---
title: "mixedClust"
output: 
  rmarkdown::html_vignette:
    fig_caption: true
bibliography: bibliography.bib
vignette: >
  %\VignetteIndexEntry{mixedClust}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

## Description

mixedClust is an R package to perform co-clustering on heterogeneous data. The kind of data that are taken into account are:
* Categorical
* Quantitative
* Integer
* Ordinal
* Functional

## Installation
```r
set.seed(5)
```
```r
library(mixedClust)
```

## Datasets

__under construction__

## Simulation of heterogeneous data

The following codes simulate a sample of heterogeneous data.
```r
M <- matrix(0, nrow=150,ncol=250)
```

### Simulation of categorical data
This snippet creates a sample of categorical data with 6 levels.

```r
multinomial6.block1 <- sample(1:6, 120*75, prob = c(0.4,0.25,0.1,0.1,0.05,0.1), replace = TRUE)
multinomial6.block2 <- sample(1:6, 120*50, prob = c(0.1,0.1,0.05,0.6,0.1,0.05), replace = TRUE)
multinomial6.block3 <- sample(1:6, 30*75, prob = c(0.2,0.1,0.2,0.1,0.1,0.3), replace = TRUE)
multinomial6.block4 <- sample(1:6, 30*50, prob = c(0.05,0.2,0.1,0.2,0.4,0.05), replace = TRUE)

M[1:120,1:75] <- multinomial6.block1
M[1:120,76:125] <- multinomial6.block2
M[121:150,1:75] <- multinomial6.block3
M[121:150,76:125] <- multinomial6.block4
```

### Simulation of quantitative data

```r
gaussian.block1 <- rnorm(120*10)
gaussian.block2 <- rnorm(120*40,mean=28,sd=7)
gaussian.block3 <- rnorm(30*10,mean=-12,sd=1.5)
gaussian.block4 <- rnorm(30*40,mean=2,sd=1.5)

M[1:120,126:135] <-gaussian.block1
M[1:120,136:175] <- gaussian.block2
M[121:150,126:135] <- gaussian.block3
M[121:150,136:175] <- gaussian.block4
```

### Simulation of ordinal data

The model Bos is used to simulate ordinal data. This snippet creates a sample of ordinal data with 5 levels.

```r
library(ordinalClust)

m=5 
probaBOS=rep(0,m)
for (im in 1:m) probaBOS[im]=pejSim(im,m,4,0.8)
bos.block1 <- matrix(0,nrow = 120, ncol = 35)
bos.block1 <- sample(1:m,120*35,prob = probaBOS, replace=TRUE)

probaBOS=rep(0,m)
for (im in 1:m) probaBOS[im]=pejSim(im,m,2,0.3)
bos.block2 <- matrix(0,nrow = 120, ncol = 20)
bos.block2 <- sample(1:m,120*20,prob = probaBOS, replace=TRUE)

probaBOS=rep(0,m)
for (im in 1:m) probaBOS[im]=pejSim(im,m,1,0.7)
bos.block3 <- matrix(0,nrow = 120, ncol = 20)
bos.block3 <- sample(1:m,120*20,prob = probaBOS, replace=TRUE)

probaBOS=rep(0,m)
for (im in 1:m) probaBOS[im]=pejSim(im,m,3,0.8)
bos.block4 <- matrix(0,nrow = 30, ncol = 35)
bos.block4 <- sample(1:m,30*35,prob = probaBOS, replace=TRUE)

probaBOS=rep(0,m)
for (im in 1:m) probaBOS[im]=pejSim(im,m,5,0.4)
bos.block5 <- matrix(0,nrow = 30, ncol = 20)
bos.block5 <- sample(1:m,30*20,prob = probaBOS, replace=TRUE)

probaBOS=rep(0,m)
for (im in 1:m) probaBOS[im]=pejSim(im,m,5,0.8)
bos.block6 <- matrix(0,nrow = 30, ncol = 20)
bos.block6 <- sample(1:m,30*20,prob = probaBOS, replace=TRUE)

M[1:120,176:210] <-bos.block1
M[1:120,211:230] <- bos.block2
M[1:120,231:250] <- bos.block3
M[121:150,176:210] <-bos.block4
M[121:150,211:230] <- bos.block5
M[121:150,231:250] <- bos.block6
```

### Shuffling lines and columns
```r
line.sample <- sample(1:150,150,replace = F)

col.sample.cat <- sample(1:125,125,replace=F)
col.sample.gaussian <- sample(126:175,50,replace=F)
col.sample.bos <- sample(176:250,75,replace=F)

M1 <- M[line.sample,c(col.sample.cat, col.sample.gaussian, col.sample.bos)]
```

## Setting parameters

```r
nbSEM=120
nbSEMburn=100
nbindmini=1
init = "kmeans"

kr=2
kc=c(2,2,3)
m=c(6,5)
d.list <- c(1,126,176)
distributions <- c("Multinomial","Gaussian","Bos")
```

## Perform co-clustering

In this section, a co-clustering is executed with the simulated dataset, thanks to the **mixedCoclust** function.

```r
res <- mixedCoclust(x = M1, myList = d.list,distrib_names = distributions,
                    kr = kr, kc = kc, m = m, init = init,nbSEM = nbSEM,
                    nbSEMburn = nbSEMburn, nbindmini = nbindmini)
```


## The particular case of functional data
Functional data is taken into account in this package. However, the way of introducing them is a bit different since they are not represented by a simple matrix. Functional data must be stored in a **functionalData** array with three dimensions:
* nrow = number of row that must be identical to the number of rows of the x data matrix.
* ncol = number of features of the functional type
* nslice = number of points for one function (all functions must have the same number of points)
Then, **functionalData** is passed as an argument to the different functions (co-clustering, clustering, classification).



### Simulation of functional data
The **fda.usc** package is used to simulate functional data

```r
library(fda.usc)

par(mfrow=c(1,2))
lent<-50
tt<-seq(0,1,len=lent)
mu1<-fdata(0.5*cos(2.3*2*pi*tt)+5.4*sin(0.4*2*pi*tt),tt)
mu2<-fdata(cos(2*pi*tt)+sin(2*pi*tt),tt)
mu3<-fdata(2*cos(2*pi*tt)+sin(2*pi*tt*4),tt)
mu4<-fdata(sin(2*pi*tt*5),tt)
nb <- 100
func.block.1 <- rproc2fdata(nb,mu=mu1,sigma="OU",par.list=list("scale"=1))
func.block.2 <- rproc2fdata(nb,mu=mu2,sigma="OU",par.list=list("scale"=1))
func.block.3 <- rproc2fdata(nb,mu=mu3,sigma="OU",par.list=list("scale"=1))
func.block.4 <- rproc2fdata(nb,mu=mu4,sigma="OU",par.list=list("scale"=1))
```

The **functionalData** array is built:

```r
functionalData <- array(0,c(20,20,50))
functionalData[1:10,1:10,]=func.block.1$data
functionalData[1:10,11:20,]=func.block.2$data
functionalData[11:20,1:10,]=func.block.3$data
functionalData[11:20,11:20,]=func.block.4$data

sample.lines <- sample(1:20,20,replace=F)
sample.cols <- sample(1:20, 20, replace=F)
functionalData <- functionalData[sample.lines, sample.cols,]

line.labels <- c(rep(1,10),rep(2,10))[sample.lines]
col.labels <- c(rep(1,10),rep(2,10))[sample.cols]
```

### Setting parameters
One of the limitation of functional data is that the kmeans algorithm cannot be used as initialization.
```r
nbSEM=120
nbSEMburn=100
nbindmini=1
init = "random"
kc = c(2)
kr = 2 
distributions <- c("Functional")
```

### Performing co-clustering with functional data

```r
res <- mixedCoclust(distrib_names = distributions,kr = kr, kc = kc,
                    init = init, nbSEM = nbSEM, nbSEMburn = nbSEMburn,
                    nbindmini = nbindmini, functionalData = functionalData)
```



## References




