---
title: "Simulation with non-canonical matrices"
author: "Matthew Stephens"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{mashr simulation with non-canonical matrices}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,comment = "#",fig.width = 5,
                      fig.height = 4,fig.align = "center",
                      eval = FALSE)
```

## Goal

To try out some simulations that don't match the canonical covariance
matrices and illustrate how the data driven matrices help.

## Simple simulation

Here the function `simple_sims_2` simulates data in five conditions
with just two types of effect:

1. shared effects only in the first two conditions; and

2. shared effects only in the last three conditions.

```{r}
library(ashr)
library(mashr)
set.seed(1)
simdata = simple_sims2(1000,1)
true.U1 = cbind(c(1,1,0,0,0),c(1,1,0,0,0),rep(0,5),rep(0,5),rep(0,5))
true.U2 = cbind(rep(0,5),rep(0,5),c(0,0,1,1,1),c(0,0,1,1,1),c(0,0,1,1,1))
U.true  = list(true.U1 = true.U1, true.U2 = true.U2)
```
  
## Simple simulation

Run 1-by-1 to add the strong signals and ED covariances.

```{r, collapse = TRUE}
data   = mash_set_data(simdata$Bhat, simdata$Shat)
m.1by1 = mash_1by1(data)
strong = get_significant_results(m.1by1)
U.c    = cov_canonical(data)
U.pca  = cov_pca(data,5,strong)
U.ed   = cov_ed(data,U.pca,strong)

# Computes covariance matrices based on extreme deconvolution,
# initialized from PCA.
m.c    = mash(data, U.c)
m.ed   = mash(data, U.ed)
m.c.ed = mash(data, c(U.c,U.ed))
m.true = mash(data, U.true)
  
print(get_loglik(m.c),digits = 10)
print(get_loglik(m.ed),digits = 10)
print(get_loglik(m.c.ed),digits = 10)
print(get_loglik(m.true),digits = 10)
```

The log-likelihood is much better from data-driven than canonical
covariances. This is good! Indeed, here the data-driven fit is very
slightly better fit than the true matrices, but only very slightly.