--- title: "Simulation with non-canonical matrices" author: "Matthew Stephens" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{mashr simulation with non-canonical matrices} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE,comment = "#",fig.width = 5, fig.height = 4,fig.align = "center", eval = FALSE) ``` ## Goal To try out some simulations that don't match the canonical covariance matrices and illustrate how the data driven matrices help. ## Simple simulation Here the function `simple_sims_2` simulates data in five conditions with just two types of effect: 1. shared effects only in the first two conditions; and 2. shared effects only in the last three conditions. ```{r} library(ashr) library(mashr) set.seed(1) simdata = simple_sims2(1000,1) true.U1 = cbind(c(1,1,0,0,0),c(1,1,0,0,0),rep(0,5),rep(0,5),rep(0,5)) true.U2 = cbind(rep(0,5),rep(0,5),c(0,0,1,1,1),c(0,0,1,1,1),c(0,0,1,1,1)) U.true = list(true.U1 = true.U1, true.U2 = true.U2) ``` ## Simple simulation Run 1-by-1 to add the strong signals and ED covariances. ```{r, collapse = TRUE} data = mash_set_data(simdata$Bhat, simdata$Shat) m.1by1 = mash_1by1(data) strong = get_significant_results(m.1by1) U.c = cov_canonical(data) U.pca = cov_pca(data,5,strong) U.ed = cov_ed(data,U.pca,strong) # Computes covariance matrices based on extreme deconvolution, # initialized from PCA. m.c = mash(data, U.c) m.ed = mash(data, U.ed) m.c.ed = mash(data, c(U.c,U.ed)) m.true = mash(data, U.true) print(get_loglik(m.c),digits = 10) print(get_loglik(m.ed),digits = 10) print(get_loglik(m.c.ed),digits = 10) print(get_loglik(m.true),digits = 10) ``` The log-likelihood is much better from data-driven than canonical covariances. This is good! Indeed, here the data-driven fit is very slightly better fit than the true matrices, but only very slightly.