Deciphering Mutational Signatures with R, Amazon EC2 and the mutSignatures package

Important Notice!! This page is currently a placeholder while the mutSignatures R package is undergoing a major upgrade. Sorry for the inconvenience! A paper is (hopefully) getting accepted soon, and then this page will finally have the tutorial about how to deploy mutSignatures on EC2 instances for deciphering mutational signatures…

Please, note that a dev package is currently available (it comes with a vignette). To learn more about this, don’t hesitate to contact the package author/maintainer, at the email address listed on CRAN:


The mutSignatures package is

It is As recommended in the original paper

Let’s go for a test drive using this package on Amazon EC2.

Start running an RStudio Server on Amazon EC2. For this we can use one of the Amazon Machine Images (AMIs) from the following URL:

click on the link (I am using the North Virginia one), log in the Amazon Web Services (AWS), select the server type (c4.2xlarge)

Make sure to set a security group that will allow access to the RStudio Server via web (allow HTTP access via port 80 at least from your current local IP address), and create or select an existing key pair and launch the EC2 instance.

This is very similar to what described in the following post: Working with R and Bioconductor on the cloud (Amazon EC2)

Follow the public IP address of the instance and log in to RStudio server using the user: rstudio; password: rstudio. You can change password if you want.

The browser will display your RStudio layout

install the package and all its dependencies by typing

The installation proceeds with no errors. Once it is completed, you can load the library via:

We are now ready to proceed with the test run.


For this test, we will generate some simulated mutational catalogs. To do so, I will download some of the COSMIC signatures and then combine them to generate different combinations of three selected signatures.

cosmic.url <- “”
cosmic.signatures <- read.delim(cosmic.url)
rownames(cosmic.signatures) <- cosmic.signatures$Somatic.Mutation.Type
cosmic.signatures <- cosmic.signatures[, grep(“Signature”, colnames(cosmic.signatures))]
selected.signatures <- cosmic.signatures[,c(5,13,22)]

We can simulate some mutational catalogues by randomly combining these signatures. Briefly, we can calculate the product of the signature matrix and a matrix including some randomly assigned effects. Next, since the framework input is a matrix of counts (and not frequencies) we can mutliply for a random total number of mutations

exposure.matrix <- sapply(1:100, (function(i){
exp.a <- runif(1,0,1)
exp.b <- runif(1,0,(1-exp.a))
exp.c <- 1 – (exp.a + exp.b)
c(exp.a, exp.b, exp.c)
simulated.mutFreqs <- as.matrix(selected.signatures) %*% exposure.matrix
simulated.mutCounts <- apply(simulated.mutFreqs, 2, (function(clm){
as.integer(runif(1, 200, 4000) * clm)

rownames(simulated.mutCounts) <- rownames(simulated.mutFreqs)


Our mutation count matrix is ready to be processed using the mutSignatures Framework. We need only to format it using the . Likewise, we need to se up the parameters for the run. We need to specify how many signatures we want to extract, how many cores to run and how many interactions per core. Most of the other parameters may be left unchanged. <- setMutCountObject(simulated.mutCounts, datasetName = “simulated genomes”) <- setMutClusterParams(num.processes.toextract = 3,
tot.iterations = 20,
tot.cores = 6)

Everything is ready now. To start the analysis, do as follows:Running such analysis on a 8-core cluster wil





About Author

Postdoc Research Fellow at Northwestern University (Chicago)

Leave a Comment

Your email address will not be published. Required fields are marked *