Survival TCGA

All cancers are caused by mutations in DNA. The Cancer Genome Atlas (TCGA) is a project that begun in 2005 and aims to catalogue the mutations that cause cancer in patients. It uses highthroughput sequencing technologies and bioinformatics to achive this gaol. Beside genomic data, TCGA also records and provides detailed anonymize meta information about each patient (such as disease stage, patient age, sex, healt, etc) which are very valuable for identification of risk factors.

In this workflow, survival analysis is performed on TCGA meta and survival data using the Cox regression model. The glmnet package is one of the most efficient packages for such an analysis in R.

digraph SURV_TCGA_workflow {
             Data [group = g1; shape = invhouse, label = "Survival data (TCGA)"];
             xdata [shape = box; label = "xdata:\nExpression"];
             ydata [shape = box; label = "ydata:\nSurvival"];
             params [group = g1; shape = box; label = "n lambda: 10k\nalpha: seq(0 , 1, .1)        \nn alpha: 11"];
             glmnet [group = g1; shape = box; label = "glmnet(xdata, ydata    \nalpha, family=cox)    "];
             Data -> xdata;
             Data -> ydata;
             xdata -> glmnet;
             ydata -> glmnet;
             glmnet -> glmnet [label = " n alpha   "];

             edge[style=invis];
             glmnet -> params;
     }

Survival workflow with TCGA dataset and glmnet()

Packages and Dependencies

There are 2 packages used in this workflow, which depend on 5 additional packages from CRAN (dependencies)

Used packages:

  • CRAN: glmnet, survival

Package dependencies:

  • CRAN: foreach, Matrix, codetools, iterators, lattice

Data

  • Patient survival data from TCGA consortium provided by Andre Verissimo.

License

Copyright (c) 2015 Andre Verissimo (andre.verissimo@tecnico.ulisboa.pt)
Copyright (c) 2015-2016 BeDataDriven B.V.

Table Of Contents

Previous topic

Survival simple

Next topic

TCGA browser

This Page