In clinical studies, information about disease state and progression of a large number of patients is collected together with patient information that could (technically) be used for early diagnosis and or prognosis of disease. In this workflow clinical prostate cancer data of 97 patients is used to identify molecular markers that correlate with disease stage. The analysis is mainly inspired by the book ‘Essential Statistical Learning‘.
Analysis used in this workflow are cross-validation for Lasso penalized regression fit and best predictive variable identification (‘ncvreg’ and ‘leaps’ packages); linear regression and fitting with L1 constraints (‘stats’ and ‘lasso2’ packages); Lasso penalized Least Angle Regression with cross validation (‘lars’ package); and fitting General Linear Model (‘stats’ package).
There are 6 packages used in this workflow, which depend on 2 additional packages from CRAN (dependencies)
Used packages:
Package dependencies:
source datasets from http://cran.r-project.org/web/packages/lasso2/lasso2.pdf