ICML2012 Workshop on Machine Learning in Genetics and Genomics [MLGG]



The field of computational biology has seen a dramatic growth over the past few years—not only in terms of new available data, but also in new scientific questions, and new challenges for learning and inference. For instance, multiple types of large-scale (often genome-wide) datasets, such as gene expression, genotyping data, whole-genome sequences, protein-protein interactions, and protein abundance measurements, are widely available for multiple model organisms and across multiple conditions, and some of these data types are also available for large patient cohorts. Combined with appropriate statistical and computational methods for their integration, modeling, and analysis, these datasets have the potential to revolutionize our understanding of basic molecular biology and lead to better diagnosis and therapy for genetic and other diseases.

However, computational approaches for analyzing and learning from these data are faced with major challenges including scalability, data heterogeneity, missing data and confounding factors to name a few. It is becoming clear that out-of-the-box computational approaches are unlikely to be applicable. For example, next generation sequencing technologies produce gigabytes of data for each sample bringing the issue of scalability to a whole new level. Data heterogeneity and unobserved confounding effects result in artifacts such as non-biological correlations between samples, giving rise to high false positive rates and complications during validation.


The goal of this workshop is to present emerging genomics research questions and machine learning techniques that can address some of the challenges on the way to answering fundamental biological questions and refining our understanding of the genesis and progression of diseases.

Important Dates

  1. Deadline for paper submission: May 14, 2012 (Midnight Samoa time)

  2. Author notification: May 21,2012

  3. Workshop date: July 1,2012

Anna Goldenberg

University of Toronto

anna [dot] goldenberg [at] utoronto [dot] ca

Pierre Baldi

University of California, Irvine

pfbaldi [at] ics [dot] uci [dot] edu

Sara Mostafavi

Stanford University

saram [at] cs [dot] stanford [dot] edu

Michal Rosen-Zvi

IBM Research, Haifa

rosen [at] il [dot] ibm [dot] com