Context-Specific Bayesian Clustering for Gene Expression Data
Y. Barash and N. Friedman
In Journal of Computational Biology.
Postscript version
PDF version.
Abstract
The recent growth in genomic data and measurements of genome-wide
expression patterns allows us to apply computational tools to examine
gene regulation by transcription factors. In this work, we present a
class of mathematical models that help in understanding the
connections between transcription factors and functional classes of
genes based on genetic and genomic data. Such a model represents the
joint distribution of transcription factor binding sites and of
expression levels of a gene in a unified probabilistic model.
Learning a combined probability model of binding sites and expression
patterns enables us to improve the clustering of the genes based on
the discovery of putative binding sites and to detect which binding
sites and experiments best characterize a cluster. To learn such
models from data, we introduce a new search method that rapidly learns
a model according to a Bayesian score. We evaluate our method on
synthetic data as well as on real life data and analyze the biological
insights it provides. Finally, we demonstrate the applicability of
the method to other data analysis problems in gene expression data.
Back to Nir's publications page
nir@cs.huji.ac.il