Information Theory: T. Cover and J. Thomas, "Elements of Information Theory", Chapters 2, 3

Convex Optimization: S. Boyd and L. Vandenberghe, "Convex Optimization", Chapter 3.
Available online

Linear Algebra background: "Convex Optimization", Appendix

Statistics: Daphne Koller and Nir Friedman, "Bayesian Networks and Beyond" (BN Beyond), Draft book, chapter 2

Graphical Models

Representation: BN Beyond

       Parameter Estimation: BN Beyond.
       Expection Maximization: R. Neal, G. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants.

       Exact Inference: BN Beyond.
       Variational Inference: Martin Wainwright and Michael Jordan, A Variational Principle for Graphical Models.
       Sampling 1: BN Beyond.
       Sampling 2: R. Neal, Probabilistic inference using MCMC methods, Chapter 3

Machine Learning

Chris Bishop, Pattern Recognition and Machine Learning, (the entire book)
(This book comprehensively covers most of the topics in machine learning)


Book 1: Visual Perception: Key Readings Edited by Steven Yantis (Psychology Press) 2000.
(A collection of 25 essential papers in vision science/neuroscience.)

Book 2: Foundations of Vision by Brian A. Wandell (Sinauer) 1995.

Computational Neuroscience

Book: Peter Dayan and LF Abbott, Theoretical Neuroscience, MIT Press.

Deep belief networks:
       Hinton, G. E. and Salakhutdinov, R. R., Reducing the dimensionality of data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006.
       Geoffrey E. Hinton, What kind of a graphical model is the brain?

Sparse coding/ICA:
       Olshausen BA, Field DJ (2004). Sparse Coding of Sensory Inputs, Current Opinion in Neurobiology, 14: 481-487.
       A. Hyvärinen and P.O. Hoyer. Emergence of phase and shift invariant features by decomposition of natural images into independent feature subspaces. Neural Computation, 12(7):1705-1720, 2000.

Slow feature analysis: Berkes, Pietro and Wiskott, Laurenz. Slow feature analysis yields a rich repertoire of complex cell properties. Journal of Vision, 5(6):579--602, 2005.