CS 374 -
Algorithms in Biology
This course will cover algorithms and computational models applied to molecular biology. Current, exciting algorithms from a variety of biological areas will be covered. The topics should be of interest to computer scientists and biologists alike. In Fall 2004 we will cover topics from genomics and evolution of DNA, such as sequence comparison methods, annotating DNA with genes and evolutionary important elements, genomic rearrangements, microarray analysis, and new sequencing technologies. We will also cover topics from protein structure, protein surface and interactions modeling, multiple alignment of proteins, phylogenetic trees, and DNA-based computation. The course will consist primarily of student presentations of topics in the syllabus, which will be prepared with the help of the instructor. Students will help forming the syllabus, by choosing the topics they would like to present.
Lecture: TTh 3:15-4:30, Clark Center room S361
Staff and Office hours
Instructor: Serafim Batzoglou
Office: S266 Clark Center
Phone: (650) 723-3334
E m a il: serafim (at the address of) cs period stanford period edu (so as to avoid spam)
Office hours: Tuesday 1:15-3:15PM.
TA: Relly Brandman
These are recommended but will not be strictly enforced.
CS161, Design and Analysis of Algorithms, or equivalent familiarity with algorithmic and data structure concepts.
CS262, Computational Genomics, or CS274, Representations and Algorithms for Computational Molecular Biology, or BIOCHEM218, Computational Molecular Biology; or equivalent familiarity with computational biology concepts, problems, and algorithms.
Course Requirements and Grading
1. Lecture. The main course requirement is to select a topic and prepare a presentation based on 2 papers on the topic. The instructor and TA will meet with each student to help with the preparation, and ensure that the resulting presentation will be interesting and accessible to students in the class who are not experts in the given topic. Most of the topics have a strong algorithmic flavor, but some topics are more geared towards biology. Please sign up for topics to present, on a first-come first-serve basis (see Topics below).
2. Scribing. The second requirement is scribing a lecture. Lecture notes should provide students who are taking the class a useful resource for remembering the material presented. Ideally, lecture notes should be written up in a way so that they are readable by students of next year who did not necessarily read the papers that were presented. For formatting, here is a sample of how lecture notes should look like in terms of format and organization. We suggest that you use this as a template to prepare your lecture notes in Word.
Please sign up for scribing, on a first-come first-serve basis. To do so, please email both instructor and TA with subject "CS374, signing up for scribing". Lecture notes are due 1 week after the presentation.
3. Summaries. As a third requirement, you should select one of the first 10 lectures, and one of the rest. For this lecture, you should find one paper in addition to the 2 presented, which is related to the topic. It is preferable to find recent papers (2001-2005). Then, you should write a 1-page summary of what the paper presents and how it relates to the other two. The deadline for that summary is 1 week from the time of your selected lecture, and it will be made available online 2 weeks from that lecture, after we edit it together.
Here is a sample structure of this short summary:
- Paper reference
- Abstract: in your own words (preferably simple description), what does the paper present
- Discussion: how do these results relate to the topic? Is it an advance over what was described, a different approach, and what are the main advantages/disadvantages?
4. As this is a seminar-style class, attendance is mandatory, and each student can miss up to 2 classes without affecting his/her grade.
Taking the class for 2 units: If you take the class for 2 units, you can drop (2) or (3) above; or, in case enrollment is too high we will consider dropping (1) if you prefer.
Questions should be sent to the instructor and TA directly with email, or communicated to course staff in person after lecture or during office hours.
Students will select topics from the following list. Also, they will sign up for a date of presentation. All this will be done on a first-come first-serve basis. Please email both instructor and TA with subject "CS374, signing up for presentation". Each lecture will cover 2, or occasionally 3 papers. Underlined topics have been assigned.
In selecting topics, note that some of them have several papers, which are always grouped. Please select just one group of papers.
Color code: The topics below are color coded to roughly correspond to the subject area. Sky blue broadly denotes DNA sequence & genomics papers, red is systems and modular biology, green is protein-related papers, purple is biological computation, orange is non-CS biology, and gray are miscellaneous topics. Ordering of the topics is random.
|2||Repetitive DNA detection and classification||
|3||Networks of Protein Interactions||
B Network Alignment
C Mathematical Properties
D Systems Biology
E Misc. graph algorithms
F Signal Transduction Networks
|4||Indexing large databases for string similarity search||
A Seeded database search
B Multiple seeds and multiple alignments
|5||Regulatory motif finding||
|6||Protein structure and prediction||
A Finding the Beta Helix motif
B Computational musings on protein domains
C Molecular Dynamics Simulation of Drug-Target Proteins
D Graphical Models for Protein Kinetics
A Kernel-based methods
B Graph flow-based methods
|10||Finding elements in DNA that are conserved by evolution||
A Methods for finding conserved elements
B Statistical power of detecting conserved elements
|11||Protein multiple alignment||
|12||Modeling the origin and migration of human populations||
|13||Finding genes based on comparative genomics||
|14||Mining the medical literature||
|15||Modeling regulatory networks||
A Probabilistic Modeling
B Role of noise in gene expression
This presentation, if selected by a student, will be different from usual. We will cover a historical perspective based on three classic papers on Chromosomes (1903), Genes (1933), and the Central Dogma of molecular biology (1970)
|17||DNA-based computation and self-assembly||
|18||Transforming cells into automata||
The schedule will be filled-in as students sign up for topics. Click on the scribe's name for lecture notes.
|Topic||Date||Presenter||Short Paper Summaries||Scribe|
|Introduction||9-27||Serafim Batzoglou||Abhishek Rathod|
|2||Comparative Genomics||9-29||Serafim Batzoglou||Vignesh Ganapathy|
|3||Finding Genes Based on Comparative Genomics||10-4||Sam Gross||Ross Bayer|
|4||Classic Papers in Genetics||10-6||Chihiro Fukami|
|5||Networks of Protein Interactions -- A. Introduction and Integration||10-11||Balaji Srinivasan|
|Networks of Protein Interactions -- B. Network Alignment||10-13||Tony Novak|
|7||Indexing Large Databases for String Similarity -- A. Seeded Database Search||10-18||Ross Bayer||Indexing a MSA|
|8||Protein Multiple Alignment||10-20||Konstantin Davydov||SPEM aligner|
|9||Networks of Protein Interactions -- C. Mathematical Properties||10-25||Abhishek J Rathod||Functional Topology||Chihiro Fukami|
|Signal Transduction Networks -- no slides, Onn used chalk :-)||10-27||Onn Brandman||ModularAnalysis|
|11||Graphical Models for Understanding Protein Kinetics||11-1||Nina Singhal|
|12||Regulatory Motif Finding||11-3||Wenxiu Ma||Phylogenetic Motif Finder 1 , 2||Marcin Mejran|
|13||DNA-based Computation and Self-Assembly||11-8||Ho-Lin Chen|
|14||Protein Classification||11-10||Serafim Batzoglou||Coherent Subgraphs|
|15||Protein Structure and Prediction -- A. Finding the Beta Helix Motif||11-15||Marcin Mejran||Ab initio prediction|
|16||Modeling the Origin and Migration of Human Populations||11-17||Michael Palmer||Melroy Saldanha|
|Mining the Medical Literature||11-29||Vignesh Ganapathy||Konstantin Davydov|
|18||Networks of Protein Interactions -- D. Systems Biology||12-1||Ophelia Venturelli|
|19||Regulatory Motif Finding, Part II||12-6||Balaji Srinivasan||Wenxiu Ma|
|20||Phylogenetic Trees||12-8||Melroy Saldanha||Ophelia Venturelli|