CS374

Course Description
This course will cover algorithms and computational models applied to molecular biology. Current, exciting algorithms from a variety of biological areas will be covered. The topics should be of interest to computer scientists and biologists alike. We will cover topics from genomics and evolution of DNA, such as current sequence analysis methods, regulatory programs, speciation and selection, to the applications of computational biology to the study of diseases. We will also cover topics in structural biology (RNA and protein structure), systems biology, imaging, the use of computational methods in metabolic engineering and biological computation. The course will consist primarily of student presentations of topics in the syllabus, which will be prepared with the help of the instructor. Students will help forming the syllabus, by choosing the topics they would like to present.

Class Schedule
Lecture: Tue Thu 11:00am-12:15pm, Y2E2 building, room 111

Instructor
Serafim Batzoglou
Office: Clark Center S266
Office hours: TBA.
Meeting before your presentation: By appointment.
Phone: (650) 723-3334
Email: ude.drofnats@mifares (written backwards to avoid spam)

Teaching Assistant
Marc A. Schaub
Office: Clark Center S260
Office hours: Monday 2pm-4pm, Wednesday 2pm-4pm
Paper discussion: by appointment.
Phone: (650) 725-6094
Email:

Prerequisites
The following courses are recommended:
  • CS161: Design and Analysis of Algorithms, or equivalent familiarity with algorithmic and data structure concepts.

  • CS262: Computational Genomics, or CS274: Representations and Algorithms for Computational Molecular Biology, or BIOCHEM218: Computational Molecular Biology, or equivalent familiarity with computational biology concepts, problems and algorithms.

Course Requirements
There are four course requirements:
  1. Presentation. The main course requirement is to select a topic and give a presentation based on two papers on the topic. The instructor and TA will meet with each student to help with the preparation, and ensure that the resulting presentation will be interesting and accessible to students in the class who are not experts in the given topic. Most of the topics have a strong algorithmic flavor, but some topics are more geared towards biology. Please send the slides in powerpoint (.ppt,.pptx) or PDF format to the TA on the day of your presentation.

  2. Critique. The second requirement is the critique of one of the papers that will be presented in lecture. The critique has to be written and submitted before the topic is presented in class. You will be assigned one class for the critique. You should choose one of the papers that will be presented during that class, read and understand the paper, and then write a critique. The critique should be 2 to 3 pages long (using a 12pt font and standard page setup). The critique must be submitted before class (ie. before 11am on the assigned day) by emailing a PDF file to the TA. Submissions received during or after class will not be considered. You are required to attend the class during which the paper you critiqued is presented, and are strongly encouraged to share your comments about the paper in the discussion. You may be asked to perform some editing (if necessary) before your critique appears online. The critique should not just be a summary of the paper, but rather demonstrate critical thinking about the content of the paper. Points that can be discussed in the critique include the methodology used in the paper, the quality of the results and their validation, possible shortcomings, alternative approaches to solving the problem, ideas for the extension of the presented approach, and comparisons between the paper and other work in the area. You are encouraged to discuss ideas for your critique with the TA. Keep in mind that your critique constitutes an original text; verbatim copying from any sources is not allowed.

  3. Summary. As a third requirement, you need to select two lectures: one of lectures 3 to 10, and one of lectures 11 to 19 (except lectures 15 and 16). For each lecture, you have to find one paper in addition to the two presented; the paper must be related to the topic and relatively recent (published after 2004). Then, you must write a single-page summary of what the paper presents and how it relates to the other two. Refer to the sample entry for examples of how the summary should look like in terms of format and structure. Each of your two summaries is due one week after its respective lecture, but it may appear online later, as you may be asked to perform some editing (if necessary). The summaries should be submitted by emailing a PDF file to the TA. You do not need to sign up for summaries. If you are not sure if a paper is suitable for a summary, or need other advice, feel free to contact the TAs. Keep in mind that your summaries constitute an original text; verbatim copying from any sources (including the paper you are summarizing) is not allowed.

  4. Attendance. As this is a seminar-style class, attendance is mandatory, and each student can miss up to two classes without affecting his/her grade. You are required to attend the full class period. We will circulate an attendance sheet during each class. If you came late or have to leave early, please mention so on the attendance sheet. It is a Honor Code violation to sign the attendance sheet on behalf of somebody else, or to ask somebody else to sign the attendance sheet if one is not attending the class, or to indicate that you attended the whole class if you missed a part of the class.

If you take the class for two units, you can drop (2) or (3) above; or, in case enrollment is too high we will consider dropping (1) if you prefer.

Getting started...
We are assigning presentation topics, presentation dates and critique dates on a first come-first served basis. Please note that your presentation, critique and summaries all need to be on distinct classes (and topics). Critiques are assigned by date, not by topic. To sign up for a presentation, you need to find a topic and a date. Consult the topic list and the schedule, which we will try to update on a rolling basis as soon as assignments are made. As soon as possible, but no later than Friday, April 3, you should send an email with:
  1. A list of at least five topics, in order of preference.
  2. A list of at least five presentation dates, sorted by order of preference.
  3. A list of at least five dates for the critique (unless you are taking the class for two units and want to drop the critique).
  4. Mention if you are taking the class for two or for three units.
Flexibility in terms of possible dates is greatly appreciated. You cannot critique the same day you are presenting, but giving overlapping dates is fine.
Honor Code
The Stanford Honor Code applies to every document you submit for this class. In particular, please be careful not to plagiarize the papers you are presenting or summarizing. Make sure that you always correctly cite your sources (for examples, when you show figures or use illustrations in your presentations). Summaries and critiques must be written using your own words, not by copying text from the respective papers. If you need to cite text verbatim from some source, always put it in quotes and mention the source. If you have any question about this, or in case of doubt, please contact the TAs.

Communication
Questions should be sent to the instructor and the TA directly with email, or communicated to course staff in person after lecture or during office hours.

Topics
The following is a tentative list of topics:

Topic description
 PapersAssigned to
 
Sequence Alignment
1
Whole genome alignment
 
2
Topics in sequence alignment
 
 
Regulatory motif finding
3
Finding regulatory programs
 
4
Approaches to motif finding
 
 
Phylogenetic Trees
5
Inference of phylogenetic trees
Zinnia Zheng
 
Chromatin structure
6
Nucleosome Positioning
Gus Katsiapis
 
Current Trends in Genome Sequencing
7
Sequencing human genomes
 
8
Short read assembly
 
9
Metagenomics
 
 
RNA Biology
10
RNA secondary structure prediction
 
11
Non-coding RNAs and function
 
12
MicroRNA detection
 
 
Structural Biology
13
RNA tertiary structure modeling
 
14
Protein folding and molecular dynamics as a Markov process
 
15
Calculation of protein-ligand binding
Hieu Nguyen
16
Protein Structure Determination
 
 
Systems Biology
17
Comparison of biological networks
 
18
Building protein interaction networks
 
19
Models and methods in systems biology
Daniel Kluesing
20
Learning networks from experimental data
Marc Schaub
(part of lecture 19)
 
Population Genetics and Evolution
21
Identifying population structure and evolution
Marc Schaub
(part of lecture 3)
22
Positive selection
Marc Schaub
(part of lecture 3)
23
Human migrations
Jennifer Chen
24
Comparative genomics and evolution
Antony Vydrin
25
Stories of speciation
 
26
Ancestry inference
 
27
Haplotype reconstruction
David Breeden
 
Bioinformatics and disease
28
Networks in disease
Marc Schaub
(part of lecture 19)
29
Methods in genome wide association studies
Norú Moreno
30
From genome wide association studies to medicine
Florian Schmitzberger
31
Genomes and diseases
Daniel Kluesing
 
Special topics
32
Imaging in biology
 
33
Metabolic engineering
Lekan Wang
34
Self-assembly of DNA
 
35
Transforming cells into automata
 

Schedule
As the quarter progresses, the following schedule will be updated accordingly. Please check back often for the latest material.

 DateTitle
PresenterSummariesCritique
13/31Introduction
Serafim Batzoglou
Marc Schaub
 No critique
24/2Human population genetics
Serafim Batzoglou No critique
34/7Methods for high throughput population genetics
Marc Schaub No critique
44/9Inference of phylogenetic trees
Zinnia Zheng  
54/14Methods in genome wide association studies
Norú Moreno Jennifer Chen
64/16Research presentation
Marc Schaub No critique
74/21Human migrations
Jennifer Chen 1 2 3 Hieu Nguyen
84/23Nucleosome positioning
Gus Katsiapis Lekan Wang
94/28From genome wide association studies to medicine
Florian Schmitzberger 1 2 3 4 Norú Moreno
104/30Haplotype reconstruction
David Breeden 1 2 3  
115/5Metabolic engineering
Lekan Wang 1 2 Zinnia Zheng
125/7Models and methods in systems biology
Daniel Kluesing 1 2 Antony Vydrin
135/12Calculation of protein-ligand binding
Hieu Nguyen 1 2  
145/14Comparative genomics and evolution
Antony Vydrin Florian Schmitzberger
155/19No class (RECOMB'09)
   
165/21No class (RECOMB'09)
   
175/26Recomb recap
Marc Schaub No critique
185/28Genomes and diseases
Daniel Kluesing Gus Katsiapis
196/2Networks in diseases
Marc Schaub 1 No critique
206/5No class (Day before finals)
   
21 Summaries on other topics
  1 2  

Sample entry from previous year:

    Microarray Analysis and Clustering
Yu Bai 1 2 3