CS374

Course Description
This course covers recent developments in computational algorithms applied to a large variety of problems in biology. We will discuss computational methods that are used in current research on topics including sequence analysis, genetics, evolution, structural biology, systems biology and more... We will cover the most recent advances in genotyping and sequencing technology, and their application to personalized genetics and medicine. The content of the course should be of interest to computer scientists and biologists alike.

The course will consist primarily of student presentations on topics in the syllabus. Presentations will be prepared with the help of the instructor. Students will help form the syllabus by choosing the topics they would like to present.

This course offers a great opportunity to explore cutting edge research work all across the field of computational biology, to critically read and discuss recent research work, and to practice presentation skills.

Class Schedule
Lecture: Tue Thu 12:50-2:05pm, James H. Clark Center, room S361 (enter by going through Peet's Coffee on the third floor)

Instructor
Serafim Batzoglou
Office: Clark Center S266
Office hours: TBD
Meeting before your presentation: By appointment.
Phone: (650) 723-3334
Email: ude.drofnats@mifares (written backwards to avoid spam)

Teaching Assistants
Daniel Newburger
Office: Clark Center S260
Office hours: Thursday, 10:45AM-12:45PM
Paper discussion: by appointment.
Email: den7.cs374@cs.stanford.edu

Dorna Kashef
Office: Clark Center S256
Office hours: Tuesday, 10:00AM-12:00PM
Paper discussion: by appointment.
Email: dkashef.cs374@cs.stanford.edu

Course Requirements
Overview:
  • Students taking the class for three units can choose to do either of the following:
    • One presentation, one critique and two summaries
    • Two presentations (if enough slots available)
  • Students taking the class for two units can choose to do either of the following:
    • One presentation and one critique
    • One presentation and two summaries
  • The attendance requirement applies to all students in the class.

Details:
  • Presentation. The main course requirement is to select a topic and give a presentation based on two papers on the topic. In general, students should pick a topic in the list given below, but if a topic that is of particular interest to you is not on the list, feel free to suggest it to the instructor and TA. The instructor and TA will meet with each student to help with the preparation, and ensure that the resulting presentation will be interesting and accessible to students in the class who are not experts in the given topic. Most of the topics have a strong algorithmic flavor, but some topics are more geared towards biology. Please send the slides in PDF to the TA on the day of your presentation. If you would like to present from PowerPoint, please send the slides in .ppt or .pptx format as well.

  • Critique. For this requirement, you should write a short critique of one of the papers that will be presented in lecture. The critique has to be written and submitted before the topic is presented in class. You will be assigned one class for the critique. You should choose one of the papers that will be presented during that class, read and understand the paper, and then write a critique. The critique should be 2 to 3 pages long (using a 12pt font and standard page setup). The critique must be submitted before class by emailing a PDF file to the TA. The critique should not just be a summary of the paper, but rather demonstrate critical thinking about the content of the paper. Points that can be discussed in the critique include the methodology used in the paper, the quality of the results and their validation, possible shortcomings, alternative approaches to solving the problem, ideas for the extension of the presented approach, and comparisons between the paper and other work in the area. You are encouraged to discuss ideas for your critique with the TA. Keep in mind that your critique constitutes an original text; verbatim copying from any sources is not allowed.

  • Summary. For this requirement, you need to select two lectures: one of class dates between 10/04 and 11/1, and one of class dates between 11/3 and 12/8. These two summaries will be due 11/5 and 12/9, respectively, before midnight. For each summary, you have to find one paper in addition to the ones presented; the paper must be related to the topic and relatively recent (published after 2007, with exceptions given for particularly interesting and relevant papers from earlier in this millenium). Then, you must write a single-page summary, with 1/3 of the page summarizing what the paper presents and 2/3 of the page discussing how it relates to the papers presented in class. Refer to the sample entry for examples of how the summary should look like in terms of format and structure. The summaries should be submitted by emailing a PDF file to the TA. You do not need to sign up for summaries. If you are not sure if a paper is suitable for a summary, or need other advice, feel free to contact the TA. Keep in mind that your summaries constitute an original text; verbatim copying from any sources (including the paper you are summarizing) is not allowed.

  • Attendance. As this is a seminar-style class, attendance is mandatory, and each student can miss up to two classes without affecting his/her grade. You are required to attend the full class period. We will circulate an attendance sheet during each class. If you came late or have to leave early, please mention so on the attendance sheet. It is a Honor Code violation to sign the attendance sheet on behalf of somebody else, or to ask somebody else to sign the attendance sheet if one is not attending the class, or to indicate that you attended the whole class if you missed a part of the class.

Getting started...
We are assigning presentation topics and critique dates on a first come-first served basis. Order of preferencs for presentations dates will be accommodated as well as possible, but the course staff will choose among students' preferred presentation dates to roughly structure the flow of lecture topics. Please note that your presentation, critique and summaries all need to be on distinct classes (and topics). Critiques are assigned by date, not by topic. To sign up for a presentation, you need to find a topic and a date. Consult the topic list and the schedule, which we will try to update on a rolling basis as soon as assignments are made.

All preliminary topic choices will be posted by Tuesday, September 27. After 9/27, but no later than Friday, 9/30, you should send den7.cs374@cs.stanford.edu:
  1. A list of five topics, in order of preference.
  2. A list of five presentation dates, in order of preference.
  3. A list of five dates for the critique (if applicable).
  4. Mention if you are taking the class for two or for three units, and which requirements you want to complete.
Flexibility in terms of possible dates is greatly appreciated. You cannot critique the same day you are presenting, but giving overlapping dates is fine.
Honor Code
The Stanford Honor Code applies to every document you submit for this class. In particular, please be careful not to plagiarize the papers you are presenting or summarizing. Make sure that you always correctly cite your sources (for examples, when you show figures or use illustrations in your presentations). Summaries and critiques must be written using your own words, not by copying text from the respective papers. If you need to cite text verbatim from some source, always put it in quotes and mention the source. If you have any question about this, or in case of doubt, please contact the TA.

Communication
Questions should be sent to the instructor and the TA directly with email, or communicated to course staff in person after lecture or during office hours.

Topics
The following is a tentative list of topics:

Topic description
 PapersAssigned to
 
Sequencing
1
Sequencing technologies
Kevin Dalton
2
De novo assembly
Alex Morgan
3
Short read alignment
Jaehyun Park
4
Alternative splicing and RNA isoforms
Jesse Rodriguez
5
Structural variation
 
6
Sequencing extinct human ancestors
 
7
DNA sequence compression
Michael Chung
8
Metagenomics
Juan-Carlos Foust
9
MicroRNA Folding prediction algorithms
George Michopoulos
 
Population genomics and evolution
10
Identifying population structure and evolution
 
11
Signs of different types of selection
 
12
Comparative genomics and evolution
 
13
Evolution in dogs
Abhinay Nagpal
14
Ancestry inference
 
15
Identity-by-descent inference and mapping
 
 
Comparing multiple genomes
16
Multiple genome alignment
 
 
Beyond sequencing
17
Chromatin structure
 
18
Epigenetics
 
 
Systems biology
19
Identifying biological interactions
Simon Ye
20
Detecting significant network structures
Serene Kosaraju
21
Reconstructing signaling with bayesian inference
Juthika Dabholkar
22
Deciphering cell differentiation
 
23
Game theory for social behavior and dynamics
Laney Kuenzel
 
Personalized medicine
24
Genome-wide association studies:methods and applications
 
25
Privacy in genome-wide association studies
Yongwhan Lim
26
Clinical risk from common variants and environmental factors
 
27
Identifying disease genes
 
28
Missing heritability in genome-wide association studies
 
29
Imputing missing data and haplotype phase
 
 
Cancer
30
Dysregulated subnetworks in cancer
 
31
Finding cancer driver mutations
Rifat Joyee
32
Tumor normal algorithms
 
 
Gene regulation
33
Interpreting evidence for cis-regulation
 
 
Miscellaneous topics
34
Cell tracking using imaging
 
35
Synthetic biology
Bob Arrigo
36
Self-assembly of DNA
 

Schedule
As the quarter progresses, the following schedule will be updated accordingly. Please check back often for the latest material.

 DateTitle
PresenterSummariesCritique
19/27Introduction
Serafim Batzoglou
Daniel Newburger
 No critique
29/29Introduction (cont.)
Serafim Batzoglou No critique
310/4Sequencing Technologies
Kevin Dalton  
410/6Identifying biological interactions
Simon Ye  
510/11Game theory for social behavior and dynamics
Laney Kuenzel Abhinay Nagpal
610/13DNA Sequence Compression
Michael Chung Simon Ye
710/18Reconstructing signaling with bayesian inference
Juthika Dabholkar Michael Chung
810/20Genome-wide association studies and heritability
Marc Schaub  
910/25MicroRNA Folding prediction algorithms
George Michopoulos Juan-Carlos Foust
1010/27Short read alignment
Jaehyun Park Rifat Joyee
1111/1Synthetic biology
Bob Arrigo  
1211/3Privacy in genome-wide association studies
Yongwhan Lim Bob Arrigo
1311/8High-throughput population genetics
Marc Schaub  
1411/10Metagenomics
Juan-Carlos Foust Kevin Dalton
1511/15Alternative splicing and RNA isoforms
Jesse Rodriguez Serene Kosaraju
1611/17Detecting significant network structures
Serene Kosaraju Jaehyun Park
1711/22No class (Thanksgiving Break)
   
1811/24No class (Thanksgiving Break)
   
1911/29Evolution in dogs
Abhinay Nagpal Juthika Dabholkar
2012/1Finding cancer driver mutations
Rifat Joyee Laney Kuenzel
2112/6Methods for detecting recent positive selection and what they can tell us about disease
Erik Corona (Guest Lecturer)  
2212/8De Novo Assembly
Alex Morgan George Michopoulos

Sample entry from previous year:

    Comparative Genomics and Evolution
Max Libbrecht 1 2 3 4 5 Daniel Newburger