CS374

Course Description
This course will cover algorithms and computational models applied to molecular biology. Current, exciting algorithms from a variety of biological areas will be covered. The topics should be of interest to computer scientists and biologists alike. We will cover topics from genomics and evolution of DNA, such as sequence comparison methods, annotating DNA with genes and evolutionary important elements, genomic rearrangements, microarray analysis, and new sequencing technologies. We will also cover topics from protein structure, protein surface and interactions modeling, multiple alignment of proteins, phylogenetic trees, and DNA-based computation. The course will consist primarily of student presentations of topics in the syllabus, which will be prepared with the help of the instructor. Students will help forming the syllabus, by choosing the topics they would like to present.

Class Schedule
Lecture: Tue Thu 3:15pm-4:30pm, Clark Center S361. There is no Friday section this year.

Instructor
Serafim Batzoglou
Office: Clark Center S266
Office hours: Tue 1:15pm-3:15pm
Phone: (650) 723-3334
Email: ude.drofnats@mifares (written backwards to avoid spam)

Teaching Assistant
George Asimenos
Office: Clark Center S260
Office hours: General questions: Tue 1:15pm-3:15pm. Paper discussion: By appointment.
Phone: (650) 725-6094
Email: ude.drofnats@sonemisa (written backwards to avoid spam)

Prerequisites
The following courses are recommended:
  • CS161: Design and Analysis of Algorithms, or equivalent familiarity with algorithmic and data structure concepts.

  • CS262: Computational Genomics, or CS274: Representations and Algorithms for Computational Molecular Biology, or BIOCHEM218: Computational Molecular Biology, or equivalent familiarity with computational biology concepts, problems and algorithms.

Course Requirements
There are three course requirements:
  1. Presentation. The main course requirement is to select a topic and give a presentation based on two papers on the topic. The instructor and TA will meet with each student to help with the preparation, and ensure that the resulting presentation will be interesting and accessible to students in the class who are not experts in the given topic. Most of the topics have a strong algorithmic flavor, but some topics are more geared towards biology. To sign up for a presentation, you need to pick a topic and a date. Consult the topic list and the schedule, and then email both the instructor and the TA with subject "CS374, signing up for presentation", listing your choices in order of preference. Requests are handled on a first-come first-serve basis, and therefore you are encouraged to provide us with more than one choice.

  2. Scribing. The second requirement is scribing a lecture. Lecture notes should provide students who are taking the class a useful resource for remembering the material presented. Ideally, lecture notes should be written up in a way so that they are readable by students of next year who did not necessarily read the papers that were presented. To sign up for scribing a lecture, consult the schedule and then email both the instructor and the TA with subject "CS374, signing up for scribing", listing your choices in order of preference. Once again, requests are handled on a first-come first-serve basis. Refer to the sample entry for an example of how lecture notes should look like in terms of format and organization. We suggest preparing your notes in Microsoft Word using the aforementioned sample as a template. Your notes are due one week after the lecture, and they should be submitted by emailing a PDF file to the TA; you can also submit any other popular format (Microsoft Word, Postscript) and in this case it will be converted to PDF by the TA. Keep in mind that your notes constitute an original text; verbatim copying from any sources is not allowed.

  3. Summary. As a third requirement, you need to select two lectures: one of lectures 3 to 14, and one of lectures 15 to 26. For each lecture, you have to find one paper in addition to the two presented; the paper must be related to the topic and relatively recent (after 2001). Then, you must write a single-page summary of what the paper presents and how it relates to the other two. Refer to the sample entry for examples of how the summary should look like in terms of format and structure. Each of your two summaries is due one week after its respective lecture, but it may appear online later, as you may be asked to perform some editing (if necessary).

As this is a seminar-style class, attendance is mandatory, and each student can miss up to two classes without affecting his/her grade. If you take the class for two units, you can drop (2) or (3) above; or, in case enrollment is too high we will consider dropping (1) if you prefer.

Communication
Questions should be sent to the instructor and the TA directly with email, or communicated to course staff in person after lecture or during office hours.

Topics
The following is a tentative list of topics:

Topic description
 PapersAssigned to
 
Large Scale Genome Properties
1
Genomic Rearrangements
Nandhini Nandiwada Santhanam
2
Repetitive DNA Detection and Classification
Vijay Krishnan
 
Searching Biological Sequence Databases
3
Index-based search of single sequences
Omkar Mate
4
Multiple indexes and multiple alignments
Siddharth Jonathan
 
Sequence Alignment
5
Multiple Sequence Alignment
Sarah Aerni
6
Inverse Alignment
Bahman Bahmani
 
Regulatory motif finding
7
Ab initio motif finding
Ryo Shimizu
8
Comparative motif finding
Mayukh Bhaowal
 
RNA Structure
9
RNA Secondary Structure Prediction
Greg Goldgof
10
RNA regulation
Marc Schaub
11
RNA finding
Leticia Britos
 
Phylogenetic Trees
12
Inference of phylogenetic trees
 
13
Gene trees
Abhita Chugh
 
Protein Structure
14
Evolution of Multidomain Proteins
Wissam Kazan
15
Stochastic roadmap simulations of protein kinetics
 
16
Protein Folding Dynamics
 
17
Protein Structure Alignment
Ramji Srinivasan
18
Machine Learning for Protein Classification
Ashutosh Saxena
 
Networks of Protein Interactions
19
Construction of Networks from Diverse Data Sources
Neda Nategh
20
Comparison of Networks Across Species
Chuan Sheng Foo
21
Properties of Interaction Networks
Susan Tang
 
Human Population Genetics
22
Human Migrations
Anjalee Sujanani
23
Human Evolution
Sharareh Noorbaloochi
24
Human-Chimp Speciation
Frank Chan
 
Computation using DNA and cells
25
Robust Self-Assembly of DNA
Eduardo Abeliuk Acuna
26
Transforming Cells into Automata
Ravi Tiruvury
 
Biological Data mining
27Rashmi Raj

Schedule
As the quarter progresses, the following schedule will be updated accordingly. Please check back often for the latest material.

 DateTitle
PresenterSummariesScribe
19/26Introduction
Serafim Batzoglou Ravi Tiruvury
29/28A zero-knowledge based introduction to biology
George Asimenos Anjalee Sujanani
310/3RNA Finding
Leticia Britos 1 2 Greg Goldgof
410/5RNA Secondary Structure Prediction
Greg Goldgof 1 Chuan Sheng Foo
510/10Human Evolution
Sharareh Noorbaloochi 1 2 3 Wissam Kazan
610/12Properties of Interaction Networks
Susan Tang 1 Neda Nategh
710/17(a)Transforming Cells into Automata
Ravi Tiruvury 1 2 Rashmi Raj
810/17(b)Index-based search of single sequences
Omkar Mate Abhita Chugh
910/19Multiple indexes and multiple alignments
Siddharth Jonathan Susan Tang
1010/24(a)Evolution of Multidomain Proteins
Wissam Kazan Mayukh Bhaowal
1110/24(b)Human Migrations
Anjalee Sujanani 1 2 Ashutosh Saxena
1210/26(a)Comparison of Networks Across Species
Chuan Sheng Foo 1 Frank Chan
1310/26(b)Repetitive DNA Detection and Classification
Vijay Krishnan 1 2 Ryo Shimizu
1410/31Biological Data Mining
Rashmi Raj 1 2 3 4 5 6 7 8 9 Siddharth Jonathan
1511/2Ab initio motif finding
Ryo Shimizu Eduardo Abeliuk Acuna
1611/7Construction of Networks from Diverse Data Sources
Neda Nategh 1 Nandhini Nandiwada Santhanam
1711/9(a)Genomic Rearrangements
Nandhini Nandiwada Santhanam 1 Sarah Aerni
1811/9(b)Gene trees
Abhita Chugh Vijay Krishnan
1911/14Protein Structure Alignment
Ramji Srinivasan Omkar Mate
2011/16Machine Learning for Protein Classification
Ashutosh Saxena 1 2 3 4 5 6 7 Ramji Srinivasan
2111/28RNA Regulation
Marc Schaub 1 2 3 4  
2211/30Robust Self-Assembly of DNA
Eduardo Abeliuk Acuna  
2312/5(a)Comparative motif finding
Mayukh Bhaowal 1 2 Leticia Britos
2412/5(b)Human-Chimp Speciation
Frank Chan 1 2 Sharareh Noorbaloochi
2512/7(a)Multiple Sequence Alignment
Sarah Aerni 1 2 3 4 5 Bahman Bahmani
2612/7(b)Inverse Alignment
Bahman Bahmani 1  

Sample entry from previous year:

    Microarray Analysis and Clustering
Yu Bai 1 2 3 George Asimenos