MB&B 447b3 (747b3)

Go to New Homepage for Spring 1999 Course

Contents of Old Spring 1998 Homepage follow below.

Brief Description

Computational analysis of gene sequences and protein structures, on a large-scale. Topics include sequence alignment, biological database design, geometric analysis of protein structure, and macromolecular simulation. [Blue Book Entry]


Mark Gerstein
MB&B Department, Bass 432A, Yale University, New Haven, CT 06520
Phone: 203 432-6105, E-mail: Mark.Gerstein@yale.edu

Class Handouts and Other Documents

Handouts and reading with Janice Murphy (432-5600, Janice.Murphy@yale.edu) in Bass 420.

For all on-line documents go to http://bioinfo.mbb.yale.edu/course/classes

Survey. Complete by Friday 1/16/98, by both for-credit and non-credit attendees.
Statistical Analysis of Survey Results [html] [pdf]

Expanded Description

This course will provide an overview of bioinformatics, the application of computational methods to interpret the rapidly expanding amount of biological information. Following the natural flow of this information in the cell, the course will begin with the analysis of gene sequences and progress to the study of protein structures. The classic dynamic programming method of sequence alignment will be presented first, and then it will be shown how this can be extended to allow rapid searching and scoring of the thousands of sequences in a genome. This will naturally lead to the question of how large amounts of biological information can be intelligently organized into a database. Discussion of sequence-structure relationships will form the bridge to protein structure. Particular emphasis will be placed here on statistically based "predictions" of secondary structure. For the analysis of 3D structures, mathematical constructions, such as Voronoi polyhedra, will be presented for calculating simple geometric quantities, such as distances, angles, axes, areas, and volumes. Finally, it will be shown how these simple quantities can be related to the basic properties of proteins and this will naturally lead to a brief overview of the more physical calculations that are possible on protein structures, namely molecular dynamics and Monte Carlo simulation.


The course will meet during the first half of the 1998 Spring term, from 1/12/98 to 2/27/98, for 2.5 hours per week.

First meeting: Bass 405, 9:30-10:20, Monday 1/12/98.
Second meeting: Bass 405, 9:30-10:20, Wednesday 1/14/98.
Third meeting: Bass 405, 9:05-10:20, Monday 1/19/98.
Remainder of meetings in Bass 405 on Mondays and Wednesdays from 9:05 am to 10:20 am.

This module to some degree relates to the MB&B module on computational crystallography (MB&B 460b4), which follows at the same time and place in the second half the spring semester.

Rough Outline of Topics

  1. General Overview
  2. Sequences I: Alignment via Dynamic Programming
  3. Sequences II: Multiple Alignment and Consensus Patterns
  4. Sequences III: Scoring schemes and Matching statistics
  5. Sequences IV: Secondary Structure Propensities and Prediction
  6. Structures I: Basic Protein Geometry and Least-Squares Fitting
  7. Structures II: Calculation of Volume and Surface
  8. Structures III: Structural Alignment
  9. Structures IV: Molecular Dynamics & Monte Carlo
  10. Databases I: Relational Database Concepts
  11. Databases II: Protein Domains and Modules
  12. Databases III: Clustering and Trees
  13. Databases IV: Large-scale Censuses and Genome Comparisons
  14. Summary Lecture

[More Detailed Outline, Tentative]

Readings, general

Readings will be excerpted from a number of original research papers. In addition, sections from the following books will be used:

Sequence Analysis Primer by Gribskov & Deveraux

Database System Concepts by Korth & Silberschatz

Dynamics of Proteins & Nucleic Acids by McCammon & Harvey.

Work Required

Approximately 25-30 pages of reading will be required each week. Students will be evaluated on the basis of a final paper/project, class participation, and 1-2 short exercises.


Students should have :
1. A basic knowledge of biochemistry and molecular biology.
2. A knowledge of basic quantitive concepts, such as single variable calculus, some probability and statistics, and basic programming skills.
These can be fufilled by the following prerequistes statement: "Prerequisites: Biol. 122b and Mathematics 115 or permission of the instructor."

A more detailed discussion of background knowledge is in the first lecture and in the survey.

Summer Jobs in Bioinformatics

If you're really motivated, take a look at http://bioinfo.mbb.yale.edu/jobs.

[home]   Lab Home