MB&B 447b3 (747b3)
BIOINFORMATICS
Go to New Homepage for Spring 1999 Course
Contents of Old Spring 1998 Homepage follow below.
Brief Description
Computational analysis of gene sequences and protein structures, on
a large-scale. Topics include sequence alignment, biological database design,
geometric analysis of protein structure, and macromolecular simulation.
[Blue Book Entry]
Instructor
Mark Gerstein
MB&B Department, Bass 432A,
Yale University, New Haven, CT 06520
Phone: 203 432-6105, E-mail: Mark.Gerstein@yale.edu
Class Handouts and Other Documents
Handouts and reading with Janice Murphy (432-5600, Janice.Murphy@yale.edu)
in Bass 420.
For all on-line documents go to http://bioinfo.mbb.yale.edu/course/classes
Survey. Complete by Friday 1/16/98,
by both for-credit and non-credit attendees.
Statistical Analysis of Survey Results [html]
[pdf]
Expanded Description
This course will provide an overview of bioinformatics, the application
of computational methods to interpret the rapidly expanding amount of biological
information. Following the natural flow of this information in the cell,
the course will begin with the analysis of gene sequences and progress
to the study of protein structures. The classic dynamic programming method
of sequence alignment will be presented first, and then it will be shown
how this can be extended to allow rapid searching and scoring of the thousands
of sequences in a genome. This will naturally lead to the question of how
large amounts of biological information can be intelligently organized into
a database. Discussion of sequence-structure relationships will form the
bridge to protein structure. Particular emphasis will be placed here on
statistically based "predictions" of secondary structure. For the analysis
of 3D structures, mathematical constructions, such as Voronoi polyhedra,
will be presented for calculating simple geometric quantities, such as
distances, angles, axes, areas, and volumes. Finally, it will be shown
how these simple quantities can be related to the basic properties of proteins
and this will naturally lead to a brief overview of the more physical calculations
that are possible on protein structures, namely molecular dynamics and
Monte Carlo simulation.
Timing
The course will meet during the first half of the 1998 Spring term,
from 1/12/98 to 2/27/98, for 2.5 hours per week.
First meeting: Bass 405, 9:30-10:20, Monday 1/12/98.
Second meeting: Bass 405, 9:30-10:20, Wednesday 1/14/98.
Third meeting: Bass 405, 9:05-10:20, Monday 1/19/98.
Remainder of meetings in Bass 405 on Mondays and Wednesdays from 9:05
am to 10:20 am.
This module to some degree relates to the MB&B module on computational
crystallography (MB&B
460b4), which follows at the same time and place in the second half
the spring semester.
Rough Outline of Topics
-
General Overview
-
Sequences I: Alignment via Dynamic Programming
-
Sequences II: Multiple Alignment and Consensus Patterns
-
Sequences III: Scoring schemes and Matching statistics
-
Sequences IV: Secondary Structure Propensities and Prediction
-
Structures I: Basic Protein Geometry and Least-Squares Fitting
-
Structures II: Calculation of Volume and Surface
-
Structures III: Structural Alignment
-
Structures IV: Molecular Dynamics & Monte Carlo
-
Databases I: Relational Database Concepts
-
Databases II: Protein Domains and Modules
-
Databases III: Clustering and Trees
-
Databases IV: Large-scale Censuses and Genome Comparisons
-
Summary Lecture
[More Detailed Outline, Tentative]
Readings, general
Readings will be excerpted from a number of original research papers.
In addition, sections from the following books will be used:
Sequence Analysis Primer by Gribskov & Deveraux
Database System Concepts by Korth & Silberschatz
Dynamics of Proteins & Nucleic Acids by McCammon & Harvey.
Work Required
Approximately 25-30 pages of reading will be required each week. Students
will be evaluated on the basis of a final paper/project, class participation,
and 1-2 short exercises.
Prerequisites
Students should have :
1. A basic knowledge of biochemistry and molecular biology.
2. A knowledge of basic quantitive concepts, such as single variable
calculus, some probability and statistics, and basic programming skills.
These can be fufilled by the following prerequistes statement: "Prerequisites:
Biol. 122b and Mathematics 115 or permission of the instructor."
A more detailed discussion of background knowledge is in the first
lecture and in the survey.
Summer Jobs in Bioinformatics
If you're really motivated, take a look at http://bioinfo.mbb.yale.edu/jobs.