Thank you for your interest in ECS 124 which will be taught first in spring 2000. The syllabus is contained below. Note that there are some changes in the required and suplemental books from the original "official" syllabus found at: http://www.cs.ucdavis.edu/instruction/exp_course_desc/124.html The lecture is MWF 2:10-3:00pm. Classes start March 31. Exam is June 16 4pm. There will be two sections of the lab offered, one M 10-10:50am (CRN 75732) and the other M 11-11:50 (CRN 76023). Both are for enrolled students only. First lab is Monday April 3. We may organize a few help sessions for people not familiar with Unix. For auditors, almost all the labs can be done on any machine with web access and Perl. ECS 124 THEORY AND PRACTICE OF BIOINFORMATICS (4) III Lecture: 3 hours Laboratory: 1 hour Prerequisites: Course CSE 10 or 30 or E 5 or E 6; Stat 12 or 13 or 32 or 100 or Math 131/Stat 131A; Bio Sci 1A or MCB 10 Grading: Letter; 5-7 homework/laboratory sets (60%), final (40%) Catalog Description: Fundamental biological, mathematical and algorithmic models underlying bioinformatics; sequence analysis, database search, gene prediction, molecular structure comparison and prediction, phylogenetic trees, high throughput biology, massive datasets; applications in molecular biology and genetics; use and extension of common bioinformatics tools. Goals: I. Understanding the role and utility of bioinformatics in modern biology II. Understanding basic biological, mathematical and algorithmic concepts, techniques and models underlying bioinformatics tools III. Mastery of common bioinformatics tools IV. Simple programming in Perl to extend the utility of common bioinformatics tools Expanded Course Description: I. Initial examples of the power of bioinformatics in modern biology A. The importance of sequence and structure comparison and of database search B. The use of sequence analysis in laboratory protocols C. The use of phylogenetics in evolution and non-evolutionary areas of biology II. Sequence analysis A. Probabilistic and biological models underlying sequence alignment B. Computational efficiency and the need for compromises in the models C. The general technique of Dynamic Programming D. Pairwise sequence alignment - algorithms for global, local alignment and variations E. Algorithms for multiple sequence alignment and the identification/use of motifs F. Database search - FASTA, BLAST, PSI-BLAST, scoring matrices, statistical significance and its significance G. Creation and use of motif models H. Novel uses of sequence analysis in studying DNA, RNA and proteins I. Sequence analysis in genomics and high throughput biology III. Phylogenetic algorithms A. Probabilistic and ideal-data models underlying phylogenetic algorithms B. Distance-based methods C. Character/parsimony-based methods D. Maximum-likelihood methods E. PAUP, PHYLIP F. Evolutionary and non-evolutionary uses for phylogenetics G. The interaction of phylogenetics and sequence analysis IV. Protein and RNA structure comparison and prediction A. Ideal-data models underlying structure comparison and prediction B. Algorithms for RNA folding C. Methods and problems in protein structure comparison and prediction D. Biological use of structure prediction and comparison tools V. Overview of common bioinformatics utilities and web-based resources such as GCG and Entrez Textbooks: Required Texts Elements of Programming with Perl. A Johnson, Manning Publications, 2000 Notes for an Undergraduate Course on Bioinformatics, D. Gusfield and K. Stevens, 200?, distributed on-line. Additional library readings available on-line. Supplemental Texts R. Durbin et al., Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge Press, 1998 A. Baxevanis and B. Ouellette, Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Wiley-Interscience 1998 M. Bishop and C. Rawlings, DNA and Protein Sequence Analysis: A practical Approach, IRL Press, 1997 D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambrige Press, 1997. Homework: Each homework set includes creative problems as well as recitation problems to strengthen understanding and discover new material. Computer Usage: The lab portion of the class will emphasize practical computer exercises using both established bioformatics software and writing simple programs in Perl. ABET Category Content: Engineering Science: 1 unit Engineering Design: 0 unit Instructor: Dan Gusfield Prepared by: Dan Gusfield (April 1999) The course is aimed both at biology and computer science students. It is expected that the typical biology student will have a stronger background in molecular biology, genetics and biochemistry (not listed as a prerequisite) than is reflected in the prerequisite list, and that the computer science student will have a stronger background in programming and mathematics than is listed in the prerequisites. Some of the laboratory assignments will be done by groups mixing biology and computer science students whose backgrounds should complement each other. The laboratory portions of the course will teach the hands-on computer tools and some programming in Perl, while the lectures will focus on the fundamental biological, mathematical and algorithmic chain of reasoning leading to the models that underlie these tools. Thus, the course requires some sophistication in mathematics, and some intuitive understanding of what computer programming is (a prior exposure to some computer programming is required, and a prior exposure to Unix is desirable), but we do not assume an extensive background in programming. No prior experience in Perl is assumed. Facility in using a web-browser is assumed. I will use Netscape in lab, but others can be used. Almost all the lab assignments can be done on any machine that has web access and Perl. The scheduled time for the computer lab is intended to get students started on their computer work, but additional computer work outside the lab time is expected.