Assignment #7

Hello all,

Today’s is a short assignment highlighting some of the multiple sequence alignment techniques that we discussed on Monday’s lecture.  Expect to work through this exercise during the first hour or so of today’s class, following which we will discuss the results as a group.

Assignment #7: MCB_5472_Assignment_7 (pdf) MCB_5472_Assignment_7 (docx)

Lecture: MCB5472_Assignment_7_lecture_Mar-5-14 (pptx) MCB5472_Assignment_7_lecture_Mar-5-14 (pdf)

The multiple sequence file that you will use as input is: BALiBASE3_faa

The corresponding reference alignment for these sequences is: BALiBASE3_msf

“.msf” files are a common multiple sequence alignment format. You can use that handy “seqret” program to convert this format into something else (and vice versa) see:  All of the multiple sequence programs that we will use in this exercise output aligned multiple fasta files by default, and these can also be read by most alignment viewers.

Lecture #6 Mar 3/14

Hello all,

Happy March to everyone!  This week’s lecture will be on sequence alignment, demonstrating basic principles, variation between methods, their affects on downstream analysis and some pragmatic ways to deal with it all.  This will be the final lecture before the midterm.  Next week will be a review session so come prepared with your questions that you would like me to cover!

MCB5472_Lecture_6_Mar-3-14 (pptx)

MCB5472_Lecture_6_Mar-3-14 (pdf)

Lecture #5 Feb 24/14

Hello all – Please find today’s lecture below.  We will be taking a wild romp through gene prediction and annotation, especially focusing on differences between the various methods in order to understand how far we should trust these data.  We will also look at a large number of different annotated databases that are good starting places for all kinds of projects in molecular evolution.

MCB5472_Lecture_5_Feb-24-14 (pptx)

MCB5472_Lecture_5_Feb-24-14 (pdf)

Assignment #5

Hello all,

This week’s assignment has two parts. In the first, we will do one last experiment with our E.coli dataset, finding reciprocal BLAST hit orthologous proteins between the two complete genomes and counting the conserved and unique proteins in each genome. An additional aspect of this exercise is to reinforce the utility of hashes. The second half of this exercise is a fairly brief comparison of different BLAST applications showing the power of PSI-BLAST for finding distant homologs.

Assignment: MCB_5472_Assignment_5

Lecture: MCB5472_Assignment_5_lecture_Feb-19-14 (pdf) MCB5472_Assignment_5_lecture_Feb-19-14 (ppt)

Assignment #4

Hello all,

Please find assignment #4 attached here: MCB_5472_Assignment_4.  This week we will be starting to work with command line BLAST, using it to answer some interesting questions about our genomes from assignment #3.  We will also be continuing to develop our perl skills, especially using arrays to parse the BLAST output and hashes to tabulate it into something useful.  This assignment is due before class on Wednesday Feb 19/14.
The corresponding lecture can be found here: MCB5472_Assignment_4_lecture_Feb-12-14 (pptx) MCB5472_Assignment_4_lecture_Feb-12-14 (pdf)