Author: Jonathan Klassen

Assignment #7

Hello all,

Today’s is a short assignment highlighting some of the multiple sequence alignment techniques that we discussed on Monday’s lecture.  Expect to work through this exercise during the first hour or so of today’s class, following which we will discuss the results as a group.

Assignment #7: MCB_5472_Assignment_7 (pdf) MCB_5472_Assignment_7 (docx)

Lecture: MCB5472_Assignment_7_lecture_Mar-5-14 (pptx) MCB5472_Assignment_7_lecture_Mar-5-14 (pdf)

The multiple sequence file that you will use as input is: BALiBASE3_faa

The corresponding reference alignment for these sequences is: BALiBASE3_msf

“.msf” files are a common multiple sequence alignment format. You can use that handy “seqret” program to convert this format into something else (and vice versa) see: http://emboss.open-bio.org/rel/dev/apps/seqret.html.  All of the multiple sequence programs that we will use in this exercise output aligned multiple fasta files by default, and these can also be read by most alignment viewers.

Lecture #6 Mar 3/14

Hello all,

Happy March to everyone!  This week’s lecture will be on sequence alignment, demonstrating basic principles, variation between methods, their affects on downstream analysis and some pragmatic ways to deal with it all.  This will be the final lecture before the midterm.  Next week will be a review session so come prepared with your questions that you would like me to cover!

MCB5472_Lecture_6_Mar-3-14 (pptx)

MCB5472_Lecture_6_Mar-3-14 (pdf)

Lecture #5 Feb 24/14

Hello all – Please find today’s lecture below.  We will be taking a wild romp through gene prediction and annotation, especially focusing on differences between the various methods in order to understand how far we should trust these data.  We will also look at a large number of different annotated databases that are good starting places for all kinds of projects in molecular evolution.

MCB5472_Lecture_5_Feb-24-14 (pptx)

MCB5472_Lecture_5_Feb-24-14 (pdf)

Assignment #5

Hello all,

This week’s assignment has two parts. In the first, we will do one last experiment with our E.coli dataset, finding reciprocal BLAST hit orthologous proteins between the two complete genomes and counting the conserved and unique proteins in each genome. An additional aspect of this exercise is to reinforce the utility of hashes. The second half of this exercise is a fairly brief comparison of different BLAST applications showing the power of PSI-BLAST for finding distant homologs.

Assignment: MCB_5472_Assignment_5

Lecture: MCB5472_Assignment_5_lecture_Feb-19-14 (pdf) MCB5472_Assignment_5_lecture_Feb-19-14 (ppt)

Assignment #4

Hello all,

Please find assignment #4 attached here: MCB_5472_Assignment_4.  This week we will be starting to work with command line BLAST, using it to answer some interesting questions about our genomes from assignment #3.  We will also be continuing to develop our perl skills, especially using arrays to parse the BLAST output and hashes to tabulate it into something useful.  This assignment is due before class on Wednesday Feb 19/14.
The corresponding lecture can be found here: MCB5472_Assignment_4_lecture_Feb-12-14 (pptx) MCB5472_Assignment_4_lecture_Feb-12-14 (pdf)

Assignment #3

Hello all,

As you have likely heard, UConn is closed tomorrow and so we will not be having class.  There will be no make up lecture, however I am posting a new assignment building on the perl techniques you have already learned and applying them to think about some of the genome quality issues that we discussed on Monday.  The new assignment will be due before class on Wednesday Feb 12/14 and can be found here: MCB_5472_Assignment_3

Questions #1 and #2 from Assignment #2 will be due at 1pm on Thursday Feb 6/14 to accommodate UConn’s closure tomorrow.  Question #3 from this assignment has been moved to Assignment #3.

As always, please post your questions here or email myself or Peter directly.  Please note that if you question has been answered on the website already we will just refer you there, so you may as well make it your first stop!

Lecture #2 Feb 3/14

Hello all – Please find attached today’s lecture.  We will be covering a bit more about GenBank, and then taking a whirlwind tour of DNA sequencing and assembly.  The goal of this latter section is to understand how the methods used to sequence and assemble a particular genome can affect the quality of the input data used for any evolutionary analysis and how to deal with this problem.

Lecture #2 pdf: MCB5472_Lecture_2_Feb-3-14

Lecture #2 ppt: MCB5472_Lecture_2_Feb-3-14