COMP 6811 Bioinformatics Algorithms

Fall 2022 Section FF


Projects

Projects are individual work.

You must submit to EAS (the electronic assignment submission system). You must submit the project in the format requested. No other format will be accepted (except zip, but only use zip if your internet connection makes it necessary to compress the file).

You must submit to the requested location in EAS, such as "programming_assignment 1", and you must submit to COMP6811 (and not some other course).

Late Penalty There will be a penalty of 10 percent for each day late.


Project 1

Deadline 10:59 am Friday 21 October 2022
Submission A zip file to "programming_assignment 1" to EAS under COMP6811 containing a directory with your source code, tests, and report.

The task looks at clustering of protein sequences.

You will be provided with a collection of proteins as a fasta file.

You should apply the CD-Hit program with a threshold of 60 percent identity to cluster the sequences.

You should use blast all-vs-all to align all the proteins with each other. Cluster the sequences according to the 60 percent identity threshold.

Compare the results.

Write a report in Latex.


Project 2

Deadline 10:59 am Friday 9 December 2022
Submission A zip file to "programming_assignment 2" to EAS under COMP6811 containing a directory with your source code, tests, report, and presentation.

The project is to compare two or more related microbiome samples using the facilities of Diamond and MEGAN.

Write a report, and prepare a presentation.


Last modified on 6 September 2022 by Greg Butler