HOMEWORK 1¶
This assignment is due in lab 18Sept: It is worth 20 points.
The purpose of this homework is to get you more familiar with the tasks I taught you in Lab1, specifically BLAST, alignment, tree making. You may work in groups, but you each must submit your own report. Feel free to come see me for help.
- Follow the instructions exactly in lab 1, all the way down until you are making a file
query.fa
. This command!!
grep -A1 'ENSPTRP00000032491\|ENSPTRP00000032494' dataset1.pep > query.pep
Instead of using ENSPTRP00000032491
and ENSPTRP00000032494
, you are going to use...
- if your last name begins with A-E: ENSPTRP00000000215 and ENSPTRP00000028630
- if your last name begins with F-K: ENSPTRP00000000198 and ENSPTRP00000044401
- if your last name begins with L-Q: ENSPTRP00000009781 and ENSPTRP00000000220
- if your last name begins with R-Z: ENSPTRP00000007897 and ENSPTRP00000005951
Can you see how you would change the original command (grep -A1 'ENSPTRP00000032491\|ENSPTRP00000032494' dataset1.pep > query.pep
) to search for these new sequences?
Once you’ve done that, you can use BLAST just like you did in lab 1:
blastp -evalue 8e-8 -num_threads 8 -db uniprot -query query.pep -max_target_seqs 3 -outfmt "6 qseqid pident evalue stitle"
Remember in lab, that this search gave us several lines of results?? We then searched for these genes using this code:
grep --no-group-separator -A1 'HXA2_HUMAN\|HXA2_BOVIN\|HXA2_PAPAN\|HXA3_HUMAN\|HXA3_MOUSE\|HXA3_BOVIN\|HXA9_HUMAN' uniprot.pep > results.pep
You will do the same type of grep
search, but the names of the genes will be different.. i.e., HXA2_HUMAN\|HXA2_BOVIN\|HXA2_PAPAN
will be different names because you have blasted for different sequences.
In the end, you will have Newick tree code and an alignment. Send me an email with these 2 things.