https://cpt.tamu.edu/bich464/C1-exercise.html
Here is your protein, you’re tasked with identifying it!
>unknown_protein
VVKSSGVRQPFDKEKIYKVLKWACDGHNIDVRAFLENVLELIRDGMTTKQIQRIAAIKYA
ADHISVKEPDWQYVASNLEMFALRKDVYGQFDPIPFYDHIVKMVEAGKYDKEILEKYSKQ
DIQVFERAIDHDKDFEFSYAGSQQLIGKYLVQDRDTGEIFETPQYAFMLIAMCLHQEETG
AQVTHIVDFYNAISDRKLSLPTPIMAGVRTPTRQFSSCVVIESGDSLGSLNAVTSAIKVY
ISQRAGIGVNAGHIRAMGSKIRGGEAVHTGVIPFWKIQTAVKSCSQGGVRGGAATLYYPF
WHLEVENLLVLKNNKGVEENRVRHLDYGVQLNQLMYKRLMNRDYITLFSPDVANDRLYDL
The NCBI BLAST website offers several types of BLAST queries.
We’ll be doing a Basic BLAST with Protein Blast. You’ll need to
Paste in your Query Sequence, the unknown protein
. Then choose
nr for Choose a Search Set
You’re ready to blast! Hit the button and be patient
There are several sections to the BLAST output on the web:
This simply shows you where regions from other proteins hit aligned with your query protein.
Here we can see an overview of the database hits. The table is sorted by E-Values, the expectation values.
When looking at blast results on the web, your main goal is usually to figure out the identity of a protein. Here we see lots of blast hits to NrdA or ribonucleotide reductases. Why are they so sure? Since NCBI doesn’t expose any further levels of evidence, we dig through the blast results and see...
A hit to T4’s NrdA! That’s an extremely good indicator of the identity of your protein (again, in absence of real, wet-lab experiments.)
We will be doing blast from within Galaxy and viewing the results. The workflow will be somewhat different from what you’ve learned how to do in this exercise, however the underlying theory is the same. You have a query protein and you’re searching through all the other proteins in the world for similar results.