Bioinformatics. Bioinformatics. Pfam Workshop
print full screen
Pfam Domain Databases
Pfam is a database of protein families and domains. Currently, there are over 10,000 entries in Pfam that match to 75% of all sequences in UniProt / GenPept. Pfam can be accessed from: http://pfam.sanger.ac.uk
In the following worked example you will be guided through a Pfam entry.
- STEP 1 – Open the Pfamhomepage at either of the two sites.
- STEP 2 – Click on View a Pfam Familyand entry ‘RBD’ in the textfield.
This will take you to the Wiki page on the family. The Summary provides a quick synopsis on the entry.
What protein family is RBD? Which human proteins contain this domain?
- STEP 3 – Click on ‘domain organisation’
The RBD is found associated with many different domains, many of which are involved in signalling. This page shows a summary of all the proteins that contain this domain. Solid, coloured regions are Pfam domains.
How many sequences have the domain architecture RBD, C1_1 ?
How many different domain architectures is the RDB domain involved?
(this information can be found at the bottom of the web-page)
Pfam clans are groups of related families that have arisen from a single common evolutionary ancestor. A variety of tools are used for finding related families: structural similarity, sequence similarity, functionally similarity and profile-profile comparison tools.
So why are they useful? Clans can provided functional insights for domains with otherwise unknown function. For example, the DUFs (domains of unknown function) in the ubiquitin clan are like to function as small binding
domains. It also allows the identification of more distantly related structural homologs. The alignments are at the extreme edge of what can be achieved with current sequence analysis tool, but again can provide clues to key residues with the families. One can also look to see if domains are commonly combined with members of the same clan of if they are specific.
What clan does the RBD belong to?
How many other domain families belong to this clan?
What are the CATH and SCOP descriptions of the fold?
- STEP 6 – Click on ‘Alignments’
This allows you to get the alignment of the protein familiy in variety of formats.
Each Pfam entry contains two alignments. The seed alignment contains a set of representative sequences that are used to build a profile HMM. The full alignment contains all examples of the domains.
- STEP 7 – Go back to the RBD family page and click on ‘Alignments‘. Select ‘jalview’ and the ‘full’ alignment and click the ‘view’ button.
- If ‘jalview’ does not work on your computer – look at it in ‘html’
Can you tell which are the most conserved residues in the alignment?
Profile HMMs are difficult to understand. To help understand them a little better, there has been the introduction of the HMM logo tab. This is a graphical representation of the HMM, where the height of the letter denotes the likelihood of that amino acid. Thus, the key residues that define the family can easily be identified.
Using HMM logo which is the most conserved residue in this domain?
What do you think the red blocks mean?
- STEP 9 – Click on ‘Species’
By looking at both the Sunburst and the Tree:
Are RBD found in archea, bacteria, viruses and eukaryotes?
How many human sequences contain RBD ?
- STEP 10 – Click on ‘Interactions’
How many interacting partners have been identified by pfam?
What sort of domains are these?
- STEP 11 – Click on ‘Structure‘
This page shows RDB domains with a known structure. Often a structure can be solved multiple times.
In which species has the the RBD structure been solved?
- STEP 12 – Select ‘Jmol’ to view the structure
Right click on the structure and open up the console. Left click on the structure to reveal position. Modify how the structure is represented using Jmol.
Can you
p(4)
Place your order now to enjoy great discounts on this or a similar topic.
People choose us because we provide:
Essays written from scratch, 100% original,
Delivery within deadlines,
Competitive prices and excellent quality,
24/7 customer support,
Priority on their privacy,
Unlimited free revisions upon request, and
Plagiarism free work,
Bioinformatics
Bioinformatics
• Affordable Rates – (15 – 30% Discount on all orders above $50)
• 100% Free from Plagiarism
• Masters & Ph.D. Level Writers
• Money Back Guarantee
• 100% Privacy and Confidentiality
• Unlimited Revisions at no Extra Charges
• Guaranteed High-Quality Content