Using bioinformatics to characterize the solute carrier proteins

Using bioinformatics to characterize the solute carrier proteins. Using bioinformatics to characterize the solute carrier proteins.

For your project, we want you to use all of the techniques covered in your assignments to investigate the sequence similarity and evolution of the SLC26 family of proteins. Pick ONE member of the SLC26 family (SLC26A1 – SLC26A11) and run it through the entire suite of tools you have been using for your assignments. Please note that this project is not an assignment; here we expect you to apply what you have learned and choose the best strategy to accomplish the task at hand. We also expect your report to include figures and tables as well as a narrative description of the results. You will most likely need to read a number of papers on the SLC transporters in order to accomplish this task effectively.

Instructions

Use BLAST or FASTA websites to find a number of closely related homologues within the class of SLC26 proteins. Use UniProtKB to find out more about these proteins.
Use BLAST or FASTA to find 10 more distantly related homologues (members of the SLC family but not SLC26).
Use PSI-BLAST or COBBLER websites to find 10 even more distant homologues (proteins that are not members of the SLC family).
Characterize the domain structure of your SLC26 protein using the websites: BLOCKMAKER, PFSCAN, PFAM, etc.
In order to rationally design a drug, we usually need to have a crystal structure. Use VAST, DALI, or PdbEfold to find genes related by structure. Include images showing the structural alignment.
Use ClustalW and other multiple sequence alignments (T-COFFEE, MAFFT, MUSCLE) to find and highlight conserved regions using the amino acid sequence. Use at least 10 amino acid sequences from 10 different organisms.
Use PHYLIP or MEGA5 to construct phylogenies using these multiple alignments:
One based on distance matrix methods
One based on maximum parsimony
One based on maximum likelihood (you may need to use DNA sequences for this)
Evaluate your trees
bootstrap analysis
shuffling order or other online resources
Use Primer3 to design primers to study this gene in populations.
Use CODEHOP to design primers to fish out related genes from mRNA of an organism that you haven’t previously studied.
Design a protocol to clone the entire coding sequence (i.e. from start to stop codon) of a homologous gene related to your assigned gene but from a different organism into Bluescript
Identify ORF
Design primers that bind as close to start & stop as possible, and evaluate them using primer finder
Describe procedure for performing the cloning, including restriction enzymes used
print map of recombinant plasmid
print 50 bp of sequence at the junctions between the plasmid and cloned DNA

Report

Prepare a summary in which you first state which gene you started out with, then provide a paragraph for each test in which you summarize the results, discuss why you think that you obtained them and try to explain any unexpected results.
For BLAST/FASTA tell us how many significant results were found, and which sequences were most closely related and who they came from. Are any of your homologues associated with diseases? What ligands do your proteins bind? Identify and try to explain any unexpected similarities and any differences between the searches using RNA versus amino acid sequences. How did you have to modify your search strategy to find more distant homologues (SLC that are not members of SLC26)? What sorts of ligands do your proteins bind in this broader group?
For PSI-BLAST etc. tell us how many more distant relatives were found, what sorts of organisms they came from, and what sorts of proteins were related. Did you find any bacterial anion transporters? If so what organisms do they come from?
For Blockmaker, PFSCAN etc tell us how many motifs were found, what parts of the protein they came from, and what other proteins contain these motifs. Identify and try to explain any unexpected results. How do these results compare to what you found using BLAST/FASTA? Are there specific domains that are more common among your hits?
For VAST (or the other structural alignment programs) tell us what sorts of proteins you found, and whether you found any new ones missed by the sequence-based approaches. Discuss any differences from the sequence-based approaches, and identify and try to explain any unexpected results. What parts of your SLC26 protein seem to have structural homologs with crystal structures? How does this agree with the domains you identified above?
For Clustal W describe the relationships that were identified, and what parts of the protein were related. Identify and try to explain any unexpected results.
For Phylip explain who the closest relatives are, and comment on whether this was expected or a surprising result. Then, if the three methods come up with different trees, try to explain why this might have happened and which tree is most likely to be correct.
For the tree evaluation explain what your results mean in terms of the reliability of the trees.
For part 8 just list the primers that were designed.
For part 9 just list the primers that were designed.
For part 10 first state where the ORF started and finished, then list the sequence of the primers and state where they bound. Next describe which restriction enzymes you will use, print a map of the recombinant plasmid and print 50 bp of sequence at each junction between the plasmid and cloned DNA.

Organize your results into sections where everything you present is clearly explained and annotated. Provide a conclusion section wherein you summarize all of your results in narrative form. In order to properly interpret your results and explain their biological significance, you will need to read about your protein, the SLC26 class of proteins and the SLC family.p(7)

Place your order now to enjoy great discounts on this or a similar topic.

People choose us because we provide:

Essays written from scratch, 100% original,

Delivery within deadlines,

Competitive prices and excellent quality,

24/7 customer support,

Priority on their privacy,

Unlimited free revisions upon request, and

Plagiarism free work,

 

Using bioinformatics to characterize the solute carrier proteins

Using bioinformatics to characterize the solute carrier proteins

For a custom paper on the above or a related topic or instructions, place your order now!

What We Offer:

• Affordable Rates – (15 – 30% Discount on all orders above $50)
• 100% Free from Plagiarism
• Masters & Ph.D. Level Writers
• Money Back Guarantee
• 100% Privacy and Confidentiality
• Unlimited Revisions at no Extra Charges
• Guaranteed High-Quality Content

Bioinformatics

Bioinformatics. Bioinformatics. Pfam Workshop

print full screen

Pfam Domain Databases

Pfam is a database of protein families and domains. Currently, there are over 10,000 entries in Pfam that match to 75% of all sequences in UniProt / GenPept. Pfam can be accessed from: http://pfam.sanger.ac.uk

In the following worked example you will be guided through a Pfam entry.

  • STEP 1 – Open the Pfamhomepage at either of the two sites.
  • STEP 2 – Click on View a Pfam Familyand entry ‘RBD’ in the textfield.

This will take you to the Wiki page on the family. The Summary provides a quick synopsis on the entry.

What protein family is RBD? Which human proteins contain this domain?

  • STEP 3 – Click on ‘domain organisation

The RBD is found associated with many different domains, many of which are involved in signalling. This page shows a summary of all the proteins that contain this domain. Solid, coloured regions are Pfam domains.

How many sequences have the domain architecture RBD, C1_1 ?
How many different domain architectures is the RDB domain involved?
(this information can be found at the bottom of the web-page)

  • STEP 4 _ click on ‘Clan

Pfam clans are groups of related families that have arisen from a single common evolutionary ancestor. A variety of tools are used for finding related families: structural similarity, sequence similarity, functionally similarity and profile-profile comparison tools.
So why are they useful? Clans can provided functional insights for domains with otherwise unknown function. For example, the DUFs (domains of unknown function) in the ubiquitin clan are like to function as small binding
domains. It also allows the identification of more distantly related structural homologs. The alignments are at the extreme edge of what can be achieved with current sequence analysis tool, but again can provide clues to key residues with the families. One can also look to see if domains are commonly combined with members of the same clan of if they are specific.

What clan does the RBD belong to?
How many other domain families belong to this clan?

  • STEP 5 – click on CL0072

What are the CATH and SCOP descriptions of the fold?

  • STEP 6 – Click on ‘Alignments

This allows you to get the alignment of the protein familiy in variety of formats.
Each Pfam entry contains two alignments. The seed alignment contains a set of representative sequences that are used to build a profile HMM. The full alignment contains all examples of the domains.

  • STEP 7 – Go back to the RBD family page and click on ‘Alignments‘. Select ‘jalview’ and the ‘full’ alignment and click the ‘view’ button.
  • If ‘jalview’ does not work on your computer – look at it in ‘html’

Can you tell which are the most conserved residues in the alignment?

  • STEP 8 – Select HMM logo

Profile HMMs are difficult to understand. To help understand them a little better, there has been the introduction of the HMM logo tab. This is a graphical representation of the HMM, where the height of the letter denotes the likelihood of that amino acid. Thus, the key residues that define the family can easily be identified.

Using HMM logo which is the most conserved residue in this domain?
What do you think the red blocks mean?

  • STEP 9 – Click on ‘Species

By looking at both the Sunburst and the Tree:

Are RBD found in archea, bacteria, viruses and eukaryotes?
How many human sequences contain RBD ?

  • STEP 10 – Click on ‘Interactions

How many interacting partners have been identified by pfam?
What sort of domains are these?

  • STEP 11 – Click on ‘Structure

This page shows RDB domains with a known structure. Often a structure can be solved multiple times.
In which species has the the RBD structure been solved?

  • STEP 12 – Select ‘Jmol’ to view the structure

Right click on the structure and open up the console. Left click on the structure to reveal position. Modify how the structure is represented using Jmol.

Can you

p(4)

Place your order now to enjoy great discounts on this or a similar topic.

People choose us because we provide:

Essays written from scratch, 100% original,

Delivery within deadlines,

Competitive prices and excellent quality,

24/7 customer support,

Priority on their privacy,

Unlimited free revisions upon request, and

Plagiarism free work,

 

Bioinformatics

Bioinformatics

For a custom paper on the above or a related topic or instructions, place your order now!

What We Offer:

• Affordable Rates – (15 – 30% Discount on all orders above $50)
• 100% Free from Plagiarism
• Masters & Ph.D. Level Writers
• Money Back Guarantee
• 100% Privacy and Confidentiality
• Unlimited Revisions at no Extra Charges
• Guaranteed High-Quality Content