Data Analysis Part 1 – MutationHuman cystic fibrosis transmembrane conductance regulator (CFTR) mRNA

 

  1. The cDNA sequence representing the complete mRNA of the normal or wild-type human CFTR gene is given in Appendix I. Use this to determine the amino acid sequence of the protein.

There are several DNA translation tools that you can use for this:

https://web.expasy.org/translate/

https://www.ebi.ac.uk/Tools/st/emboss_transeq/

Which reading frame gives you the correct protein sequence (remember you only need to translate the top strand)?

 

  1. Identify the protein coding sequence (that is, the region which is translated into protein), and the 3’ and 5’ untranslated regions and complete Table 1. How many amino acids are encoded by this region?

 

Table 1.  CFTR mRNA features

Feature Nucleotide positions
3’ UTR
Coding sequence (CDS)
5’ UTR

 

 

 

 

 

  1. Over 1700 mutations have been identified in the CFTR gene that are associated with cystic fibrosis. Four of the most common ones are listed in Table 2.

 

For each of the mutations determine (i) what kind of mutation it is (missense, nonsense) and (ii) what is the effect of the mutation on the protein sequence.

To do this you will need to identity and change the relevant nucleotide(s) in the wild-type cDNA and then use a translation tool to determine the effect on the protein. You can then use programmes designed to align multiple sequences to compare the mutant proteins to the wild-type. The most commonly used is Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/).

 

Table 2. Common CFTR mutations

Mutation Nucleotide position(s)* Mutation type Amino acid change Protein domain affected
M1:  G → A 420 Missense R117H TMD 1
M2: del CTT 1591 -1593 3 bp deletion Deletion of F508 ATP binding domain 1
M3: G → T 1694 Nonsense G542X Premature termination
M4: G → A 1722 Nonsense G551D ATP binding domain 1

* Refers to the nucleotide position in the cDNA

 

  1. Using the information in Table 3, draw a cartoon of the CFTR protein showing the domain organisation and indicate on this the positions of the amino acid changes identified in the mutants.

 

Table 3. CFTR proteins domain regions

Domain Amino acids
Transmembrane domain 1 85 – 303
ATP binding domain 1 389 – 670
Regulatory (R) domain 639 – 849
Transmembrane domain 2 866 – 1147
ATP binding domain 2 1208 – 1480

 

 

  1. A patient was diagnosed with CF; neither parent had CF. Routine analysis identified a known mutant allele on one chromosome and an unknown mutant allele on the other chromosome. The novel mutation was found to be located between exons 8-11.  RT-PCR analysis was carried out on RNA samples from the patient and both parents, using primers designed to amplify this region (Figure 1B).  The PCR products were analysed by agarose gel electrophoreses (Figure 1A) and then sequenced (Appendix II).

 

 

 

 

 

 

 

 

 

Figure 1: (A) Analysis of RT-PCR products of RNA isolated from blood samples of the patient (J28), and his mother (Mo) and father (Fa).  The position of the 500 bp band in the 100 bp marker lane is indicated. (B) Sequences of the forward (F) and reverse (R) primers used in the PCR.

 

 

  • Locate the position of the primers on the complete cDNA. Remember you will first need to determine the reverse complement of the R primer in order to do this.
  • What size product is expected for amplification of the wild-type cDNA?
  • How do you interpret the results of the RT-PCR?
  • Compare the wild-type and mutant DNA sequences to determine the nature of the mutation using programmes Clustal Omega https://www.ebi.ac.uk/Tools/msa/clustalo/).
  • Can you speculate how this mutation arose (hint: look at the positions of the exon junctions in the cDNA; Appendix III)?
  • What protein domain is affected by the mutation and in what way?

 

 

Appendix I. CFTR cDNA sequence

 

Note the sequence below is presented in FASTA format, which is a text-based format used in bioinformatics to represent nucleotide or protein sequences using their single-letter abbreviations. The first line denoted by ‘>’ is a comment line and is used to identify or describe the sequence.

 

>human_CFTR_wildtype_cDNA_6070 bp

gtagtaggtctttggcattaggagcttgagcccagacggccctagcagggaccccagcgcccgagagaccatgcagaggtcgcctctggaaaaggccagcgttgtctccaaactttttttcagctggaccagaccaattttgaggaaaggatacagacagcgcctggaattgtcagacatataccaaatcccttctgttgattctgctgacaatctatctgaaaaattggaaagagaatgggatagagagctggcttcaaagaaaaatcctaaactcattaatgcccttcggcgatgttttttctggagatttatgttctatggaatctttttatatttaggggaagtcaccaaagcagtacagcctctcttactgggaagaatcatagcttcctatgacccggataacaaggaggaacgctctatcgcgatttatctaggcataggcttatgccttctctttattgtgaggacactgctcctacacccagccatttttggccttcatcacattggaatgcagatgagaatagctatgtttagtttgatttataagaagactttaaagctgtcaagccgtgttctagataaaataagtattggacaacttgttagtctcctttccaacaacctgaacaaatttgatgaaggacttgcattggcacatttcgtgtggatcgctcctttgcaagtggcactcctcatggggctaatctgggagttgttacaggcgtctgccttctgtggacttggtttcctgatagtccttgccctttttcaggctgggctagggagaatgatgatgaagtacagagatcagagagctgggaagatcagtgaaagacttgtgattacctcagaaatgattgaaaatatccaatctgttaaggcatactgctgggaagaagcaatggaaaaaatgattgaaaacttaagacaaacagaactgaaactgactcggaaggcagcctatgtgagatacttcaatagctcagccttcttcttctcagggttctttgtggtgtttttatctgtgcttccctatgcactaatcaaaggaatcatcctccggaaaatattcaccaccatctcattctgcattgttctgcgcatggcggtcactcggcaatttccctgggctgtacaaacatggtatgactctcttggagcaataaacaaaatacaggatttcttacaaaagcaagaatataagacattggaatataacttaacgactacagaagtagtgatggagaatgtaacagccttctgggaggagggatttggggaattatttgagaaagcaaaacaaaacaataacaatagaaaaacttctaatggtgatgacagcctcttcttcagtaatttctcacttcttggtactcctgtcctgaaagatattaatttcaagatagaaagaggacagttgttggcggttgctggatccactggagcaggcaagacttcacttctaatggtgattatgggagaactggagccttcagagggtaaaattaagcacagtggaagaatttcattctgttctcagttttcctggattatgcctggcaccattaaagaaaatatcatctttggtgtttcctatgatgaatatagatacagaagcgtcatcaaagcatgccaactagaagaggacatctccaagtttgcagagaaagacaatatagttcttggagaaggtggaatcacactgagtggaggtcaacgagcaagaatttctttagcaagagcagtatacaaagatgctgatttgtatttattagactctccttttggatacctagatgttttaacagaaaaagaaatatttgaaagctgtgtctgtaaactgatggctaacaaaactaggattttggtcacttctaaaatggaacatttaaagaaagctgacaaaatattaattttgcatgaaggtagcagctatttttatgggacattttcagaactccaaaatctacagccagactttagctcaaaactcatgggatgtgattctttcgaccaatttagtgcagaaagaagaaattcaatcctaactgagaccttacaccgtttctcattagaaggagatgctcctgtctcctggacagaaacaaaaaaacaatcttttaaacagactggagagtttggggaaaaaaggaagaattctattctcaatccaatcaactctatacgaaaattttccattgtgcaaaagactcccttacaaatgaatggcatcgaagaggattctgatgagcctttagagagaaggctgtccttagtaccagattctgagcagggagaggcgatactgcctcgcatcagcgtgatcagcactggccccacgcttcaggcacgaaggaggcagtctgtcctgaacctgatgacacactcagttaaccaaggtcagaacattcaccgaaagacaacagcatccacacgaaaagtgtcactggcccctcaggcaaacttgactgaactggatatatattcaagaaggttatctcaagaaactggcttggaaataagtgaagaaattaacgaagaagacttaaaggagtgcttttttgatgatatggagagcataccagcagtgactacatggaacacataccttcgatatattactgtccacaagagcttaatttttgtgctaatttggtgcttagtaatttttctggcagaggtggctgcttctttggttgtgctgtggctccttggaaacactcctcttcaagacaaagggaatagtactcatagtagaaataacagctatgcagtgattatcaccagcaccagttcgtattatgtgttttacatttacgtgggagtagccgacactttgcttgctatgggattcttcagaggtctaccactggtgcatactctaatcacagtgtcgaaaattttacaccacaaaatgttacattctgttcttcaagcacctatgtcaaccctcaacacgttgaaagcaggtgggattcttaatagattctccaaagatatagcaattttggatgaccttctgcctcttaccatatttgacttcatccagttgttattaattgtgattggagctatagcagttgtcgcagttttacaaccctacatctttgttgcaacagtgccagtgatagtggcttttattatgttgagagcatatttcctccaaacctcacagcaactcaaacaactggaatctgaaggcaggagtccaattttcactcatcttgttacaagcttaaaaggactatggacacttcgtgccttcggacggcagccttactttgaaactctgttccacaaagctctgaatttacatactgccaactggttcttgtacctgtcaacactgcgctggttccaaatgagaatagaaatgatttttgtcatcttcttcattgctgttaccttcatttccattttaacaacaggagaaggagaaggaagagttggtattatcctgactttagccatgaatatcatgagtacattgcagtgggctgtaaactccagcatagatgtggatagcttgatgcgatctgtgagccgagtctttaagttcattgacatgccaacagaaggtaaacctaccaagtcaaccaaaccatacaagaatggccaactctcgaaagttatgattattgagaattcacacgtgaagaaagatgacatctggccctcagggggccaaatgactgtcaaagatctcacagcaaaatacacagaaggtggaaatgccatattagagaacatttccttctcaataagtcctggccagagggtgggcctcttgggaagaactggatcagggaagagtactttgttatcagcttttttgagactactgaacactgaaggagaaatccagatcgatggtgtgtcttgggattcaataactttgcaacagtggaggaaagcctttggagtgataccacagaaagtatttattttttctggaacatttagaaaaaacttggatccctatgaacagtggagtgatcaagaaatatggaaagttgcagatgaggttgggctcagatctgtgatagaacagtttcctgggaagcttgactttgtccttgtggatgggggctgtgtcctaagccatggccacaagcagttgatgtgcttggctagatctgttctcagtaaggcgaagatcttgctgcttgatgaacccagtgctcatttggatccagtaacataccaaataattagaagaactctaaaacaagcatttgctgattgcacagtaattctctgtgaacacaggatagaagcaatgctggaatgccaacaatttttggtcatagaagagaacaaagtgcggcagtacgattccatccagaaactgctgaacgagaggagcctcttccggcaagccatcagcccctccgacagggtgaagctctttccccaccggaactcaagcaagtgcaagtctaagccccagattgctgctctgaaagaggagacagaagaagaggtgcaagatacaaggctttagagagcagcataaatgttgacatgggacatttgctcatggaattggagctcgtgggacagtcacctcatggaattggagctcgtggaacagttacctctgcctcagaaaacaaggatgaattaagtttttttttaaaaaagaaacatttggtaaggggaattgaggacactgatatgggtcttgataaatggcttcctggcaatagtcaaattgtgtgaaaggtacttcaaatccttgaagatttaccacttgtgttttgcaagccagattttcctgaaaacccttgccatgtgctagtaattggaaaggcagctctaaatgtcaatcagcctagttgatcagcttattgtctagtgaaactcgttaatttgtagtgttggagaagaactgaaatcatacttcttagggttatgattaagtaatgataactggaaacttcagcggtttatataagcttgtattcctttttctctcctctccccatgatgtttagaaacacaactatattgtttgctaagcattccaactatctcatttccaagcaagtattagaataccacaggaaccacaagactgcacatcaaaatatgccccattcaacatctagtgagcagtcaggaaagagaacttccagatcctggaaatcagggttagtattgtccaggtctaccaaaaatctcaatatttcagataatcacaatacatcccttacctgggaaagggctgttataatctttcacaggggacaggatggttcccttgatgaagaagttgatatgccttttcccaactccagaaagtgacaagctcacagacctttgaactagagtttagctggaaaagtatgttagtgcaaattgtcacaggacagcccttctttccacagaagctccaggtagagggtgtgtaagtagataggccatgggcactgtgggtagacacacatgaagtccaagcatttagatgtataggttgatggtggtatgttttcaggctagatgtatgtacttcatgctgtctacactaagagagaatgagagacacactgaagaagcaccaatcatgaattagttttatatgcttctgttttataattttgtgaagcaaaattttttctctaggaaatatttattttaataatgtttcaaacatatataacaatgctgtattttaaaagaatgattatgaattacatttgtataaaataatttttatatttgaaatattgactttttatggcactagtatttctatgaaatattatgttaaaactgggacaggggagaacctagggtgatattaaccaggggccatgaatcaccttttggtctggagggaagccttggggctgatgcagttgttgcccacagctgtatgattcccagccagcacagcctcttagatgcagttctgaagaagatggtaccaccagtctgactgtttccatcaagggtacactgccttctcaactccaaactgactcttaagaagactgcattatatttattactgtaagaaaatatcacttgtcaataaaatccatacatttgtgtgaaa

 

 

 

Appendix II. Sequence of RT-PCR products

 

>Wild-type_RT-PCR product

ctgcgcatggcggtcactcggcaatttccctgggctgtacaaacatggtatgactctcttggagcaataaacaaaatacaggatttcttacaaaagcaagaatataagacattggaatataacttaacgactacagaagtagtgatggagaatgtaacagccttctgggaggagggatttggggaattatttgagaaagcaaaacaaaacaataacaatagaaaaacttctaatggtgatgacagcctcttcttcagtaatttctcacttcttggtactcctgtcctgaaagatattaatttcaagatagaaagaggacagttgttggcggttgctggatccactggagcaggcaagacttcacttctaatggtgattatgggagaactggagccttcagagggtaaaattaagcacagtggaagaatttcattctgttctcagttttcctggattatgcctggcaccattaaagaaaatatcatctttggtgtttcctatgatg

 

 

>Mutant_RT-PCR product

ctgcgcatggcggtcactcggcaatttccctgggctgtacaaacatggtatgactctcttggagcaataaacaaaatacaggatttcttacaaaagcaagaatataagacattggaatataacttaacgactacagaagtagtgatggagaatgtaacagccttctgggaggagacttcacttctaatggtgattatgggagaactggagccttcagagggtaaaattaagcacagtggaagaatttcattctgttctcagttttcctggattatgcctggcaccattaaagaaaatatcatctttggtgtttcctatgatg

 

 

 

 

 

Appendix III. Exons positions

 

exon nucleotides exon nucleotides exon nucleotides
exon 1 1 – 123 exon 10 1280 – 1462 exon 19 3059 – 3209
exon 2 124 – 234 exon 11 1463 – 1654 exon 20 3210 – 3437
exon 3 235 – 343 exon 12 1655 – 1749 exon 21 3438 – 3538
exon 4 344 – 559 exon 13 1750 – 1836 exon 22 3539 – 3787
exon 5 560 – 649 exon 14 1837 – 2560 exon 23 3788 – 3943
exon 6 650 – 813 exon 15 2561 – 2689 exon 24 3944 – 4033
exon 7 814 – 939 exon 16 2690 – 2727 exon 25 4034 – 4206
exon 8 940 – 1186 exon 17 2728 – 2978 exon 26 4207 – 4312
exon 9 1187 – 1279 exon 18 2979 – 3058 exon 27 4313 – 6070

 

The post MutationHuman appeared first on AssignmentHub.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *