''' According to the central dogma in molecular biology, genetic information in biological systems flows from DNA to RNA to protein. To make an mRNA, a part of a DNA strand is transcribed. The resulting mRNA has nucleotides complementary to the DNA template (note: RNA contains uracil instead of thymine). To produce a protein, ribosomes then use codons (sets of 3 nucleotides) on mRNA, which are matched with their complementary anticodones on aminoacid-carrying tRNAs. Let's go through the process of a) transcription, b) translation, and c) checking for mutations in GFP, everyone's favourite marker protein. ''' ''' Exercise: Pseudocode For each of the exercises below, start by writing pseudocode on a piece of paper. Only once you're done with pseudocode should you start working on Python code in VS Code. ''' ''' Exercise A: GFP transcription Write a for loop that will take the string GFP_DNA and convert it to the complementary RNA sequence. Put it into a string called GFP_RNA. You do not have to worry about reversing the order. You may assume the input is all capital letters and contains only A, T, C, and G. Return the GFP_RNA string as capital letters as well. ''' GFP_DNA = "TATTTACGATGAGCAAAGGCGAAGAACTGTTTACCGGCGTGGTGCCGATTCTGGTGGAACTGGATGGC\ GATGTGAACGGCCATAAATTTAGCGTGAGCGGCGAAGGCGAAGGCGATGCGACCTATGGC\ AAACTGACCCTGAAATTTATTTGCACCACCGGCAAACTGCCGGTGCCGATGCCGACCCTG\ GTGACCACCTTTAGCTATGGCGTGCAGTGCTTTAGCCGCTATCCGGATCATGCGAAACAG\ CATGATTTTTTTAAAAGCGCGATGCCGGAAGGCTATGTGCAGGAACGCACCATTTTTTTT\ AAAGATGATGGCAACTATAAAACCCGCGCGGAAGTGAAATTTGAAGGCGATACCCTGGTG\ AACCGCATTGAACTGAAAGGCATTGATTTTAAAGAAGATGGCAACATTCTGGGCCATAAA\ CTGGAATATAACTATAACAGCCATAACGTGTATATTATGGCGGATAAACTGAAAAACGGC\ ATTAAAGTGAACTTTAAAATTCGCCATAACATTGAAGATGGCAGCGTGCAGCTGGCGGAT\ CATTATCAGCAGAACACCCCGATTGGCGATGGCCCGGTGCTGCTGCCGGATAACCATTAT\ CTGAGCACCCAGAGCGCGCTGAGCAAAGATCCGAACGAAAAACGCGATCATATGGTGCTG\ CTGGAATTTGTGACCGCGGCGGGCATTACCCATGGCATGGATGAACTGATTAAA" # Your code here ''' Exercise B: GFP translation Write a for loop that will take the output string GFP_RNA from Exercise A and return the protein code GFP_protein (as capitalized one letter amino acid codes). You *cannot* assume that the coding sequence is aligned with the start of your sequence (i.e., there may be nucleotides in the RNA sequence before the start codon). Use standard RNA codon table from: https://en.wikipedia.org/wiki/DNA_and_RNA_codon_tables ''' # Your code here ''' Exercise C: GFP mutations After trying to grow the cells with the protein you just translated, the cells were yellow. Turns out the colleagues who gave us the GFP actually gave us a GFP variant. Write a code that gives a list of the amino acid mutations written as XNY, where X is the original amino acid, N is the position of the mutation, and Y is the new amino acid. Example: Mutation of K to G at position 55 is written as: K55G As input you have: * GFP_protein - your sequence from the previous section * GFP_reference - a reference sequence You may assume the sequences are the same length. ''' GFP_reference = "MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTL\ VTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLV\ NRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLAD\ HYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK" # Your code here