How to print out part of an array?

Hi all,

I'm currently working on a program that takes RNA letters A, C, G, and U and converts them into codons. Ex. AUG = MET (codon)

I am then supposed to go through the array that contains the codons and output the expressed protein, which is a sequence that starts with MET, ends with the codon STO (stop), and MUST have at least one other codon besides MET or STO in between. If not, then the program should just ignore it. EDIT: I am not allowed to use strings. Only for loops and if/else statements. For example:

If this is what the elements in the array is:
ILE-ASN-ASP-ARG-LYS-ASN-STO-MET-LYS-SER-ASP-LYS-STO-ARG-GLN-ASP-SER-LYS-GLY-SER-MET-STO-RP-GLU-HIS-ALA-MET

Then the program should print out MET-LYS-SER-ASP-LYS-STO-. So far, my program has been able to do this... sometimes. However, when there is an MET codon near the end of the array but no STO codon towards the end as well, it will print out everything after MET until the end of the array.

Here is the section of my program that is supposed to do this:

for (int i = 0; i < arraySize; i+=3)
{
if (codons[i] == 'M' && (codons[i+3] != 'M' || codons[i+4] != 'T'))
{
for (codons[i] = 'M';i < arraySize; i+=3)
{
if (codons[i] == 'M' && codons[i+4] == 'T')
{
continue;
}
else if ( codons[i+1] == 'T' && codons[i+2] == 'O')
{
cout << codons[i] << codons[i+1] << codons[i+2] << "-";
break;
}

else
{
cout << codons[i] << codons[i+1] << codons[i+2] << "-";
continue;
}
}
//this is just to separate the outputs and make things look neater
cout << endl;
cout << "------------------------------------------" << endl;
}
}

Any help in this matter is greatly appreciated. Thanks in advance!
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <iostream>
#include <string>
#include <sstream>

int main()
{
    // http://www.mochima.com/tutorials/strings.html
    const std::string met = "MET" ;
    const std::string stop = "STO" ;
    constexpr char seperator = '-' ;

    const char array[] = "ILE-ASN-ASP-ARG-LYS-ASN-STO-MET-LYS-SER-ASP-LYS-STO-"
                           "ARG-GLN-ASP-SER-LYS-GLY-SER-MET-STO-RP-GLU-HIS-ALA-"
                           "MET-SER-LYS-GLY-SER-ASP-ARG-LYS-STO-MET-LYS-SER-ASP" ;

    // http://www.artima.com/cppsource/streamstrings3.html
    std::istringstream stm(array) ;

    std::string protein ;
    std::string codon ;

    // http://www.cplusplus.com/reference/string/string/getline/
    while( std::getline( stm, codon, seperator) )
    {
        if( codon == met ) protein = met ; // start new protein

        // http://www.cplusplus.com/reference/string/string/find/
        else if( protein.find(met) == 0 ) // if protein starts with "MET"
        {
            if( codon == stop )
            {
               if( protein.size() > met.size() ) // if there is a codon after "MET"
                   std::cout << protein + seperator + stop << '\n' ; // print it
               protein = "" ; // "STO", that is the end of this protein
            }
            else protein += seperator + codon ; // append the codon to the protein
        }
    }
}

http://coliru.stacked-crooked.com/a/bc23054513db77c0
Dear JLBorges,

Thank you very much for your reply! However, would there be a way to do the same thing without using strings? Thanks again!
Of course you can use C-strings. strstr http://www.cplusplus.com/reference/cstring/strstr/

1. Find (strstr) from array the first MET *
2. Find (strstr) from array the first STO after location of MET *
3. strstr returns pointers, i.e. locations. If the MET and STO are not consecutive, show the piece of array. Else repeat (1) from location of STO.

[*] If strstr does not find the codon, then abort.


[Edit] If you are translating from RNA, then you are reading bases and converting them to codons according to a table. You could at that point represent each residue in protein with single character, rather than the 3-char codons that you do use now. Just invent a char for STO. Furthermore, you could keep track with pointers, where the MET and STO do occur, while you translate.

When you finally print, you can convert the 1-char protein sequence into 3-char format.


An obvious homework question due to reinvention of wheel and standard library being forbidden territory.
Last edited on
Topic archived. No new replies allowed.