Changing 3 letters to 1 in between space

Forum

Forum
General C++ Programming
Changing 3 letters to 1 in between space

Changing 3 letters to 1 in between spaces

Pages: 12

I am trying to write code that changes three letters in a string to one letter. I am creating spaces in that string to find that three letter code and manipulating the content of the string.
My string is:
TCAATGTAACGCGCTACCCGGAGCTCTGGGCCCAAATTTCATCCACT

My code:
[pos = code.find(startc); //startc = AUG; looking for 3 letter code of AUG.
if (pos != string::npos) { // if found replace with letter M with // // //spaces around 'M'.
code.replace(code.find(startc), startc.length(), " M ");
}

cout << "Open reading frames: " << endl;
do {

i = 0;

do { // Looping to create an open reading frame.

for (b = 0; b*3 < code.length(); b++) { // Separating code // // //into codons (length 3 letters).
for (a = 0; a < 3; a++) {
cout << code[(a + b*3) + i];

}
cout << " " ; // creating space between the three letters.

}

i++;
cout << endl;
} while (i < 3);

reverse(code.rbegin(), code.rend()); // Reversing to create the second // //set reading frame.

c++;
cout << endl;

} while (c < 2);][/code]

While my output looks like this:
[Open reading frames:
UCA M UAA CGC GCU ACC CGG AGC UCU GGG CCC AAA UUU CAU CCA CU
CA M U AAC GCG CUA CCC GGA GCU CUG GGC CCA AAU UUC AUC CAC U
A M UA ACG CGC UAC CCG GAG CUC UGG GCC CAA AUU UCA UCC ACU

UCA CCU ACU UUA AAC CCG GGU CUC GAG GCC CAU CGC GCA AU M A CU
CAC CUA CUU UAA ACC CGG GUC UCG AGG CCC AUC GCG CAA U M AC U
ACC UAC UUU AAA CCC GGG UCU CGA GGC CCA UCG CGC AAU M ACU][/output]

As you can see, the AUG is replaced by M, but there is spacing between AUG sometimes and therefore M should not replace that.
Thanks for the help in advance! I am trying to figure how to substitute the one letter while the AUG is intact.
Thanks for the help in advance!

hbjgd (83)

pos = code.find(startc); //startc = AUG; looking for 3 letter code of AUG.
if (pos != string::npos) { // if found replace with letter M with // // //spaces around 'M'.
code.replace(code.find(startc), startc.length(), " M ");
}

cout << "Open reading frames: " << endl;
do {

i = 0;

do { // Looping to create an open reading frame.

for (b = 0; b*3 < code.length(); b++) { // Separating code // // //into codons (length 3 letters).
for (a = 0; a < 3; a++) {
cout << code[(a + b*3) + i];

}
cout << " " ; // creating space between the three letters.

}

i++;
cout << endl;
} while (i < 3);

reverse(code.rbegin(), code.rend()); // Reversing to create the second // //set reading frame.

c++;
cout << endl;

} while (c < 2);

Wrap your code in [code][\code]. looks better. but the \ is a /
Ok and one more thing. does startc = AUG ?? Because I'm not seeing AUG anywhere in the string. Unless it's just me. I'm And is this DNA? because that's what it looks like.

Last edited on

brian96853 (10)

Yeah, startc = AUG; it is at the start of my code. I'm still working on this block coding, I have barely posted in forums, because I can usually find what I am looking for, but haven't found anything on this.
Yes this is DNA, I am supposed to take the DNA sequence, find its complementary code, transfer RNA, find the longest possible sequence and find the one letter code for the codon triplet. I am currently trying to find the longest possible code, but am currently stuck at this spot, just trying to find AUG without spaces in between.
Did you change anything? I went through and did not see anything changed...

Pravesh Koirala (267)

@brian96853
I am not sure I understood your question but I think I maybe able to help you. But you need to post your full code and explain precisely what you want.
P.S.
My biology is extremely horrible

Last edited on

hbjgd (83)

Ok. since this is DNA to RNA all the 'T' = 'U'. I got that. So basically you want this program to take all of the T's make them U's, and then you want the program to replace all parts that say "AUG" with an 'M'?

If this what you were trying to say?

brian96853 (10)

Yes. If anyone knows about open reading frames, this would help, but I will try to explain:
I have now separated my code into a length of 3 codons and created a white space between them like this:

AGU UAC AUU GCG CGA UGG GCC UCG AGA CCC GGG UUU AAA GUA GGU GA
GUU ACA UUG CGC GAU GGG CCU CGA GAC CCG GGU UUA AAG UAG GUG A
UUA CAU UGC GCG AUG GGC CUC GAG ACC CGG GUU UAA AGU AGG UGA

Now I want to find all presence of AUG, but it has to be connected. If you look hard, you can find AUG, but it is sometimes split up. In line 1 at the 18th position starts A UG. Line 2 at 17th is AU G, and Line 3 16th position I actually get AUG, which means I want to change that to M. My code does not read it like that, and I want to know why, and how to fix that. I will submit my entire code on another reply. Thanks!

ED.: Guess I didn't really explain Open Reading Frames really, sorry.

Last edited on

brian96853 (10)

Here is my code, I have been trying to play with the code for a little bit, but am still getting the same results.

int main ()
{
    string startc = "AUG";
    char *defaultcode = "TCAATGTAACGCGCTACCCGGAGCTCTGGGCCCAAATTTCATCCACT";
    int i = 0,b, a;
    int c = 0;
    size_t pos = 0;
    string code;
    char choice;
    
    cout<< "Do you want the default code? "; // prompting user for default or input own
    cin >> choice;
    choice = toupper(choice);
    
    if (choice == 'Y') {
        code = defaultcode;        
        
    }
    else
        if (choice != 'Y') {
            cout << " Enter code: ";
            cin >> code;
            
            
        }

    
    cout <<  "The size of your code is: " << code.length() << endl;
    cout << " Code: " << "5' " << code << " 3' " << endl;
    
    while(code[i])     // Changing all code to upper case.
    {
        
        
        code[i] = toupper(code[i]);
        i++;
        
    }
    cout << endl;
   /* cout << "This is code after upper: " << code << endl;
    cout << endl;*/
    
    complement(code);     // function that finds the DNA's complement code.

    cout << endl << "\t\t\t\t\t\t\t" << " 5' "<< code << " 3'" << endl;

    for(int i = 0; i < code.length(); i++)   // finding the mRNA strand from 3' to 5'
    {
        switch (code[i]) {
            case 'T':
                code[i] = 'A';
                break;
            case 'A':
                code[i] = 'U';
                break;
            case 'G':
                code[i] = 'C';
                break;
            case 'C':
                code[i] = 'G';
                break;
                
            default:
                break;
        }
    }
    
    cout << "\t\t\t\t\t\t\t\t";
    for (int l = 0; l < code.length(); l++) {
        cout  << "|";
    }
   
    cout << endl << "The Code From Transcription: " << "3' " << code << " 5' ";
    cout << endl << endl;

    
    cout << "Open reading frames: " << endl;
    do {
 
        i = 0;

        do {                                                // Looping to create an open reading frame.

            for (b = 0; b*3 < code.length(); b++) {         // Separating code into codons (length 3 letters). 
                for (a = 0; a < 3; a++) {
                    cout << code[(a + b*3) + i];
                    
                    
                }
               cout << " " ;            // creating space between the three letters.
                
                    
                    
           pos = code.find(startc);
                if (pos != string::npos)
                    {
                        if (isspace(int(pos+ 1))) {
                            code.replace(code.find(startc), startc.length(), " M ");
                        }
                    }

                }

            i++;
            cout << endl;
        } while (i < 3);
        
        reverse(code.rbegin(), code.rend());            // Reversing to create the second set reading frame.
        
        c++;
        cout << endl; 
        
        } while (c < 2);
    

    cout << endl << endl;
    

    return 0;
}

Last edited on

hbjgd (83)

you need to add code to test whether or not there is blank space in the codon that line 94 finds. A for loop would accomplish this easily. If there is then the replace part will not be executed. Sorry that took awhile to decipher.

brian96853 (10)

Thank you, I will work on that.

brian96853 (10)

So would I use the 'isspace' function to accomplish this?

Pravesh Koirala (267)

Hey, is this what you are looking for?

#include <iostream>
#include <string>
#include <cstring>
using namespace std;
inline int Investigate(string s,int i)
{
    char tmp[4];
    int k=0;
    int j=i;
    while (k<3)
    {
        if (s[i] != ' ' ) //Ignore blanks
            tmp[k++] = s[i];
        i++;
    }
    tmp[3] = '\0';
    if (strcmp(tmp,"AUG") == 0) return i-j;
    return 0;


}
int main()
{
    string s = "TCAATGTAACGCGCTACCCGGAGCTCTGGGCCCAAATTTCATCCACT";
    //First convert all T to U
    for (int i=0;s[i];i++)
        if (s[i]=='T') s[i]='U';
    cout <<s<<endl;

    //Now separate into three
    for (int i=3;i<s.size();i+=4)
        s.insert(i," ");
    cout <<s<<endl;

    //Now run over and replace all occuring of AUG to M
    int j=0;
    for (int i=0;s[i];i++)
        if ((j = Investigate(s,i)) !=0)
        {
            //Now replace
            s.erase(i,j);
            s.insert(i,"M");
            if (s[i] != ' ') s.insert(i," ");
            if (s[i+2] != ' ') s.insert(i+2," ");
        }
    cout << s;

}

hbjgd (83)

Pravesh Koirala wrote:
Hey, is this what you are looking for?

Was trying to let him keep his code.

brian96853 wrote:
So would I use the 'isspace' function to accomplish this?

yes you could either use isspace() or you could use an if statement like this.

if (x == ' ')

either one works

Pravesh Koirala (267)

@hbjgd
The problem with me was that I yet didn't understood what he tried to do (I guess the credit goes to my horrible biology)
So, I just wanted to know if that was what he intended to do.

hbjgd (83)

HaHa it's all good. Just saying. I hadn't a clue what was going on at first either. Took me a good 20 minutes to decipher his code.

brian96853 (10)

So I do apologize for the extra help, but I do not seem to be getting this. I have the loop now at this:

pos = code.find(startc);                                // pos at the initial position of AUG
                if (pos != string::npos){
                    for (int s = 0; s < 3; s++) {  //looping 3 times to check for a space
                    pos++;                                 // incrementing the position
                    if (isspace(int(pos))) {          // validating if there is a space at the next position
                        pos = code.find(startc);  // if there is, I want to find the next AUG
                    }else{                               //if there is not a space, replace it with  M.
                        code.replace(code.find(startc), startc.length(), " M ");
                    }
                }
                }

The program terminates early with "terminate called without an active exception". Sorry for the extra help, but I am still a rookie at this stuff...

hbjgd (83)

Strike that I was wrong. I read it wrong. Hold on i'll find it.

Last edited on

brian96853 (10)

I did write my complement as my own function, it would mess up everything else.
Here it is:

void complement(string code);                        // forgot what this is called. (Drawing a blank)

void complement(string code)
{
    for(int i = 0; i < code.length(); i++)                 // set up for loop to switch the codons
    {
        switch (code[i]) {
            case 'T':
                code[i] = 'A';
                break;
            case 'A':
                code[i] = 'T';
                break;
            case 'G':
                code[i] = 'C';
                break;
            case 'C':
                code[i] = 'G';
                break;
                
            default:
                break;
        }
    }
    
    
    cout << "Your Complementary code is: " << " 3' " <<  code << " 5' " << endl;
    
    
}

Also, there still seems to be an error when I run the code:

terminate called without an active exceptionAGU sharedlibrary apply-load-rules all
(gdb)...The cursor also jumps to complement(code) with this:

Thread 1: Program received signal "SIGABRT".

I noticed if I comment out:
code.replace(code.find(startc), startc.length(), " M ");
it will run, but obviously not change AUG to M

Last edited on

hbjgd (83)

here i got it to work on my computer. what are you programming with? what software? I use dev.

#include<stdio.h>
#include<cmath>
#include<iostream>

using namespace std;
void complement(string code); 
int main ()
{
    string startc = "AUG";
    char *defaultcode = "TCAATGTAACGCGCTACCCGGAGCTCTGGGCCCAAATTTCATCCACT";
    int i = 0,b, a;
    int c = 0;
    size_t pos = 0;
    string code;
    char choice;
    
    cout<< "Do you want the default code? "; // prompting user for default or input own
    cin >> choice;
    choice = toupper(choice);
    
    if (choice == 'Y') {
        code = defaultcode;        
        
    }
    else
        if (choice != 'Y') {
            cout << " Enter code: ";
            cin >> code;
            
            
        }

    
    cout <<  "The size of your code is: " << code.length() << endl;
    cout << " Code: " << "5' " << code << " 3' " << endl;
    
    while(code[i])     // Changing all code to upper case.
    {
        
        
        code[i] = toupper(code[i]);
        i++;
        
    }
    cout << endl;
   /* cout << "This is code after upper: " << code << endl;
    cout << endl;*/
    
    complement(code);     // function that finds the DNA's complement code.

    cout << endl << "\t\t\t\t\t\t\t" << " 5' "<< code << " 3'" << endl;

    for(int i = 0; i < code.length(); i++)   // finding the mRNA strand from 3' to 5'
    {
        switch (code[i]) {
            case 'T':
                code[i] = 'A';
                break;
            case 'A':
                code[i] = 'U';
                break;
            case 'G':
                code[i] = 'C';
                break;
            case 'C':
                code[i] = 'G';
                break;
                
            default:
                break;
        }
    }
    
    cout << "\t\t\t\t\t\t\t\t";
    for (int l = 0; l < code.length(); l++) {
        cout  << "|";
    }
   
    cout << endl << "The Code From Transcription: " << "3' " << code << " 5' ";
    cout << endl << endl;

    
    cout << "Open reading frames: " << endl;
    do {
 
        i = 0;

        do {                                                // Looping to create an open reading frame.

            for (b = 0; b*3 < code.length(); b++) {         // Separating code into codons (length 3 letters). 
                for (a = 0; a < 3; a++) {
                    cout << code[(a + b*3) + i];
                    
                    
                }
               cout << " " ;            // creating space between the three letters.
                
                    
                    
            for (int x = 0; x < code.length(); x++)
                if (code[x] == 'A' && code[x+1] == 'U' && code[x+2] == 'G')
                {
                   code[x] = 'M';
                   for(int l = x+1; l < code.length(); l++)
                        code[l] = code[l+1];
                }
                }

            i++;
            cout << endl;
        } while (i < 3);
        
        reverse(code.rbegin(), code.rend());            // Reversing to create the second set reading frame.
        
        c++;
        cout << endl; 
        
        } while (c < 2);
    

    cout << endl << endl;
    
    system("pause");
    return 0;
}


void complement(string code)
{
    for(int i = 0; i < code.length(); i++)                 // set up for loop to switch the codons
    {
        switch (code[i]) {
            case 'T':
                code[i] = 'A';
                break;
            case 'A':
                code[i] = 'T';
                break;
            case 'G':
                code[i] = 'C';
                break;
            case 'C':
                code[i] = 'G';
                break;
                
            default:
                break;
        }
    }
    
    
    cout << "Your Complementary code is: " << " 3' " <<  code << " 5' " << endl;
    
    
}

And is this the way you want it too look for output?

Do you want the default code? y
The size of your code is: 47
 Code: 5' TCAATGTAACGCGCTACCCGGAGCTCTGGGCCCAAATTTCATCCACT 3'

Your Complementary code is:  3' AGTTACATTGCGCGATGGGCCTCGAGACCCGGGTTTAAAGTAGGTGA
5'

                                                         5' TCAATGTAACGCGCTACCCG
GAGCTCTGGGCCCAAATTTCATCCACT 3'
                                                                ||||||||||||||||
|||||||||||||||||||||||||||||||
The Code From Transcription: 3' AGUUACAUUGCGCGAUGGGCCUCGAGACCCGGGUUUAAAGUAGGUGA
5'

Open reading frames:
AGU UAC AUU GCG CGM GGG CCU CGA GAC CCG GGU UUA AAG UAG GUG A
GUU ACA UUG CGC GMG GGC CUC GAG ACC CGG GUU UAA AGU AGG UGA   \
UUA CAU UGC GCG MGG GCC UCG AGA CCC GGG UUU AAA GUA GGU GA   \c

 AG UGG MGA AAU UUG GGC CCA GAG CUC CGG GMG CGC GUU ACA UUG A
AGU GGM GAA AUU UGG GCC CAG AGC UCC GGG MGC GCG UUA CAU UGA   \
GUG GMG AAA UUU GGG CCC AGA GCU CCG GGM GCG CGU UAC AUU GA   \c



Press any key to continue . . .

brian96853 (10)

Not really, There shouldn't be an M on every line. Only when you have AUG bunched together on one triplet. Then M should completely Replace that section, for example this would be normal output without any substation going on.

AGU UAC AUU GCG CGA UGG GCC UCG AGA CCC GGG UUU AAA GUA GGU GA
GUU ACA UUG CGC GAU GGG CCU CGA GAC CCG GGU UUA AAG UAG GUG A
UUA CAU UGC GCG AUG GGC CUC GAG ACC CGG GUU UAA AGU AGG UGA

AGU GGA UGA AAU UUG GGC CCA GAG CUC CGG GUA GCG CGU UAC AUU GA
GUG GAU GAA AUU UGG GCC CAG AGC UCC GGG UAG CGC GUU ACA UUG A
UGG AUG AAA UUU GGG CCC AGA GCU CCG GGU AGC GCG UUA CAU UGA

Now with substitution:
AGU UAC AUU GCG CGA UGG GCC UCG AGA CCC GGG UUU AAA GUA GGU GA
GUU ACA UUG CGC GAU GGG CCU CGA GAC CCG GGU UUA AAG UAG GUG A
UUA CAU UGC GCG M GGC CUC GAG ACC CGG GUU UAA AGU AGG UGA

AGU GGA UGA AAU UUG GGC CCA GAG CUC CGG GUA GCG CGU UAC AUU GA
GUG GAU GAA AUU UGG GCC CAG AGC UCC GGG UAG CGC GUU ACA UUG A
UGG M AAA UUU GGG CCC AGA GCU CCG GGU AGC GCG UUA CAU UGA

It seems for some reason, it does not seem to see the space that I created.
I'm using Xcode on a Mac.

hbjgd (83)

I'm sorry it's 4am where I am. You've repeated that like times. I'm too tired to go anymore. I'll be back tomorrow.

Salutations.

Pages: 12

C++

Forum

Changing 3 letters to 1 in between spaces