Hello, I got this exercise where I have to compare gene fragments of two (or more ) sheep and print out the matching coefficient. But I get random data when I try to do this way. What would be correct syntax to use to get it working? Data is provided this way:
4 6 // the amount of sheep, the lenght of DNR fragment
3 // Sheep which will be compared to
Baltukas // sheep names TAGCTT // their DNR fragments
Bailioji ATGCAA
Doli AGGCTC
Smarkuolis AATGAA
So basically I have to find how many matching fragments the 3rd sheep ( in this case ) have with others. for example with first sheep it would be: 3.
Also the exercise is asking to use arrays and structures.
#include <iostream>
#include <fstream>
usingnamespace std;
struct genes{
char name[20];
char DNR[20];
int matching[20];
}sheep[20];
int main()
{
int n; // amount of sheep (2<=n<=20)
int m; // DNR fragment lenght (4<=m<=20)
int e; // sheep number that is being examined
ifstream duom ("U2.txt");
duom >> n;
duom >> m;
duom >> e;
n=n*2;
for (int i=0; i<n; i++){
duom >> sheep[i].name[i];
duom >> sheep[i].DNR[i];
}
for (int i=0; i<n; i+=2){
if (sheep[e].DNR[i]==sheep[i].DNR[i+1]){
matching[i]++;
}
}
duom.close();
return 0;
}
if these are c-strings containing text data then you can use strcmp.
if(strcmp(a,b) ==0) they are equal.
if it is not text data, memcmp is needed (eg, binary bytes).
if a 'matched fragment' is 'a letter of text' then you just have to do it the old fashioned way, compare letter by letter.
Does order/position matter? if not sort the data before comparison makes it easier.
eg
Doli AGGCTC
Smarkuolis AATGAA
sorted, you get
Doli ACCGGT
Smarkuolis AAAAGT
AGT match ... easy to loop to produce this, right?
if order/position matter you need to understand the 'rules' of what a 'match' is to proceed, and I failed biology.
@jonnin its the last thing, just check if letters match. But I was asking if what I wrote is correct, cause it gave me some random data instead of anything usefull
:29:12: error: 'matching' was not declared in this scope
matching[i]++;
^~~~~~~~
Once you fix that,
your match does not look correct at all.
its iterating over n, which is a constant, looking at what appear to be variable length strings (?) or are they all 6? Is n*2 == 6?. Write out n, to start. After setting n = n*2 it had better be 6.
yours is checking [i] against [i+1] where i is tied to n above, which is either not looking at all of the letters in the tag, or looking at one past off the end of the array, and seems totally wrong depending on what n is (?). Seems they should both be [i] ?!
actually I've been reworking this code, fixed this specific issue, but could you tell me if I can pass structure array to function? "sheep[20].DNR[20]" part specifically
yes.
void foo (genes* gp)
{
cout << gp[0].DNR;
}
...
foo(sheep);
speaking of sheep, sheep is a global. that is bad practice, as
is using C style after-struct hidden variables. I would prefer:
struct genes{
char name[20];
char DNR[20];
int matching[20];
}; //no variables hiding here!
int main
{
genes sheep[20]; //globals are baaad...
foo(sheep);
baaa();
}
etc.
//I am not sure what you really want. here are 2 ideas...
struct genes{
string name;
string DNR;
int matching[20];
}sheep[20];
string matcher(string a, string b)
{ //this is deduped/sorted/matched so ABC and BXC returns CB
static string result;
result = "";
bool ahas[256] = {false};
bool bhas[256] = {false};
for(char& c : a)
ahas[c] = true;
for(char& c : b)
bhas[c] = true;
for(char c = 'A'; c <= 'Z'; c++)
{
if(ahas[c] && bhas[c])
result+=c;
}
return result;
}
string matcher2(string a, string b)
{//this is just positional compare.
//so CBAA and ABBC returns B (second b same position, same value)
static string result;
result = "";
int i = 0;
for(char& c : a) //assumes a and b same length
if(b[i++] == c)
result+= c;
return result;
}
int main()
{
sheep[0].DNR = "TAGCTT";
sheep[1].DNR = "ATGCAA";
sheep[2].DNR = "AGGCTC";
sheep[3].DNR = "AATGAA";
sheep[4].DNR = "AAGGXX"; //1-3 are yours and less than exciting output
sheep[5].DNR = "ATGCAA";
//example of c++ sort that we don't need explicitly now.
//The matcher is sorting and deduplicating the string using a bucket method.
sort(sheep[0].DNR.begin(), sheep[0].DNR.end());
cout << "-->" << sheep[0].DNR<< endl << endl;;
//sheep[0] is still sorted here, keep in mind..
cout << matcher(sheep[0].DNR, sheep[1].DNR) << endl;
cout << matcher(sheep[0].DNR, sheep[2].DNR) << endl;
cout << matcher(sheep[0].DNR, sheep[4].DNR) << endl;
cout << matcher2(sheep[0].DNR, sheep[4].DNR) << endl;
cout << matcher2(sheep[5].DNR, sheep[1].DNR) << endl;
}
-->ACGTTT
ACGT
ACGT
AG //all that match
AG //second and third letters
GC
Is this anything at all like what you wanted?
#include <iostream>
#include <fstream>
usingnamespace std;
struct genes{
string name;
char DNR[20];
int coeffi=0;
};
int main()
{
genes sheep[20];
ifstream duom ("U2.txt");
ofstream rez ("U2rez.txt");
int n; // amount of sheeps
int m; // DNR fragment lengh
int e; // comparison subject
duom >> n;
duom >> m;
duom >> e;
for (int i=0; i<n; i++){
duom>>sheep[i].name;
for (int j=0; j<m; j++){
duom>>sheep[i].DNR[j];
}
}
for (int i=0; i<n; i++){
if (i!=e-1){
for (int j=0; j<m; j++){
if (sheep[e-1].DNR[j]==sheep[i].DNR[j]){
sheep[i].coeffi++;
}
}
}
}
rez << sheep[e-1].name << endl;
for (int i=0; i<n; i++){
if (i!=e-1){
rez << sheep[i].name << " " << sheep[i].coeffi << endl;
}
}
duom.close();
rez.close();
return 0;
}
I need to move this:
1 2 3 4 5 6 7 8 9
for (int i=0; i<n; i++){
if (i!=e-1){
for (int j=0; j<m; j++){
if (sheep[e-1].DNR[j]==sheep[i].DNR[j]){
sheep[i].coeffi++;
}
}
}
}
into separate function. And then make another function that would sort the results from that.
It should sort from highest number to lowest and if results are the same then alphabetically by the names.