Reading coordinates from Matrix file without 2D arrays

Hi all,

I'm currently working on a program (bioinformatics project) that involves reading multiple files, including a matrix, and outputting the results onto another file. What I'm having the most trouble with is how I would go about reading the matrix file like a coordinate system (for lack of a better term). For example, if I have the following amino acids in:

fileA: CTTNCLAPLA
fileB: CTTNSITPVA

The program would then read the two files, compare each letter, and refer to the matrix to find the number corresponding to the two letters, which in turn determines the probability of a letter in fileA mutating to a letter in fileB.

Since the first letter in each file is C, the program would read the matrix and output in a separate file:


C
.
S

The "." meaning that the number (according to the matrix, 0) was 0 but not the same letter.

Here is part of the matrix (the rest wouldn't fit):
NOTE: The matrix I must use is in a .csv file, and does not include spaces I believe.


_	A	R	N	D	C	Q	E	G	H	I	L	K
A	2	-2	0	0	-2	0	0	1	-1	-1	-2	-1
R	-2	6	0	-1	-4	1	-1	-3	2	-2	-3	3
N	0	0	2	2	-4	1	1	0	2	-2	-3	1
D	0	-1	2	4	-5	2	3	1	1	-2	-4	0
C	-2	-4	-4	-5	12	-5	-5	-3	-3	-2	-6	-5
Q	0	1	1	2	-5	4	2	-1	3	-2	-2	1
E	0	-1	1	3	-5	2	4	0	1	-2	-3	0
G	1	-3	0	1	-3	-1	0	5	-2	-3	-4	-2
H	-1	2	2	1	-3	3	1	-2	6	-2	-2	0
I	-1	-2	-2	-2	-2	-2	-2	-3	-2	5	2	-2
L	-2	-3	-3	-4	-6	-2	-3	-4	-2	2	6	-3
K	-1	3	1	0	-5	1	0	-2	0	-2	-3	5
M	-1	0	-2	-3	-5	-1	-2	-3	-2	2	4	0
F	-3	-4	-3	-6	-4	-5	-5	-5	-2	1	2	-5
P	1	0	0	-1	-3	0	-1	0	0	-2	-3	-1
S	1	0	1	0	0	-1	0	1	-1	-1	-3	0
T	1	-1	0	0	-2	-1	0	0	-1	0	-2	0
W	-6	2	-4	-7	-8	-5	-7	-7	-3	-5	-2	-3
Y	-3	-4	-2	-4	0	-4	-4	-5	0	-1	-1	-4
V	0	-2	-2	-2	-2	-2	-2	-1	-2	4	2	-2


I apologize if my explanation is confusing. Please let me know if you need any clarification. Any help is greatly appreciated. Thanks in advance!
Last edited on
You should read the entire matrix file into a two-dimensional array stored in memory. Then you can perform row-column lookups like, for example:
int val = matrix[i][j];
It would be needlessly complicated to try to read from a position in the actual matrix file.
Hi yulingo, thanks for the response. I'm not familiar with 2D arrays, but by using them to perform row-column lookups for let's say, row A and column A, will that give me the number 2 as a result? Or would I have to write some other code to get that? Thanks for your time.
You have to access arrays by a numerical index. You can use an enumeration to associate the appropriate amino acid with the appropriate row/column index.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
// Enum for rows
enum ROWS
{
    ROW_A = 0,
    ROW_R = 1,
    ...
}

// Convert a char to a row index
int GetRowIndex(char amino)
{
    int index;

    switch (amino)
    {
        case 'A':
            index = ROW_A;
            break;
        case 'R':
            index = ROW_R;
            break;
        ...
        default:
            index = -1;
            break;
    }

    return index;
}

...
// Read in your amino acids
int rowIdx = GetRowIndex(rowAmino);
int colIdx = GetColIndex(colAmino);

if ((rowIdx != -1) && (colIdx != -1))
    int val = matrix[rowIdx][colIdx]
Thank you again for your response yulingo. I just found out from my professor that we are not allowed to use 2D arrays for this particular project.

So far, I think I have found a way to search through the top line of the matrix, find the letter I need, store the index of that letter in a variable, and search through the rest of the matrix using that index.

However, this is rather difficult due to the commas. I tried using the ignore function but it just doesn't seem to work. I also tried the getline(...,..., ',') which would basically get one character at a time.

Is there a way to either replace the commas from the file with the next character or another way to ignore them?

Thanks!
.
.
Topic archived. No new replies allowed.