Hello,
I'm working on a project which involves manipulating a two-dimensional array. Please bear with me while I explain the purpose of the program. Assume that we have this table:
1 2 3 4 5 6 7 8 9 10 11
|
SUCCESS 1 9 2
FAILURE 5 9 1
SUCCESS 4 8 4
SUCCESS 0 5 7
SUCCESS 7 8 6
FAILURE 6 9 9
FAILURE 8 0 2
SUCCESS 2 0 1
FAILURE 1 2 1
FAILURE 4 7 5
FAILURE 3 3 5
|
"SUCCESS" and "FAILURE" are actually represented by numerical values, I just made the table this way in order to explain it better. The actual tables are all consisted of double values and have several thousands of records and about 20-30 columns.
The idea is to find
successivevalues of each column that give the highest "success rate". Something like "when column D is between 3 and 5, column C is between 1 and 2, and column B is between 6 and 8, then the "success rate" is 90%.
Of course we also need a minimum number of records that fulfill the above criteria.
Anyway, I'm not looking for code here, just some ideas of how I should implement it. I'm currently thinking about this: Begin by sorting by one column. Eg, let's start by column B:
1 2 3 4 5 6 7 8 9 10 11
|
SUCCESS 0 5 7
SUCCESS 1 9 2
FAILURE 1 2 1
SUCCESS 2 0 1
FAILURE 3 3 5
SUCCESS 4 8 4
FAILURE 4 7 5
FAILURE 5 9 1
FAILURE 6 9 9
SUCCESS 7 8 6
FAILURE 8 0 2
|
Then, get all possible combinations of successive records, as long as their total number is above a certain minimum, and copy them to a new array. Eg get all records where column B is 1-2, then all records where column B is 1-3, then 1-4, 1-5, 1-6, 1-7, 1-8, 2-3, 2-4, 2-5, 2-6, 2-7, 2-8, etc.
In those new arrays, count the ratio of "SUCCESS" and if it is above a certain threshold, report it. Then continue with doing the same thing for column C, etc.
Obviously this will be very time consuming. If we have 20 columns and in each column we have 10 "ranges" of values, then we'd have to create about 10 trillion new arrays. I have some ideas about how to reduce the number of iterations, but for the time being I wonder whether my basic idea is efficient or if there is any other way to do the same thing in a faster way.
If you've read to this point, congratulations :D