Hey Guys, I am at the final stage of a model I am working on, where I am optimizing the variance, to get the best parameter fit for my results compared to clinical data sets. I am using Powell's optimization method, which I already got to work with a simple case with just four data file each having one row each.
However, my goal is the solve the Runge-Kutta method for each dataset(with parameter values from four different input files with the same number of rows (Dataset)), optimize the model-parameters, then increment the variance as appropriate.
My challenge now is that I have 4 files that contain parameters unique to each data set and each line in these files represents a data set. These files have the same number of rows which is the total number of datasets that I have, but a varying number of column, depending on the parameters that each represents.
Say;
X0.dat: Contains the 30 initial values for the dependent variables on each row: which I have as thirty valarray elements
X0 X1 X2 ............ X30
X0 X1 X2 ............ X30
X0 X1 X2 ............ X30
X0 X1 X2 ............ X30
X0 X1 X2 ............ X30
X0 X1 X2 ............ X30
|
Inf.dat: Contains the 'ith' species, in location 'j', the value of I[ij], the time it starts, t_start, and the time it stops, t_stop. I have made all other I[i*j] = 0 except the one declared in each row of this file.
Again my Inf.dat file with the same number of rows, as the X0.dat file above
i j Iij t_start t_stop \\All for data set 1
i j Iij t_start t_stop \\ Dataset 2
i j Iij t_start t_stop
i j Iij t_start t_stop
i j Iij t_start t_stop
i j Iij t_start t_stop
|
time.dat file, which is a set of file I have been able to digitize from each clinical data set, at which points the concentrations were taken. I have all the timepoints as a vector of times. Here, the first entry of each row is the number of data points in the dataset, and the vector is populated after.
8 t0 t1 t2 t3 t4 t5 t6 t7 \\All for data set 1
4 t0 t1 t2 t3 \\ Dataset 2
5 t0 t1 t2 t3 t4
6 t0 t1 t2 t3 t4 t5
2 t0 t1
8 t0 t1 t2 t3 t4 t5 t6 t7
|
Lastly, I have a set of experimental concentrations, from the clinical data for which has the same number of row and column as the time file above. In this file, each row represents a clinical data set and the first entry on each row is the species number (0-29 from my X0), whose concentrations in time was available from that dataset.
Xclin.dat
1 X1(t0) X1(t1) X1(t2) X1(t3) X1(t4) X1(t5) X1(t6) X1(t7) \\Dataset 1
1 X1(t0) X1(t1) X1(t2) X1(t3) \\ Dataset 2
3 X3(t0) X3(t1) X3(t2) X3(t3) X3(t4)
2 X2(t0) X2(t1) X2(t2) X2(t3) X2(t4) X2(t5)
4 X4(t0) X4(t1)
5 X5(t0) X5(t1) X5(t2) X5(t3) X5(t4) X5(t5) X5(t6) X5(t7)
|
Currently, I can do the single dataset case, solve the RK4, and optimize the transfer variable (not included here), but i need your kind suggestions as to how can do this for all the data sets in each file.
I have tried a vector of structs, but I can only store the row entries of a single, file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
|
#include <iostream>
#include <fstream>
#include <iomanip>
#include <valarray>
#include <vector>
#include<string>
#include<sstream>
using namespace std;
typedef valarray<double> val;
void read_X(ifstream &infile, val(&x));
void read_I(val(&I));
struct Inf
{
int i, j;
double Iij;
double ton, toff;
};
int ijIndex( int i, int j ) // Casting Dimensions from 2 to 1
{
return 5 * ( i - 1 ) + ( j );
}
istream& operator >> (istream &is, Inf &s)
{
string line;
getline(is, line);
istringstream ss(line);
ss >> s.i >> s.j >> s.Iij >> s.ton >> s.toff ;
return is;
}
vector<Inf> Xdata;
val Iij(20);
int main()
{
Iij = 0;
ifstream infile("Data.txt");
Inf I;
if(infile.is_open()){
while (infile >> I) Xdata.push_back(I);
}
else cerr << "The file cannot be opened" << endl;
for(Inf s : Xdata) cout << s.ton << " " << s.Iij << endl;
infile.close();
return 0;
}
|
data.txt
1 1 0.1204 0.0 1.170
1 2 0.142 1.0 1.170
1 3 0.523 5.0 5.170
1 4 0.200 2.0 4.170
2 1 0.4 3.0 4.170
2 2 0.1204 0.0 1.170
2 3 0.1204 0.0 1.170
2 4 0.1204 0.0 1.170
3 1 0.1204 0.0 1.170
3 2 0.1204 0.0 1.170
3 3 0.1204 0.0 1.170
3 4 0.1204 0.0 1.170
4 1 0.1204 0.0 1.170
4 2 0.1204 0.0 1.170
4 3 0.1204 0.0 1.170
4 4 0.1204 0.0 1.170
|
Basically, my algorithm would be:
** in main
1. Open all the files
2. call Powell's method
3. Takes the user-supplied number of datasets in the four files, say N;
4. Start i = 0;
5. Calls row i, in each of the 4 files and store their values as appropriate;
6. Solve the RK4 equation and update the tolerance.
5. increment i (i++)
6. When i == N stop and print the final value of tolerance;
7. Go to 5