Best way to store records / fields of csv-files in memory?

Jan 15, 2019 at 8:58pm
Hello,
my csv-file has 1000 records and every record has 15 fields separated with "/".
Now I want to analyse this data and I'm not sure how I have to store the data in memory for a fast access.
Should I store them in vector of objects or what is your advice?
What is the easiest way to read the fields of the records all at once in an structure / object....

Thank you
Jan 15, 2019 at 9:05pm
Sure, store it in a vector<record>.

e.g.
1
2
3
4
5
struct Record {
    std::string name;
    int age;
    // etc. (the other 15 fields)
};


If you're asking about how to actually part the file, use getline.
https://en.cppreference.com/w/cpp/string/basic_string/getline

getline can take in a delimiter as its 3rd argument.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Example program
#include <iostream>
#include <string>
#include <sstream>
#include <fstream>
int main()
{
    // replace this with your file:
    std::istringstream f("do/you/understand?");
    //std::ifstream f("test.txt");
    
    for (int i = 0; i < 3; i++)
    {
        std::string token;
        std::getline(f, token, '/');
        std::cout << token << '\n';
    }
}


Another example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Example program
#include <iostream>
#include <string>
#include <sstream>
#include <fstream>
#include <vector>

struct Record {
    std::string foo;
    std::string bar;
    std::string bah;
};

int main()
{
    // replace this with your file:
    std::istringstream f("test1/patient1/1/test2/patient2/2");
    //std::ifstream f("test.txt");
    
    int num_records = 2;
    std::vector<Record> records(num_records);
    
    for (int i = 0; i < num_records; i++)
    {
        Record rec;
        std::string token;
        
        std::getline(f, token, '/');
        rec.foo = token;
                
        std::getline(f, token, '/');
        rec.bar = token;    
        
        std::getline(f, token, '/');
        rec.bah = token;
        
        records[i] = rec;
    }
    
    for (int i = 0; i < num_records; i++)
    {
        std::cout << records[i].foo << " " << records[i].bar << " " << records[i].bah << '\n';
    }
}
test1
patient1
1
test2
patient2
2


I'm using std::istringstream because it's easy to communicate over the internet with it. You probably want to replace std::istringstream with std::ifstream if you are reading from a file.
Last edited on Jan 15, 2019 at 9:14pm
Jan 16, 2019 at 6:59pm
Hello,
thanks for the examples. The second code is perfect for my requirements and its easy to understand for me.
Jan 16, 2019 at 8:21pm
One Question more: How about using "new" to put the mass data to heap instead of stack?
Jan 16, 2019 at 8:28pm
A vector places the "data" on the heap already.
Jan 16, 2019 at 8:43pm
formatyes, why not try researching it, and if you can't figure it out, show your attempt to us?

The boilerplate looks like this:
1
2
3
int* arr = new int[size];
// access arr[index]
delete[] arr; // always 'delete' anything you 'new' 

But as jlb said, use vectors to avoid these pitfalls.
Last edited on Jan 16, 2019 at 8:44pm
Jan 17, 2019 at 6:03am
omg ...yes I remember ... I read it ... about heap und vectors but I forgot it.
Now I'm already working with your information and I'm happy with the first results :-)
thank you
Topic archived. No new replies allowed.