reading tabulated data

May 26, 2011 at 8:59am

Hello guys,

I'm a complete C++ newbie an experience some difficilties.

I want to read tabulated data and store it in a 2D array. Sounds quite easy.

I've got a peace of code from a friend:

std::ostringstream inputFile;
inputFile << "path/filename";
std::ifstream inputData;
inputData.open(inputFile.str().c_str());

for (int i=0; i<numberOfRows; i++)
{
  for (int j=0; j<numberOfColumns; j++)
  {
      inputData >> Field[i][j]
  }
}

1. What does .str().c_str() mean?
2. If I don't know, how many columns and rows there are, how can I identify word wrap and EOF and use them instead of numberOfColumns/numberOfRows?

Regards,
linch

Last edited on May 26, 2011 at 8:59am

May 26, 2011 at 9:48am

hamsterman (4538)

.str().c_str() converts first, ostringstream to string, and then, string to char*. There is no reason for this code though. Remove first two lines, and change 4th to inputData.open( "path/filename" );

The second one is more complicated.
If there are no spaces between a word and a newline character, you can simply check the next character using .peek or .get methods. It will either return ' ' (a whitespace) or '\n' (a newline character, which is what you are looking for) (it may also return '\t' (a tab) ).
However if there may be a space at the end of line, this will fail as it only checks one character.
One solution is to put .get (but not .peek) in a loop to discard any ' 's or '\t's.
Another is to read a string with getline, build a stringstream from that string and read words from that stringstream.

May 26, 2011 at 10:43am

linch (16)

Thanks Hamsterman for the fast reply.

Do you find the new code ok:

//inputFile << "path/filename";
std::ifstream inputData;
inputData.open("path/filename");

int i=0;
while (inputData.good())
{
  int j=0;
  while((inputData.good())&&(inputData.peek!='\n'))
  {
      inputData >> Field[i][j];
      j++;
  }
  i++;
}

In this case I probably will need another loop previous to the upper one to somehow initialise the Field to the size of the table...

And could you please explain briefly what ostringstream is? I looked here: www.cplusplus.com/reference/iostream/ostringstream/ but I didn't get it.

Last edited on May 26, 2011 at 11:56am

May 26, 2011 at 11:53am

hamsterman (4538)

Sort of ok (except the first line).
As you say, you'll have problems if you don't know the size you need. The thing to do here is use an std::vector (are you familiar with it?).
In ostringstream, o stands for output, string means that it can be built from and converted to a string, stream is a method of handling data.. sort of like a queue.. Anyway, you can think of it as "std::cout" which does not print to console.

May 26, 2011 at 12:03pm

linch (16)

As you say, you'll have problems if you don't know the size you need. The thing to do here is use an std::vector (are you familiar with it?).

As as said, I am an absolute C++ newcomer and I'm not familiar with anything :) I only have some C experience. I'll search for std::vector and post here my questions if I'll need help.

Thank you ones again!

May 26, 2011 at 12:22pm

linch (16)

If I got it right, vector's memory is being dynamically reallocated, thus they can "grow" if new elements are added, right?

It could be an alternative to arrays, but I'm not sure if it is efficient to continuously reallocate the memory while reading the data because the file might be pretty large.

Is there a quick way to identify the number of rows in the file an the number of columns (words) in the first row without looping over the whole file contents?

May 26, 2011 at 12:40pm

hamsterman (4538)

1. right.

2. while there is some overhead, vector is a well written and optimized class. It wont copy the whole array every time you call push_back.

3. no.

May 26, 2011 at 1:21pm

linch (16)

www.cplusplus.com/reference/stl/vector:

...their elements can be accessed not only using iterators but also using offsets on regular pointers to elements

It means, that vector elements are located in the same memory block, right?
Namely you said:

while there is some overhead, vector is a well written and optimized class. It wont copy the whole array every time you call push_back

but I still can't imagine how it can work. For example if I create a vector with 10 elements, is memory for i.e. ~15 elements being allocated? Even well written an well optimized code can't predict how many elements are still to be added... So if I start with one element and continuously increase the number of elements up to i.e. 10,000, will memory be reallocated 9,999 times or not? It seems so if I read the push_back description:

Adds a new element at the end of the vector, after its current last element. The content of this new element is initialized to a copy of x.
This effectively increases the vector size by one, which causes a reallocation of the internal allocated storage if the vector size was equal to the vector capacity before the call. Reallocations invalidate all previously obtained iterators, references and pointers

3. no.

It's a pity... I've just thought: I'll also have to read out axes before I start with the data. These are files with only one row or column. So I just have to cont all separators (tabs, whitespaces and newline characters) to obtain the number of stored points. What would be the easiest way to do that?

Last edited on May 26, 2011 at 1:38pm

May 26, 2011 at 1:52pm

hamsterman (4538)

A vector has size and capacity. Size is the number of elements vector holds, capacity is the number of elements it can hold without reallocating. Vector allocates more memory that old size+1. You can write a program which pushes_back things into the vector and prints size and capacity to see how it behaves.
My point was that you shouldn't worry much about this though.

What wold be the easiest way to do that?

A simple loop with .get() in it. I suppose you only need to find gaps in the first line, and the '\n's in the rest.

May 30, 2011 at 8:50am

linch (16)

Thanks a lot

May 31, 2011 at 11:15am

linch (16)

Well, it works for the first file, but closing the file doesn't work
inputData.close;
returns an compiling error:

error: statement cannot resolve address of overloaded function

And if I open a new file without closing the old one
inputData.open("path/anotherFile");
inputData.good() stays false

Edit: sorry, my fault. Right way to close is:
inputData.close();

Last edited on May 31, 2011 at 11:19am

Topic archived. No new replies allowed.

C++

Forum

reading tabulated data