read last 5 lines of a file using stl fstream

like the tail command on linux,
is it possible to implement using c++ stl fstream?

do a reverse lookup for the last 5 lines ( which means not read whole file into an array, or just skip N-5 lines)

any idea?
Thanks
The following is a quick example of how it could be done that I typed up a few years ago:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
#include <iostream>
#include <fstream>
#include <cstdlib>
#include <vector>
#include <stack>
#include <string>

class ReverseReader
{
  public:
    ReverseReader(const std::string& fileName,int numberOfLines,int estimatedAvgLineLength=80)
    : stream(fileName.c_str(),std::ios::ate|std::ios::binary), currentLine(0), cpos(-1), totalLines(numberOfLines),
      avgLineLength(estimatedAvgLineLength), lastLinePos(0)
    {
      if (stream.tellg()==0)return;
      try {while (currentLine<totalLines)readLine();}
      catch (int) {}
      printAllLines();
    }

  private:
    void fillVector()
    {
      int blockSize=std::min(int((totalLines-currentLine)*avgLineLength*1.3),int(stream.tellg()));
      if (blockSize<=0 || !stream.good())return;
      stream.seekg(-blockSize,std::ios::cur);
      buf.resize(lastLinePos);
      buf.insert(buf.begin(),blockSize,0);
      stream.read(&buf[0],blockSize);
      stream.seekg(-blockSize,std::ios::cur);
      cpos+=blockSize;
      lastLinePos+=blockSize;
    }

    char peek()
    {
      if (cpos<0)fillVector();
      if (cpos<0)throw int();
      return buf[cpos];
    }

    char readChar()
    {
      char ch=peek();
      cpos--;
      return ch;
    }

    void readLine()
    {
      try {while (readChar()!='\n');}
      catch(int)
      {
        if (lastLinePos-cpos-1)output(std::string(&buf[cpos+1],lastLinePos-cpos-1));
        throw;
      }
      output(std::string(&buf[cpos+2],lastLinePos-cpos-2));
      while (peek()=='\r')readChar();
      currentLine++;
      lastLinePos=cpos+1;
    }

    void output(const std::string& str)
    {
      if (currentLine==0 && str.empty())currentLine--;
      else lines.push(str);
    }

    void printAllLines()
    {
      while (lines.size()>0)std::cout << lines.top() << std::endl,lines.pop();
    }

    std::ifstream stream;
    int currentLine;
    int cpos;
    int totalLines;
    int avgLineLength;
    std::vector<char> buf;
    std::stack<std::string> lines;
    int lastLinePos;
};

int main(int argc,char** argv)
{
  if (argc<2)return 1;
  int n=argc>=3 ? atoi(argv[2]) : 0;
  if (n==0)n=10;
  ReverseReader reader(argv[1],n);
}


You could modify that to suit your own needs.
Thanks Athar, it seems a lot work to do even using stl
Why not just push the lines onto a deque, and delete the ones on the other side so as to keep the length at 5?
Or write something that reads the file backwards.
Last edited on
Why not just push the lines onto a deque, and delete the ones on the other side so as to keep the length at 5?

Because that would entail reading the entire file, which is usually not an option when you resort to the tail command.
Last edited on
Hmm, but yours seems to read the whole file too... Or am I missing something? I don't see any seek commands in there! To do this I would write a reversefgetc function like this:
1
2
3
4
5
int reversefgetc(FILE *f){
	fseek(f,-1,SEEK_CUR);
	int c=fgetc(f);
	fseek(f,-1,SEEK_CUR);
}

And then go back until I reached the nth '\n' and then output the file starting right after that '\n'.
Ok here it is:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <stdio.h>

int reversefgetc(FILE *f){
	if(fseek(f,-1,SEEK_CUR))return EOF;
	int c=fgetc(f);
	fseek(f,-1,SEEK_CUR);
	return c;
}

int main(int argc,char **argv){
	int i=0,c,n=5;
	if(argc<2)fputs("You need to enter some file!\n",stderr);
	if(argc>=3)sscanf(argv[2],"%d",&n);
	FILE *f=fopen(argv[1],"r");
	fseek(f,0,SEEK_END);
	while(i<=n&&(c=reversefgetc(f))!=EOF)
		if(c=='\n')i++;
	fgetc(f);
	while((c=fgetc(f))!=EOF)putchar(c);
}

This only reads the part that is output, although it reads it twice.
Oh, but it does seek ;)
The program is trying to guess how many bytes it has to read to cover the desired number of lines and reads the data in large blocks from the end of the file.

Of course, you could do it character by character, but the overhead would be rather large.
There won't be any difference if you read just 5 lines, but it becomes very noticeable if you read thousands of lines.
Personally, I use tail for log files that are too large to fully load into a text editor and I generally need a few thousand lines of context to diagnose a particular problem.

Locally, reading the data in chunks is up to a few hundred times faster than reading char by char.
But where it really matters is when reading a file from the network (e.g. a log file in a directory on a server mounted with SFTP). In that case the character-by-character method is far too slow for practical use.
Last edited on
Right, so you'd have to make a better version of reversefgetc that does some buffering.
I thought that the standard defines fseek for text mode when used to seek to the beginning of the file or positions previously recorded with ftell. Otherwise its implementation defined or smth. The same with the fstream family. You could open the file in binary, but then you will have to handle line termination sequences manually. (This may be desirable anyway, but may be not.)

Regards
Topic archived. No new replies allowed.