Counting characters in several files

I want to know how many characters, words or lines I've got in my source code for something else I'm writing without having to count each file separately and then add them up in my head. So I'm writing a program to do it for me -- sort of like the UNIX wc program except I want to be able to count up to 512 files in total.

I can't think of a condition for my while loop;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
void countchars(std::string infiles) {
    int x[512], total, cfno = total = 0;
    int i, j = i = 0;

    // Get total amount of files
    while (j < int(infiles.length()))
        if (infiles[j] != ';')
            j++;

    std::string cfname = "";

    while (cfno < j) {
        // Get current filename
        while (infiles.find(";") == std::string::npos) {
            cfname += infiles[i];
            ++i;
        }

        // The current file's number
        ++cfno;

        std::ifstream curFile;
        curFile.open(cfname.c_str());

        if (curFile.is_open())
            while (!curFile.eof()) {
                curFile.get();
                ++(x[cfno]); // Counted one more character
            }
    }

    // Got characters per file, give total characters in all files:
    while (i++ < int(sizeof(x))) {
        total += x[i];
        if (x[i] != 0)
            std::cout << "Characters in file #" << i << ": " << x[i] << "\n";
    }

    std::cout << "Total characters in all files: " << total << std::endl;
}


The files in the string "infiles" are semi-colon-separated.

Am I doing this right? I'm not sure I am :l
Last edited on
I was frustrated because of same problem of counting number of lines of code in all of my files, so I wrote a simple KSH script. I am not sure if you specifically want to do in C/C++ or something like a simple script will work for you. However, if a script can work, then here is the code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
!/bin/ksh

# resultant number of lines
typeset -i result=0

for n in `ls` 
do
if [[ ! -d $n  ]]  
then 
result+=`cat $n | wc -l`
fi
done 

echo $result
That's pretty cool. I might try doing it your way; I like the idea of writing a full program to do it but your way might be better.

If I can't figure this out then I'll do it your way, thanks :P

Edit: I can't believe I didn't think of using a vector to store the files!
Last edited on
So it's going ok. I have the line counting function, it's not throwing ridiculous errors at me this time and seems to be working.

I must be doing something simple wrong:
chris@chris:~/Projects/C++/mash/multiwc/mwc/bin/Release$ ./mwc -l -v -in file.txt
mwc: error no usable arguments
mwc: error no usable arguments
Amount of lines in file #0: 1
Amount of lines in file #1: 0
Amount of lines in file #2: 21
Amount of lines in file #3: 0
Amount of lines in file #4: 607676608
Total amount of lines in all 4 files: 15


This is the actual file:
H
e
l
l
o

w
o
r
l
d
!

This file contains 14 lines!


Now obviously my program is wrong... I'm guessing it's the for loop in main() suggesting that all arguments are files and then the program trying to open all of them. I'm not sure what to do about it though.

Here is the source:
main.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
#include "mwc.h"

int main(int argc, char*  argv[]) {
    if (argc < 4) {
        showmessage();
        return 0;
    }

    std::vector <std::string> infiles;
    std::string curArg, mode;
    mode = curArg = "";
    bool verbose = false;

    for (int i = 0; i < argc; ++i) {
        curArg = argv[i];

        if (curArg == "-eg" || curArg == "--show-example") {
            infiles.push_back("source/main.cpp");
            infiles.push_back("source/mwc.h");
            infiles.push_back("source/count.cpp");
            countlines(infiles, verbose);
            break;
        } else if (curArg == "-c" || curArg == "--count-characters") {
            mode = "char";
        } else if (curArg == "-w" || curArg == "--count-words") {
            mode = "word";
        } else if (curArg == "-l" || curArg == "--count-lines") {
            mode = "line";
        } else if (curArg == "-in") {
            for (int j = i; j < argc; ++j) {
                infiles.push_back(std::string(argv[j]));
                infiles.push_back(";");
            }
        } else if (curArg == "-v" || curArg == "--verbose") {
            verbose = true;
        } else if (curArg == "-h" || curArg == "--help") {
            showmessage();
        }
    }

    if (mode == "char")
        countchars(infiles, verbose);
    else if (mode == "word")
        countwords(infiles, verbose);
    else if (mode == "line")
        countlines(infiles, verbose);

    std::cout << std::endl;
    return 0;
}

void error(std::string errmsg, bool showno) {
    std::cerr << "mwc: error " << errmsg;
#ifdef errno
    if (showno)
        std::cerr << " errno: " << errno;
#endif
    std::cout << std::endl;
}

void showmessage() {
    std::cout << "Multi word-count -- count characters, words or lines in specified files.\n\n";
    std::cout << "Usage:\n\tmwc [mode] [options] [-in [infiles]]\n";
    std::cout << "Modes:\n\t-eg | --show-example\n\t\tShow an example.\n\t";
    std::cout << "-c | --count-characters\n\t\t";
    std::cout << "Counts characters in file(s).\n\t";
    std::cout << "-w | --count-words\n\t\tCounts words in file(s).\n\t";
    std::cout << "-l | --count-lines\n\t\tCounts lines in file(s).\n";
    std::cout << "Options:\n\t-v | --verbose\n\t\tToggle verbosity.\n";
    std::cout << "Infiles:\n\t -in [infiles] -- space separated arbitrary"
              << " amount of infiles." << std::endl;
}


mwc.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#ifndef _MWC_H_
#define _MWC_H_

// Includes:

#include <cstdio>
#include <errno.h>
#include <fstream>
#include <iostream>
#include <vector>

// Function prototypes by file:

// main.cpp
int main(int, char**);
void error(std::string, bool);
void showmessage();

// count.cpp
void countchars(std::vector <std::string>, bool);
void countwords(std::vector <std::string>, bool);
void countlines(std::vector <std::string>, bool);

#endif 


count.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include "mwc.h"

void countlines(std::vector <std::string> infiles, bool verbose) {
    size_t fno = infiles.size(); // Amount of files
    std::string fname, line;
    int i, total = i = 0, perfile[fno];

    std::ifstream f;

    while (i++ < int(fno)) {
        fname = infiles.back(); // Current filename is last on vector (LIFO vector)

        f.open(fname.c_str());

        if (f.is_open()) {
            while (!f.eof()) {
                std::getline(f, line);
                ++total;        // One more line in total
                ++(perfile[i]); // One more line in file i
            }
        }

        f.close();

        infiles.pop_back();     // Remove filename we just used
    }

    i = 0;

    do {
        if (verbose) {
             std::cout << "Amount of lines in file #" << i;
             std::cout << ": " << perfile[i] << "\n";
        }
    } while (i++ < int(fno));

    std::cout << "Total amount of lines in all " << fno << " files: ";
    std::cout << total << std::endl;
}

void countwords(std::vector <std::string> infiles, bool verbose) {
}

void countchars(std::vector <std::string> infiles, bool verbose) {
}
Last edited on
One problem that I see is: uninitialized perfile[i];
Initialize it to 0 before incrementing
Other problem is while (i++ < int(fno))

I would really recommend to use i++ inside loop rather than doing it the way you are doing. Also I would really recommend
using const size_t fno = infiles.size() just because of the fact that you are using fno to declare the size of an array.

Hope this helps!
Last edited on
Oh right, thanks.

It sort of works now:
$ ./mwc -l -v -in file.txt
Amount of lines in file #0: 15
Total amount of lines in all 1 file(s): 15


Somewhere the amount of lines must be getting incremented too much or something. I can probably do the rest myself, thanks :)

I would be grateful if anyone could work out where the following bugs are coming from:

1. There is one more line than there should be. Rather than jump on a quick fix (such as always decrementing the amount of lines by 1) I'd like to find the root of the problem.
2. The amount of characters and words have no real relation to the true amount. It reports 15 lines, 39 characters and 4 words, which is wrong.
3. When I use the -eg (show example) option, which is supposed to report the total amount of lines of the source code I get
Amount of lines in file #0: 15 File: source/mwc.h
Amount of lines in file #1: 145 File: source/main.cpp
Amount of lines in file #2: 74 File: source/count.cpp
Amount of lines in file #3: 29 File: source/fexist.cpp
Total amount of lines in all 4 file(s): 263

however according to wc that should be
28 source/mwc.h
73 source/main.cpp
144 source/count.cpp
14 source/fexist.cpp
259 total


Here is the file named file.txt that I've been using for testing:
H
e
l
l
o

w
o
r
l
d
!

This file contains 14 lines!


It contains 14 lines, 16 words and 53 characters according to wc (not mine, I think it's the GNU version of the UNIX tool).
Last edited on
mwc.h -- main header file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#ifndef _MWC_H_
#define _MWC_H_

// Includes:

#include <cstdio>
#include <errno.h>
#include <fstream>
#include <iostream>
#include <sys/stat.h>
#include <vector>

// Function prototypes by file:

// main.cpp
int main(int, char**);
void error(std::string, bool);
void showmessage();

// count.cpp
void countchars(std::vector <std::string>, bool); // Count chars in file(s),
void countwords(std::vector <std::string>, bool); // Count words in file(s);
void countlines(std::vector <std::string>, bool); /* Count lines in file(s).
                                                     As per the std::vector, an arbitrary amount of files is possible. */

// fexist.cpp
bool fexist(std::string); // Check a file exists and we have access to it.
#endif 


main.cpp -- contains a few functions plus the main function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#include "mwc.h"

int main(int argc, char*  argv[]) {
    if (argc < 4) {
        showmessage();
        return 0;
    }

    std::vector <std::string> infiles;
    std::string curArg, mode;
    mode = curArg = "";
    bool verbose = false;

    for (int i = 0; i < argc; ++i) {
        curArg = argv[i];

        if (curArg == "-eg" || curArg == "--show-example") {
            infiles.push_back("source/main.cpp");
            infiles.push_back("source/mwc.h");
            infiles.push_back("source/count.cpp");
            countlines(infiles, verbose);
            break;
        } else if (curArg == "-c" || curArg == "--count-characters") {
            mode = "char";
        } else if (curArg == "-w" || curArg == "--count-words") {
            mode = "word";
        } else if (curArg == "-l" || curArg == "--count-lines") {
            mode = "line";
        } else if (curArg == "-in") {
            for (int j = i; j < argc; ++j) {
                curArg = std::string(argv[j]);
                if (fexist(curArg)) // Valid file found.
                    infiles.push_back(curArg);
            }
        } else if (curArg == "-v" || curArg == "--verbose") {
            verbose = true;
        } else if (curArg == "-h" || curArg == "--help") {
            showmessage();
        }
    }

    if (mode == "char")
        countchars(infiles, verbose);
    else if (mode == "word")
        countwords(infiles, verbose);
    else if (mode == "line")
        countlines(infiles, verbose);

    std::cout << std::endl;
    return 0;
}

void error(std::string errmsg, bool showno) {
    std::cerr << "mwc: error " << errmsg;
#ifdef errno
    if (showno)
        std::cerr << " errno: " << errno;
#endif
    std::cout << std::endl;
}

void showmessage() {
    std::cout << "Multi word-count -- count characters, words or lines in specified files.\n\n";
    std::cout << "Usage:\n\tmwc [mode] [options] [-in [infiles]]\n";
    std::cout << "Modes:\n\t-eg | --show-example\n\t\tShow an example.\n\t";
    std::cout << "-c | --count-characters\n\t\t";
    std::cout << "Counts characters in file(s).\n\t";
    std::cout << "-w | --count-words\n\t\tCounts words in file(s).\n\t";
    std::cout << "-l | --count-lines\n\t\tCounts lines in file(s).\n";
    std::cout << "Options:\n\t-v | --verbose\n\t\tToggle verbosity.\n";
    std::cout << "Infiles:\n\t -in [infiles] -- space separated arbitrary"
              << " amount of infiles." << std::endl;
}


count.cpp -- containts the counting functions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
#include "mwc.h"

void countlines(std::vector <std::string> infiles, bool verbose) {
    const size_t fno = infiles.size(); // Amount of files
    std::string fname, line;
    int i, total = i = 0, perfile[fno];

    std::ifstream f;

    for (i = 0; i < int(fno); ++i)
        perfile[i] = 0;

    i = 0;

    while (i < int(fno)) {
        fname = infiles.back(); // Current filename is last on vector (LIFO vector)

        f.open(fname.c_str());

        if (f.is_open()) {
            while (!f.eof()) {
                std::getline(f, line);
                total++;        // One more line in total
                perfile[i]++; // One more line in file i
            }
        }

        f.close();

        infiles.pop_back(); // Remove filename we just used
        ++i;
    }

    i = 0;

    while (i < int(fno)) {
        if (verbose) {
             std::cout << "Amount of lines in file #" << i;
             std::cout << ": " << perfile[i];
             std::cout << "\tFile: " << infiles[i] << "\n";
        }
        ++i;
    }

    std::cout << "Total amount of lines in all " << fno << " file(s): ";
    std::cout << total << std::endl;
}

void countwords(std::vector <std::string> infiles, bool verbose) {
    const size_t fno = infiles.size(); // Amount of files
    std::string fname, line;
    int i, total = i = 0, perfile[fno];

    std::ifstream f;

    for (i = 0; i < int(fno); ++i)
        perfile[i] = 0;

    i = 0;

    while (i < int(fno)) {
        fname = infiles.back(); // Current filename is last on vector (LIFO vector)

        f.open(fname.c_str());

        if (f.is_open()) {
            while (!f.eof()) {
                std::getline(f, line);

                for (int j = 0; j < int(line.length()); ++j) {
                    if (line[j] == ' ' || line[j] == '\n') {
                        total++;      // Add another word to total
                        perfile[i]++; // Counted another word
                    }
                }
            }
        }

        f.close();

        infiles.pop_back(); // Remove filename we just used
        ++i;
    }

    i = 0;

    while (i < int(fno)) {
        if (verbose) {
             std::cout << "Amount of words in file #" << i;
             std::cout << ": " << perfile[i];
             std::cout << "\tFile: " << infiles[i] << "\n";
        }
        ++i;
    }

    std::cout << "Total amount of words in all " << fno << " file(s): ";
    std::cout << total << std::endl;
}

void countchars(std::vector <std::string> infiles, bool verbose) {
    const size_t fno = infiles.size(); // Amount of files
    std::string fname, line;
    int i, total = i = 0, perfile[fno];

    std::ifstream f;

    for (i = 0; i < int(fno); ++i)
        perfile[i] = 0;

    i = 0;

    while (i < int(fno)) {
        fname = infiles.back(); // Current filename is last on vector (LIFO vector)

        f.open(fname.c_str());

        if (f.is_open()) {
            while (!f.eof()) {
                std::getline(f, line);
                total += line.length();
                perfile[i] += line.length();
            }
        }

        f.close();

        infiles.pop_back(); // Remove filename we just used
        ++i;
    }

    i = 0;

    while (i < int(fno)) {
        if (verbose) {
             std::cout << "Amount of chars in file #" << i;
             std::cout << ": " << perfile[i];
             std::cout << "\tFile: " << infiles[i] << "\n";
        }
        ++i;
    }

    std::cout << "Total amount of chars in all " << fno << " file(s): ";
    std::cout << total << std::endl;
}


fexist.cpp -- containts fexist() function which decides if a file is accessible or not:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include "mwc.h"

bool fexist(std::string fname) {
    struct stat stFileInfo;
    int Stat;

    Stat = stat(fname.c_str(), &stFileInfo);

    if (Stat == 0) // file exists
        return true;
    else
        return false; /* Bear in mind it might be that we don't have
                         read permission for the file or folder... */
}
The problem is in

1
2
3
4
5
while (!f.eof()) {
    std::getline(f, line); 
    total++;        // One more line in total
    perfile[i]++; // One more line in file i
}


In std::getline(f, line); , just after reading you are incrementing the line number. So, when f is actually eof, you loop will break but you have already incremented the line number before checking. So, I guess it is fine if you decrement it by one after coming out of the loop. But if you don't want that then just do

1
2
3
4
5
6
7
8
9
10
11
bool keepReading = true; 
while (keepReading) {
   std::getline(f, line); 
   if (!f.eof()) { 
            total++;        // One more line in total
             perfile[i]++; // One more line in file i
    } 
     else { 
         keepReading = false; 
     }
}
Last edited on
Ok, thanks :)

Any ideas about the characters/words issue? Words is probably because I'm counting spaces and newlines; but... never mind. I'll figure it out, thanks.
Topic archived. No new replies allowed.