Need a split function!

Hello,

I'm making a web crawler for my website.
I've make the code which downloads the page done,
It gives the code like this:
Original file:
<html>
<head>
</head>
<body>
</body>
</html>

Output from my code:
<html>,,,<head>,,,</head>,,,<body>,,,</body>,,,</html>

Now i need a string split function,
I've tried a lot of functions i found on the web,
But they all have this problem:
Instead of splitting, they remove every character you give it,
Like when you feed this in:
abcdcba
And say it should split on cd i want to get this:
array[0]="ab";
array[1]="cba";
But the this what i dont want is this:
array[0]="ab";
array[1]="ba";

I use g++ and i compile on linux.

Please help!

Greetings
Note, it has to split on the whole delim.
Else it would change the page-code making not a webcrawler, but a webcrapper.
The other codes i found didn't split, they remove the chars in delim
You can create something yourself. You can use this code as starting point:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#include <iostream>
#include <vector>
#include <string>

using namespace std;

void split(const string& str, const string& delim, vector<string>& result)
{
    size_t start_pos = 0;
    size_t match_pos;
    size_t substr_length;

    while((match_pos = str.find(delim, start_pos)) != string::npos)
    {
        substr_length = match_pos - start_pos;

        if (substr_length > 0)
        {
            result.push_back(str.substr(start_pos, substr_length));
        }

        start_pos = match_pos + delim.length();
    }

    substr_length = str.length() - start_pos;

    if (substr_length > 0)
    {
        result.push_back(str.substr(start_pos, substr_length));
    }
}

int main()
{
    string input = "Hello<delim>world<delim><delimnot>!";
    string delim = "<delim>";

    vector<string> tokens;

    split(input, delim, tokens);

    for(int i = 0; i < (int)tokens.size(); ++i)
    {
        cout << tokens[i] << endl;
    }
}
THX man!
You found the code i've been searching for!

Topic archived. No new replies allowed.