C-string vector acts differently than std::string vector?

Please consider the following example code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream>
#include <vector>
#include <string>
#include <sstream>
int main()
{
    std::vector<std::string> vec1;
    std::vector<const char*> vec2;
    std::string str{"This\nis\na\ntest"};
    std::istringstream ss{str};

    std::string temp;
    while(ss >> temp)
    {
        vec1.push_back(temp);
        vec2.push_back(temp.c_str());
    }
    for(auto a: vec1) std::cout << a << "-";
        std::cout << std::endl;
    for(auto a: vec2) std::cout << a << "-";
}


The output is as follows:

This-is-a-test-
test-test-test-test-


Why is the output for these two different? Shouldn't both output this-is-a-test?

EDIT: I just put this into an online C++ compiler and I get the correct result, but when I compile using GCC on my machine, it gives the wrong result as above. What is going on?

EDIT 2: I compiled using Clang as well and am getting the same incorrect result. But on online compilers it is showing the correct result.
Last edited on
Each element of vec1 is a unique std::string object.

Each element of vec2 is the pointer returned from temp.c_str()

http://www.cplusplus.com/reference/string/string/c_str/ writes:
C++11: The pointer returned points to the internal array currently used by the string object to store the characters that conform its value.

The pointer returned may be invalidated by further calls to other member functions that modify the object.


If the temp does no reallocate on each input, then the array remains same and all four calls return same address.
If the temp does reallocate, then previous pointers become invalid.
I just put this into an online C++ compiler and I get the correct result

Which online compiler? It seems unusual to me for it to give what you call the "correct" result, although it is possible (if c_str() copied the data somewhere or if temp used different storage for each string, which seems wasteful). But this cannot be relied on. In fact, the storage has almost certainly been freed.

The vector of const char* is only storing the address of a string, not the string itself. You are (potentially at least) storing the exact same address in all of the elements, the address of temp.c_str(). That's why you only see the last string. You need to allocate space for the string to store separate strings.

1
2
3
char *s = new char[temp.size() + 1];
std::copy(temp.c_str(), temp.c_str() + temp.size() + 1, s);
vec2.push_back(s);

And you should of course delete the storage, too.
Which online compiler


The online compiler that runs when you click on "Edit and Run" in my original post (cpp.sh).
That site uses gcc 4.9.2 whereas the current version is 10.2. Apparently that version inefficiently stores the strings in different memory each time, even if the current memory is large enough to hold the new string. In fact, the strings are small enough to be stored in the string structure itself (small string optimization), which is what the version I'm using does.
Is there an easier way to input into C-strings using std::string::c_str without having to worry about manually dynamically allocating a char array and copying it over?
Apparently that version inefficiently stores the strings in different memory each time, even if the current memory is large enough to hold the new string.

Pre-GCC 5.1 std::string was implemented using CoW. But then the C++11 standard (indirectly) banned copy-on-write implementations in order to better support concurrency. Rationale here:
https://wg21.link/n2534

As a result GCC 5.1 had to break ABI. This is the _GLIBCXX_USE_CXX11_ABI stuff you may have encountered:
https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html
Last edited on
Is there an easier way to input into C-strings using std::string::c_str without having to worry about manually dynamically allocating a char array and copying it over?

Yes. Just use std::string. That's what it's for. It handles the allocation/deallocation for you.

@mbozzi, thanks. That's interesting stuff.
Yes. Just use std::string. That's what it's for. It handles the allocation/deallocation for you.


I am well aware of this alternative. Unfortunately this isn't a possibility for me. I have to access Linux functions such as fork() and exec(), which expect C-style strings, in a C++ program. I have managed to make some workarounds for it, such as using std::vector<std::string> to store a buffer (to be passed as the "argv" parameter in exec(), to which I then perform a std::transform() on to convert it to a std::vector<const char*>, which I then take and perform a const_cast on it after retrieving the raw buffer using the std::vector::data() member function order to pass it as an an argument to exec. Obviously, this is a very messy solution.
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <vector>
#include <string>
#include <unistd.h>

void my_exec(std::vector<std::string>& v)
{
    std::vector<char*> w;
    for (auto& s: v) w.push_back(s.data());
    w.push_back(nullptr);
    execvp(w[0], w.data());
}

int main()
{
    std::vector<std::string> v { "ls", "-l" };
    my_exec(v);
}

I have to access Linux functions such as fork() and exec(), which expect C-style strings, in a C++ program.


std::string has .c_str() which returns a C-style null-terminated string.

See http://www.cplusplus.com/reference/string/string/c_str/
Last edited on
@seeplus:
The .c_str() returns a pointer and that is what started this thread.

@dutch:
 In function 'void my_exec(std::vector<std::basic_string<char> >&)':
9:41: error: invalid conversion from 'const char*' to 'std::vector<char*>::value_type {aka char*}' [-fpermissive]

One can add explicit cast:
1
2
for (auto& s: v)
  w.push_back( const_cast<char*>(s.data()) );



There are many compilers that support C++ standards to varying level. GCC supports more than one C++ standard (with and without GNU extensions). Each version of GCC has some standard as default, but can be told (the -std= option) to use different standard. Hence (default) behaviour can differ.

Pre-C++11 was less specific about the .c_str() / .data().
The .c_str() returns a pointer and that is what started this thread.


The OP was regarding vector of pointers, not passing arguments to functions.

@dutch Shouldn't the second param to execvp be the address of the second element of w - not the first as that's the command? Also .data() does not guarantee that the returned array is null-terminated.

For VS,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>
#include <vector>
#include <string>
#include <process.h>

void my_exec(std::vector<std::string>& v)
{
	std::vector<const char*> w;

	for (const auto& s : v)
		w.push_back(s.c_str());

	w.push_back(nullptr);
	_execvp(w[0], &w[1]);
}

int main()
{
	std::vector<std::string> v {"ls", "-l", "-no"};
	my_exec(v);
}


@keskiverto, I compiled as c++17 with full warnings (as usual) with g++ 9.3, and for whatever reason it worked, although it won't compile as c++11 or c++14.

seeplus wrote:
Shouldn't the second param to execvp be the address of the second element of w

No, the first argument is also the command. It just lets the program know what filename it was started with.
Also, your code will not compile for me, no matter what standard I choose.
Clearly this is fiddly business. :-)
Last edited on
VS uses _execvp() rather than execvp()......

OK. For the program test12:

1
2
3
4
5
6
7
8
9
#include <iostream>

int main(int argc, char* argv[])
{
	std::cout << '\n';

	for (int i = 0; i < argc; ++i)
		std::cout << i << "  " << argv[i] << '\n';
}


which just displays the arguments.

And for test64:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream>
#include <vector>
#include <string>
#include <process.h>

void my_exec(const std::vector<std::string>& v)
{
	std::vector<const char*> w;

	for (const auto& s : v)
		w.push_back(s.c_str());

	w.push_back(nullptr);
	_execvp(w[0], w.data());
}

int main()
{
	const std::vector<std::string> v {"test12", "a1", "a2"};

	my_exec(v);
}


which gives:


c:\MyProgs>test64

c:\MyProgs>
0  test12
1  a1
2  a2

Last edited on

> I compiled as c++17 with full warnings (as usual) with g++ 9.3, and for whatever reason it worked,
> although it won't compile as c++11 or c++14.

The non-cost overload for std::basic_string<>::data() was not there prior to C++17.
https://en.cppreference.com/w/cpp/string/basic_string/data
VS uses _execvp() rather than execvp()......

obviously I changed that......

And, what, your test proves my point???
Last edited on
It was just meant to highlight that _exec??() is used with VS rather than exec??() and to show it working with VS.

:) :)
Last edited on
Topic archived. No new replies allowed.