How do I use string_view where a C-string is expected?

Aug 6, 2018 at 7:02am
I've written a function foo() whose purpose is to call something that requires a C-string:

1
2
3
4
void foo(std::string_view x)
{ 
  std::puts(x.data()); // expects null-terminated string
}

However, std::string_view does not always look at a null-terminated string, leaving plenty of room for errors.
This code demonstrates one potential problem:
1
2
3
4
5
auto my_sv = "Hello, World!\n"sv;
my_sv.remove_suffix(9); 
    
std::cout << my_sv << "\n"; // prints "Hello"
foo(my_sv); // prints "Hello, World!" 

See:
http://coliru.stacked-crooked.com/a/0875515cfa6b9631

While simply promoting the std::string_view to a std::string would prevent the problem, it eliminates most of the benefits of string_view in the first place, which I'd like to avoid.

What's the best way to write foo()?

A few possible approaches come to mind.
- I could write foo(const char*) and foo(std::string const&) and give up on string_view.
- Or I could write a new component (c_string_view?) that guarantees the string is zero-terminated, or use one that already exists.
- Or I could just require the string_view be null-terminated as a precondition.

Thanks!
Last edited on Aug 6, 2018 at 7:38am
Aug 6, 2018 at 8:44am
I vote for the first option, foo(const char*) and foo(std::string const&). If the function expects a null-terminated string it shouldn't use std::string_view in my opinion because that is not what it is.
Last edited on Aug 6, 2018 at 8:47am
Aug 6, 2018 at 9:34am
The more I think about it the more I like the idea of having a c_string_view type. The only problem is that it's not part of the standard library so one either have to find a third-party implementation or write our own implementation.

I guess c_string_view would have the same interface as std::string_view except that it lacks any function that could give rise to a string that it not null-terminated (e.g. size_t constructor, remove_prefix, remove_suffix), and with the addition of a c_str() function. I'm not sure what the substr function would return. A std::string_view?

One might even want to make c_string_view implicitly convertible to std::string_view. The conversion should be trivial and perfectly safe.
Aug 6, 2018 at 1:43pm
> The more I think about it the more I like the idea of having a c_string_view type.
> The only problem is that it's not part of the standard library
> so one either have to find a third-party implementation or write our own implementation.

Writing a small wrapper would just be a few lines of code. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include <iostream>
#include <string>
#include <string_view>
#include <cstdio>
#include <vector>

struct c_string_view_wrapper
{
    // note: the life-time of the underlying sequence of characters must not end
    //       before than the last use of the c_string_view_wrapper / the wrapped string_view
    
    c_string_view_wrapper( const std::string& str ) : view(str), null_terminated(true) {}
    
    constexpr c_string_view_wrapper( const char* cstr ) : view( cstr ? cstr : ""  ), null_terminated(true) {}
    
    constexpr c_string_view_wrapper( const char* cstr, std::size_t sz ) : view( cstr, sz ), null_terminated( cstr[sz-1] == 0 )
    { if(null_terminated) view.remove_suffix(1) ; }
    
    constexpr c_string_view_wrapper( std::string_view view ) : view(view), null_terminated( view.back() == 0 )
    { if(null_terminated) view.remove_suffix(1) ; }

    constexpr operator std::string_view() const { return view ; }
    
    const char* c_str() const // not thread safe; in addition,
                              // the usual caveats for std::string::c_str() apply
    {
        if(null_terminated) return view.data() ;

        static std::string str ;
        str.assign( view.begin(), view.end() ) ;
        return str.c_str() ;
    }

    std::string_view view ;
    bool null_terminated ;
};

std::size_t foo( c_string_view_wrapper view_wrapper )
{
    std::string_view view = view_wrapper ;
    const char* cstr = view_wrapper.c_str() ;

    // ...

    std::cout << "null terminated view? " << std::boolalpha
              << view_wrapper.null_terminated << " (size:" << view.size() << ") => " ;
    std::puts(cstr) ;
    return view.size() ;
}

int main()
{

    foo( "hello world!" ) ;

    foo( { "hello world!", 5 } ) ;
    foo( { "hello world!"+6, 6 } ) ;
    foo( { "hello world!"+6, 7 } ) ;

    const std::string str = "hello world!" ;
    foo(str) ;

    std::vector<char> vec { str.begin(), str.end() } ;
    foo( std::string_view( std::addressof( vec.front() ), vec.size() ) ) ;

    vec.push_back(0) ;
    foo( std::string_view( std::addressof( vec.front() ), vec.size() ) ) ;
}

http://coliru.stacked-crooked.com/a/daacb6dd16468731
Aug 8, 2018 at 5:41am
I've taken the suggestion and written a c_string_view class, but I've tried to create something more of a vocabulary type than something with a strict invariant. I don't like it, but I wanted to give an update anyways. It's 400 lines of boilerplate code, so I'll just link it and summarize the differences between it and string_view:
  - remove_suffix() removed
  - remove_prefix() still there
  - char const*, size_t constructor still there (not sure about that)
  - c_string_view::substr() returns std::string_view
  - c_string_view is implicitly convertible to std::string_view
  - std::string_view is explicitly convertible to c_string_view
  - c_string_view::c_str() is a synonym for data().

The code is here. Consider it a rough draft:
http://coliru.stacked-crooked.com/a/f33bcd9324fc7a60

@JLBorges: IINM, c_string_view_wrapper::c_str() has a problem because the underlying sequence could have changed after setting null_terminated but before the call. I think you can't cache null_terminated in general.
Last edited on Aug 8, 2018 at 5:55am
Aug 8, 2018 at 6:07am
That entire code is based on the assumption that the underlying sequence does not change during the period of use of the wrapper (not thread-safe, the wrapper is a temporary object passed by value to a function which uses it, and then discards it).

In general, cacheing does not work with string views into mutable sequences.
1
2
3
4
std::string str = "abcdefghijkl" ;
std::string_view view(str) ;
str += "mnopqrst" ;
// using the string view here is undefined behaviour 
Sep 21, 2018 at 7:43pm
Just wanted to post an update:

I've been using something like the c_string_view above for a few weeks now, but I am not convinced that it will save the time it has absorbed.

There is an issue interfacing with std::string, since the standard library has this inverted dependency: std::string has a conversion to std::string_view, in order to prevent <string_view> from depending on <string>. This allows std::string to compare with string_view.

Ideally, std::string would compare with c_string_view too, but that would force me to include <string> in the header file, which is harmful for compile speed. It is unfortunate to require so much of the standard library in translation units which only care about some C interface, and the slower compilation wastes my time.

Additionally, given the use-case, it may be preferable to drop the size member and store only a pointer instead of a pointer-length pair, but I'm not sure and this isn't a big deal anyway.

I'm considering switching back to plain-old string_view.
Last edited on Sep 21, 2018 at 8:08pm
Topic archived. No new replies allowed.