I've split my string into vector<string_view> and now I'd like to get back the original string (or a list of substrings if elements were deleted). my question is, does string_view::substr allow for increasing size according to the standard? any suggestions for using algorithms or ranges/view in smoothen()? (I'm aware splitting could be done with split_view too.) how about using a view_interface instead of vector, would I still be able to distinguish the individual words after join_view, or would it just become a complicated kind of string?
here's my test-code:
does string_view::substr allow for increasing size according to the standard?
I'm not sure I'm understanding what is meant by this.
std::string_view is a read-only view referencing the underlying data. You can have multiple views of the same underlying data - but you can't change the underlying data via string_view. Neither can you concatenate etc data to a string_view.
the "more detail" is in my example program!
my question is if c++20 standard allows for what I'm doing in line 11 or if an exception must be thrown in that line instead or if it's undefined behaviour.
many thanks. solves the mystery why it doesn't work. en.cppreference.com didn't mention anything about that behaviour of string_view::substr() and string_view::size().
all that is nifty and stuff, but the actual problem can be solved by
string s1;
... whatever, get data into s1
string s2{s1};
... code to mess up s1
....
s2 is still the original string.
if you deleted something, you can either apply the same to s2 (if possible, and it should be) or you end up back trying to do the above complexity (seems best to avoid, in this specific use case?)
It seems to me that OP's problem would be better solved with just the original string and a vector of offsets/pointers into it. The point of a string_view is to be a read-only slice of an object. If you're talking about shifting or splicing a string_view, it's probably not what you want. The code that uses the string_view should not assume that there's any more memory to extend into.
Yes, one can always make a deep copy of a string (or construct a string from a C-style NTBS) instead of creating a view; that was how it used to be normally done before 2017.
Its not the views, its the manipulation then trying to go back to the original that I was saying can be solved via a copy. The view part is fine, views don't cost much and are a wonderful addition.
Well the first issue is that the split doesn't work! It goes into an infinite loop.
n=s.find(' ',n)+1
when find is not found, npos + 1 is 0 - so the condition test fails so the loop never exits. Try:
1 2 3 4
for (auto n = s.find(' '); n != std::string_view::npos; p = n + 1, n = s.find(' ', p))
v.push_back(s.substr(p, n - p));
v.push_back(s.substr(p, s.size() - p));
#include <vector>
#include <string_view>
#include <iostream>
usingnamespace std::string_view_literals;
std::vector<std::string_view> smoothen(const std::vector<std::string_view>& v) {
if (v.empty())
return {};
std::vector<std::string_view> sv {v.front()};
for (auto itr = v.begin() + 1; itr != v.end(); ++itr)
if (sv.back().data() + sv.back().size() == itr->data())
sv.back() = std::string_view(sv.back().data(), itr->data() + itr->size());
else
sv.push_back(*itr);
return sv;
}
std::vector<std::string_view> split(std::string_view s, bool incspace)
{
std::vector<std::string_view> v1;
size_t p = 0;
for (auto n = s.find(' '); n != std::string_view::npos; p = n + 1, n = s.find(' ', p))
v1.push_back(s.substr(p, n - p + incspace));
v1.push_back(s.substr(p, s.size() - p));
return v1;
}
int main() {
constexprauto s = "this is my string"sv;
constauto v1 {split(s, false)};
constauto v2 {split(s, true)};
constauto sm1 {smoothen(v1)};
constauto sm2 {smoothen(v2)};
for (constauto& s1 : v1)
std::cout << "!" << s1 << "!" << '\n';
std::cout << '\n';
for (constauto& s1 : sm1)
std::cout << "!" << s1 << "!" << '\n';
std::cout << '\n';
for (constauto& s1 : v2)
std::cout << "!" << s1 << "!" << '\n';
std::cout << '\n';
for (constauto& s2 : sm2)
std::cout << "!" << s2 << "!" << '\n';
}
!this!
!is!
!my!
!string!
!this!
!is!
!my!
!string!
!this !
!is !
!my !
!string!
!this is my string!
which 'merges' adjacent string_views if the beginning of the next follows from the ending of the current.
In this case, if the split doesn't include the ' ', then next doesn't follow from the current and in this case no joins are done and the result is the same.
if the split does include the ' ', then the next does follow from the current and in this case the next and current are merged into one.
thanks, that's exactly the clearheadedness I were looking for. hope you're aware that when giving split() the parameter "false" then the same can be achieved by c++20 stl functions in more readable way.
as for explanation of what I'm trying to achieve: I discovered the library "dtl" which creates a diff between 2 vectors. so when I feed it vectors of string_view where each element contains either a whole word (or maybe alphanumeric sequence) or a single character, it would tell me at which characters or words the two strings differ, just like some fancy diff-visualizer widgets do. so I can program a widget to visualize the differences with colours or further analyze why the words differ (when analyzing plagiarism or spam). all I have to do is to smoothen the output whenever it's continuous, to save on space...