RegEx problem

Hey All,

I'm just starting into regular expressions in C++ and I've noticed a couple problems. I spent 20+ years using them in perl so there are certain things I "just expect". And I'm not finding them.

Specifically, I expect a regex to have a "multiline" mode where it will match beyond the first newline included in the target string. I can't find one.

Similarly, I have found that no search will proceed past a null character in the buffer either (possible with std::strings).

As a more minor complaint (because there is a workaround) I don't find a case-insensitive mode either. Am I missing something?

BTW the reason these things matter is I'm about to write a file searching utility that wants to be able to use regexes. If I can't solve the problem I will have to break the string into pieces at newlines or nulls and search the pieces individually. Ugh.

TIA,

Lars
icase: Look at 'ninth' in the example here: http://www.cplusplus.com/reference/regex/basic_regex/basic_regex/

What is "perl multiline"?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <string>
#include <regex>

int main ()
{
  using namespace std::regex_constants;
  std::regex ninth ("lo\\nw", ECMAScript | icase );

  std::string subject = "Hello\nWorld";
  std::cout << subject << std::endl;
  std::string replacement = "yup";

  std::cout << std::regex_replace (subject, ninth, replacement);
  std::cout << std::endl;

  return 0;
}
C++17 added a multiline constant to basic_regex.
https://en.cppreference.com/w/cpp/regex/basic_regex
Thanks!
Not exactly the perl approach, but it'll work. In perl you do it with modifiers as you apply the regex, e.g. if ($x =~ /foo/i) { dosomething(); } would get you case-insensitive.
I would say that it is quite close.
1
2
/foo/i
std::regex bar("foo", icase);

A difference here is that perl has "unnamed literal constant", where C++ prefers to create "named object". Both have the regular expression and modifier (flags).
The literal constant is used directly in match expression, while the named regex can be used in multiple expressions.

Use of unnamed temporary is possible too:
1
2
if ( std::regex_match( x, std::regex("foo", icase) ))
   { dosomething(); }

So the real difference is that perl has binary operator =~ but C++ uses function regex_match()
Furry Guy: haven't been able to get multiline to work. Here's my test code:
<code>
#include <iostream>
#include <string>
#include <regex>

int main()
{
std::string str("\nabcd");
std::regex r("^abc", std::regex::multiline);

std::smatch m;
std::regex_search(str, m, r);

for (auto v : m)
std::cout << v << std::endl;
}
</code>

I get "E0135 class "std::basic_regex<char, std::regex_traits<char>>" has no member "multiline" RegexTest C:\projects\RegexTest\RegexTest.cpp 8


and yes, I did switch it to C++ 17. Am I doing something wrong?

TIA,

Lars
Keskiverto: yes, I don't find the C++ syntax problematic, except where it doesn't seem to work (see above). The perl syntax is nicely terse but without storing a regex in a var (which works fine) you can't otherwise reuse one. E.g. if ($x =~ /$regex/) { dosomething(); }
Which version of compiler do you have and how complete its C++17 support is?

Could there be something like: https://developercommunity.visualstudio.com/content/problem/268592/multiline-c.html
Keskiverto: I'm using VS2019 16.8.1.
Last edited on
Keskiverto: it does seem to be multiline by default, but it still won't read through a zero.
Reported to Microsoft 12/12/2020.
Topic archived. No new replies allowed.