regex doesn’t like me :(

Apr 26, 2020 at 6:34am
I don’t get regex. :/ I’ve been trying for the past two day straight to just match strings, but so far nothing wants to work. I honestly don’t even know what the problem is. The regular expression I have for strings at the moment is
("([.\r\n[^"]]*(\\")*)*"), which feels kinda redundant, but
([^"]*(\\")*) didn’t want to work either for some reason, because whenever I tried using any of these, it matches only things between the strings.
For example, if I had the input js js " jdjdn " " djdjdn ",
it would match " " as a string instead of " jdjdn ".

My original intent was to match strings as anything that begins and ends with ", but not " \", so allowing for escaping of the double quote character. Could someone tell me why it is not working?? So confused :(
Apr 26, 2020 at 7:16am
Just an example.
https://regex101.com/r/yR2jyH/1

Actually, the site is very useful at explaining what each part of your regex means, and how that matches (or doesn't) your test input.

Apr 26, 2020 at 7:29am
eem. That’s blocked/restricted for me.
Apr 26, 2020 at 10:33am
Blocked by whom?

Your ISP
Your Employer
Your State
Apr 26, 2020 at 2:50pm
So if I'm understanding you correctly, you want to comb through a given input, matching only the substrings that are inside of double quotes (" ")?

salem c's regex looks like this:
std::regex("\"([^\"]+)\"")

It seems to work.
Last edited on Apr 26, 2020 at 2:51pm
Apr 26, 2020 at 7:11pm
I’ve tried that one and I agree that it should work, but it just doesn’t. That’s basically why I’m asking this question. Why wouldn’t that work? I think I’ll post code too and then try that sequence again though...
Apr 26, 2020 at 8:35pm
Which standard library are you using? We'll need the version number.

In particular, some versions of GNU libstdc++ do not completely implement the <regex> header - as in, some versions just throw exceptions unconditionally whenever you construct a regex. Others "sort of" work, IINM.
Last edited on Apr 26, 2020 at 11:36pm
Apr 27, 2020 at 2:47am
I honestly do not know. I use an online ide so I’m not sure.
Apr 27, 2020 at 3:20am
What result do you get from this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <iostream>
#include <iterator>
#include <regex>
#include <string>

int main()
{
    std::string s {R""(js js " jdjdn " " djdjdn ")""};
    std::regex re(R""("([^"]+)")"");
    std::smatch m;

    auto it = std::sregex_iterator(s.begin(), s.end(), re);
    auto it_end = std::sregex_iterator();
    for ( ; it != it_end; ++it)
        std::cout << (*it)[0] << '\n'; // or (*it)[1] for 1st capture
} 

Apr 28, 2020 at 8:51pm
Sorry for responding so late my parents took my compiler.


I get
" jdjdn "
" djdjdn "


So basically that works.. 🤷‍♂️ I’m f’ing something up ok cause that works......... :/
Last edited on Apr 28, 2020 at 8:51pm
Apr 28, 2020 at 8:56pm
The only problem with that match is that it doesn’t allow for escaped double quotes or empty strings... I’m gonna play with it for a bit..
Last edited on Apr 28, 2020 at 8:56pm
Apr 28, 2020 at 9:09pm
my parents took my compiler.

Haha :D
Apr 28, 2020 at 9:38pm
Yeah, it's just a simplification of what you are after, but I thought it would be a good place to start.

Detecting an empty string is easy. Just change the + (1 or more) to a * (0 or more).

C++ regexes don't appear to have lookbehind, so I'm not sure how to detect a " with a \ before it. If you could use lookbehind then it would look like this: R""("(.*?)(?<!\\)")""

(Note that the raw string delims that I've used are ""( and )"", so those aren't part of the regex string.)

This part (?<!\\)" says to match a quote that is not preceeded by a backslash. (Even in a raw string we still need to put two backslashes here; in a regular string we would need four!)

my parents took my compiler.

I remember the days when I would stay up into the wee hours coding. ;-)
Apr 28, 2020 at 9:57pm
Look behind? Never even heard of that yet! :D I am learning off of the ECMAScript specs on cplusplus.com right here. That was the same thing I was having trouble with ye.. how does that work? Is it the ! Meaning not like in C++? What is all the other chars mean? Ik ut’s an answer but could you break it down for me more? I’m still learning regex so I just think it might be handy to know... thank you!



You know it, lol XD 4 in the morning, I’d still be coding. :)
Apr 28, 2020 at 10:38pm
The (? is a "special open parenthesis", the < means lookbehind instead of lookahead, the ! like you say means not (otherwise you would use =). So (?<!//) means "only match if the next char does not have a \ before it".

But like I said, as far as I know it doesn't work in C++.

You might also notice the ? after the *, which makes the match non-greedy. Without it the "(.*)" would match the longest string it could. With it "(.*?)" it matches the shortest string it can.
Apr 28, 2020 at 11:14pm
Ahh ok, thank you! :) imma try this out now with with that maybe look around to see if I can make it work...
May 4, 2020 at 1:22am
Sorry forgot to mark as answered! Lol. Thanks guys! I found that I could do look behinds using Boost.regex and also a working refined version of my original attempts.
May 4, 2020 at 1:47am
BTW, regex doesn’t like anyone...
May 4, 2020 at 2:30am
You definitely have a point there lol.
Topic archived. No new replies allowed.