how do i determine and cout an email from a string

basically, i have to do it WITHOUT using regex. people told me that i have to find '@' and check if it's valid to be an email address, but since it has to be done with functions, i'm a little lost. it has to be done with a function, or i guess it has to be one. could you please help me with it and explain? thanks in advance
Last edited on
Well, the email regex is this: https://emailregex.com/
So if you don't want to use regexes but still want to correctly detect email addresses, then you'll need to write a state machine that's equivalent to that regex.
Well a simple way is to find the @, then find the proceeding space or string begin and find the succeeding space or end. The email address is between these two. So consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <string>
#include <iostream>
#include <algorithm>
#include <iterator>

std::string email(const std::string& str)
{
	if (const auto at {std::find(str.cbegin(), str.cend(), '@')}; at != str.end())
		return std::string(std::find(std::reverse_iterator(at), str.crend(), ' ').base(), std::find(at, str.cend(), ' '));
	else
		return {};
}

int main()
{
	const std::string str {"qwer my@asd.df.co rrr"};

	std::cout << email(str) << '\n';
}



my@asd.df.co

Last edited on
email is exceedingly difficult to verify. The standard is too forgiving.
its something like: only 1 @ symbol, last token is from a known published list, and almost anything at all on the sides of the @ symbol. There is a little more to it than this (you can read the published standard online) but that is the gist of it.
If you are not checking legality and just fishing it out of a string, the above is perfect (find the @ and the ends).
Hello laura fidarova,

I was thinking about working on this, but my question is how much checking do you need to do?

Are you just looking for the "@" or do you need more?

If this is a school assignment, or not, post the full instructions that you have so everyone does not have to guess at what you need.

Andy
thank you all!!
Handy Andy, it's one of the questions from my school assignment, all it says is 'determine an email from a given string using function' that's all. But my teacher told me that i can't use regex for some reason
Hello laura fidarova,

I think the question being asked is if "jdoe@somebis.com" is an e-mail address because it has the "@" in it.

My question is if "john.doe@somebis.com" or "john_doe@somebis.com" is an e-mail address because the lhs of "@" is properly formed. Then does the rhs of "@" end with ".com", ".net", ".org", ".gov" and some others? The name on the lhs of "." is not important unless the entire rhs of "@" needs to match a given domain name.

Also the rhs of "@" could also contain a 2 letter country code.

Something to think about and it can be done by checking the string and sub-strings.

The program I started with checks the string for the "@" and if present creates 2 strings for the lhs and rhs, but after that I am not sure what you need to do.

Andy
'determine an email from a given string using function'


You need some clarification (or an alternative translator). Does your question say:
- "determine if this string is a valid email address";
or
- "extract the email address that is contained within this string" (for example: "send this message to laura.fidarova@somewhere.com.")? (Which also poses some interesting questions about terminating punctuation).
Last edited on
Hello laura fidarova,

Personally I do not care if this is for school or not, but post the complete instructions that you were given so as not to make this a X/Y problem, http://xyproblem.info/ , and that everyone will know what you have to do.

Andy
Assuming for simplicity that we don't accept quoted local parts or bracketed domain parts and spreading out the regex, it becomes much simpler:

(?:
  [a-z0-9!#$%&'*+/=?^_`{|}~-]+
  (?:
    \.[a-z0-9!#$%&'*+/=?^_`{|}~-]+
  )*
)
@
(?:
  (?:
    [a-z0-9]
    (?:
      [a-z0-9-]*
      [a-z0-9]
    )?
    \.
  )+
  [a-z0-9]
  (?:
    [a-z0-9-]*
    [a-z0-9]
  )?
)
It just means, don't do everything in main... I doubt you need to be as robust as the regex that helios linked.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>
#include <string>

bool is_valid_email_address(const std::string& address)
{
    // Look, I'm using a function!
    return true;
}

int main()
{
    bool valid = is_valid_email_address("dog@dog.com");
    if (valid)
        std::cout << "valid!\n";
    else
        std::cout << "invalid!\n";
}


https://www.cplusplus.com/reference/string/string/find/
https://www.cplusplus.com/reference/string/string/substr/
The assignment is badly specified. The instructor should have given a description of what he thinks a valid email is.

I suggest considering the following (at least) to be invalid:

a@b@c.d        # more than one @
a.b@           # no domain part
@c.d           # no local part
a.b@c          # no period in domain part
.a.b@c.d       # local part starts with period
a.b.@c.d       # local part ends with period
a..b@c.d       # local part has 2 or more periods in a row
a.b@.c.d       # domain part starts with period
a.b@c.d.       # domain part ends with period
a.b@c..d       # domain part has 2 or more periods in a row

You may also want to limit the types of characters allowed in each part.
The pieces of the domain part that are separated by periods allow only letters and digits, and also a dash as long as it isn't the first or last character in a piece. (Also the pieces can't be more than 63 chars.)
The local part allows quite a few punctuation characters along with letters and digits.

The different checks offer a lot of opportunity for multiple functions.
Topic archived. No new replies allowed.