Mar 6, 2011 at 9:49pm Mar 6, 2011 at 9:49pm UTC
Hello everyone in this great forum, which i finally joined!
I parse wordlists:
1 2 3 4 5 6 7 8 9 10 11
ifstream StmFI("wordlist" );
string s;
while (StmFI >> s) { // e.g. "-string-with-dashes--"
while (s.find('-' ) == 0)
s.erase(0, 1);
while (s.rfind('-' ) == s.size() - 1)
s.erase(s.size()-1, 1);
// ...
}
if string is "-" ==>boom!
s.size() - 1 == 0 - 1 ==
-1 and also string::npos ==
-1
==>s.erase(-1, 1) !!!
circumstantially solution:
1 2
while (s.size() && s.rfind('-' ) == s.size() - 1)
s.erase(s.size()-1, 1);
Has anyone a more elegant solution?
Last edited on Mar 6, 2011 at 9:55pm Mar 6, 2011 at 9:55pm UTC
Mar 6, 2011 at 10:16pm Mar 6, 2011 at 10:16pm UTC
If the string is "-" wouldn't the first while loop take care of everything?
Mar 6, 2011 at 11:31pm Mar 6, 2011 at 11:31pm UTC
@firedraco: But then the second while loop would run, breaking as TC has stated.
Mar 6, 2011 at 11:35pm Mar 6, 2011 at 11:35pm UTC
Ah...In that case, I don't see anything better than what you have...although testing for s.empty() is probably more easily read.
Mar 7, 2011 at 1:05am Mar 7, 2011 at 1:05am UTC
Tbh, I would probably prefer the original, it was a lot easier to figure out what you were doing.
Mar 7, 2011 at 10:09pm Mar 7, 2011 at 10:09pm UTC
Damn, it's that simple:
1 2 3 4 5
while (s.find('-' ) == 0)
s.erase(0, 1);
while (s.rfind('-' ) == static_cast <unsigned char >( s.size() - 1 ) )
s.erase(s.size()-1, 1);
or as one-liner:
1 2
while (! (i = s.find('-' )) || (i = s.rfind('-' )) == (UCHAR) (s.size() - 1)) // typedef'd
s.erase(i, 1);
...same story like EOF!
Last edited on Mar 7, 2011 at 10:10pm Mar 7, 2011 at 10:10pm UTC
Mar 8, 2011 at 12:31am Mar 8, 2011 at 12:31am UTC
...i have fun with my wordlists... millions of words... :-)
==>but better cast to ushort:
1 2
while (! (i = s.find('-' )) || (i = s.rfind('-' )) == (USHORT) (s.size() - 1))
s.erase(i, 1);
@Duoas: good idea, i will check this...
Last edited on Mar 8, 2011 at 12:48am Mar 8, 2011 at 12:48am UTC
Mar 8, 2011 at 3:13am Mar 8, 2011 at 3:13am UTC
Er, why are you casting to random Windows #defined type aliases? Use the proper thing:
if (i = s.rfind('-' )) != std::string::npos)
Mar 8, 2011 at 8:55pm Mar 8, 2011 at 8:55pm UTC
...you're right, STL at it's best:
1 2 3 4 5 6 7 8 9 10
const char *pcc_Char2Cut = "!\"#'()*+,-./:;<>?@[]_`{|}~" ; // lead + tail
ifstream StmFI;
void S::F_Parse()
{
string s;
while (StmFI >> s) {
s.erase(s.find_last_not_of(pcc_Char2Cut) + 1).erase(0, s.find_first_not_of(pcc_Char2Cut));
// ...
}
a
one-liner , and w/o any loop :-)
Remember:
erase(0, 0) is okay (does nothing)
erase(size(), npos) is okay (dito)
but erase(-1) crashes!
one can think this gives:
~find_last_not_of()
~erase()
~find_first_not_of()
~erase()
but apparently the compiler is free to code:
~find_first_not_of()
push result
~find_last_not_of()
~erase()
~erase()
Remember:
erase last before first
else crash!
@Duoas: Er, why are you casting to random Windows #defined type aliases?
To avoid a crash. And uchar gives strlen 255, but ushort 65K.
@Duoas: if (i = s.rfind('-' )) != std::string::npos)
want only delete lead/trail chars
Thanx again for your inspiration!
Last edited on Mar 9, 2011 at 12:20am Mar 9, 2011 at 12:20am UTC
Mar 9, 2011 at 1:13am Mar 9, 2011 at 1:13am UTC
seems you don't like the one-liner... :-(
The reason for the crash i want avoid is absolutly clear:
the not-found constant npos is defined with -1, hex 0xFFFFFFFF
an empty string has size 0, my formula subtracts 1, that gives also -1
the compare is true and its calls erase(-1) and crashes
a cast to uchar gives 0x000000FF, ushort 0x0000FFFF
this compares false ==>no call ==>no crash
sorry, i thought this was clear... maybe i'm too short sometimes ;-)
Last edited on Mar 9, 2011 at 1:22am Mar 9, 2011 at 1:22am UTC
Mar 9, 2011 at 12:35pm Mar 9, 2011 at 12:35pm UTC
dutChBZ,
npos maybe defined as static const size_t npos = -1;
but semantically it is greatest possible value for an element of type size_t. If the type of size_t is not as expected or changed for some reason in any particular implementation of STL, your 'magic number' code may well break.
Mar 9, 2011 at 1:45pm Mar 9, 2011 at 1:45pm UTC
Yes, i like one-liner . I like highly-optimized, cool looking code. Thats my ASM-origin.
Mea culpa.
Mar 9, 2011 at 8:48pm Mar 9, 2011 at 8:48pm UTC
@jsmith
I think we're wasting our time here.