worked on a small example but threw me an "Segmentation fault: 11" - error when running the compiled program |
The code that you posted looks okay, but I see that it isn't the entire program. I suspect that the crash came somewhere else in the code. Can you post the whole thing?
Is there a more efficient way? |
There are several things that could improve your code:
1 2
|
for (int i = 65; i < 91; i++) ascii.push_back(i);
for (int i = 97; i < 123; i++) ascii.push_back(i);
|
While this works, it's hard to understand. What are the magic numbers 65, 91, 97 and 123? The code would be much clearer if you wrote it like this:
1 2
|
for (int i = 'A'; i <= 'Z'; i++) ascii.push_back(i);
for (int i = 'a'; i < 'z'; i++) ascii.push_back(i);
|
Also consider this:
find(ascii.begin(), ascii.end(), (int) word[i])
To execute this code, the program does:
1 2 3 4 5 6 7 8 9
|
Is word[i] equal to 'A'? No? Hmm.
Okay, is word[i] equal to 'B' then? No....
What about 'C'? Is it 'C'? Dang.
It's gotta be 'D'. Is it 'D'? Rats!
...
Sigh.... okay then 'Z'. It must be 'Z' right? Oh boy. This is getting old.....
Alright then, maybe it's 'a'. Is it 'a'?
...
etc. etc. until it find the letter
|
A more efficient way to determine if a character is a letter is to check if it's in the range, sort of like the code that populates the ascii table:
if (word[i] >= 'A' and word[i] <= 'Z' or word[i] >= 'a' && word[i] <= 'z') ...
Since the set of all ascii characters is small, an even faster way would be to create a vector<bool> that is indexed by the ASCII value of a character, and returns whether the character is a letter. Then you could do something like this:
if (isletter[word[i]]) ...
.
It turns out that such a thing already exists in the C library. It's called isalpha():
1 2 3
|
#include <cctype>
...
if (isalpha(static_cast<unsigned char>(word[i])) ...
|
You are probably asking what's with that cast to unsigned char? Isalpha()'s parameter is an ASCII value expressed as an integer. But the type char can be signed. Meaning that values might be in the range -128 to +127. You want to be sure that you pass a value in the range 0 to 255, so you need to cast the char to an unsigned char.
There's also a bug in the way that you're using
word[i].erase()
. Consider when word contains the string "?!ABC". The first time through the loop you find that '?' isn't a letter, so you execute erase, changing word[i] to "!ABC". So far so good, but then
you increment i to 1. So the next time though the loop you check word[1], which is now 'A'. As a result, you've mistakenly left in a non-alpha character.
There's also efficiency to consider. If word[i] is "!@#$%^ABCDE" then the that first call to erase() must copy "@#$%^ABCDE" (10 characters) over to the left. The next time (assuming that you first fix the above bug), it would copy "#$%^ABCDE" (9 characters) over. Then 7, then 6 and finally 5. That's a lot of copying. It might be faster to build up a second string from scratch using only the alpha characters. This method is easy to understand it avoids the above bug:
1 2 3 4 5 6
|
string word;
for (auto c : str) {
if (isalpha(static_cast<unsigned char>(c))) {
word += c;
}
}
|
This program puts it all together:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
|
#include <iostream>
#include <string>
#include <cctype>
using std::cin;
using std::cout;
using std::string;
int
main()
{
string str, word;
while (cin >> str) {
string word;
for (auto c : str) {
if (isalpha(static_cast<unsigned char>(c))) {
word += c;
}
}
cout << str << " shrinks to " << word << '\n';
}
}
|