The function your posted looks for an identifier. Let's see the exact rules it takes:
1 2
|
while(isspace(c = getch())
;
|
This loop will execute as long as it reads spaces from the input, since the loop does nothing, it will swallow up all leading whitespaces.
1 2 3 4 5 6 7
|
if(c != EOF)
*w++ = c;
if(!isalpha(c))
{
*w = '\0';
return c;
}
|
The first code checks for an end-of-file (no more characters to read), if the first non-whitespace character found is not EOF, it will store the first value as the first character in the string named w.
The next part of the code checks if it encountered a non-alphabetic character, if so, it will terminate the string and return this character.
1 2 3 4 5
|
for( ; --lim > 0; w++)
if ( !isalnum(*w = getch())) {
ungetch(*w);
break;
}
|
Next comes the main part of this function, the part that swallows up the rest of the word. This function continues until it reaches the maximum limit of character (
--lim > 0
) or until it encounters an non-alphanumeric value (the if-conditional).
Following this code, we know it will parse an alphabetic character, followed by any number of alphanumeric characters. An underscore isn't alphabetic or alplhanumeric and will thus stop the parser, and so will any of the other types. An example to accept underscores too could be:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
|
int getword(char *word, int lim)
{
int c, getch(void);
void ungetch(int);
char *w = word;
while (isspace(c = getch()))
;
if (c != EOF)
*w++ = c;
if (!isalpha(c) && c != '_') { //Check for underscore
*w = '\0';
return c;
}
for ( ; --lim > 0; w++)
{
*w = getch();
if (!isalnum(*w) && *w != '_') { //Check for underscore
ungetch(*w);
break;
}
}
*w = '\0';
return word[0];
}
|
You can modify this function to accept string constants (a trailing and leading quotation mark), comments (a leading /* and a trailing */) or preprocessor control lines (a trailing #), all of which can have spaces in between, which is left as an exercise to the reader.