How many words are in the C++ language

Pages: 12
I'm engineer, but not a programmer. My question is simple but it's hard to find the answer by using search engine.

"How many words are in the C++ language?"

In the sense of definition, C++ is language, therefore it goes accordingly with this definition

All languages rely on the process of semiosis to relate a sign with a particular meaning. Spoken and signed languages contain a phonological system that governs how sounds or visual symbols are used to form sequences known as words or morphemes, and a syntactic system that governs how words and morphemes are used to form phrases and utterances.


The definition said that language must have 2 thing, which are words and syntactic system. For programing language, syntactic system off course is syntax. But what is "words" in C++ (or programming language) and how many are they?

Could we said that "words" in programing language is semantic?

A programming language is usually split into the two components of syntax (form) and semantics (meaning) and many programming languages have some kind of written specification of their syntax and/or semantics.


http://en.wikipedia.org/wiki/Semantics#Computer_science

Or Could we said that "words" in c++ is everything in reserved words and predefined identifiers, which are collectively referred to as keywords? Could we say Keyword, Operators, Operator Precedence, and "functions, constants, classes, objects and templates of the standard library" are the "words" of C++?

Sure we can add new "words" to C++ language by using declaration, but my question is straight forward, the basic "words" in C++ as in "How many words are there in the English language?" question.

Since there is no exact answer for question like this, roughly answered is sufficient.

Can any one help me?
Last edited on
If you want to compare programing languages with natural languages then words would be all sequences of characters or digits and maybe some symbols. Though programing languages are different from natural languages. I don't think they have phonological systems (at least there is definitely nothing 'phono-' about them).
If you're looking for a list of keywords, see http://begincpp.blogspot.com/2009/11/list-of-all-c-keywords.html
I don't think the term 'word' is used in formal languages.
I guess the most similar thing is the lexeme of a terminal symbol, this would include all possible literals and identifiers,
the number of possible valid combination of characters is huge and depends on the specific implementation.
@Bazzy: Actually, it is. A word of a formal language is any symbol that can be constructed from it's terminal symbols using the rules of the language.

@OP: If you view C++ as a formal language, any program that can be written in C++ would be a 'word'. Otherwise, regarding keywords, here is a list of all of them:
http://cs.smu.ca/~porter/csc/ref/cpp_keywords.html


Please keep in mind that programming languages (and other formal languages) really aren't the same thing as natural languages.
Last edited on
I think programing language will be easily understand if it approach by a familiar concept.

As I quote (from wiki) and base on system how to learn foreign language, I think "words" is correct term. To be able to use English as second language, the process is started by learning the meaning of words and how to use the words (syntax).

In programing language "the number of possible valid combination of characters is huge", but the basic words are limited. Basic words, I think, is no need to be declared anymore. Basic words are codes that the machine (compiler) understand immediately without the need to be explained first. For example, the machine would recognize word "while" without the need to be tell or declared the definition of "while".

I think you got that wrong. A machine doesn't understand the instruction while at all.

What you would mean would be operation codes of CPU's, but those don't really have anything to do with C++ anymore.

A compiler is not a machine, it is a program that translates code from one language (C++ in this case) into a form that is readible by a machine.

It would be best to learn programming first and become familiar with it's concepts, and not compare it all too much with natural languages- because while there are similarities, they are completely distinct things.
Last edited on
That's why when I install the C++ compiler in my machine, the basic words recognize by the machine is everything in Reserved Words, Predefined Identifiers, Operators, Operator Precedence, Preprocessor Directives, and the rest of it is everything that defined in C++ Standard Library.

If I want to use new words, construct from terminal symbols, I have to tell the machine first, since the machine is "stupid". The process telling the machine is "Declarations", right?
For me, software and hardware both are the machine, where software just another "virtual" machine.

Actually I want to compare it with how physic define object of study first and then move on step by step to build and define the entire system to know how the system work.

As physic use both mathematical language and "definition language", I see word as the matter and syntax as the law that dictate how the matter interact each other. Therefore, all definition use by physic is not entirely same with natural language. Let just say that I want to use the Language of Physic to define programing (which I know a little bit/familiar) and using C++ as the case of study.

That way, a machine, either an compiler or a hardware, recognize limited type of matter and limited type of law of interaction.

As compiler is a system, words in here is an object of study. We can create a new object of study, but then we have to create what law shall we put in the object of study.

For me, software and hardware both are the machine, where software just another "virtual" machine.


Sorry, but in that case you have a completely wrong image of what software is. Programs are data that a machine operates on. There are programs that behave like machines (http://en.wikipedia.org/wiki/Virtual_machine ), but they are not the same thing.


And as for programming, a top-down approach would be better. Programming languages are, compared to machine languages and machines in general, highly abstract things. Knowing how machines work won't help you understand programming, much as knowing how sound waves work won't help you learn speaking. Knowledge about the machines you program for does prove useful in praxis (especially knowing the practical limits of your programs), but in the end programs are not things that were created to be executed by machines, it is machines that were built to execute programs.
As my world is engineer and scientist, not many of us fully understand programing language.

Try to define software if you can, but now according to Physic or Biology terminology. As far as I concern Physic has no firm definition yet about Information, in which I believe software is Information, no longer mere data, since the set of data now is presented as organized, structured or presented in a given contex.

I'm testing my explanation that though programing language is different, but it's still a language. Since language is characterized by two aspect: the matter (words) and the law (the syntactic system), so what is the matter (words) of C++ language.

As lot of programmer quickly said, C++ has lot of "words" it's make me wonder do the programmer see that we can organize that "words" in little bit different fashion to see it from perspective of different field of study?

Seriously, I think "basic words" in C++ everything in Reserved Words, Predefined Identifiers, Operators, Operator Precedence, Preprocessor Directives, and everything that defined in C++ Standard Library.

We can define new word as

char question[]= "";

but we have to tell the machine that the word "question" is a character by using word for define something which is "char"

I believe code above should be called declaration. Can you see the different between the word "char" and "question". The computer recognize char, since during Installation of the compiler in the computer, we tell the computer what is the definition of char, in which the compiler will convert char into low level language to be processed by the processor. But the word "question" can not exist without declaration.

Then when we use code

cout << question;

can you see the different between the word "cout" and "char"? I believe in the sense of word, the "cout" in here can be consider a verb while "char" is adjective.
Last edited on
Honestly, I think you are sort of stabbing strawmen here. The difference here is whether you define a "word" as the set of the terminal symbols, or the set of possible combinations of terminal symbols over the grammar of the language. I have actually no idea what you're getting at, because the code above IS a declaration (and a definition).

I believe code above should be called declaration. Can you see the different between the word "char" and "question". The computer recognize char, since during Installation of the compiler in the computer, we tell the computer what is the definition of char, in which the compiler will convert char into low level language to be processed by the processor.


Are you really an engineer? The concept of datatypes doesn't exist on machine level at all. Datatypes are part of the semantics of the language.

Also, cout is not a "word" if you define words as the set of terminal symbols- "cout" is simply a predefined object of the type "istream", and "istream" again is a standard class declared in the "iostream" header and so on and so on...


Honestly, please tell me what exactly you are trying to say here.
Last edited on
There are wide branch of engineer. I'm for one is Materials Engineering. Can you make correlation between software engineer with mine. Since we have broad specialty also, raging from Metallurgy to Material simulation.

Define a "word" as the set of the terminal symbols, has no use in my question. "Basic words" for me is basic world that already define to the machine.

Different field of study have different of definition and (natural) language nor semantic are not our field. I get it, all the opinion to do not mix up natural language with programing language. I only want to see how far I can describe programing language (C++, the popular one) using Physic language.

Physic can be applied to virtual system as long as the relation of the matter in virtual system is define first. Ergo, Programing Language is virtual system according to physic.

I already said my conclusion, by define "basic words" first, I conclude that "basic words" in C++ everything in Reserved Words, Predefined Identifiers, Operators, Operator Precedence, Preprocessor Directives, and everything that defined in C++ Standard Library.

I defined "basic words" as set of the terminal symbols that will be recognized by the compiler (as recognize = have definition) without the need to use declaration.

It's up to you guys "the programmer" want to bring my topic to what area.
sorry for grammar.
http://en.wikipedia.org/wiki/Terminal_and_nonterminal_symbols

I completely mess up with definitional of terminal symbol here.

Sorry again
closed account (zb0S216C)
I believe that restrict is also a keyword; I don't think it's standard( defined in _mingw.h ).

Edit -------------------8<------------------
Here's a link that explains what restrict does: http://developers.sun.com/solaris/articles/cc_restrict.html
Last edited on
Well I see now.

You see C++ as much deeper in it's layer while I still see it from "user" point of view. In which I only know how to "use" C++.

What I define "words", can not be easily define in computer science study without effecting other establish definition. I can see also, definition of "words" from natural language can not be used, because computer science already create different meaning for "sequences of symbols" or "words".

I think this is the part of definition terminal and non terminal symbol came in.

If in natural language we defined word is set of symbol as the smallest meaningful unit that can stand by themselves, then from what I understand terminal-symbol itself is equal with the symbol in definition of word in natural language. In other word, set of terminal-symbol as the smallest meaningful unit that can stand by themselves should be the meaning of "word" in formal language.

But it could not apply, since there is confusion about meaningful in here. Apparently in computer science, since it's build by definition, definition go deep into all layer and meaning will become object of heavy discussion, since apparently, everything has meaning.

Well that's my understanding as non computer science engineer.
Last edited on
Well, depends. In general use, what we call a word would be, for example, "computer". However, if we would view the english language as a whole, "computer" would be a symbol, and "I know how to operate a computer" would be a word that can be constructed with the language "english". All general-use words of the language english would form the alphabet of the english language.

As I said, I still don't really get where you're trying to go with here. Do you want to learn something about C++, or do you want to talk about formal languages, or what?

Oh and @Framework: Of course it's standard, but only in C99. And you normally don't need to include a header for it either... (that would sorta go against the concept of having a keyword for this, no?) - maybe at that point some sort of macro is defined that marks pointers as unaliased like the restrict keyword does without having the compiler follow C99?
closed account (zb0S216C)
And you normally don't need to include a header for it either

I believe you do in GCC. GCC complains about restrict when _mingw.h isn't included. Although restrict is among the keywords list, only __restrict__ actually works.
have you tried gcc test.c -o test -std=c99 ? As I said, restrict is a keyword in C99, but GCC doesn't automatically assume you are in C99 mode.
Last edited on
closed account (zb0S216C)
Hmmm...You have a point there, Hanst.
Pages: 12