Still Trying To Understand Pointers

Forum

Forum
Windows Programming
Still Trying To Understand Pointers

Still Trying To Understand Pointers

Jun 23, 2011 at 12:03pm

Pretend I'm an amateur. Shouldn't be hard to do. Now, pretend I don't understand the following, which I don't. Then, remembering that we're pretending I'm an amateur, please explain to me simply, concisely, but sufficiently why this is so --

int main()
{
	char* ptr1 = "Computer";

	cout << "*ptr1 = " << *ptr1 << endl << endl;
	cout << "ptr1 = " << ptr1 << endl << endl;

	return 0;
}

The output is:

1
2
3

*ptr1 = C

ptr1 = Computer

Why am I getting only one character with the first, and the whole enchilada with the second?

I know it has a lot to do with the fact that *ptr1 is referring to the address (I think), but it seems that I should get the whole word when I declare it that way, for if I make it an array it works differently.

Last edited on Jun 23, 2011 at 12:14pm

Jun 23, 2011 at 12:16pm

Athar (4466)

ptr1 is a pointer to char (like the name implies, a char stores a single character), so dereferencing gets you a character.
However, when printing a pointer to char, it is treated as a pointer to the beginning of a C string. See
http://en.wikipedia.org/wiki/C_string

Pointers to char are treated special in that regard. When trying to print other pointer types (e.g. int*) you'll get a representation of the address it contains.

Last edited on Jun 23, 2011 at 12:18pm

Jun 23, 2011 at 12:22pm

Lamblion (642)

I understand what you're saying, but I still don't understand the concept. For example, if I do this --

 char* ptr[] = {
"one",
"two",
three"
}

Then ptr[0] = one, ptr[1]= two, ptr[2] = three, i.e., it translates to the whole word. But I guess it's the same thing because I'm not dereferencing here either. I guess I'll read that article carefully. I'm also reading this article --

http://pw1.netcom.com/~tjensen/ptr/pointers.htm

Jun 23, 2011 at 12:34pm

Athar (4466)

Well, in that case you have an array of char pointers. Each of those pointers points to the beginning of a string literal (those are C strings).
Now if you write ptr[0], you have a char pointer and you're back at the first example.
If you dereference it again, you'll get the first character of "one" and if you print it, the entire C string it points to is printed.

Jun 23, 2011 at 12:42pm

Lamblion (642)

It becomes a little clearer each time I get into it, but I think it is crucial that understand pointers very well if I'm really going to make real progress in progamming C/C++ Win32.

Jun 23, 2011 at 1:53pm

kaije (34)

char* ptr1 is a pointer to a char, since you assigned "computer" to it, its an array of around 8 chars, you can then refer to single characters in the array by dereferencing like

(*ptr) is C (mem addr 1000)
(*ptr + 1) is o (mem addr 1001)
(*ptr + 2) is m (mem addr 1002)
(*ptr + 3) is p...(mem addr 1003)
etc

void crappyPrint(char * str)
{
  char *pch = str; // point first mem addr
  while (*pch != '\0') // strings end with a 0 / char '\0'
  {
    printf("%c", *pch);
    ++pch;
  }
}

Jun 23, 2011 at 2:08pm

Disch (13742)

a char is a single character. Example:

1
2
3

char c = 'A';

cout << c;  // prints "A"

When cout is passed a single char, like the above, it will only print the single char.

Now, C++ also treats string literals like an array of chars / or a pointer to an array of chars.

So:

// note the 'const' here.
const char* ptr = "example";

cout << ptr;  // prints "example";

This is kind of a special case as Athar mentioned. When given a const char*, cout will interpret that as a C-style string and print the entire string. The actual pointer points to the first character in the string (in this case, the first 'e' in example). The rest of the string can be "found" by incrementing the pointer, since the string is stored sequentially in memory.

That said... let's look at your first example:

const char* ptr1 = "Computer";  // note: added const here for good measure

cout << "*ptr1 = " << *ptr1 << endl << endl;
cout << "ptr1 = " << ptr1 << endl << endl;

Here, you're using the * operator to dereference a pointer. When you dereference a pointer, you get whatever the pointer is pointing to.

So if ptr1 is a const char*, then that means that *ptr1 is a const char (ie: it's a single character).

Since ptr1 points to the first character in the string ('C'), that is what is printed when you print *ptr1. You're giving just a single char to cout, so that's all it prints.

However when you pass the pointer to cout, it will print the whole string because of the special case noted above.

Really, this is easier to understand with a type other than char, because char's are treated special. Let's look at an int:

int foo = 5;
int* ptr = &foo;  // ptr points to foo

cout << *ptr;  // prints '5'
cout << ptr;  // prints the address of foo

Here, *ptr is a single int. Since ptr points to foo, printing *ptr is the same as printing foo. You get the output of '5'. But when you print ptr, that's exactly what you're printing: the pointer. The address in memory at which foo is stored.

char pointers are the same idea, only instead of printing the address, they are treated specially and are printed as a string instead.

Now with your second example:

const char* ptr[] = {
"one",
"two",
"three"
}

Here, ptr isn't really a pointer. It's an array of pointers. Kind of like a const char** (ie a pointer to a pointer). _{hyper-technically this isn't true, but it might help you understand it}

Therefore, in order to get a single character, you'd need to dereference the pointer twice:

1
2
3

ptr;   // <- a const char**  (pointer to a pointer)
*ptr;  // <- a const char* (pointer)
**ptr; // <- a const char (single char)

Now remember that the bracket operator [] does the exact same thing as the indirection operator *. They both dereference the pointer.

So another way to view the above:

1
2
3

ptr;        // <- a const char**
ptr[0];     // <- a const char*
ptr[0][0];  // <- a const char

So when you do this:

cout << ptr[0];

ptr[0] is a pointer to a const char, so "one" is printed as you'd expect.

Last edited on Jun 23, 2011 at 2:08pm

Jun 23, 2011 at 3:09pm

Lamblion (642)

Thanks for that detailed response. I'm beginning to understand pointer arithmetic better, but I haven't reached that point where everything just sort of "clicks". I still have to back through my notes or look it up in books on certain things (or do the trial and error routine with the compiler), but it's coming.

As I said, I feel instinctively that unless I really get a really good handle on pointers I'm going to badly handicapped. Understanding pointers also helps me understand the computer architecture, and although I'm not really interested in learning Assembly (at least at this point), I'm reading Kip Irving's book on Assembly so I can better understand what's under the hood, which in turn should help me better understand pointers, as well as other stuff.

I also don't understand the bitwise shifts (if that's the right terminology), which I also need to learn and understand, and which I think the Assembly will help with as well.

Jun 24, 2011 at 7:05am

closed account (2b4hAqkS)

Hi Lamblion,

You're right to want to get the pointer thing dead right. The better you feel about pointers, the better you will feel about programming in general. Even in languages that "don't have pointers" like Java, you still need to understand what it means to pass an object as a function argument and how that works.

To your question, I think something helpful to know is that arrays and pointers are kind of the same thing.

1
2

char* str1 = "text";
char str2[] = "text";

The from the point you declare these, str1 and str2 can be used in exactly the same ways. Basically when you declare an array of something, the array name is like a pointer to the very first spot in the array. (Also note that the pointer is of type char*, but the array is of type char). For example:

cout << str1[2];
cout << *(str1 + 2);
cout << str2[2];
cout << *(str2 + 2);

All four of these print the single character 'x'. Regardless of if you declared some thing as an array or pointer initially, they work the same way. Since a C-string is just a character array, a pointer-to-character exactly the same thing.

In the case of cout, if you have something like this:

1
2

int intArray[] = {3,4,5}
cout << intArray;

Now here you're essentially handing cout an int*, and cout says hey, that's an address, and it prints the value of the address (something in hex, like 0xb0334aa0). The thing that is weird about char*'s and cout is that they behave as an exception. Cout knows that what you really want to do is not print the starting address of the string, but ask it to print every character in the string sequentially. It does this by essentially doing something like this pseudocode:

char* str = "My String";
cout << str;

// cout internally does
while (*str != '\0') { // is this character the last (null) character?
   putch(*str); // one character
   str++; // point to next character
}

This relies on the fact that when you hand it a char*, it is going to keep printing out every single character until it finds the null character '\0'. This behavior is impossible with passing cout another array type (like int*) because it has absolutely no way of knowing when to stop. When you declare a string like "My String", what you really get is an array of characters like this: {'M', 'y', ' ', 'S', 't', 'r', 'i', 'n', 'g', '\0'};

The null '\0' character is implicitly tacked on at the end to track the end of the string.

Sorry for the long winded response, but pretty much all of that is necessary to truly understand why you're seeing the behavior your are.

Last edited on Jun 24, 2011 at 7:08am

Jun 24, 2011 at 9:15am

h9uest (157)

Just wanna say, second Disch.
The real difference lies in the cout. In your case, it is cout that decides what to output for a given input.

Understanding pointers also helps me understand the computer architecture

From my personal experience, that's not the case. If you want to understand computer architecture, you might want to start with computer organization and then go into architecture stuff gradually. Start from single cycle machine to multi-cycle, then to pipeline, then to superscaler, multi-core, on and on. You might also want to look into topics like data caching, memory coherence, etc. So much stuff for computer architecture and understanding pointer didn't really help me understand computer architecture. Read Patterson' computer organization and computer architecture: quantative approach, they are helpful.

If you simply want to understand pointers from a high level, I guess knowing the memory model well will be sufficient.

Jun 24, 2011 at 10:44am

Lamblion (642)

crimsonking: That was actually very helpful, the way you explained it.

h0uest: There's so much that could be learned and absoreded I just have to start somewhere. Right now, learning the pointers and also learning how assembly language interacts not only with the computer, but how code from C/C++ actually translates to Assembly, is a full plate for me at this time.

Thanks for the replies. I appreaciate them.

Jun 24, 2011 at 7:28pm

freddie1 (1838)

Don't know if you are on information overload by now Lamblion, as everyone has presented really good and detailed information, but here is a little program with its output that shows the kinds of things I agonized over when I was learning pointers. This little program shows something like I do just about every day in creating custom controls and other things too when you can't work with named variables but must work with pointers, and that is to allocate memory for objects so as to 'take posession' of them. The concept of 'posession' and memory allocations go hand in hand with pointers. Anyway, what this program does is allocate on the stack storage for the character strings "Zero", "One", "Two", and "Three". Then the program dynamically allocates storage for a buffer to hoild pointers to these four strings, then independent storage for the four strings themselves. Finally, it outputs lots of info on everything in terms of where its at.

If you take the time to look at this, when you are looking at it, think about, for example, how a listbox control in Windows manages internally all the various strings you can feed into it, and at that point I'm sure it will dawn on you how important and powerful a concept this is.

This was an old C program I had but I just checked it with a modern C++ compiler, and its OK (had to add a few casts).

#include "Windows.h"
#include <stdio.h>
#define  LAST_INDEX  3
//
int main(void)
{
 char* szData[]={(char*)"Zero",(char*)"One",(char*)"Two",(char*)"Three"}; // Initial test data is in szData[]
 BOOL blnFree=(BOOL)NULL;                                                 //array, but for purposes of exposition
 char **pMem=NULL;                                                        //this data will be copied to dynamically allocated
 unsigned int i,j;                                                        //storage accessed by char pointer to pointer pMem.
 HANDLE hHeap;
 //
 hHeap=GetProcessHeap();
 pMem=(char**)HeapAlloc(hHeap,HEAP_ZERO_MEMORY,sizeof(char*)*(LAST_INDEX+1));
 if(pMem)
 {
    printf("Allocate 16 bytes to hold four char pointers.\n");
    printf("There will be one allocation to hold the four\n");
    printf("pointers, i.e., pMem, and then four more allocations\n");
    printf("for memory into which to copy the strings themselves.\n\n");
    printf("pMem=%u\n\n",(unsigned)pMem);
    puts("                        UINT (%u)       str (%s)");
    puts("i       &pMem[i]        pMem[i]         pMem[i] ");
    puts("==================================================");
    for(i=0;i<=LAST_INDEX;i++)
    {
        pMem[i]=(char*)HeapAlloc(hHeap,HEAP_ZERO_MEMORY,strlen(szData[i])+1);
        if(pMem[i])
        {
           strcpy(pMem[i],szData[i]);  //Copy *szData[] strings to storage
           printf("%u\t%u\t\t%u\t\t%s\n",i,(unsigned)&pMem[i],(unsigned)pMem[i],pMem[i]);
        }
    }
    printf("\n");
    for(i=0;i<=LAST_INDEX;i++)
    {
        if(pMem[i])
        {
           for(j=0;j<strlen(pMem[i]);j++)  //try some fancy byte
               printf("%c\t",pMem[i][j]);  // minipulations!
           printf("\n");
        }
    }
    printf("\n");
    for(i=0;i<=LAST_INDEX;i++)
    {
        if(pMem[i])
        {
           blnFree=!!HeapFree(GetProcessHeap(),0,pMem[i]);
           printf("blnFree=%u\n",blnFree);
        }
    }
    printf("\n");
    blnFree=!!HeapFree(GetProcessHeap(),0,pMem);
    printf("blnFree=%u\n",blnFree);
 }
 else
    puts("Memory Allocation Failure!");
 getchar();
 //
 return 0;
}

/*
'Output:
'
'Allocate 16 bytes to hold four char pointers.
'There will be one allocation to hold the four
'pointers, i.e., pMem, and then four more allocations
'for memory into which to copy the strings themselves.
'
'pMem=2307184
'
'                        UINT (%u)       str (%s)
'i       &pMem[i]        pMem[i]         pMem[i]
'==================================================
'0       2307184         2303368         Zero
'1       2307188         2307208         One
'2       2307192         2307224         Two
'3       2307196         2307240         Three
'
'Z       e       r       o
'O       n       e
'T       w       o
'T       h       r       e       e
'
'blnFree=1
'blnFree=1
'blnFree=1
'blnFree=1
'
'blnFree=1
*/

Edit & run on cpp.sh

Jun 24, 2011 at 8:00pm

Disch (13742)

Lamblion wrote:
I also don't understand the bitwise shifts (if that's the right terminology), which I also need to learn and understand, and which I think the Assembly will help with as well.

Bitwise stuff is simple.

The only thing you need to know is that everything is represented in binary. Let's take a simple unsigned char that's 8 bits wide. Each bit can be either 0 or 1. Let's say we have this 8-bit value:
01001101
Each bit has a "weight" of 2^n, where n is the bit number. So bit number 0 (the least significant bit) has a weight of 2^0 = 1. Bit 1 has a weight of 2^1 = 2, Bit 2 has a weight of 2^2 = 4, etc.

Now just like normal decimal numbers, digits are typically written most significant digit first. So the above bits would be numbered 76543210

To get the value of this, you sum all the weights of the bits that are "on" (ie: 1, not 0)

so again with this number: 01001101

We can see that bits 0,2,3, and 6 are on
which gives us:
2^0 = 1
2^2 = 4
2^3 = 8
2^6 = 64

1 + 4 + 8 + 64 = 77

Therefore 01001101 (in binary) equals 77 (in decimal)

All bitshifting does is shift the bits over. So:

1
2
3

01001101  <- 77
10011010  <- 77 left shifted 1
00110100  <- 77 left shifted 2

Jun 24, 2011 at 9:58pm

Lamblion (642)

Thanks very much for that program, freddie1. I am going to compile and run it later and study it. What Windows does under the hood, such as with the list boxes you mentioned, is quite amazing to me sometimes.

If I could have one programming wish, I'd wish I could program good enough to write the compilers themselves, such VS 2010! -:)

Disch, is this the same thing as ORing? Again, I'm not sure of the terminology. I've read about in several books and even copied the code before, but I can't find any of them right now.

And by the way, I'm not worried about information overload. I learn bits and pieces every time one you writes something, and I really do appreciate it.

As I say, I am now -- and will always be -- just a hobbyist programmer, but I take my hobbies seriously! -:)

Last edited on Jun 24, 2011 at 10:04pm

Topic archived. No new replies allowed.