Copying Strings

1) Why can't I do the following?
char b [] = a;

A C-style array cannot be copy-initialized from another C-style array for the same reason it cannot be copy-assigned: it was to deliberately break compatibility with B, where such statements were a common programming idiom (array rehoming) which would never work in C.

Note that if you wrap arrays in structs, they become copyable (B had no structs, there is no way a B programmer would write such code)

struct S {
    char a[6];
};
int main() {
    S a = {"hello"}; 
    S b = a;
}

2) how come in the following code I can do "student1.name = n;"?
3) How come in the code above when I change student 2's name it changes student 1's name?

name is a pointer, that's a whole different (although related) ball game.

The line

student1.name = n;

creats a pointer to n[0] and copies that pointer into name.

The line MITStudent student2 = student1; makes another copy of that pointer, student2.name - it's still pointing at the same n[0].

The line student2.name[0] = 'b'; changes n[0]

Aug 13, 2012 at 5:57pm

In reference to your response to #1:
- I am still not clear on #1. Can you make your explanation simpler please.
- Also, I've never heard of B, and what is copy-initialized and copy-assigned?
- What is the difference between char a[] = "blah" and char *a = "blah"?

In reference to your response to #2 &#3:
- So what your saying is since the student's name is a pointer, it points to the address of n. Then when I say MITStudent student2 = student1, since the object's name parameter is a pointer, I am simply equating the pointers, hence when I change one I change them all. Correct?

Aug 13, 2012 at 6:07pm

I've never heard of B,

That was the language before C. It is extinct, don't worry about it. It's just that the reason the line you asked about doesn't compile is this old B to C migration.

What is the difference between char a[] = "blah" and char *a = "blah"?

One creates a C-style array of five char called "a" and populates it with the characters 'b', 'l', 'a', 'h', and '\0'. This array will be destroyed at the next closing brace.

The other creates a nameless read-only character array of six char at program startup (which is only destroyed at program termination), then creates a pointer called "a" and stores the address of the first character (the character 'b') in it. Incidentally, this is an error in modern C++, the correct syntax is const char *a = "blah";

Note that in C++, you should be using std::string a = "blah"; (but then you wouldn't ever have a chance to learn about B!)

since the student's name is a pointer, it points to the address of n.

It points to the address of n[0], the first character of your character array. But yes, otherwise it is about right: when you change the value pointed by one pointer, you can observe the change through another pointer to the same char object.

Last edited on Aug 13, 2012 at 6:10pm

Aug 13, 2012 at 6:22pm

Great explanation!

I'm still not clear on that first question though, why can't I say "char b [] = a"?

int main() {
char a [] = "hello";  // This creates 6 characters in memory and names them a
char b [] = a;          /* Doesn't this also create six characters and copy them?
                                   The only reason why it doesn't work that I could think of
                                    is that a only points to a[0]. But then wouldn't b at
                                    least be equal to 'h'? */    
}

Also why do I need the "const" before char *a = "blah"?

I haven't learnt about strings in c++ yet ... the MIT course I'm looking at sticks closer to C first, and yes I wouldn't have known about B :P

Thanks!

Aug 13, 2012 at 6:37pm

char b [] = a; /* Doesn't this also create six characters and copy them? */
It does this with C++ strings, C++ arrays, and C arrays inside structs, and it *would* do that for raw C arrays, except that the creators of C decided to block this specific case for the sake of the B programmers.

why do I need the "const" before char *a = "blah"

Because that 'b' is a const char. A pointer to it is a const char*

I haven't learnt about strings in c++ yet ... the MIT course I'm looking at sticks closer to C first

Sounds like that "Introduction to C++" nonsense that some undergrads put together for opencourseware: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-096-introduction-to-c-january-iap-2011/index.htm It was a good exercise for the students that made it, but it's useless as teaching material.

Last edited on Aug 13, 2012 at 6:39pm

Aug 13, 2012 at 6:56pm

viliml (791)

why do I need the "const" before char *a = "blah"

Because, whenever you use a literal in your code, it put's it somewhere in the RAM of your computer. If you would just change that mememory like crazy, somethign bad would happen. Try this code:

#include <iostream>
int main()
{
	for (int i=0; i<10000; i++) std::cout<<"Hello World!"[i];
}

It will probably have alternating chunks of random data and C-string names of C++ stuff like constructors, destructors, vtables, lambdas, keywords, and other library stuff.

Aug 13, 2012 at 8:33pm

@Cubbi:
1) so are you saying that if a and b were integers this would work? Ie

1
2

int a [] = {1, 2, 3}; 
char b [] = a;

2) are you saying b is a constant char because of what @viliml said?

3) That is exactly where I'm learning from, I am very open to hearing of other places I should learn from.

Thanks guys!

Aug 13, 2012 at 9:20pm

1) so are you saying that if a and b were integers this would work?

No, i am saying that if a and b were strings, C++ arrays, vectors, or pretty much anything that's not a C-style array, it would work:

#include <vector>
int main()
{
    std::vector<int> a = {1, 2, 3}; 
    std::vector<int> b = a; 
}

online demo: http://ideone.com/zZK37

#include <array>
int main()
{
    std::array<int, 3> a = {1, 2, 3}; 
    std::array<int, 3> b = a; 
}

online demo: http://ideone.com/ZY68E

#include <string>
int main()
{
    std::string a = "blah"; 
    std::string b = a; 
}

online demo: http://ideone.com/vwMRt

etc.

2) are you saying b is a constant char because of what @viliml said?

Not really. I am saying that b is a constant char because that's what happens when you use double quotes in source code: a read-only array is created. Viliml is pointing out some of the possible repercussions of modifying memory around the array on platforms where the read-only property is not supported at the OS level.

3) That is exactly where I'm learning from, I am very open to hearing of other places I should learn from.

There are a few decent books: "Accelerated C++", "Programming: Principles and Practice using C++", "C++ Primer" (I'd preorder the new edition though, if it's not in stores yet).

Last edited on Aug 13, 2012 at 9:22pm

Aug 14, 2012 at 2:00pm

1) I think I get it.

2) Okay I just want to clarify a constant char means the value stored in the variable cannot be changed. Correct? And the reason why it cannot be changed is because it is in read only memory. Correct? Well if this is the case then wouldn't the second line of code bellow work, because it is not in RD-Only memory? Also I don't understand what happens when you use double quotes?

1
2

char *a = "hello";
char b[] = "yello";

3) Thanks!

Aug 14, 2012 at 2:26pm

Assuming your code is inside a function:

The line char *a = "hello"; does this:
1. in read-only section of the program image, the characters 'h', 'e', 'l', 'l', 'o', '\0' are stored, say, at .rodata offset 0 <- this is what the double quotes do

2. at the line of the program where you wrote that line, a pointer-to-char is created (typically, in a CPU register) and the value .rodata+0 is stored in it.

The line char b[] = "yello"; does this:

1. in read-only section of the program image, the characters 'y', 'e', 'l', 'l', 'o', '\0' are stored, say, at .rodata offset 6

2. at the line of the program where you wrote that line, a 6-character array is allocated on stack, then a loop is compiled that copies the six characters from read-only memory (.rodata+6 to .rodata+12) to the six locations on stack.

as a result, a[0] .. a[5] are read-only locations, while b[0] .. b[5] are writable

Aug 14, 2012 at 2:50pm

THAT WAS SUCH A GOOD EXPLANATION! THANK YOU SO MUCH!

I have two more questions if you don't mind...

1) Is this loop that copies the data to the stack executed at run time or during compile time?
2) Stack memory is last in first out right. Well wouldn't that be an issue? I mean if I have 5 character arrays and I wanted to access the 3rd one that would be a problem. Am I not understanding stack memory?

Aug 14, 2012 at 2:53pm

By the way I ordered the book:

http://www.amazon.com/Primer-Plus-Edition-Developers-Library/dp/0321776402/ref=sr_1_2?ie=UTF8&qid=1344954371&sr=8-2&keywords=C%2B%2B+Primer

Aug 14, 2012 at 3:07pm