"Is this correct" ? "What do you prefer?" : It doesn't matter.

closed account (jvqpDjzh)
Talk about the possible leaks and wrongs things to do. You can also say, if 2 things are equivalent, what is the best:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
int main()
{
    using namespace std;

    const unsigned LEN = 5;

    char str0[LEN] = "FAIL";
    char* str1 = new char[LEN];
    const char* str2 = "GOOD";

    //1. IS THE SECOND OPTION STUPID OR COULD I MAKE A MISTAKE 
//AND MAYBE NOT COPY THE '\0' CHARACTER?
    for(unsigned i=0; i<= LEN; i++)//EDIT: ERROR: OUT_OF_BOUNDS
        str1[i] = str0[i];
    cout << str1 << endl;

    //2. I AM CORRECT
    for(unsigned i=0; i< LEN; i++)
        str1[i] = str0[i];
    str1[LEN] = str0[LEN];//EDIT: ERROR: OUT_OF_BOUNDS
    cout << str1 << endl;

    //3. AM I BETTER THAN MY OLDER BROTHERS?
    unsigned c = 0;
    cout <<"\nCopying str0 to str1: ";
    while(str1[c] = str0[c++]);

    cout << "\nMaking str1 point to '\\0': ";
    while(*str1) cout << *str1++;

    cout <<"\nShowing str1: ";
    for(int i= - 4; str1[i] !='\0'; i++)
        cout << str1[i];

    cout <<"\nShowing str1 again (note: I output also '\\0'): ";
    for(int i=0; i> -5; i--)
        cout << str1[i];

    cout << "\nMaking str1 point to the first character again: ";
    for(unsigned i=0; i<4; i++) str1--;

    cout << "\nIn fact now str1 = "<<str1;

    cout <<"\nMaking str1 point to '\\0' again: ";
    while(*str1) cout << *str1++;

    cout <<"\nCopying str2 to str1 (I AM DANGEROUS!): ";
    while(*--str1 = *str2++);

    cout << "\nAHAH, I am beginning from 1, I am rebel!: ";
    for(int i=1; str1[i] !='\0'; i++)
        cout <<str1[i];

    cout << "\nNo, you are stupid, look at me now: ";
    while(*str1) cout <<*str1++;


    cin.get();
    return 0;
}


Last edited on
1) Access out of bounds here: str1[LEN] access in invalid
2) Remove line 19 (out of bounds again) and you will copy whole content of one character array to another
3) undefined behavior: using c, which changes, twice in unsequenced order.
correct way: strcpy(str1, str0) or strncpy() (safer but not exactly what your codedoes)
Also you are copying c-string which contained inside array, not the whole array (That is not matters in your case)

Line 27: correct, but you loses pointer to beginning of array.

Lines 30, 34: unsafe. Anything changes in your program and you are in trouble. But negative indexes are pretty standard.

Line 38: you have reclaimed pointer back, but it is terribly unsafe too.
Also I would use str1 -= 4; here.

Line 43 = 27

Line 46: you are copying backward. You keep trailing 0 and that is fine.

Line 49: OK

Line 54: OK

Whole program: LEAK. You did not delete str1 array.
It is very unsafe. It works only because you know all sizes and content beforehand. If anything changes (like array size or array content) and you need to walk through your program changing things.


Bonus: following is completely legal:
1
2
char line[] = "Hello!";
std::cout << 5[line];
Last edited on
closed account (jvqpDjzh)
strcpy(str1, str0) or strncpy()
Why you are always suggesting functions from the libraries? Do you think I don't know strcpy()?
Last edited on
Why you are always suggesting functions from the libraries
Because
a) Everyone understand what happens at the moment they see them. Your 1 and 2 examples are not crear in what are they supposed to do and will confuse people changing your code.
b) They often implemented more effectively than naive approach (strcpy might be implemented to read chunks of aligned data and copy it (fast)).
c) They are laconic and descriptive. There is no need to invent the wheel when you have standard function which does exactly what you want.

I actually forgot to suggest memcpy() for first two examples.
closed account (jvqpDjzh)
1) Access out of bounds here: str1[LEN] access in invalid
Why is it invalid? '\0' is also a character and I can copy it to str1, which has to have at the end that exact char

Remove line 19 (out of bounds again) and you will copy whole content of one character array to another

Well, if a const char* string has at the end of its content a '\0', why shouldn't I copy also that to the new string?
I could also do: str1[LEN] = '\0', it's the same!

Line 27: correct, but you loses pointer to beginning of array.
? pointer arithmetic? Never heard about that?

Line 54: OK
????


Whole program: LEAK. You did not delete str1 array.
Do you think that was what I have in mind?
This is not a program, of course. This are just experiments.
Why is it invalid?
char str1[LEN] contains LEN characters with valid indexes [0, LEN-1]. Like int x[2] will have 2 integers with indexes 0 and 1.
In your case trailing 0 is fifth character which is held in array by index 4 or LEN-1

I could also do: str1[LEN] = '\0', it's the same!
Read before. If you declare char arr[5] then arr[5] is out of bounds

pointer arithmetic? Never heard about that?
Which is easy to make mistake with. It is not an actual complain, just note that it is unsafe.

Line 54: OK
????
Lines were pointing to the beginning of the block I was talking about as whole. It is the same as block starting at line 28 and I did not think that you need warning about unsafe operations again. Aside from that both of them are perfectly valid way to show string character by character.

Do you think that was what I have in mind?
zwulu wrote:
Talk about the possible leaks
Yes I do. You have a leak here. And you cannot delete your array unless you know exactly what operations were performed and what is the length of all strings involved.

Just imagine that str2 was declared as extern const char* str2;
closed account (jvqpDjzh)
contains LEN characters with valid indexes [0, LEN-1].
So you are saying that '\0' is not a valid index? So for what should we use it? Everybody knows that every c-string has to have at the end a null terminated character to be considered a c-string, so I know that at the position 5 (LEN) there is that character, because I have just inserted it there! And I have allocated memory for exactly 5 characters to have space also for our '\0'. I don't understand why you are saying that '\0' is invalid or why accessing it is invalid, it's nonsense what you are saying

Read before. If you declare char arr[5] then arr[5] is out of bounds
No, at the position 5 we have the '\0' that I have just inserted. I know that it is there, it's not invalid.

Which is easy to make mistake with. It is not an actual complain, just note that it is unsafe.
I am not studying C++, because it's safe! I study also java and c#, no problem. I could use the smart points, but I don't see any acceptable reason to do it! ++C == JAVA && ++C == C#.


It is the same as block starting at line 28
Really? Didn't see it

Yes I do. You have a leak here.
Anyway, thank you!

Just imagine that str2 was declared as extern const char* str2;
Should we still use extern?
Last edited on
closed account (jvqpDjzh)
Sorry, I am completely stupid, I mean: str1[LEN -1] = str0[LEN - 1];
Last edited on
Should we still use extern?
Sure thing. When you need to access enviromental variables or you want to access variable in another file. Or you are in embedded enviroment which is one of the few reasons to use C features in C++

Really? Didn't see it
Find the difference:
while(*str1) cout << *str1++;   
while(*str1) cout <<*str1++;  
I might have confused you because I count only strings which actually do work here.

I am not studying C++, because it's safe!
C++ Standard Comitee wants people to not use C features in C++ unless there is no other way, instead replacing them with safer C++11 alternatives.
However you should know about those features anyway so I did not complained about it before.

I am not good at explaining, but look here:
1
2
3
4
5
char str0[LEN] = "FAIL";
//F A I L \0  ←Letters
//0 1 2 3 4   ←Indexes
//Total of 5 (LEN) chracters with indexes 0, 1, 2, 3, 4. 
//index 5 is out of bounds 
These tutorials might explain better:
http://www.cplusplus.com/doc/tutorial/arrays/
http://www.cplusplus.com/doc/tutorial/ntcs/

EDIT:
I mean: str1[LEN -1] = str0[LEN - 1];
Ok, but you still have out of bound access in (1) and in (2) this line is not needed because loop itself handles this case.
Last edited on
closed account (jvqpDjzh)
while(*str1) cout << *str1++;
while(*str1) cout <<*str1++;
I was joking.

Sorry, I have just got confused, but I meant LEN - 1, of course, so this should be correct, because it copies also the '\0', right?
1
2
    for(unsigned i=0; i< LEN; i++)
        str1[i] = str0[i];


EDIT:
Ok, but you still have out of bound access in (1) and in (2) this line is not needed because loop itself handles this case.
Yes, you are right, sorry.

EDIT: I have to pay more attention to these things, they are not complicated, but as you see I'm still making mistakes. Really, 1 index can change whole your program.
Last edited on
closed account (jvqpDjzh)
3) undefined behavior: using c, which changes, twice in unsequenced order.
correct way: strcpy(str1, str0) or strncpy() (safer but not exactly what your codedoes)
Also you are copying c-string which contained inside array, not the whole array (That is not matters in your case)
1
2
unsigned c = 0;
while(str1[c] = str0[c++]);

The first thing that happens is assigning the values at position 0 from one array to another and then c increments. In my case, when c is 3, it will copy the last character 'l' and then c increments agains, at that point what really happens? Is first assign the '\0' to str1 or the expression is evaluated immediately as 0 and the loop finishes?

The first thing that happens is assigning the values at position 0 from one array to another and then c increments.
This might happens. And it happens often enough for people assume that it is expected behavior.

However what might happens too (and happens on some platforms. I learned this the hard way):
Character at str0[c] saved in register
c is incremented.
Pointer to str1[c] is calculated (using new value of c)
Saved character is saved to memory location calculated pointer points to.
Repeat


If that was not an UB, it would copy everything up to first encountered '\0' symbol: operator= returns reference to value it had assigned and it checked by loop after assigment takes place. So this means loop will end after '\0' character was assigned.

I also noted that in first two examples you have copied whole array. If it contained '\0' in the middle it would copy whole array anyway.
Third one is copies a c-string: sequence of character ending with '\0' character. If there was '\0' in the middle of the array it would copy only that part leaving everything after unchanged.
Those examples have different semantics so it is impossible to tell what is "better" without knowing what are they used for.
closed account (jvqpDjzh)
1
2
for(int i=0; str0[i] != '\0'; i++) str1[i] = str0[i];
//This copies until the first '\0' character, without including it. 

//so str1 doesn't have the null terminated character yet, therefore we put it at the end of the array anyway:
str1[len - 1] = '\0';
At least, in this case, str1 will have just 1 '\0' and no undefined behaviour, that I can see, could happen
Last edited on
Yes, This loop is fine and it does what you want.

Common implementation also includes:
1
2
3
4
5
char* s1 = str1;
char* s0 = str0;
while(*s1++ = *s0++)
    ;
//Faster than index based loop on non- or badly-optimizing compilers 
It is actually common approach to strcpy() implementation. Unless library uses heavy optimized functions. In this case you will see something like this: http://www.stdlib.net/~colmmacc/strlen.c.html
(It is code for srtlen in GNU C library)
Here is code for strcpy: http://code.metager.de/source/xref/gnu/glibc/string/strcpy.c
1
2
for(int i=0; str0[i] != '\0'; i++) str1[i] = str0[i];
str1[len - 1] = '\0';

//This copies until the first '\0' character, without including it.
//so str1 doesn't have the null terminated character yet, therefore we put it at the end of the array anyway:
// At least, in this case, str1 will have just 1 '\0' and no undefined behaviour, that I can see, could happen


Lets play then:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
const size_t len {5};
char str1[len];

char str0[len] = "Hi";
// str0 == { 'H', 'i', '\0', '\0', '\0' }

for(int i=0; str0[i] != '\0'; i++) str1[i] = str0[i];
str1[len - 1] = '\0';
// str1 == { 'H', 'i', ?, ?, '\0' }

// content of str1 is (partially) undefined. Not OK.


char str2[] = "Hello"; // array has 6 elements
for(int i=0; str2[i] != '\0'; i++) str1[i] = str2[i];
// no out-of-bounds error, for last written character, 'o', is at legal str1[len - 1]
// a close shave

str1[len - 1] = '\0';
// str1 == "Hell"
// str1 is valid C-string 

Pushing a longer str2 into str1 is an obvious error.
However, it is the subtle str0 that your code does not expect.
closed account (jvqpDjzh)
yes, you are right, I have to find something better
Either move declaration of i out of the loop and after loop make str1[i] = '\0'; or use one of alternative approaches I posted: either BSD implementation of strcpy (in post directly) or glibc one (one of the links). And use strcpy directly after you done implementing it yourself.
closed account (jvqpDjzh)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
//For string literals, this has to work!
//returns the size including the '\0'
int cstrlen(const char* str1="")
{
    int n;
    for(n = 0; str1[n] != '\0'; n++);
    return ++n;
}

//This only works with pointers that point to an allocated memory
void cpycstr(char* &str0, int alloc_mem, const char* str1)
{
    int str1_len = cstrlen(str1);
    if(alloc_mem < str1_len)
    {
        delete[] str0;
        str0 = new char[str1_len];
        for(int i=0; i < str1_len; i++) str0[i] = str1[i];;//Copies also the '\0'
    }
    else if(alloc_mem == str1_len)
        for(int i=0; i < str1_len; i++) str0[i] = str1[i];
    else//does not reallocates memory if str0 has the capacity to contain str1
        //instead the rest elements are assigned to '\0'
    {
        for(int i=0; i<str1_len; i++) str0[i] = str1[i];
        for(int i=str1_len; i<alloc_mem; i++) str0[i] = '\0';
    }
}

void show_str(char* a)
{
    while(*a) std::cout << *a++;
    std::cout << '\n';
}

int main()
{
    char* a = new char[3];
    char* b = new char[10];
    char* c = new char[6];

    const char* str1 = "Hello";//length = 6
    cpycstr(a, 3, str1);
    show_str(a);

    cpycstr(b, 10, str1);
    show_str(b);
    b[5] = 'a';
    b[6] = 'b';
    b[7] = 'c';
    b[8] = 'd';
    //b[9] = '\0';//has already it!//don't override this last character
    show_str(b);

    cpycstr(c, 6, str1);
    show_str(c);

    std::cin.get();
    return 0;
}



Last edited on
1) cstrlen works fine, but including trailing 0 is not the behavior people expects from length function. If you did not aim to replicate standard library behavior it is fine.

2) cpycstr. It works, however it is slightly ineffective (why do you need to fill all excess memory with 0?) and it is more verbose than it should (rule of thumb: if line repeats more than twice and sometimes more than once it is a good indicator that you should rethink your program flow). Also passing pointer by reference and changing it is rarely good idea: in real program there would be probably mutiple copies of same pointer floating around.
It might be fine in wrapper around direct copy routine.

Changed cpystr to be less verbose without changing semantics:
1
2
3
4
5
6
7
8
9
10
11
12
void cpycstr(char* &str0, int alloc_mem, const char* str1)
{
    int str1_len = cstrlen(str1);
    if(alloc_mem < str1_len)  {
        delete[] str0;
        str0 = new char[str1_len];
        alloc_mem = str1_len;
    }
    for(int i=0; i<str1_len; i++) str0[i] = str1[i];
    //if str1_len = alloc_mem this loop will not be executed
    for(int i=str1_len; i<alloc_mem; i++) str0[i] = '\0';
}
closed account (jvqpDjzh)
but including trailing 0 is not the behavior people expects from length function.
I know, I have just written it to match it with the other functions

why do you need to fill all excess memory with 0?
Yes, you are right, in fact I was thinking that the function was to large for such a "small" job.
Topic archived. No new replies allowed.