Runtime Error - Codewars - Mumbling - C++ (Newbie)

Hi, I'm just starting to teach myself C++ and using Codewars to gain some experience.

I'm really stuck on a runtime error I got yesterday.

My code appears to be an adequate solution to the "Mumbling" Kata (C++), but I've clearly done something wrong that is causing runtime errors.

I believe this is some sort of memory / page fault, and perhaps I'm miss-using string Str initialization or manipulation?

It would be great if someone could tell me why I'm getting the runtime errors.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
  class Accumul
{
public:
    static std::string accum(const std::string &s)
    {
     size_t i, a, b, c=0;
     static std::string str;
           
      i=s.size();
      for (a=0;a<i;a++)
          {
          s[a]<91 ? str[c] = s[a] : str[c] = (s[a]-32);
          c++;
          for (b=a;b>0;b--)
              {
              s[a]>91 ? str[c] = s[a] : str[c] = (s[a]+32); 
              c++;
              }
        if (a<(i-1)) str[c]='-'; 
        c++;
           }
      return str.c_str(); 
    }
 };

/*   Sample Tests
void testequal(std::string ans, std::string sol) {
    Assert::That(ans, Equals(sol));
}
static void dotest(std::string s, std::string expected)
{
    testequal(Accumul::accum(s), expected);
}
Describe(accum_Tests)
{
    It(Fixed_Tests)
    {
        dotest("ZpglnRxqenU", "Z-Pp-Ggg-Llll-Nnnnn-Rrrrrr-Xxxxxxx-Qqqqqqqq-Eeeeeeeee-Nnnnnnnnnn-Uuuuuuuuuuu");
        dotest("NyffsGeyylB", "N-Yy-Fff-Ffff-Sssss-Gggggg-Eeeeeee-Yyyyyyyy-Yyyyyyyyy-Llllllllll-Bbbbbbbbbbb");
    }
};
*/ 


This is the output I'm getting:

Time: 2152ms Passed: 1 Failed: 0 Exit Code: 1
Test Results:
accum_Tests
Fixed_Tests
Test Passed
STDERR
UndefinedBehaviorSanitizer:DEADLYSIGNAL
==1==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00000042a691 bp 0x00000042aef0 sp 0x7ffda61d6c20 T1)
==1==The signal is caused by a READ memory access.
==1==Hint: address points to the zero page.
==1==WARNING: invalid path to external symbolizer!
==1==WARNING: Failed to use and restart external symbolizer!
#0 0x42a690 (/workspace/test+0x42a690)
#1 0x4256a2 (/workspace/test+0x4256a2)
#2 0x7fa052b92bf6 (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6)
#3 0x404609 (/workspace/test+0x404609)

UndefinedBehaviorSanitizer can not provide additional info.
==1==ABORTING

Hi, I'm just starting to teach myself C++ and using Codewars to gain some experience.

Line 12 immediately accesses the first character of an empty string.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
std::string accum(const std::string &s)
{
    std::string result;
    
    for (std::size_t i = 0; i < s.size(); ++i)
    {
        if (i) result.push_back('-');
        
        result.append(1, s[i] &~ 0x20);
        result.append(i, s[i] |  0x20); 
    }   
    
    return result;
}
Thank you @mbozzi

"Line 12 immediately accesses the first character of an empty string."

Can you expand on this?

I create str on line 7 but do not initialize it, is that the problem?

In your elegant example you create result on line 3 and also don't initialize it?

On line 12 I copy s[a] into str[c], but why is this an issue?

You also add s[i] to result in your line 9, although you do specifically using the .append() function? Is it critical that I use .append() when amending a string in C++ ?

I have now read up on .push_back() and .append() - thank you


result.append(1, s[i] &~ 0x20); //1times, char,
result.append(i, s[i] | 0x20); //itimes, char,

What dose the bitwise operator &~ do here?
What is 0x20 - is this a memory address? why do you use it?

what does the bitwise operator | do here?

Would be great if you could help me out with this.


Last edited on
Ok, using .append() really helped. Thank you.

class Accumul
{
public:
static std::string accum(const std::string &s)
{
size_t i, a, b;
std::string str;
i=s.size();

for (a=0;a<i;a++)
{
s[a]<91 ? str.append(1,s[a]) : str.append(1,(s[a]-32));
for (b=a;b>0;b--) s[a]>91 ? str.append(1,s[a]) : str.append(1,(s[a]+32));
if (a<(i-1)) str.append(1,'-');
}
return str.c_str();
}
};

The solution above works with no errors.

I now see that 0x20 is Hex for 32, which is also the shift required for ASCII uppercase to ASCII lowercase.. but I can't quite grasp how you are using the &~ and | bitwise operators to ensure your line 9 is always upppercase, and your line 10 is repeating itimes lowercase.

result.append(1, s[i] &~ 0x20);
result.append(i, s[i] | 0x20);



On line 12 I copy s[a] into str[c], but why is this an issue?


Because at the beginning str has no content so it's length is 0 - but this is attempting to access the non-existent str[c] char. Even if c is 0, at the start str[0] is not defined as str is empty.

Because at the beginning str has no content so it's length is 0 - but this is attempting to access the non-existent str[c] char. Even if c is 0, at the start str[0] is not defined as str is empty.


I see. Thank you @seeplus

just to clarify terminology...
page fault is usually a reference to an operating system and hardware marriage where memory is moved in blocks (pages) in and out of the CPU cache. A page fault is where the current CPU instruction needs data not in the cache (fault!) and has to wait for it to be moved before the program can proceed (having this happen frequently is a major performance decrease). This is not an error, it is a normal operation that you want to do as infrequently as possible.

what you had was a coding error, specifically a memory access issue (attempt to use memory that does not belong to your program, trapped by the operating system and prevented before you could cause cascading instability).
Last edited on
… page fault is usually a reference to an operating system and hardware marriage where memory is moved in blocks (pages) in and out of the CPU cache…


I’m sure you’re right @jonnin - thank you for the clarification

Happy New Year!
I can't quite grasp how you are using the &~ and | bitwise operators to ensure your line 9 is always upppercase, and your line 10 is repeating itimes lowercase.


Both our approaches assume an ASCII-compatible text encoding.
In ASCII-compatible encodings the difference between an uppercase letter 'A'-'Z' and a lowercase letter 'a'-'z' is that the sixth bit is unset in the former and set in the latter.

For example 'A' is represented by a byte containing the binary value 1000001 while 'a' is represented by a byte with value 1100001.

The bit math is intended to set and unset the sixth bit.
'A' | 0x20 == 'a'; // 0100 0001 | 0010 0000 == 0x0110 0001
'a' | 0x20 == 'a'; // 0110 0001 | 0010 0000 == 0x0110 0001
'A' &~ 0x20 == 'A'; // 0100 0001 & ~0010 0000 == 0100 0001 & 1101 1111 == 0x0100 0001
'a' &~ 0x20 == 'A'; // 0110 0001 & ~0010 0000 ==  0110 0001 & 1101 1111 == 0x0100 0001

I wrote 0x20 (instead of 32) not to confuse, but rather to emphasize the pattern of bits.

A programmer could also use the standard library functions std::tolower and std::toupper to do the same thing, but in the spirit of your original solution, I didn't.
https://en.cppreference.com/w/cpp/string/byte/toupper

Line 22 should be
return str;
What do you suppose str.c_str() does and why do you think it was needed?
Last edited on

Both our approaches assume an ASCII-compatible text encoding.
In ASCII-compatible encodings the difference between an uppercase letter 'A'-'Z' and a lowercase letter 'a'-'z' is that the sixth bit is unset in the former and set in the latter.

For example 'A' is represented by a byte containing the binary value 1000001 while 'a' is represented by a byte with value 1100001.


Great - got it. I would never have spotted the pattern, but it's clear now you've explained it. The sixth bit from the right is 32 in 8 bit binary. Very nice.

The bit math is intended to set and unset the sixth bit.

'A' | 0x20 == 'a'; // 0100 0001 | 0010 0000 == 0x0110 0001
'a' | 0x20 == 'a'; // 0110 0001 | 0010 0000 == 0x0110 0001
'A' &~ 0x20 == 'A'; // 0100 0001 & ~0010 0000 == 0100 0001 & 1101 1111 == 0x0100 0001
'a' &~ 0x20 == 'A'; // 0110 0001 & ~0010 0000 == 0110 0001 & 1101 1111 == 0x0100 0001


Ah - I see. The bitwise operator is being applied to the 8 bit binary equivalents of both the value prior to, and immediately after, the operator.

This now makes perfect sense.

Thank you for taking the time to explain it. That's really helpful :).

I did a little bit of hobby programming in the late 1980's and very early 1990's, but haven't done any programming for the best part of 30 years until picking it back up a couple of weeks ago - so I'm very rusty!


Line 22 should be
return str;
What do you suppose str.c_str() does and why do you think it was needed?


Yes, I figured that out earlier today.

Yesterday, when I first encountered the runtime errors, I wasn't sure what was causing them, so, as an aid to debugging, I added a line including std::cout << str - which refused to print str.

I found that cout would only print str if I applied it as str.c_str(). This lead me to wrongly suspect that I also needed to apply str as str.c_str() to return it's value.

return str.c_str(); - this works fine, but I now realize it's completely unnecessary.

Thanks for all the help.

Happy New Year!

Last edited on
Topic archived. No new replies allowed.