I'm implementing a basic routine in C++ to integrate with some of Amazon's AWS services. The one I'm having trouble with here is Amazon Glacier when one of my class functions goes to build the canonical request.
If I run this class against a small source file, it works fine, but if I run it against a larger file (e.g. 100MB), I get EXC_BAD_ACCESS and the following backtrace (the underlying hash calculation routine has been tested and works without erroring out on both small and larger files, so it seems to be something to do with the integration with the parent class) :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
£0 0x00007fff80e66c53 in __objc_personality_v0 ()
£1 0x00007fff809ff4c6 in unwind_phase2 ()
£2 0x00007fff809fe96e in _Unwind_RaiseException ()
£3 0x00007fff86ff71dc in __cxa_rethrow ()
£4 0x00007fff80e6718d in _objc_terminate ()
£5 0x00007fff86ff6001 in safe_handler_caller ()
£6 0x00007fff86ff605c in std::terminate ()
£7 0x00007fff86ff7152 in __cxa_throw ()
£8 0x00007fff8c925255 in std::__throw_length_error ()
£9 0x00007fff8c94b84e in std::string::_Rep::_S_create ()
£10 0x00007fff8c94c1d7 in std::string::_Rep::_M_clone ()
£11 0x00007fff8c94c40e in std::string::reserve ()
£12 0x00007fff8c94c7d9 in std::string::append ()
£13 0x0000000100006b81 in std::operator+<char, std::char_traits<char>, std::allocator<char> > (__lhs=à0x10001a858, __rhs=à0x7fff5fbff5f8) at basic_string.h:2085
£14 0x0000000100003531 in mynamespace::AmzGlacier::canonicalPost (this=0x7fff5fbff850, pVaultName=à0x7fff5fbffa80, pHostName=à0x7fff5fbffa70) at amazon_glacier.cpp:22
£15 0x0000000100009056 in main (argc=4, argv=0x7fff5fbffb58) at main.cpp:52
This is the top of the amazon_glacier.cpp file, line 22 referred to in the backtrace is the last line of this code extract. main.cpp:52 simply calls the canonicalPost function (i.e. amzn.canonicalPost("joe","bloggs");)
As you've rightly deduced, that suggests that kAmzGCanocLinearPfx is broken.
Something in your program is overwriting the memory used by kAmzGCanocLinearPfx. You now have to work backwards to work out when that happens and find the offending code.
Either my understanding of C is poor (not difficult ;-) ... or you've failed to grasp my problem (not your fault, probably my description !).
Lets say my compiled program is called X.
If I run
./x small-file.abc
It works
If I run
./x big-file.def
It crashes out.
Therefore I fail to comprehend your reasoning about something overwriting kAmzGCanocLinearPfx. Because if it was overwriting, it would be overwiting for both runs, not just one. Secondly as I have pointed out, kAmzGCanocLinearPfx is a const, and I am only using that const in one place in my code... the place that I posted the extract above. That's it.
The header defines it as : const std::string kAmzGCanocLinearPfx = "x-amz-content-sha256:";
And the code uses it as : kAmzCanocLinear = kAmzGCanocLinearPfx + mynamespace::HexConvertSS(linearHash) + mynamespace::kNewLine;
cat and grep my code all you like, but you won't find it anywhere else. ;-)
Just to reinforce this further, if I comment out the call to canonicalPost and just call the constructor shown above, it works perfectly for both large and small file. As soon as I add the call to canonicalPost it crashes out at the location detailed above.
If I change kAmzCanocLinear = kAmzGCanocLinearPfx + mynamespace::HexConvertSS(linearHash) + mynamespace::kNewLine;
Therefore I fail to comprehend your reasoning about something overwriting kAmzGCanocLinearPfx.
Your own logging shows it's corrupt.
Because if it was overwriting, it would be overwiting for both runs, not just one.
Maybe that's a clue. One code path runs correctly while the other has a memory overrun.
Secondly as I have pointed out, kAmzGCanocLinearPfx is a const
It's logical const rather than physical const. It's not put into a read-only data segment or anything like that. It just makes the compiler check that you're not changing it with non-const members.
However, the object has an implementation that's in memory somewhere. That buffer overrun is writing over part of the implementation of kAmzGCanocLinearPfx. That much you've already proved.
You need to step thru and verify where in your big-file.def scenario the memory overrun is happening. I can assure you, there is one there somewhere.
EDIT:
Back in the old days we'd use an ICE (In Circuit Emulator) to watch that memory for change. When the i386 became available, we were able to do that with Soft-ICE. I'm sure modern debuggers can watch memory and break on change.
You need to look at kAmzGCanocLinearPfx, check out where the string value is held in memory (and it's length) and watch that range in the debugger. You can either step thru each line while watching it manually or set up the debugger to watch it for you.
The program logic executed is exactly the same. The only thing that changes is that it freads the big instead of the small.
The program takes no special actions depending on the attributes of the file, it follows exactly the same actions for whatever file you tell it to read.... big, small, txt, exe... it doesn't care... it performs the same actions every single time.
The program only operates on one file at a time, there are no threads, forks or anything fancy. Just a very simple program. It opens the file, calculates the hashes and exits... that's it.
yeah... and that is why I'm here, because I don't understand why. I thought someone might actually be able to enlighten me and give me the second pair of eyes on the problem that I need.
Obviously I'm wasting my time here. Could have re-written the whole thing in another programming language by now !