Goto loop entry :-@

Here is something that has bothered me for ages, and for which I cannot seem to find a very satisfying answer in C++.

I am still playing around with bignums, and as part of the initialization from string I have a simple loop to multiply digits into the number (code simplified to remove stuff not important to this post, like error handling):

1
2
unsigned radix = <whatever the user specifies in 2..36>;
string digits( "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ", radix );
1
2
3
4
5
for (unsigned n = start; n < s.length(); n++)
  {
  *this *= radix;
  *this += digits.find( std::toupper( s[ n ] ) );
  }

This is swell enough, but there is also the sign of the number to handle. Naturally, the sign of the number is somewhere near the beginning of the string. Since I decided to do it brown, the sign can be specified in one of two places: before or after the radix specification. Here is the full syntax for a number:

    [ ['+'|'-'] [ decimal_digits '%' ['+'|'-'] ] digits ]

The reason is that you can say things like "-2%1010" for negative 10, specified in binary. If you have just "-1010" and you know it is binary, all you need is to concatenate it to "2%" and let the biginteger parse the string "2%-1010". (Of course, the user can say things like "-2%+1010", but the specification states that negative signs always win ;-)

Now, the reason this becomes important is because the first time through the loop that multiplication on line 3 is zero times radix, and zero is not allowed to be negative (by specification).


There are a number of ways to solve the problem.

 1. Create an extra boolean variable to remember whether or not the resulting number
    is supposed to be negative.
 
bool am_i_supposed_to_be_negative = <is there a '-' in the right spot>;
1
2
3
4
5
6
7
8
for (unsigned n = start; n < s.length(); n++)
  {
  *this *= radix;
  *this += digits.find( std::toupper( s[ n ] ) );
  }

if (am_i_supposed_to_be_negative)
  negate();


 2. Move the flag into the worst spot possible: inside the loop.
 
if (<there a '-' in the right spot>) negate();
1
2
3
4
5
6
for (unsigned n = start; n < s.length(); n++)
  {
  if (n != start)
    *this *= radix;
  *this += digits.find( std::toupper( s[ n ] ) );
  }


 3. Unroll the first iteration. (My current solution.)
 
if (<there a '-' in the right spot>) negate();
1
2
3
4
5
6
*this += digits.find( std::toupper( s[ n ] ) );
for (unsigned n = start + 1; n < s.length(); n++)
  {
  *this *= radix;
  *this += digits.find( std::toupper( s[ n ] ) );
  }


 4. Just goto the proper entry point. (Gasp!)
 
if (<there a '-' in the right spot>) negate();
1
2
3
4
5
6
7
8
unsigned n = start;
goto entry;
while (n < s.length())
  {
  *this *= radix;
  entry:
  *this += digits.find( std::toupper( s[ n++ ] ) );
  }

I hate flaggy code, and putting stuff into the loop is a mistake, so options 1 and 2 are out. (This also reduces to options that do not perform a multiplication by zero.)

Option 3 is nice, but increases code size. Not too big a deal in this case, and I am unsure whether the compiler can actually optimize it away.

Option 4 seems ideal, because it performs the proper optimization, but at the expense of being on the programmer's code level, which makes code less friendly and potentially disables other useful compiler optimizations in the loop -- a bad tradeoff.


Nevertheless, I have often wondered how to do something just like this in C and C++. My Pascal and Delphi compilers will automatically optimize option 3 into option 4 for me, but I don't think the C++ compiler does. I don't want to second guess the optimizer, so I am sticking with option 3, of course, but I have to wonder, is there a way to force my hand here? Much like the fabled Duff's Device? http://www.faqs.org/docs/jargon/D/Duff%27s-device.html

Sorry for the weirdness, and thank you for your time.
To use Duff's adage, If your code is too slow, you must make it faster. If no better algorithm is available, you must trim cycles.

The question is, is opt 3 too slow and is opt 4 faster? If not, I'd stick with opt 3.
1
2
3
4
5
6
7
unsigned n = start;
while (n < s.length())
  {
  *this += digits.find( std::toupper( s[ n++ ] ) );
  if (n == s.length()-1) break;
  *this *= radix;
  }


Although that isn't the most elegant solution.

Option 3 seems reasonable and I see no reason to switch it, optimizing it at such levels does not really matter much. (As I have been told many times.)
@kbw
Yes, for speed optimization (which I want) option 3 and 4 are identical. Hence my current choice of option 3. The advantage of option 4 is that it also has space optimization.

@Kyon
That is a variation on option 2, alas.
Last edited on
The advantage of option 4 is that it also has space optimization.
I took a look at the assembler generated by Visual Studio 2008 /O2 (Speed Optimisation). Opt4 was much smaller, I imagined the compiler doing a better job, only in my dream it seems. I may have to pay more attention to what this compiler generates in future.
Yes, it is a high-level loop optimization. Like I said, the BP compilers handle it nicely (Pascal code is highly optimized), but C and C++ compilers all vary -- the GCC included -- and usually are somewhat lacking. Alas.
Topic archived. No new replies allowed.