ADHD Summary:
Score: Float 489 ms vs Uint32 495 ms
TASK:
Convert 3X uint16_t raw RGB data to uint16_t HSV for analyses. Scale hue from 0-360 degrees to 0 to 65535. Scale saturation/value from 0-1 to 0-65535. And, Step On It!
Standard image size is 36 MPix with uncompressed TIF file size ~217MB. All data in Uint16_t arrays
Sample Greenish formula from
https://www.cs.rit.edu/~ncs/color/t_convert.html <see full text below>
delta = max - min;
h = 60 * (2 + ( b - r ) / delta);
// Results of GREEN zone calc land between 60 deg YELLOW to 180 deg AQUA (CYAN). w/120 deg -> Green
DESIGN:
The original code was 100% integer. The uint16 subtractions were done first on shorts then upsized to uint32 prior to bitshifting to scale. After scaling the uint32, the division was done to avoid integer truncation. The Sextant (60 degree sector, 6 per circle) was determined to avoid negative subtraction and because max/min are needed
A float version was written to validate the results. There were some differences but never more than 1.0f, the max float->int roundoff error and checked with code.
When INT version was commented out the and the FLOAT alone was timed, the results were fascinating.
RESULTS:
Using Uint32 was slower than converting to FLOAT and back again.
I ran 4 different tests, 10 runs each:
A) Uint32, B) Float, C)Double and D) NULL (just assigned 0, no calc at all)
UINT32 -> Average time = 494.7 ms (with scale=65536/3 crunched to 10922U similar to float)
ush[huec] = (uint16_t)(21845U - 10922U*(usr[REDC]-usr[BLUEC])/ ((uint32_t)(usr[GREENC]-usr[BLUEC])) );
Ave=0.4947, sum=4.9471, count=10, SDF: SDev=0.0075
FLOAT -> Average time = 489.3 ms
ush[huec] = (uint16_t)(roundf((2.0f-( (usr[REDC]-usr[BLUEC])/((float)(usr[GREENC]-usr[BLUEC]))))*10922.667f));
Ave=0.4893, sum=4.8931, count=10, SDev=0.0069
DOUBLE -> Average time = 521.8 ms
ush[huec] = (uint16_t)(round((2.0-( (usr[REDC]-usr[BLUEC])/ ((double)(usr[GREENC]-usr[BLUEC]))))*10922.667));
Ave=0.5218, sum=5.2176, count=10, SDev=0.0107
NULL -> Average time = 449.6 ms
ush[huec]=(uint16_t) 0; continue; // A cast + assignment, but no calc
Ave=0.4496, sum=4.4955, count=10, SDev=0.0079
UINT32 -> Average time = 497.7 ms (Older version with a scale factor of 64k/3, one times and one gozinta.
Employs "mega-fast" bit-shift trick, vastly accelerating multiplication scaling.
ush[huec]=(uint16_t)((65536U-((((uint32_t)usr[REDC]-usr[BLUEC])<<15U)/((uint32_t)(usr[GREENC]-usr[BLUEC]))))/3U);
Ave=0.4977, sum=4.9769, count=10, SDev=0.0077
These results are from changing 1 program line, recompiling, and converting 1 raw to HSV 10 times in a loop, reading from a fast SSD and writing to an even faster one to minimize disk overhead. Effects one of 6 code paths based on sextant but ~48% of pixels.
And, hex dumps were made for both the raw and the hsv data, first 10 pixels and 10 more evenly spread over image. Single, smallest digit differences were seen but never greater than 1.0f (other than the NULL runs).
Is there a better way to organize the integer math to expedite the process?
TYVM,
B
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
Minutia:
gcc version 4.9.2 (x86_64-posix-sjlj, built by strawberryperl.com project)
gcc -O4 -ffast-math -m64 -Ofast -march=corei7-avx -mtune=corei7-avx c:/bin/raw2hsv.c -o c:/bin/raw2hsv.a.exe
windoz7/64, 32GB DDR3 @~1600 MHz, 2700K @ 3.4GHz
==========================================================================
SAMPLE CODE:
from
https://www.cs.rit.edu/~ncs/color/t_convert.html
// r,g,b values are from 0 to 1
// h = [0,360], s = [0,1], v = [0,1]
// if s == 0, then h = -1 (undefined)
void RGBtoHSV( float r, float g, float b, float *h, float *s, float *v )
{
float min, max, delta;
min = MIN( r, g, b );
max = MAX( r, g, b );
*v = max; // v
delta = max - min;
if( max != 0 )
*s = delta / max; // s
else {
// r = g = b = 0 // s = 0, v is undefined
*s = 0;
*h = -1;
return;
}
if( r == max )
*h = ( g - b ) / delta; // between yellow & magenta
else if( g == max )
*h = 2 + ( b - r ) / delta; // between cyan & yellow
else
*h = 4 + ( r - g ) / delta; // between magenta & cyan
*h *= 60; // degrees
if( *h < 0 )
*h += 360;
}