LZMA SDK and Optimization Issues

closed account (43RGz8AR)
I've found my self in need to use a good compression format library. The LZMA compression matches the bill. I'm using the MinGW64 compiler and when set to compile just the C core files for LZMA with the options "-s", "-O3", "-Wall", and some others for which cpu target (doing multiple compiles). I'm coding for multi-platform, OSX (with intel target), *nix (intel and AMD target), and Windows (intel and AMD target). Before you ask the reason for optimization, it is because this is meant for a real time render engine (aka game engine). Now I've managed to fix all the problems of working the LZMA SDK's C files into a C++ project. Except this one warning, which at first I ignored until the test run.

1
2
3
4
5
6
7
8
X:\code\LZ\XzDec.c||In function 'XzUnpacker_Code':|
X:\code\LZ\XzDec.c|822|warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]|
X:\code\LZ\XzEnc.c||In function 'Xz_WriteFooter':|
X:\code\LZ\XzEnc.c|120|warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]|
X:\code\LZ\XzEnc.c|131|warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]|
X:\code\LZ\XzIn.c||In function 'Xz_ReadBackward':|
X:\code\LZ\XzIn.c|216|warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]|
||=== Build finished: 0 errors, 4 warnings ===|

After seeing that this does cause problems, and looking up on the internet what this means for the code. I found that to get rid of it I could remove all the compiler flags for optimization. Before looking up on the internet I found the exact source of the warning. Which are these macros in file "CpuArch.h":

1
2
3
4
5
6
7
8
9
10
#ifdef MY_CPU_LE_UNALIGN

#define GetUi16(p) (*(const UInt16 *)(p))
#define GetUi32(p) (*(const UInt32 *)(p))
#define GetUi64(p) (*(const UInt64 *)(p))
#define SetUi16(p, d) *(UInt16 *)(p) = (d);
#define SetUi32(p, d) *(UInt32 *)(p) = (d);
#define SetUi64(p, d) *(UInt64 *)(p) = (d);

#else 

Where the actual warning point to are:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/// file: "XzDec.c"
if ( CRC_GET_DIGEST ( p->crc ) != GetUi32 ( p->buf ) )
	return SZ_ERROR_CRC;

/// file: "XzEnc.c"
SetUi32 ( buf, CRC_GET_DIGEST ( crc ) );

/// file: "XzEnc.c"
SetUi32 ( buf, CrcCalc ( buf + 4, 6 ) );

/// file: "XzIn.c"
if ( GetUi32 ( buf ) != CrcCalc ( buf + 4, 6 ) )
	return SZ_ERROR_ARCHIVE;


I understand exactly what they do, but I don't understand why this is causing a problem or how to fix it.

Thanks, Xenouis.
What's in the #else block?
closed account (43RGz8AR)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#else

#define GetUi16(p) (((const Byte *)(p))[0] | ((UInt16)((const Byte *)(p))[1] << 8))

#define GetUi32(p) ( \
             ((const Byte *)(p))[0]        | \
    ((UInt32)((const Byte *)(p))[1] <<  8) | \
    ((UInt32)((const Byte *)(p))[2] << 16) | \
    ((UInt32)((const Byte *)(p))[3] << 24))

#define GetUi64(p) (GetUi32(p) | ((UInt64)GetUi32(((const Byte *)(p)) + 4) << 32))

#define SetUi16(p, d) { UInt32 _x_ = (d); \
    ((Byte *)(p))[0] = (Byte)_x_; \
    ((Byte *)(p))[1] = (Byte)(_x_ >> 8); }

#define SetUi32(p, d) { UInt32 _x_ = (d); \
    ((Byte *)(p))[0] = (Byte)_x_; \
    ((Byte *)(p))[1] = (Byte)(_x_ >> 8); \
    ((Byte *)(p))[2] = (Byte)(_x_ >> 16); \
    ((Byte *)(p))[3] = (Byte)(_x_ >> 24); }

#define SetUi64(p, d) { UInt64 _x64_ = (d); \
    SetUi32(p, (UInt32)_x64_); \
    SetUi32(((Byte *)(p)) + 4, (UInt32)(_x64_ >> 32)); }

#endif 

The same thing with byte swapping for the Big Endian CPUs, but regardless my compiler never reads it because all my targets are Little Endian CPUs. Here is where it gets set from:

1
2
3
#if defined(MY_CPU_X86_OR_AMD64)
#define MY_CPU_LE_UNALIGN
#endif 


I've fixed the issue by simply turning all the macros into functions like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
UInt16 GetUi16 ( const void* p ) {
	return ( * ( const UInt16* ) ( p ) );
}
UInt32 GetUi32 ( const void* p ) {
	return ( * ( const UInt32* ) ( p ) );
}
UInt64 GetUi64 ( const void* p ) {
	return ( * ( const UInt64* ) ( p ) );
}
void SetUi16 ( void* p, UInt16 v ) {
	* ( UInt16* ) ( p ) = ( v );
	return;
}
void SetUi32 ( void* p, UInt32 v ) {
	* ( UInt32* ) ( p ) = ( v );
	return;
}
void SetUi64 ( void* p, UInt64 v ) {
	* ( UInt64* ) ( p ) = ( v );
	return;
}

I still have no idea why it doesn't work the other way though.
I'm guessing because the explicit (void *) conversion makes the compiler unable to perform aliasing optimizations.

Another thing you could do is manually code the deserialization:
1
2
3
4
5
6
7
8
template <typename T>
T deserialize(const void *p) {
    unsigned char *buffer=(unsigned char *)p;
    T r=0;
    for (int a=0;a<sizeof(r);a++)
        x|=buffer[a]<<8*a;
    return x;
}

Or, if you're working exclusively with C,
1
2
3
4
5
6
7
8
9
10
11
12
#define DEFINE_DESERIALIZE(type)              \
type deserialize_##type(const void *p) {      \
    unsigned char *buffer=(unsigned char *)p; \
    type r=0;                                 \
    for (int a=0;a<sizeof(r);a++)             \
        x|=buffer[a]<<8*a;                    \
    return x;                                 \
}

DEFINE_DESERIALIZE(UInt16)
DEFINE_DESERIALIZE(UInt32)
DEFINE_DESERIALIZE(UInt64)
Last edited on
closed account (43RGz8AR)
Thank you and your right!

Sense this is from within a C library then I would use something to the later to keep things intact. Might go email someone managing the original and point it out so there will be a fix for other people as well.
Topic archived. No new replies allowed.