Textual representation of mt19937

Some time ago I found what I thought was a bug in GCC. I reported it but never got any response.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60441

Looking at the standard I can still not see how the GCC implementation could be correct.

If you run this program in VC++, or another non-GCC compiler, what is the output?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <random>
#include <sstream>

int main()
{
	std::mt19937 r;
	std::stringstream ss;
	ss << r;
	int valueCount = 0;
	std::uint32_t val;
	while (ss >> val)
	{
		++valueCount;
	}
	std::cout << r.state_size << std::endl;
	std::cout << valueCount << std::endl;
}
GCC output:
	624
	625
Expected output:
	624
	624

If I'm correct about this it means there will most likely be a compatibility issue between compilers. It's important that these things are implemented correctly so that you can save the state in a program compiled with one compiler and later restore the state in a program that was compiled with a different compiler.
g++ writes an extra number 624 (state_size) right at the end.
So, if we append more information to the same stream into which the state of the twister was saved, there would be portability problems between g++ and other implementations.

The workaround is to use a separate file into which nothing more is written after the twister state is saved.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <iostream>
#include <random>
#include <fstream>
#include <string>

int main( int argc, char* argv[] )
{
    using namespace std::literals ;

    if( argc == 3 )
    {
        if( argv[1] == "write"s )
        {
            std::mt19937 twister(100) ;
            std::ofstream( argv[2] ) << twister ;
            
            for( int i = 0 ; i < 10 ; ++i ) std::cout << twister() << ' ' ;
            std::cout << '\n' ;
        }

    else if( argv[1] == "read"s )
        {
            std::mt19937 twister ;
            std::ifstream( argv[2] ) >> twister ;
            
            for( int i = 0 ; i < 10 ; ++i ) std::cout << twister() << ' ' ;
            std::cout << '\n' ;
        }
    }
}

echo g++ write && g++ -std=c++14 -O3 -Wall -Wextra -pedantic-errors main.cpp -lsupc++ && ./a.out write twister.txt
echo clang++ read && clang++ -std=c++14 -stdlib=libc++ -O3 -Wall -Wextra -pedantic-errors main.cpp -lsupc++ && ./a.out read twister.txt
echo g++ read && g++ -std=c++14 -O3 -Wall -Wextra -pedantic-errors main.cpp -lsupc++ && ./a.out read twister.txt
echo ---------------
echo clang++ write && clang++ -std=c++14 -stdlib=libc++ -O3 -Wall -Wextra -pedantic-errors main.cpp -lsupc++ && ./a.out write twister.txt
echo g++ read && g++ -std=c++14 -O3 -Wall -Wextra -pedantic-errors main.cpp -lsupc++ && ./a.out read twister.txt
echo clang++ read && clang++ -std=c++14 -stdlib=libc++ -O3 -Wall -Wextra -pedantic-errors main.cpp -lsupc++ && ./a.out read twister.txt

g++ write
2333906440 2882591512 1195587395 1769725799 1823289175 2260795471 3628285872 638252938 20267358 673068980 
clang++ read
2333906440 2882591512 1195587395 1769725799 1823289175 2260795471 3628285872 638252938 20267358 673068980 
g++ read
2333906440 2882591512 1195587395 1769725799 1823289175 2260795471 3628285872 638252938 20267358 673068980 
---------------
clang++ write
2333906440 2882591512 1195587395 1769725799 1823289175 2260795471 3628285872 638252938 20267358 673068980 
g++ read
2333906440 2882591512 1195587395 1769725799 1823289175 2260795471 3628285872 638252938 20267358 673068980 
clang++ read
2333906440 2882591512 1195587395 1769725799 1823289175 2260795471 3628285872 638252938 20267358 673068980 

http://coliru.stacked-crooked.com/a/17084d32270d43e5
It works, but if the state size is missing it puts the stream in a failed state.

I can now see that it's probably not a bug. The standard only says what the textual representation should consist of, but it doesn't mean it can't contain more, right?

§26.5.3.2/5
The textual representation of xi consists of the values of Xi−n , . . . , Xi−1 , in that order.

So the sad truth is that >> and << will not work the same on all implementations. I guess the only way to make sure it works on all compilers is to pay attention and put in special handling for the compilers that needs it.
Last edited on
As I see it, g++ conforms to the letter, but not the spirit, of the standard.

The IS specifies:
With os.fmtflags set to ios_base::dec|ios_base::left and the fill character set to the space character, writes to os the textual representation of x’s current state. In the output, adjacent numbers are separated by one or more space characters.

So, without special casing, we could write:
ostm << twister << ' ' << twister.state_size << " *** end ***\n" ;
(and then we can then use ignore() before we read any additional information from the input stream.)
http://coliru.stacked-crooked.com/a/1bc03ffb795a7ecc
Yeah, that will work.
Topic archived. No new replies allowed.