boost bitset to unsigned integers

I am using boost dynamic bites for the first time and I am kind of stuck with something. There is a variable boost::dynamic_bitset<> bs(200) in my program and that has been set. I am now trying to convert a block of bits into unsigned integers. For the bitset with 200 bits, I would like to convert first 64 bits into an unsigned integer (uint64_t). Again I want to divide into blocks of 64 + 64 + 64 + 8 and get the integer for each block. I know boost::to_block_range will be helpful, but not sure how to use that for the above case.
Use boost::dynamic_bitset<std::uint64_t> (dynamic bitset with 64 bits_per_block),
and boost::to_block_range would yield the numbers.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <iostream>
#include <boost/dynamic_bitset.hpp>
#include <cstdint>
#include <vector>
#include <iterator>

std::vector< std::uint64_t > split_uint64_t( const boost::dynamic_bitset<std::uint64_t>& bits )
{
    std::vector< std::uint64_t > result ;
    boost::to_block_range( bits, std::back_inserter(result) ) ;
    return result ;
}

int main()
{
    boost::dynamic_bitset<std::uint64_t> bits ;

    bits.resize(200) ;
    bits[0] = true ; // 1 (bits 0-63)
    bits[64] = bits[65] = true ; // 3 (bits 64-127)
    bits[128] = bits[129] = bits[130] = true ; // 7 (bits 128-191)
    bits[192] = bits[193] = bits[194] = bits[195] = true ; // 15 (bits 192-199)

    for( std::uint64_t n : split_uint64_t(bits) ) std::cout << n << '\n' ;
}

http://coliru.stacked-crooked.com/a/a64140f1737b2366
@JLBorges Thanks !! That was quite helpful.

Actually, I was trying to find the number of mismatched bits between two bitset variables like this (faster way)

1
2
3
boost::dynamic_bitset<> p(200); 
boost::dynamic_bitset<> q(200); 
int c = (p^q).count();


I think the following is faster than the above piece of code, but I need uint64_t for using __builtin_popcountll

1
2
3
4
5
#include <stdint.h>
int hammingDistance (uint64_t x, uint64_t y) {
        uint64_t res = x ^ y;
        return __builtin_popcountll (res);
}


So I was wondering if converting the bitsets into uint64_t blocks and then applying above will make it faster. Any suggestions?
Last edited on
If the number of bits is known to be a constant (say 200), std::bitset<200> should give the best performance. One would expect that the implementation would use the popcntq instruction for 64-bit segments (on a 64-bit intel architecture).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
std::size_t cnt_mismatch_std( const std::bitset<200>& a, const std::bitset<200>& b )
{
    return (a^b).count() ;

    /*
        # clang++, libc++, -O3, -march=native
        _Z16cnt_mismatch_stdRKNSt3__16bitsetILm200EEES3_: 

        # (a^b)
        movq    (%rsi), %rax
        movq    8(%rsi), %rcx
        xorq    (%rdi), %rax
        xorq    8(%rdi), %rcx
        movq    16(%rsi), %rdx
        xorq    16(%rdi), %rdx
        movl    24(%rsi), %esi
        xorl    24(%rdi), %esi
        
        # count() 
        popcntq %rax, %rax 
        popcntq %rcx, %rcx 
        addq    %rax, %rcx
        popcntq %rdx, %rdx 
        addq    %rcx, %rdx
        movzbl  %sil, %eax
        popcntq %rax, %rax
        addq    %rdx, %rax
        retq
    */
}

http://coliru.stacked-crooked.com/a/89a38bf69373ed6a

For (p^q).count(), boost::dynamic_bitset<> would be measurably slower because the number of bits is not known at compile-time, and the creation of a temporary bitset (p^q) is much harder to optimise away; typically dynamic memory allocation would be required for the temporary bitset.

If a dynamic bitset must be used, I suspect that (p^q).count() may be slightly faster than boost::to_block_range and then __builtin_popcountll on the copied 64-bit blocks. That is merely a surmise; measure the performance on the actual implementation that will be used to run the code.

Thanks again @JLBorges !!

The number of bits is not constant, so I had to use dynamic::bitset<>. Actually, the original problem was to find the mismatched characters between one string and all other substrings of another strings. For example, P = "abcc" and Q = "abcdabcdabcd", I have to find number of mismatched characters between P and Q[i, j] where i = 0 to |Q|-4 and j = 4 to |Q|. The string has only chars 'a', 'b', 'c', and 'd'.The brute force approach (comparing char by char) was not fast enough for large strings (length of 300K). So I converted the strings into bitsets and doing the bitwise operations to make it faster.
Topic archived. No new replies allowed.