Which Sieve of Eratosthenes is Better?

May 11, 2017 at 12:36pm
I was implementing a Sieve of Eratosthenes for Project Euler, and I initially made the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
std::vector<int> primes_upto(int limit)
{
    std::vector<bool> primes_bool_array (limit, true);
    std::vector <int> results;
    for (int i = 2; i <= std::sqrt(limit); ++i)
    {
        if (primes_bool_array[i])
        {
            for (int j = 2; (i * j) <= limit; ++j)
            {
                primes_bool_array[i * j] = false;
            }
        }
    }
    for (int i = 2; i < primes_bool_array.size(); ++i)
    {
        if (primes_bool_array[i])
        {
            results.push_back(i);
        }
    }
    return results;
}


Then, I updated it to this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
std::vector<int> primes_upto(int limit)
{
    std::vector<bool> primes_bool_array (limit, true);
    std::vector <int> results;
    for (int i = 2; i <= std::sqrt(limit); ++i)
    {
        if (primes_bool_array[i])
        {
            for (int j = (2 * i); j <= limit; j += i)
            {
                primes_bool_array[j] = false;
            }
        }
    }
    for (int i = 2; i < primes_bool_array.size(); ++i)
    {
        if (primes_bool_array[i])
        {
            results.push_back(i);
        }
    }
    return results;
}

Which one of these do you feel is better or more efficient? Do you also have any recommendations?
Last edited on May 11, 2017 at 12:37pm
May 11, 2017 at 2:39pm
Im sorry for doing this, but *bump*
May 11, 2017 at 2:54pm
In theory the second version is faster because of simpified/removed multiplication. A multiplication is considered to be slower than an addition.

Normally you should evaluate std::sqrt(limit) once before the loop otherwise that costly expression is evaluated for each i. But it is possible that the optimizer realizes that limit is not changed and thus does it for you...
May 11, 2017 at 4:03pm
Both versions are broken.

The last valid position in std::vector<bool> primes_bool_array (limit, true); is limit-1;
the loop condition is <= limit.

Once that is fixed, consider this: limit is the square of a prime number and std::sqrt(limit) does not return the precise integer value of its square root.
May 13, 2017 at 6:52pm
So should I use std::floor? I don't understand your assertion that it has to be limit - 1
May 13, 2017 at 7:24pm
> So should I use std::floor?

A simple, reasonably inexpensive fix is to just add 1 to the square root narrowed to an integer.


> I don't understand your assertion that it has to be limit - 1

The size of std::vector<bool> primes_bool_array (limit, true); is limit
Valid positions in the vector are in the interval [ 0, limit-1 ]

May 13, 2017 at 7:47pm
std::sqrt(limit) does not return the precise integer value of its square root


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <cmath>
#include <cstdlib>
using namespace std;

int main()
{
   unsigned long long i, isq;
   for ( i = 1; i <= 1000000000; i++ )
   {
      isq = i * i;
      if ( i - sqrt( isq ) != 0.0 ) 
      {
         cout << "First failure at i = " << i << "    Difference is " << i - sqrt( isq ) << endl;
         exit( 0 );
      }
   }
}


FWIW, with my compiler (g++) this produces
First failure at i = 94906267    Difference is 5.26779e-009

However, I bet this is implementation-dependent.
May 13, 2017 at 8:50pm
Ok, so I did what you said, but when I enter a number that is a prime for the parameter of the function, that number does not come as a result. Here is my updated code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <iostream>
#include <vector>
#include <cmath>

std::vector<int> primes_upto(int limit)
{
    std::vector<bool> primes_bool_array (limit, true);
    std::vector <int> results;
    primes_bool_array[0] = primes_bool_array[1] = false;
    int floor_sqrt = std::sqrt(limit) + 1;
    for (int i = 0; i < floor_sqrt; ++i)
    {
        if (primes_bool_array[i])
        {
            for (int j = (2 * i);j < limit; j += i)
            {
                primes_bool_array[j] = false;
            }
        }
    }
    for (int i = 0; i < primes_bool_array.size(); ++i)
    {
        if (primes_bool_array[i])
        {
            results.push_back(i);
        }
    }
    return results;
}



int main()
{
    std::vector<int> vi = primes_upto(61);
    for (int i = 0; i < vi.size() - 1; ++i)
    {
        std::cout << vi[i] << ", ";
    }
    std::cout << vi[vi.size() - 1];
    return 0;
}   

This doesn't print 61
Last edited on May 13, 2017 at 10:02pm
May 14, 2017 at 12:56am
it wouldn't. Its up to 61. Put 62 in to see 61, or make it <=

The sieve isn't as efficient as GCD on the product of all known primes. Its use comes in when you get past the known prime realm, or when the product is too big, both of which hit way out past 61 for sure :)

Last edited on May 14, 2017 at 12:57am
May 14, 2017 at 1:59am
> when I enter a number that is a prime for the parameter of the function, that number does not come as a result.

If limit is to be an inclusive upper bound:

1
2
// std::vector<bool> primes_bool_array (limit, true);
std::vector<bool> primes_bool_array ( limit+1, true );

and then put the loop condition back to <= limit


> with my compiler (g++) this produces
> First failure at i = 94906267 Difference is 5.26779e-009
> However, I bet this is implementation-dependent.

If the floating point environment conforms to he IEEE specification (most do), then the error in the result of std::sqrt() is guaranteed to be less than half ulp.
Topic archived. No new replies allowed.