[code review] finding prime numbers

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
// finds primes in [lower_bound, upper_bound] closed interval

#include <iostream>
#include <vector>
#include <limits>

constexpr int lower_limit = { 2 };
constexpr int upper_limit = { std::numeric_limits<int>::max() };
// currently no upper-limit

bool is_divisible(const int dividend, const int divisor) {
// checks if divisor divides dividend evenly
    return dividend % divisor == 0;
}

bool is_prime(const int candidate, const std::vector<int>& primes) {
// checks whether the candidate is prime
// primes: contains all primes lesser than candidate
    for (const int prime : primes) {
        if (is_divisible(candidate, prime)) {
            return false;
        }
    }
    return true;
}

int main() {

    int lower_bound, upper_bound;
    std::cout << "Enter lower and upper bounds: ";
    std::cin >> lower_bound >> upper_bound;
    if (lower_bound < lower_limit) {
        lower_bound = lower_limit;
    }
    if (upper_bound > upper_limit) {
        upper_bound = upper_limit;
    }

    // find all primes less than or equal to upper-bound
    std::vector<int> primes;
    for (int i = 2; i <= upper_bound; ++i) {
        if (is_prime(i, primes)) {
            primes.push_back(i);
        }
    }

    // display primes in [lower_bound, upper_bound]
    std::cout << "Prime numbers in the closed interval [" << lower_bound
        << ", " << upper_bound << "]:";
    for (const int prime : primes) {
        if (prime >= lower_bound) { 
            std::cout << ' ' << prime;
        }
    }

    return 0;
}


I appreciate any suggestions. Is there a different way to do this?

-----------------

1
2
3
4
5
6
7
int i;
for (i = 0; i < primes.size() && primes[i] < lower_bound; ++i) {
}

for (; i < primes.size(); ++i) {
    std::cout << ' ' << primes[i];
}

Would this snippet be better to use in place of the last for-loop in my program (it doesn't have to test against lower_bound as often) or is it too cryptic?
Last edited on
Apart from 2, all prime numbers are odd. So treat 2 as a special case. If a number is divisible by 2 then it is not a prime. So if an odd number, start at 3 and increment by 2 instead of 1. Also, the greatest number to test for divisor is the square root of the number to be tested.

Your last snippet is more efficient as it doesn't do so many tests. However, i should be of type size_t and not int. If you think it's cryptic, just add a simple comment. Note that you don't need the {} at the end of the for. Just have ); at the end.

Why have is_divisible() function? As its very simple, why not just have the modulo division in-line?

If you some idea of how many primes there are likely to be (there's formula for approximations), then the vector primes can have memory reserved to avoid as much as possible memory reallocations.
seeplus wrote:
Apart from 2, all prime numbers are odd. So treat 2 as a special case. If a number is divisible by 2 then it is not a prime. So if an odd number, start at 3 and increment by 2 instead of 1. Also, the greatest number to test for divisor is the square root of the number to be tested.
That really narrows it down! Thanks!

seeplus wrote:
However, i should be of type size_t and not int
Because size_t is of the type vector::size()? So should I use size_t in every for-loop in which I'm iterating over the elements of a container?

seeplus wrote:
If you think it's cryptic, just add a simple comment..
Ok!

seeplus wrote:
Note that you don't need the {} at the end of the for. Just have ); at the end.
I thought that {} would be easier to spot. But I haven't read much code which have empty bodies. Is ); more common? Thanks for mentioning it though!

seeplus wrote:
Why have is_divisible() function? As its very simple, why not just have the modulo division in-line?
Makes sense!

seeplus wrote:
If you some idea of how many primes there are likely to be (there's formula for approximations), then the vector primes can have memory reserved to avoid as much as possible memory reallocations.
reserved with vector::reserve()? Good suggestion. Thanks!
Last edited on
Instead of the 2 for loops, you could do something like (not tried):

1
2
3
const auto itr {std::find_if(primes.begin(), primes.end(), [lower_bound](auto no) { return no >= lower_bound;})};

std::copy(itr, primes.end(), std::ostream_iterator<int>(std::cout, " "));


or if you're a sado-masochist, then:

 
std::copy(std::find_if(primes.begin(), primes.end(), [lower_bound](auto no) { return no >= lower_bound; }), primes.end(), std::ostream_iterator<int>(std::cout, " "));

Last edited on
seeplus wrote:
Instead of the 2 for loops, you could do something like (not tried):

That looks intimidating! After a bit of googling, I was able to gather that the code does something similar by finding the first prime number to print and then inserting all elements thereon into the output stream. And that this does so with iterators.

Interesting snippet. Thanks.

Would using iterators, like your snippet did, be better for any reason (I have to clarify that I have very little knowledge about iterators; I don't know why they are a thing!)?
I appreciate any suggestions. Is there a different way to do this?


Use a sieve.
lastchance wrote:
Use a sieve.
Thanks!
1
2
3
4
5
6
7
8
9
10
11
    std::vector<int> primes;
    for (int i = 2; i <= upper_bound; ++i) {
        primes.push_back(i);
    }
    for (int i = 0; i != primes.size()-1; ++i) {
        for (int j = i + 1; j < primes.size(); ++j) {
            if (primes[j] % primes[i] == 0) {
                primes.erase(primes.begin() + j);
            }
        }
    }
I did this. It seems to work (to my surprise!).
Here is what I tried to do:
primes[i] is guaranteed to refer to a prime number.
primes[j] needs to be tested for being prime.
So for every possible primes[j] I check if it is evenly divided by any primes[i]; 
depending on the result, it either gets removed from the vector or gets to be a primes[i]

Can I have any improvements with this?
As I mentioned previously, excluding 2 primes start from 3 and are odd. So the first loop can start at 3 and increment by 2.

vector.erase() is inefficient as for each item erased means a memory copy from the element to erase to the end of the vector.

Possibly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
include <vector>
#include <iostream>
#include <algorithm>

int main()
{
	const size_t upper_bound {100};

	std::vector<int> primes;

	for (int i = 3; i <= upper_bound; i += 2)
		primes.push_back(i);

	for (int i = 0; i != primes.size() - 1; ++i)
		for (int j = i + 1; j < primes.size(); ++j)
			if ((primes[i] > 0) && (primes[j] % primes[i] == 0))
				primes[j] = -1;

	std::copy_if(primes.begin(), primes.end(), std::ostream_iterator<int>(std::cout, " "), [](auto i) {return i > 0; });
}

Last edited on
Actually, I meant that the sieve would generate your primes in the first place. It certainly won't be done by erasing, just marking a vector<bool> array.

The most known is the Sieve of Eratosthenes:
https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
but there are others.

It comes up regularly as a topic in this forum, so if you search you will find plenty of previous examples.
Possibly something like (excluding 2):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
include <vector>
#include <iostream>

int main()
{
	const size_t upper_bound {100};

	std::vector<bool> primes(upper_bound);

	for (int i = 3; i < upper_bound - 1; i += 2)
		for (int j = i + 2; !primes[i] && j < upper_bound; j += 2)
			if (j % i == 0)
				primes[j] = true;

	for (int i = 3; i < upper_bound; i += 2)
		if (!primes[i])
			std::cout << i << ' ';
}



3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <cmath>
#include <vector>
using namespace std;

vector<bool> sieve( int N )
{
   vector<bool> prime( N + 1, true );
   prime[0] = prime[1] = false;

   int imax = sqrt( N + 0.5 );
   for ( int i = 2; i <= imax; i++ )
   {
      if ( prime[i] )
      {
         for ( int j = i * i; j <= N; j += i ) prime[j] = false;
      }
   }

   return prime;
}

int main()
{
   const int N = 1000000;
   vector<bool> prime = sieve( N );

   const int lower = 900000, upper = 900999;
   cout << "The primes between " << lower << " and " << upper << " are:\n";
   for ( int i = lower; i <= upper; i++ ) if ( prime[i] ) cout << i << " ";
}
Last edited on
@seeplus
When i = upper_bound - 2
Then j starts at upper_bound (i+2). But because of j < upper_bound in the condition, the inner loop never executes when i=upper_bound-2.

Did you mean to write j <= upper_bound (The last loop would also have go be modified to i <= upper_bound)? Or perhaps i < upper_bound-2

It's a small detail but I just want to make sure.

Thanks for the code snippet.
@lastchance
sqrt( N + 0.5 )
Why are we adding 0.5?
@lastchance
sqrt( N + 0.5 )
Why are we adding 0.5?

Because I'm taking no chances with floating-point rounding error.
I had never thought about this.

Suppose the square root of 169 is represented as 12.9999 (suppose 13 can't be exactly represented).

Casting this value gives 13 (I guess there is a small amount of rounding).

We got away with the right int because there was a very small error.

But could there be enough error (for some number and some operations) such that what should have been 13 is represented as 2.998 and then when this value is casted to int, the int gets 12 instead of 13 (the fractional part is ignored while casting to int)?

Wouldn't it make more sense to have the default behaviour of casting to int be rounding instead of flooring in that case?
Last edited on
The default behaviour of casting a double to an int is truncating (toward 0). It is only "flooring" if the number is positive (and the floor() function actually returns a double, albeit it is numerically a whole number).

I think that truncation is more predictable. Which way would you choose to "round" 3.5 (not withstanding it might actually be 3.49999999999 or 3.50000000001)?

Last edited on
you can craft a scenario where any default behavior is not ideal, that is why you can forcibly override it with floor() ceil() round() and so on.
I think truncation has a historical basis, since I've seen it as the default behavior for conversion to integers in several languages. It's probable that someone somewhere once needed to specify some method of conversion and noticed that truncation was just the simpler implementation

Which way would you choose to "round" 3.5
Conventionally 3.5 would be rounded up to 4. That way the range that is rounded towards 4 is [3.5; 4.5).

not withstanding it might actually be 3.49999999999 or 3.50000000001
Well, 3.5 is exactly representable. 1 ulp up or down may be different, of course.
I think truncation has a historical basis,

which in turn, I believe, is a hardware bias: I think this is what happens at the CPU level and rather than add instructions to override it, C and C++ let it ride.
Topic archived. No new replies allowed.