I tried both of those, Jonnin. It made no measurable difference. There is a small but measurable time difference involved in seeking to return the primes, not just count them (about 0.07 seconds, I can get down to 0.471 seconds just counting primes), but vectors are so heavily optimised that even that is small. I had assumed that vectorised slicing of valarrays would be better, but vector<bool> clearly outperforms valarray<bool>, sadly.
The sieve clearly visits some points twice, which it would be nice to eliminate. I would also like to be able to get rid of the one and only repeated multiply j = i * i, but carrying isq through as a variable and updating it by addition and bitshifting:
(i+2)2=i2+(i+1)<<2;
didn't make any difference either.
As a beginner in the beginners' corner I learned something new: vectors. Thank you.
I learned also that M$VS2010 is waaaaay too slooooow manipulating a vector if compiled in debug mode. Compiled for release is ok.
Line 10 in your suggestion is int imax = sqrt( N + 0.5 );
Could you please explain the need for that +0.5?
#include <iostream> /* cout, cin, endl */
#include <ctime> /* clock */
#include <vector>
#include <cmath> /* sqrt */
usingnamespace std;
unsignedlongconst z(100000000);
vector<bool> p(z+1, true);
int main()
{
unsignedlong n, c(1), i, nmax;
clock_t t0 = clock(); /* time(R) */
nmax = sqrt( z + .0 ); // sqrt wants a real argument
for (n = 3; n <= nmax; n+=2) /* only up to sqrt(z) */
if (p[n]) /* candidate is prime */
{
c += 1; /* count it */
for (i = n*n; i <= z; i += n+n)
p[i] = false; /* sieve out multiple */
}
/* sieving is done, all remaining are prime */
for (n = n + 2; n <= z; n+=2)
if (p[n]) c += 1;
double te = (double)( clock() - t0 ) / CLOCKS_PER_SEC; /* time(E) */
cout << "From 1.." << z << " found " << c << " primes in " << te << " seconds." << endl;
cout << "The first 50 of them are:" << endl;
cout << 2;
i = 0;
for (n=3; i < 50; n += 2)
if (p[n])
{
cout << ' ' << n;
i++;
}
cout << endl;
}
From 1..100000000 found 5761455 primes in 0.437989 seconds.
The first 50 of them are:
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233
BTW, the sqrt in the procedure has nothing to do with Eratosthenes, it is related to the limitation to only a few of the infinite quantity of numbers.
Edit: M$VS2010 rejects sqrt with integer argument. Changed sqrt( z ) to sqrt( z + .0 )
Line 10 in your suggestion is int imax = sqrt( N + 0.5 );
Could you please explain the need for that +0.5?
The theoretical requirement would be sqrt( N ). Any composite number less than or equal to N must have at least one factor less than or equal to sqrt( N ), so this is part of the sieve algorithm and is the limit of how far you need check to initiate the slicing.
However, sqrt( N ) is evaluated and returned as a floating-point number, before being truncated down to an integer on assignment. As floating-point numbers are always subject to floating-point (in)accuracy, that extra 0.5 is covering myself for that unfortunate instance when, say, 24.99999999 gets truncated down to 24.
I suspect vector<bool> works so well because it can store elements in a small amount of memory (possibly 1 bit rather than 1 or 4 bytes). Thus you can keep more in fast-access cached memory at any time. But I bet that is susceptible to what else you might be running on the machine at the time!
#include <iostream>
#include <cmath>
#include <ctime>
#include <vector>
usingnamespace std;
vector<int> sieve( int N )
{
int M = ( N - 1 ) / 2; // Upper limit of odd numbers counting from 3, 5, ...
vector<bool> A( 1 + M, true ); // A[0] not used
int oddmax = ( sqrt( N + 0.5 ) - 1 ) / 2;
for ( int odd = 1; odd <= oddmax; odd++ ) // Only worry about odd numbers (first is 3, second is 5 etc.)
{
if ( A[odd] )
{
int odd2 = odd << 1;
int oddstep = odd2 + 1;
for ( int j = odd2 * ( odd + 1 ); j <= M; j += oddstep ) A[j] = false;
}
}
vector<int> P = { 2 }; // 2 is the only even prime
for ( int odd = 1; odd <= M; odd++ ) if ( A[odd] ) P.push_back( odd + odd + 1 ); // Include odd primes
return P;
}
int main()
{
constint N = 100000000;
clock_t t0 = clock();
vector<int> primes = sieve( N );
double t = (double)( clock() - t0 ) / CLOCKS_PER_SEC;
cout << "Found " << primes.size() << " primes in " << t << " seconds\n\n";
constint num = 20;
cout << "The first " << num << " are: ";
for ( int i = 0; i < num; i++ ) cout << primes[i] << " ";
}
Found 5761455 primes in 0.357052 seconds
The first 20 are: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71
1. Is there any rational justification for still using the M$VS2010?
I was told that the compiler of 2010 has fewer errors than the successors. As a beginner I am not able to prove this statement. In contrast the IDE of the 2015 would be better, I was told. No idea, I am still struggling with terms like Project and Solution, big notions but obscure boundary (for me up to now).
2. What is the grounds of rejection?
sqrt for interger arguments is not implemented (in the system I am currently using).
3. Explicit cast would be explicit: sqrt( static_cast<double>(z) )
Thank you. Meanwhile I use my own "integer RooT"
1 2 3 4 5 6 7 8 9 10
unsignedlong rt( unsignedlong z)
{
unsignedlong a, b;
a = z;
do {
b = a;
a = (a + z / a) / 2;
} while (a < b);
return b;
}
It is slower than int rt = sqrt( N + 0.0 ), but i) it's called only once outside the loop and ii) I've done it on my own. (Just see now, argument 0 should get red carpet treatment -- unless 0/0=0 is usual in C/C++)
I was told that the compiler of 2010 has fewer errors than the successors.
You were IMO, to put it bluntly, lied to.
If your intent is to write C++ code that is not capable of being even C++11 compliant then by all means use VS2010.
VS2017 is more than capable of enforcing C++ standard compliant coding.
There is one very minor quibble. The earliest C++ standard it allows appears to be C++14, even when no explicit language standard is chosen.
I learned also that VS2010 is waaaaay too slooooow manipulating a vector if compiled in debug mode. Compiled for release is ok.
In my experience debug mode with VS2017 still can be a major time bottle neck, if not as severe. Having to load and use the debug libraries, speed is not even remotely something to be worried about then. Speed is just not important when debugging code. IMO.
Microsoft thinks they are trendy or even trendsetters.
Project and Solution.
Project should be obvious. At your level, its one assignment or program, but it can mean a group of those that are related in some way, like a library specific to your program that you also wrote might be 2 things in one project.
A solution is microsoftspeak for "program". Lately they have moved on to "app" which is short for "application" which is millennial for "program". All these things mean about the same thing: a freaking executable file on your computing device. Try not to get hung up over the latest buzzword. Its idiotic, but they are going to keep changing the words so you just have to sigh and roll with it.
writing your own root finder is great practice, but the CPU probably has this as a circuit, and nothing you can do in c++ is ever going to get within an order of magnitude of *that* for performance. Use the tools they gave you, apart from the 'for fun' and 'for practice' stuff. If you can beat it, do so (pow for integer powers is very beatable in code, but sqrt, I don't think its possible, its a case by case thing). You might do it with an approximation, but not a high precision finder. This stuff can be fun, but you can sink a lot of time into shaving off a few nanoseconds too.
2010 was 9 years ago. That is an *eternity* in this world. Granted I have a few things that date back to 1990, but not my compilers :P
C++ has revisions. It had one in '11, '14, and '17 so you are THREE generations out using that tool. (this was said but not directly, I wanted to make this very clear how out of date you are).
It is now three week I try my first steps with C++, first with MinGW and failed, now with M$VS2010 and yet I do not feel comfortable with it, so I am absolutely not in the position to have an own opinion about the pros and cons of compiler versions or IDEs. For me it is not important if I was told a lie or some taradiddle. My goal was first and foremost to modify an existing program (the trace output of V41, a virtual HP41 you may get at www.hp41.org for free, almost free, you have to hand over your e-mail address). I achieved that in the first week after installing VS2010. Now that I do have C++, I just try if it could be useful for something else.
In addition, I found this site with tutorials and a helpful beginners forum. Quite possible, I will redo simulations of an HP7470A pen plotter and an HP82162A printer. I did both in ooREXX some time ago.
Use the tools they gave you, apart from the 'for fun' and 'for practice' stuff.
As a beginner in C++ I just try this and that, nothing serious. Some kind of training. And as with tennis, for the first trials an old racket will be ok.
Now it's getting cruel. This suggestion is still Sieve of Eratosthenes but far away from the "aesthetical looking" first try.
The idea is, if reducing the size of the vector has such a distinct impact on execution time, I could squeeze out some more multiples, those of 3 and 5, and spend some time on index calculations.
#include <iostream> /* cout, cin, endl */
#include <ctime> /* clock */
#include <vector>
usingnamespace std;
unsignedlongconst z(100000000);
vector<bool> p(((z-2)/30+1)*8, true);
unsignedlong rt( unsignedlong z)
{
unsignedlong a, b;
a = z;
do {
b = a;
a = (a + z / a) / 2;
} while (a < b);
return b;
}
unsignedlong i2n(unsignedlong i)
{
unsignedlong b;
b = i % 8;
switch (b) {
case 0:
b = 7;
break;
case 1:
b = 11;
break;
case 2:
b = 13;
break;
case 3:
b = 17;
break;
case 4:
b = 19;
break;
case 5:
b = 23;
break;
case 6:
b = 29;
break;
case 7:
b = 31;
break;
// default: /* should not come here */
}
return 30 * (i / 8) + b;
}
void pf(unsignedlong n)
{
unsignedlong a, b;
b = n % 30;
switch (b) {
case 7:
b = 0;
break;
case 11:
b = 1;
break;
case 13:
b = 2;
break;
case 17:
b = 3;
break;
case 19:
b = 4;
break;
case 23:
b = 5;
break;
case 29:
b = 6;
break;
case 1:
n -= 2;
b = 7;
break;
default:
return;
}
a = n / 30;
b = 8 * a + b;
p[b] = false;
return;
}
int main()
{
unsignedlong n, c(3), i, rz, m;
clock_t t0 = clock(); /* time(R) */
rz = ((rt(z)-2)/30)*8 + 1; /* roundabout */
for (n = 0; n <= rz; ++n) /* only up to sqrt(z) */
{
if (p[n]) /* candidate is prime */
{
c += 1; /* increment counter */
m = i2n(n);
for (i = m*m; i <= z; i += m+m)
pf(i); /* sieve out multiple */
}
}
/* sieving is done, all remaining are prime */
for (i = ((z-2)/30+1)*8; i2n(i) > z; --i);
for (n=n; n <= i; n++)
if (p[n]) c++;
double te = (double)( clock() - t0 ) / CLOCKS_PER_SEC; /* time(E) */
cout << "From 1.." << z << " found " << c << " primes in " << te << " seconds." << endl;
}
It looks awful but runs few miliseconds faster (only with Optimization level -O3, maximum).
From 1..100000000 found 5761455 primes in 0.347293 seconds.
The runtime varies perchance, this is one of the best. So I assume the value represents wall-clock time, not CPU-time.
#include <iostream>
#include <cmath>
#include <ctime>
#include <vector>
usingnamespace std;
int sieve( int N )
{
int M = ( N - 1 ) / 2; // Upper limit of odd numbers counting from 3, 5, ...
vector<bool> A( 1 + M, true ); // A[0] not used
int oddmax = ( sqrt( N + 0.5 ) - 1 ) / 2;
for ( int odd = 1; odd <= oddmax; odd++ ) // Only worry about odd numbers (first is 3, second is 5 etc.)
{
if ( A[odd] )
{
int odd2 = odd << 1;
int oddstep = odd2 + 1;
for ( int j = odd2 * ( odd + 1 ); j <= M; j += oddstep ) A[j] = false;
}
}
int c = 1; // 2 is the only even prime
for ( int odd = 1; odd <= M; odd++ ) if ( A[odd] ) c++; // Include odd primes
return c;
}
int main()
{
constint N = 100000000;
clock_t t0 = clock();
int p = sieve( N );
double t = (double)( clock() - t0 ) / CLOCKS_PER_SEC;
cout << "Found " << p << " primes in " << t << " seconds\n\n";
}
#include <iostream>
#include <cmath>
#include <ctime>
#include <vector>
usingnamespace std;
int sieve( int N )
{
int odd;
int M = ( N - 1 ) / 2; // Upper limit of odd numbers counting from 3, 5, ...
vector<bool> A( 1 + M, true ); // A[0] not used
int oddmax = ( sqrt( N + 0. ) - 1 ) / 2;
int c = 1; // 2 is the only even prime
for ( odd = 1; odd <= oddmax; odd++ ) // Only worry about odd numbers (first is 3, second is 5 etc.)
{
if ( A[odd] )
{
++c;
int odd2 = odd << 1;
int oddstep = odd2 + 1;
for ( int j = odd2 * ( odd + 1 ); j <= M; j += oddstep ) A[j] = false;
}
}
for ( ++odd; odd <= M; odd++ ) if ( A[odd] ) c++; // Include odd primes
return c;
}
int main()
{
constint N = 100000000;
clock_t t0 = clock();
int p = sieve( N );
double t = (double)( clock() - t0 ) / CLOCKS_PER_SEC;
cout << "Found " << p << " primes in " << t << " seconds\n\n";
}
Because when I timed it this didn't make any measurable difference. sqrt(M) is several thousand times smaller than M, so it makes negligible difference whether I start the extra counting loop from 1 or sqrt( M ) + 1.
The timing swings wildly on c++ shell from about 0.27 to 0.33 seconds, with the occasional one above 0.4 seconds. So I guess it is very dependent on the size of the sieve array and how much of that fits in cache memory. (And what else is being done on the machine at the same time.)
So a good segmented sieve will probably perform better.
So I guess it is very dependent on the size of the sieve array
In addition the speed of the inner, the sieving loop has most impact. I should find a faster index calculation. Meanwhile I do the sieving for some primes everybody knows by heart in advance, just to avoid for those the braking index calculation.
if statements (or really, jumps of any sort) are notorious for chugging up perfectly good code. I can't help but think if you can avoid some of the jumps it would earn you some clocks. In the grand scheme they are a tiny part of this code, though.