High speed processor

Pages: 12
Hi,

I have written a c++ code to do some calculations on a huge text file. It took about 10 days to do the calculation on a high speed laptop.

Is there an online c++ servers or mainframes that I can use to do the calculations? Or anything else i can use to speed up the calculations. I have searched but couldn't find.
Did you make sure you were making a release build? Was your code properly multi-threaded? Are you sure you don't have any superfluous code? Are you using the fastest algorithms you could find?

There are *all sorts* of ways you could improve performance. ;)

But in response to your question about online C++ servers, I would be somewhat surprised if there was something outside of supercomputers that you could buy time on that would let you compile and run your C++ code on them. It's still worth looking into, though. :)

-Albatross
Did you make sure you were making a release build? Was your code properly multi-threaded? Are you sure you don't have any superfluous code? Are you using the fastest algorithms you could find?...

...Are you using a faster PC than a Pentium 3?
Thank you for your replies....

I'm using an intel core dual (2.13 ghz). I think I'm using the fastest serial algorithm that I can write...
I did not use a release build, will this have a huge effect on the speed.

My c++ code is very simple. However, my dataset is very huge... The dataset contains names of people and my code counts the number of times each name had occurred...

The dataset contains names of people and my code counts the number of times each name had occurred...



Have you tried sorting your dataset first - this should change the Big O of your agorithm for counting repeating instances in your dataset.
I think I'm using the fastest serial algorithm that I can write
This is a custom algorithm and not a std or some other libraries?

How big is the dataset, Terabytes of data? How many passes through the dataset does your algorithm do?
Nouf wrote:
I did not use a release build, will this have a huge effect on the speed.

Yes, it will likely make a considerable difference, at least it did for me when I did link list sort timings.
Last edited on
Thanks for the replies...

Yes, this is a custom algorithm... My dataset contains nearly 83,000,000 lines... Each line has a name...
The algorithm do nearly 50,000,000 passes through the dataset. Once for each name.

No I didn't sort the dataset. Does sorting 83,000,000 lines in a text file take a somehow short time in an intel core dual PC with 2.13 ghz?
As Naraku says, a release build will probably make a big difference. Don't just trust your IDE to do it all for you either. Your compiler probably has many optimisation options that you can turn on.

Dig out a decent profiler and let it watch your code run for a while; it will point out to you the true bottlenecks.

Reading from disk is expensive. Really expensive. It takes forever. At the moment, if you're simply reading through the data over and over, you can make some serious time savings there.

The list of ways to optimise C++ code is long; loop unrolling, references over pointers, speed-for-safety exchanges of C++ style for C style, dropping to assembler, lots more.

My dataset contains nearly 83,000,000 lines... Each line has a name...
The algorithm do nearly 50,000,000 passes through the dataset. Once for each name.
Do you read the file for each pass, or do you read it into memory once.

My dataset contains nearly 83,000,000 lines... Each line has a name...
The algorithm do nearly 50,000,000 passes through the dataset. Once for each name


If dataset fits fully in memory, a single pass is enough.
If dataset doesn't fit fully in memory, three passes should be enough.
If you need more, you are doing it wrong.

BTW: Core 2 Duo is not a high-speed laptop. I'd say it is a common-laptop.
BTW2: Using release build probably won't help much because compilers are not intelligent enough to fix broken algorithms.
Last edited on
Use the Intel Compiler. It has optimisations that could speed up processing your work by 300%.

It's free for home-users.

http://software.intel.com/en-us/articles/intel-compilers/
Start by trying to minimise your algorithmic complexity.
Last edited on

It has optimisations that could speed up processing your work by 300%.


It was true 10 years ago. Now VS and GCC are nearly as good. A faster compiler still doesn't solve the problem - unoptimal algorithm is unoptimal algorithm, regardless of low-level optimisations.
It was true 10 years ago. Now VS and GCC are nearly as good. A faster compiler still doesn't solve the problem - unoptimal algorithm is unoptimal algorithm, regardless of low-level optimisations.


I wrote a program a few days ago to search for dipoles in a magnetic field map, which is a very complicated problem that needs processor power. You'll be surprised when you know that VS with its /O2 is 200-250% faster than MinGW's -O3, and Intel compiler (composer 2011) with its optimisation (-fast) is 300-350% faster than MinGW.
There's no point in emphasizing optimistations if you don't know where the program is spending its time. If 5% of the time is spend doing processing and 95% of the time is spent doing I/O, compiling with some kind of optimisation or slating the CPU isn't really going to help.
Last edited on
83,000,000 is a large number, but I don't just believe that 10 days is required to process.

Can you post an example of 1~2 lines from text file and indicate what procesing is required on them.

@kbw

The guy didn't share any code, and that's why the only suggestion we can provide is optimisations. You can't judge that the guy has written bad algorithms just because his program takes long time to process. In my masters thesis, I had to run a program that took 2 days to finish (it involved discritising continuum objects to a 3D grid of a 1 mm^3 sample in nm resolution). Does this make me a bad programmer or a bad algorithm writer? It's a very subjective thing and you have to give advice with what you have. The guy's asked whether he could improve the time of the execution of the program, and the answer is with optimisations. Not with accusing him of being a bad algorithm writer.
Last edited on
Now and then, when presented with something interesting, people come up with better ways to do it. Like this (beautiful) monstrosity: http://www.cplusplus.com/forum/general/59883/

If you post enough of your actual problem, someone might solve it.

I see. But as the OP hasn't responded with further detail so there's only so far you can go.
Pages: 12