Hello all, I'm new to the forums, though I'm not new to the language. I have a fairly simple question:
forward:
I'm working on a large computational problem, that involves hundreds and hundreds of calculations, all of which I would like to do in separate threads. Easy enough if I simply want to pass parameters into a thread function, however I want my threads to stay alive and do many many jobs.
The Problem:
How do I make a ton of threads and get information to them while they are running, and make sure the job isn't done twice?
So far my code is minimal. I'm using a simple class to manage what needs to be done as far as calculations go (basically a deque of functions that pass void* as parameters, with member functions to add and remove members), because I'm not sure how to use events (or whether they would be fast enough) I do know some socket programming. So can you give me any guidance? I would greatly appreciate any code you could offer =]
Might want to post this in General C++ Programming. This is beginner forum. Something that involves as many computations that you're saying is probably beyond a beginner's reach.
I'm confused, you say you're a veteran user of C++ why would you think sockets would help in this instance? Is this for a cloud or something? To make sure a job isn't done twice I would just remove that instance from the working queue, if this would adversly effect some other part of your application then use a static class member to indicate what part of the working queue is to be worked on next.
Are you using the Win32 API? SDL? SFML? Boost::ASIO? Something else? If you give us more details then we can give you more guidence.
Sockets are not necessary in this instance, I was just mentioning it if it the possibility existed that there may be some slight advantage in utilizing some socket programming techniques (eg select() and poll()).
Yes. I was well aware and have already been removing the job from the queue. However as my main thread adds computational jobs, it checks through the queue to see if it exists. I'm going to end up using a list or something similar (with a mutex I'm thinking) that will list the jobs currently being done.
Didn't think about the static member, good idea!
Currently it's not using any outside libs. It's all STL (as of this moment) but I can switch to a posix compliant system or windows if necessary.
Sorry for the lack of details, I had hoped that someone more experienced than I would instantly know what to do..
hundreds and hundreds of calculations, all of which I would like to do in separate threads
I'm not experienced in multi-threading, and I've only really used a large amount of threads in java, but wouldn't hundreds of threads slow things down a hell of a lot?
I've written more then a dozen multithreaded apps with real world applications. The trouble is that STL doesn't provide MT capability yet and you haven't picked a library that does.
I'm not experienced in multi-threading, and I've only really used a large amount of threads in java, but wouldn't hundreds of threads slow things down a hell of a lot?
That's what I thought too, although I don't have much experience with multithreading either.
Nonetheless, I'm not sure I see why you would want hundreds of threads across a processor with at best 8/16 virtual cores. That's why I suggested CUDA/OpenCL, although I have little experience with them myself.
Yes, running hundreds of threads in any environment would slow the system down considerably but it slows it down considerably more, and makes intercommunication much harder, if you run hundreds of processes on the same system. Although I've been informed recently by my local Guru that this may not be the case for Linux, I plan on looking into this later today.
First off for a project like this you use a dedicated system, not your desktop. The main problem with running too many threads on a system is that if you take up too much memory the system will potentially cache data to the disk, you can query for this but it the OS doesn't outright tell you it's doing it. With everything running in paralell the OS would need to go right back to the disk and get the data again which would mean that it caches out another threads memory block this happens over and over ad nausium. Any time you hit the hard disk (both writting to and reading from) you are drastically reducing your data throughput speed and losing the benefit that multithreading provides. On windows this potentially locks the system since given an equal priority the Windows schedular assigns priority based on the number of threads in a process.