Hi, i'm writing a daemon and i need to launch some processes and I'm not sure what do I need to use. I've read a lot about forking, threads and so on but I don't have it clear yet. I'll try to explain myself:
First I read a config file which contains several IP addresses and some more info about each IP. Then I need to launch a connection over each of those IPs . Then the daemon should enter "the big loop" and check every 20 seconds if those processes are still running. If not, I should create a new connection for each of those connections that are down.
Those connections are suposed to be running continuosly. So my first thought was I would need to fork and create as many child processes as IP connections I have. Plus, it would be very easy to handle the pids and check if they're running or not.
But on the other hand those "child processes" don't have much to do with the parent one , so I doesnt make much sense to me as I understand that a child process is a copy of the parent process. Also I read about zombi processes that would be a problem here I think. And more, I would need to pass some parameters (IP address and some other info) to each of the new processes and by forking I cant find the way to do that.
Threads dont seem like a good solution neither, the processes are not independent from each other and they dont have their own memory space(maybe i'm saying something stupid with that) and I can't see the way to manage the reconnection matter with them.
I'm quite lost here cause I think I'm missing something, maybe some other way to manage this besides forking and threading. I hope you could give me a clue about which way to go , I have always found the answer in these forums :)
This is a classic problem and I believe most developers solve it using a pool concept. That is, maintain a pool of threads/connections/objects etc. Once request come in, look for a "free" in the pool. If all are busy, then a wait is executed. Those previously running once job completed will become free again in the pool and the those who wait will be waked up to make use of it.
I don't know how to describe in details but the main idea is there. Whether to implement the pool concept as processes, threads etc is up to the individual developer implementation.
I believe it is "cheaper" to maintain a pool of threads than processes. Inter-process communication is expensive. But the "cheaper" threads come at a cost. Proper synchronization mechanism must be in place since all the threads now live together in the same process.
Thank you for your reply, I didn't know about pools . I have been reading some info about them and it looks like I could use that to do most of what I wanted. I can manage the creation and destruction easily (or so it seems) and also send some parameters when the thread is started (though i dont need inter-process communication more than those starting data).
As you say, it comes at some cost, I have read that it doesn't work good if threads have a long lifetime, and plus there is a limited number of threads running at the same time. I would need the threadpool library too (or boost, but that's just too big, i couldn't use it)
I will start getting into details with pools and I will try to use them in my code, but I will keep looking for alternatives :)
Of cuz there are variations of pools implementation. Some fine-tune it in such a way like if say there isn't any request coming in for X minutes, kill some idling threads in the pool to reduce memory consumption or database resources etc.
I wouldn't re-invent the pools as I believe there are Open Source libraries for it already. Why not you just re-use them instead ? If size is big like you say about boost threadpool library then try to hunt for other smaller footprint threadpool libraries instead.
Yes you are right, I just didn't want to use any external library, I need to stick to the basic standard libraries as I need a very portable code. I would need to run it on YDL, debian, ubuntu, or even in mobile devices .
One crazy and maybe stupid idea that came to my mind .... what about if i forked every thread ? so i would get exact copy of the thread (thus i will be able to initialize variables and pass some data at the beginning from the main) and also i would get a pid to manage and the advantages of forking (memory, independency, etc)
Hi I spent quite a lot of time working on this and reading about what you reccomended me, but finally I have been able to do it using some not-too-weird forking :) Thank you
Forking implies you use spawning of processes instead of spawning threads within a single process method? Just need to take note spawning of process are much more expensive than spawning threads. The pros of this (spawn process) approach is a misbehaving thread can crash the process but a misbehaving process is only isolated to that process itself. Trade-off I would think.
Yes, my program shouldnt crash if one of the processes crashes (which is likely to happen) or sleeps, they need to be independent from each other so thats why i went for fork(). I should have no more than 10 or 15 childs running at the same time so I guess thats not too much. Regards !
Just need to take note spawning of process are much more expensive than spawning threads.
That isn't quite true nowadays. Any reasonable Un*x OS should use cow semantics, meaning that the actual fork() is as fast as creating a thread, if not faster thanks to pthread's stupid implementation (if using linuxThreads prior to NPTL).
The real difference is in sharing data. Processes don't share memory space/data structures; threads do.