MultiTasking

Hi All,

I'm new to programing in c++ so please bare with me. Here's the problem, I have a list/vector of external job/programs I need to run. Some of the job/programs take ~10+ mins to complete and I have ~1000 jobs/programs to run. However, I have 12 CPUs to work with, so I would like to write a program to run 12 jobs/programs continously until all jobs/programs are complete. I don't know how to even start...

I known how to run all the jobs at once (but this would crash the computer) and I also know how to run 1 job at a time (but this takes too long).

Here is what would work to run 1 job at a time:

int main() {

// LOCATION OF FILES TO BE RUN
string fileLocation = "filesAreHere";
// Create A Vector Of File Names, Calls "getfilenames" Program
vector<string> files;
files = getfilenames(fileLocation);
int size = files.size();

// Start Loop To To Excute Files
int i=1;
while(i < size-1) {

system(files[i]);
++i;
}
return 0;
}

Thanks for any help,
Mike
The first thing we need to do before we even approach this step is identify any potential race conditions and exclusive file locks. Do you know for a fact that each of these processes works with a different set of files, and that one does not have to wait for another to finish before it starts?

Don't use "System(...)" for this task, it's a bad design choice. Which OS are you running on? We can show you how to use the API to startup new processes.
Last edited on
First, thanks for the fast reply. I'm working on a unix (FC 14) system. I know for a fact that none for the processes need information from the other processes. I'm really just trying to excute one program with a different input file over-and-over. The vector just cantains the different file names. Here is the command:

./rungms filename[1]
It seems to me a simple shell script that starts the processes in the background would be much simpler for you than writing it in C or C++... especially since the overhead of starting a new process in C or C++ is not for the light-hearted or beginners...
I just knew as soon as I hit submit that you would come back and say you're working on *Nix... I just installed Slackware a few days ago myself so I would probably do more harm then good with any suggestions I might have other then to try the Linux\Unix section on this form. Sorry.
Ok, thanks anyway.

I did write I C shell script to do this, it works but I get errors every now and than. The script tries to run the same file since they are starting at the same time. I'm sure there is a better way to do this... I tried adding "sleep" statments but I still sometimes get errors.

My script creates and excutes 12 other scripts which reads a file to get a file name then deletes that enetry. As mentioned this can cause problems. Again, I'm sure there is a better way, I thought c++ might be a good choice since I would have to excute the script in a c++ program anyway (running the files is just one part of what I'm trying to do).

Mike
Now I'm confused. Post your exact program requirements.
Below is my script that runs all the files in the "files-to-be-run" directory. First, the file names are written to a file called "File_names". Then "PCUs" is set to the number of jobs to run at once. Next, "PCUs" number of scripts are created and excuted. Each script reads then deletes the first line in the "File_names" file. This crazy scheme works, but sometime two scripts read the same current first line causing 2 lines to be deleted in "File_names" and the same file to be excuted twice.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#! /bin/csh
# Script to generate GAMESS inputs
### Directory Location of Files to be Run (DL) ###
set DL = files-to-be-run/
# Create File of Files Names to be Run
ls -1 $DL >! File_names
### Number of Files to Run at Once (PCUs) ###
set PCUs = 12
set XX = 1
while ($XX <= $PCUs)
# Create runPX.s Files
echo '#! /bin/csh' >! .runP$XX.s
echo 'set AA=99' >> .runP$XX.s
echo 'while ($AA > 0)' >> .runP$XX.s
echo 'set FN=`head -1 File_names | sed '\''s/....$//'\''`' >> .runP$XX.s
echo 'sed -i '1d' File_names' >> .runP$XX.s
echo 'echo $FN' >> .runP$XX.s
echo 'rungmsFS2 '$DL'$FN >& '$DL'$FN.log' >> .runP$XX.s
echo 'rm '$DL'$FN.dat' >> .runP$XX.s
echo 'sleep 1' >> .runP$XX.s
echo 'set AA=`wc -l File_names | cut -d " " -f1` ' >> .runP$XX.s
echo end >> .runP$XX.s
echo exit >> .runP$XX.s
chmod +x .runP$XX.s
.runP$XX.s &
sleep 2
@ XX = ($XX + 1)
end
exit
Ah, your problem is exactly what [b]Computergeek01[/01] indicated -- you have a race condition on the input file.

See http://tipstrickshowtos.blogspot.com/2010/02/parallel-programming-in-linux-shell.html for how to create a lock using the shell. Each sub-script should only modify your "File_names" file while it has the lock.

Good luck!
Thanks, I think I got it working now.
Topic archived. No new replies allowed.