Shared Memory

Hi all,

I am trying to have a model where two process write to a common file. Hence, I was wondering which method would be the best(in terms of speed).

1. Semaphores
2. Message Queues
3. Memory mapped files.

P.S: Files would be very large. Say some Gigs. Can someone please throw some light on this?
when two process write a common file,i think may need a mutex lock.
good luck
Is it a log file, or binary file? Is the file only written to by the processes, or they read as well?

There's different ways to solve this problem, the best solution will depend on the read/write access patterns of the processes and whether they live on the same machine.

Shared memory won't work if you are using 32bit, the largest application process size is something like 2 or 3G (depending on OS). The other part of the memory space is given over to the OS.
Last edited on
It doesn't matter, I can make it a log file or binary file as well. And they only write to the files. There won't be any reading.

It's something like, one process does some computation and dump a line into the file, this compuatation may vary, so its not like one-by-one writing, and yes they will reside in the same machine.

Because of the limitation of physical memory, I was thinking memory mapped files would be ok? But still I am not clear. Can someone throw more light on this?
In reference to this, I created a small example from a UNIX programming book.

/* writes a random number into the file */
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <time.h>
#include <unistd.h>
#define FILE_LENGTH 0x100

int random_range(unsigned const low, unsigned const high)
{
	unsigned const range = high - low + 1;
	return low + (int) (((double) range) * rand() / (RAND_MAX + 1.0));
}

int main(int argc, char *const argv[])
{
	int fd;
	void *file_memory;

	/* Seed the random number generator */
	srand(time(NULL));

	/* Prepare a file large enough to hold an unsigned integer */
	fd = open(argv[1], O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
	lseek(fd, FILE_LENGTH + 1, SEEK_SET);

	write(fd, "", 1);
	lseek(fd, 0, SEEK_SET);

	/* Create the memory mapping */
	file_memory = mmap(0, FILE_LENGTH, PROT_WRITE, MAP_SHARED, fd, 0);
	close(fd);

	/* Write a random integer to memory-mapped area. */
	sprintf((char *) file_memory, "%d\n", random_range(-100, 100));

	/* Release the memory (unnecessary because the program exits). */
	munmap(file_memory, FILE_LENGTH);

	return 0;
}


/* reads the file and stores the twice the value read into the file. */
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <time.h>
#include <unistd.h>
#define FILE_LENGTH 0x100

int main(int argc, char *const argv[])
{
	int fd, integer;
	void *file_memory;

	/* Prepare a file large enough to hold an unsigned integer */
	fd = open(argv[1], O_RDWR, S_IRUSR | S_IWUSR);

	/* Create the memory mapping */
	file_memory = mmap(0, FILE_LENGTH, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	close(fd);

	/* Read the random integer to memory-mapped area. */
	scanf(file_memory, "%d", &integer);
	printf("value: %d\n", integer);
	sprintf((char *)file_memory, "%d\n", 2 * integer);

	/* Release the memory (unnecessary because the program exits). */
	munmap(file_memory, FILE_LENGTH);

	return 0;
}


Firstly, I don't get why is it not behaving properly? I am not getting the desired result.
Secondly, is going in the similar direction best suited for my needs?

I am very new into this, so I don't have much idea about this. Thanks
Why are you using a memory mapped file? Surely writing a record in a block protected by a mutex shoudl be sufficient.
Since you are just writing (appending) to a file, I recommend creating a one thread/process (P0) that collects results from a message queue (MQ) and writes to the file, and multiple threads/processes (P1..Pn) that do the computation and sends the result as messages (m) to the message queue for the file writer.

P1-m-\
P2-m--MQ-->m--P0-->file
Pn-m-/
Last edited on
@kbw Can you please elaborate more on that? Since I have very large files to write, I thought memory mapped files would be efficient than others? Isn't it? Or, I am missing here something?

@PanGalactic Are MessageQueues faster than memory mapped files? I don't know much about either? Can you please let me know why you think MQ would be a better choice? Speed matters a lot to me. I have long simulations and even a small gain in speed will help me a lot.

So, I want to choose the one that will be fastest. Thank you.
@PanGalactic's suggestion is a standard solution and is generally better. But mine cosiders the case where the writers are not in the same program. If you wanted to do it properly, you'd use the thread hop as described by @PanGalactic, and use the mutex in the writer thread if necessary.

Pushing messages on a queue is fast and frees up the worker threads, rather than have them block, waiting on some slow i/o to complete.
Well in my I have writers in different program. Can I still use @PanGalactic's suggestion?
Yes, and I think you should. But you'll still need a mutex in the writer thread to sync between programs.
Thanks kbw. Let me know If you have any good links for learning message queues and all the above things that are required. Thanks once again.
Can someone tell me how do I make mutex that works across the processes?
Last edited on
Thanks kbw..... But I need that mutex variable in shared memory to so that different processes have accesses to it. Isn't it?

And secondly, I read at

http://stackoverflow.com/questions/1428117/linux-ipc-multiple-writers-single-reader

that MQs are inherently synchronized. Is that the case? If it is, then I guess I don't need mutex locks. Can you please expain a bit?

Thanks
Sorry, that was a bad example. Unix provides semaphores for this sort of thing. A semaphore with a count of 1 is equivalent to a mutex. Create the semaphore with the same name across processes.

I think MQ's an overkill in this instance.
Yeah, but the question, if we go ahead with the MQs, do we really need to do synchronization manually. Because it seems MQs automatically takes care of that.

Secondly, now I am back to square one. What is the best method to serve my purpose?
I am trying to have a model where two process write to a common file.


Is there something reading from this file while the processes are writing to it? If no, you could perhaps go low tech:

#!/bin/sh
process0 > file.1 &
process1 > file.2 &
wait on process 0
wait on process 1
cat file1. file.2 | sort > final_answer # e.g. if your lines have timestamp

It all depends on your use case...

Otherwise...flock might be an alternative to the semaphore suggested by kbw, though may be slower...or may not be...
Last edited on
for this problem u can use FIFO (named pipe).
that will be simple and good for 2 processes that will act as consumer/producer.

i think that will be the easiest of IPC mechanisms to use in this case...other advanced mechanisms might not be needed here...
Last edited on
Topic archived. No new replies allowed.