Overlapped IO back reading

Oct 12, 2010 at 1:28pm
Hi,

I've been working on a program which uses overlapped IO for speed, and reads a buffer of about half a MB at a time and checks for records. To avoid the index going out of bounds it doesn't check the last 200 bytes of each block.
I've been trying to get the overlapped structure to change so that it reads the first block and then starts reading the next block to include the last 200 bytes of the first block. However it doesn't seem to like this.
My code currently works and i have a function to update the overlapped structure to increase by the number of bytes read each time, i.e. moving on to the next block. I then said to increase by size of block - 200 hoping it would include the 200 bytes of the last block however this errors, is there some flag I need to set to allow back reading?
I can provide some of the code if you want.

Thanks in advance

Oct 12, 2010 at 2:11pm
minime wrote:
I can provide some of the code if you want.
just if you want some help...
Oct 12, 2010 at 2:48pm
To avoid the index going out of bounds it doesn't check the last 200 bytes of each block.
I don't understand what this means.

It's fairly straighforward.
1. You reset the event you placed in the overlapped struct.
2. You set the read offset in the overlapped struct.
3. You say how much you want to read when you call ReadFile. So you should pass in a buffer large enough at that point and not touch it until the event is signalled.
4. Check if the read has completed by checking if the event is signalled. If not, you can go off and do some work. Repeat until signalled.
5. GetOverlappedResult tells you how much was actually read.
Oct 13, 2010 at 7:54am
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
else
		{
			vDiskPos += FILE_BUFFER;
			//not first read so switch buffers if done
			dwError = GetLastError();
			bResult = GetOverlappedResult(hfile,
										  &stOverlapped,
										  &byteRead,
										  FALSE) ;
			if (!bResult)
			{
				switch (dwError = GetLastError()) 
				{ 
					case ERROR_HANDLE_EOF: 
					{ 
						// Handle an end of file
						return 0;
						break;
					} 
					case ERROR_IO_INCOMPLETE:
					{
						// Operation is still pending, allow while loop
						// to loop again after printing a little progress.
						do
						{
							bResult = GetOverlappedResult(hfile,
														 &stOverlapped,
														 &byteRead,
														 FALSE) ;
							if (!bResult)
							{
								Sleep(15);
							}
							check++;
						}while ((!bResult) && (check < 300));
						if (check == 300)
						{
							dwError = GetLastError();
							printf("Error occured reading disk before end of disk, error: %d\n",dwError);
							return 0;
						}
						byteRead -= 200;
						_UpdateOffset(&stOverlapped, byteRead);
						break;
					}

					default:
					{
						//an error occured
						return 0;
					}
				}
			}
			//all good now switch buffers
			std::swap(fileBuffer, fileBufferNext);
			//start next read
			byteRead -= 200;
			_UpdateOffset(&stOverlapped, byteRead);
			ReadFile(hfile, fileBufferNext, FILE_BUFFER, &byteRead, &stOverlapped); 
			return 1;
		}
void _UpdateOffset(OVERLAPPED* inPtr, __int64 inOffset) {
	__int64 tempOffset;
	tempOffset   = inPtr->OffsetHigh;
	tempOffset <<= 32;
	tempOffset  |= inPtr->Offset;

	tempOffset += inOffset;

	inPtr->Offset     = (DWORD)(tempOffset & 0x00000000FFFFFFFFULL);
	inPtr->OffsetHigh = (DWORD)(tempOffset >> 32ULL);
}





kbw - The program looks for records in a file/physical disk.
The records are usually less than 200 bytes.
In order to avoid having a record going over into the next buffer, i dont start looking for records in the last 200 bytes, although any found before will carry over.
I then update the overlapped structure to point the read to the next block using bytes read. However i need the blocks to overlap to include those 200 bytes , so instead of moving on the read half a mb, i want to move it on half a mb - 200 bytes.
It works for half a mb, when i tell it to update with the - 200 bytes it errors.
Hope this is clearer
Oct 13, 2010 at 8:39am
Are you using two OVERLAPPED structures to manage the reads from the two buffers? You need to, but I can't see that you are.

Your completion progress's efficiency can be improved by checking the event in the OVERLAPPED structure rather than calling GetOverlappedResult. Also, if you check more than 300 times, you assume the read has completed and go off pretending it has. What if it hasn't?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
case ERROR_IO_INCOMPLETE:
    while (true)
    {
        DWORD dwResult = WaitForSingleObject(stOverlapped.hEvent, 15);  // 15ms timeout
        if (dwResult == WAIT_OBJECT_0)
        {
            // the read has completed, get the results
            DWORD bytesRead = 0;
            // We could have use FALSE as the 4th parameter, we know the read has completed so it doesn't matter.
            BOOL bResult = GetOverlappedResult(hfile, &stOverlapped, &bytesRead, TRUE);
            byteRead -= 200;
            _UpdateOffset(&stOverlapped, byteRead);
            break;
        }
        else if (dwResult == WAIT_TIMEOUT)
        {
            // update your progress indicator
        }
    }
    break;


How do you find your records? Are they just bit patterns in the file? What are they? Without understanding that I can't comment on the comparison.
Last edited on Oct 13, 2010 at 8:41am
Oct 13, 2010 at 8:57am
No, I am using a single overlapped structure and switching the buffers round each time so that the overlapped structure always refers to the same buffer name, even though the buffers are switched round.

The timeout was set due to a bug with a disk emulator, which mounts a disk image. This seems to lock every now and again and never return the data, so this check was set so that if that happened it would escape after 300 attempts and then the program could handle this.

The records are found by looking for bit patterns and/or keywords, they are internet history records from an sqlite database (deleted)

Also when i try to read from the file after subtracting 200 from the byteRead variable i get error 996 - Overlapped I/O event is not in a signaled state.

Thanks for the help so far
Oct 13, 2010 at 9:46am
You can't use a single overapped structure. Think about it. It's holding state of each write, so you must use different structures for each parallel read.

The lock you experience could be due to your overlapped structure corruption.

You could read an extra two hundred bytes in your buffers, so they'd be FILE_BUFFER+200 big, and you'd read FILE_BUFFER+200 bytes each time, but still read FILE_BUFFER increments. That way, you'll always have the extra 200 bytes to do your comparison in.

It won't cost any extra I/O because Windows will still that data cached.
Last edited on Oct 13, 2010 at 9:48am
Oct 13, 2010 at 10:03am
So basically have 2 overlapped structures, if we say the block size was 100 bytes for ease, then set the first one to 0 and the second to 100 and then add 200 to each after each read so they cover all the blocks?

I'm going to try with the extra 200 bytes on file buffer and see how that works

Thanks for the help!
Oct 13, 2010 at 1:04pm
With regard to the blocks, say FILE_BUFFER is 1000.

You declare the buffers to be FILE_BUFFER+200 large.
You call ReadFile as:
ReadFile(hfile, fileBufferNext, FILE_BUFFER + 200, &byteRead, &stOverlapped);

But still read in FILE_BUFFER increments:
1
2
    byteRead -= 200;
    _UpdateOffset(&stOverlapped, byteRead);



With regards to the overlapped structures, you can handle them in the same way you handle the buffers:
1
2
3
4
5
6
7
8
9
    unsigned char buffer1[FILE_BUFFER+200];
    unsigned char buffer2[FILE_BUFFER+200];
    OVERLAPPED overlapped1 = { 0, 0, 0, 0, CreateEvent(0, FALSE, FALSE, 0) };
    OVERLAPPED overlapped2 = { 0, 0, 0, 0, CreateEvent(0, FALSE, FALSE, 0) };

    unsigned char* currentptr = buffer1;
    unsigned char* nextptr = buffer2;
    OVERLAPPED* currentoverlapped = &overlapped1;
    OVERLAPPED* nextoverlapped = &overlapped2;


When you swap the buffers, you swap the overlapped structure pointers. A read might look like:
ReadFile(hfile, nextptr , FILE_BUFFER + 200, &byteRead, nextoverlapped);
Last edited on Oct 13, 2010 at 3:14pm
Oct 14, 2010 at 7:06am
Got the two overlapped structures set up correctly now, however this line of code:

 
ReadFile(hfile, fileBufferNext, FILE_BUFFER + 200, &byteRead, &stOverlapped);

brings back an error of 87 which I looked up to be Invalid parameter. I have also tried creating a dword named readSize which i set to FILE_BUFFER + 200. This produces the same result.
Oct 14, 2010 at 7:57am
You're still passing in stOverlapped. I thought you might be passing in a pointer to an OVERLAPPED that is swapped at the same you swap buffers.
Oct 14, 2010 at 8:00am
Yes but this happens at the first read before any swapping of buffers and am actually using the currentOverlapped pointer as you put in above, same result.
Oct 14, 2010 at 10:47am
Double check the initialisation of OVERLAPPED, I haven't compiled any of the demo code I've posted, it's more a brain dump.
Oct 14, 2010 at 1:51pm
After all that it didnt like the number of bytes I wanted to read!!!! changed it and it seems to work fine.

Thanks for all the help!
Oct 14, 2010 at 6:18pm
That doesn't make sense.
Oct 15, 2010 at 9:06am
No it doesn't really, I did however change the overlap from 200 to 1024 which seemed to fix it, maybe it likes the power of 2!
Oct 15, 2010 at 9:23am
That's interesting. Thanks for the update.
Topic archived. No new replies allowed.