SetFilePointer/SetEndOfFile

I am trying to figure exactly how the SetFilePointer/SetEndOfFile functionality works.

Essentially, I'm rewriting a program to have more efficient disk access. Originally the program would take a 4k chunk of data and append it to the end of a file up to a certain size (say 600MB for example). Now I preallocate this defined size and using standard file streams (fstream), I just seek and write into the location. The whole reason for this was disk fragmentation...I cut fragmentation in my program from an average of 15000 fragments (not joking) to an average of 1-2. Now I run into a problem where if my program crashes, I need to know the last offset of where I wrote into the file, so I can go back to that position in the file and continue writing from there. I've been trying to find out more info on these functions themselves, from MSDN, and originally I thought by using these functions I was creating a 'sparse' file. Becuase of this original notion, I figured I could use GetCompressedFileSize() to get the 'actual' data written on the disk within the sparse file. I tried this out, and was returned the phyical file size, not logical. I've also tried reading the whole file into memory then searching backwards for the first byte of data that isn't null. This worked quite well, was decently quick, but was EXTREMELY expensive on memory (no bueno). In reading more about the functions, I now know I have to use the DeviceIoControl method to set a sparse file. Doing some more reseach into how NTFS handles sparse files, and for the application of what I am doing within the context of my program, I don't think I'm too fond using sparse files for my purposes.

Here is the code that creates the file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
LPCTSTR lpfname = TEXT("C:\\test.tmp");
LONG lsize = 10000000; // ~10MB
DWORD dwErr;
HANDLE file = CreateFile(lpfname, 
                         GENERIC_WRITE,
                         FILE_SHARE_WRITE,
                         NULL,
                         CREATE_NEW|OPEN_EXISTING,
                         FILE_ATTRIBUTE_NORMAL,
                         NULL);
dwErr = GetLastError();
if (dwErr > 0) {
	cout << "Error Code: " << dwErr << endl;
}
SetFilePointer(file, lsize, 0, FILE_BEGIN);
SetEndOfFile(file);
CloseHandle(file);


Here's the code that I use to read the data:
1
2
3
4
5
char *buffer = new char[len]; // <- length is passed into the read function
offset = _lastWriteOffset; // <- _lastWriteOffset is my problem
fstream fileIn(fileName, ios::binary|ios::in|ios::out);
fileIn.seekg(offset, ios::beg);
fileIn.read(buffer, len);


My question ultimately boils down to 1 question:

Does using SetFilePointer/SetEndOfFile actually allocate the disk space or create a sparse file in the background anyways, and then the NTFS file table keep track of where the data and 0's are at?
If so, is there a quick way of getting this info (maybe like with a file map call)??


Thanks for the help! This has been plaguing me for a couple of days now.
Hi txtxthelp!

I can't really help you other than to say I found your post interesting. I use all those functions you mentioned (particularly CreateFile(), SetFilePointer(),, and SetEndOfFile()) frequently, but never with text files. I'm clueless as to how they behave in that mode. I always use them with random access binary data. All I can say is the return error codes seem to be particularly helpful in using these functions. Good luck!

Hey freddie1,
Thanks for the support. I'm actually using them for binary data as well, I was just using the file name above as an example. And unfortunately the error code won't give me any insight into my conundrum because my file handles are always valid, its just the nature of the kernel is what I am trying to figure out!? I'm doing some more searching on the FSCTL line of methods and kernel file methods to help with my situation. Looking into the MFT, things of that nature ... in the meantime I'm playing some more SetFilePointer/SetEndOfFile and some binary searching...

Thanks!
Topic archived. No new replies allowed.