Ok. To make it easiest to compile; I'd recommend installing Cygwin if you're on Windows. On a UNIX[-like] system you should just be able to type "make" and a .img file will appear. Copy it to a floppy disk or a flash drive or something.
There might be a problem, however, that I didn't anticipate. Does your program rely on any C++ libraries? Do they use any C++ constructs? If you used the C++ standard library at any point, the code will not work. If you used C++ constructs (classes, templates, etc.), the code might not work.
You can't use any standard libraries except for what I've written, unfortunately. Of course, you're more than welcome to extend what I've got so far.
DOS probably isn't a viable option, as it has severe memory limitations. I don't recall exactly, but I'd be surprised if you can have over 4 MB of RAM. IIRC it's even less than that.
Anyway, you are micro-optimizing. The drawing won't be the bottleneck, the computations will (as you already pointed out). You're focusing on one area of your program that likely isn't going to impact the speed very much and are convinced you need to optimize it.
OpenGL can render a small scene frighteningly fast. If all you're doing is plotting a pixel here and there I'm sure you can get well over 3000 FPS (as long as you don't vsync).
What's more, if all you're doing is showing progress updates, you don't need a fast refresh rate. You should be able to get by with like 5 FPS (or lower -- maybe even 1 FPS would suffice).
So assuming you draw 5 updates a second, and can get 3000 FPS, this means that for every 10 minutes the program runs, one second will be added on top of execution time. And if I did my calculations right... if you leave the program running 24 hours for a full 7 days... that will only add about 17 minutes on top of execution time. This is not something worth getting bunched up in a knot about.
And you can cut that off even more. For instance have an "update on demand" option where it only gives an update when you press the space bar or something. Or disable updating for long stretches (going afk, going to bed, etc).
In short: don't waste your time. Just use a library. The time you're burning trying to figure out how to get direct access to video memory is probably taking longer than the combined time your program will burn using a library.
chrisname's approach of a mini-OS might be worth a try. Having a full OS running will probably do more to slow down computations than anything else. If you can make your program the sole process that will undoubtedly speed things up.
The OS idea is really interesting. It may even be worth the extra time as this program may end up running for the rest of my life. I mean, the program basically does what a lot of scientists do: posit new models and test them. Only, it should do it very rapidly.
The funny part here is that I figured that, once I did the theory and got all the concepts done, coding it would be easy. In general, programming always struck me as a very calming, fundamental line of thought; but, that was before I started trying to do things like graphics and networking and such.
The idea of a mini-OS with no other programs is neat because, if I can assume that certain, large ranges of memory addresses will always be free, I don't have to bother allocating memory or storing pointers; I can have a systematic way of referring to where data "should" be in the memory and always just drop it there and pick it up there. That seems like it could speed things up.
However, I don't know anything about making an OS (or using the one that chrisname has already provided). Any idea how long it typically takes to learn?
I haven't coded any of it up yet. I tested some of the basic concepts in VBA within Excel, but I wanted to get a feel for all of the code that I'd need before beginning a more complete version in C++. I kinda doubt that I'd need anything from the standard library that I couldn't easily remake myself, although I would definitely need the ability to use C++ functions.
You know, what I'm doing may not be entirely common, but it's probably not unique, either. If you can make an OS that turns the computer into a giant graphing calculator, while the OS itself stays in a small section of memory and leaves the rest to a calculation-producing program, it may have some pretty strong demand among computational researchers. In fact, such an OS has to already exist, because I can't imagine no one making one by now. (If anyone knows what it is, I'm all ears.) But if there really isn't such an OS yet, then it strikes me as worth doing.
First of all, like I believe someone pointed out, it will take you FAR more time to learn how to use the very low level stuff than it will to use a couple libraries. The libraries might not even slow your program down noticeably. You might think that because the low level stuff is "simpler", you can easily learn it. But low level generally translates to "less abstraction". Abstraction is usually what makes things nice and simple, hence why doing a few simple things in C++ takes faaaar fewer lines and effort than it would in Assembly. The stuff you seem set on doing would be even harder.
Secondly, I do research in Computational Physics (Simulations, like it seems you want to do). When we need to do very resource intensive stuff, we don't create "giant graphing calculators". We make our program threadsafe and use a cluster. Time using the cluster here, even at our fairly small school, is available to almost anyone doing research who asks. Considering you're doing a PhD, I'm positive there is a cluster available to you. It doesn't matter how much you "optimize" your program and machine, it will be a joke compared to a cluster.
I didn't think of that. If you can get access to a cluster, then go for it and just use the libraries (say, OpenGL) as was originally recommended. I also think it would be good to use threading as I originally recommended -- run the part doing the graphing and the part doing the heavy calculations in separate threads. Maybe the main thread could do the calculating, and every time it needed something done it would set a variable to indicate something needs graphing. Then the second thread systematically checks (or polls) the variable and does the graphing when necessary.
Edit: Also you're probably not going to get much more optimized than an OpenGL implementation.