Need help understanding cpu utilization (multiprogramming)

I wanted to post a graph, but that's not possible so I hope I explain my question right.

This is all in regards to I/O wait with cpu utilization at different percentages along different numbers of multiprogramming.

For instance, if there was 3 degrees of multiprogramming measured at cpu utilization percentages at 60%, 85%, 100% (approximately) there would be, respectively, 80, 50, 20 percent I/O wait.

I'm confused about the I/O wait. What exactly is it? Is it a good thing?
Would 60% cpu utilization at 80% wait or 100% utilization at 20% wait be better?
I/O is not instant. Disks are slower than registers/cache memory and equal or slower to hard disks (ssd getting closer to ram speeds, old disks much slower but buffered as well so its weird to try to estimate). Other IO is similar: anything not living inside the CPU is going to respond slowly and the cpu has to wait for it if it needs it to proceed.

A waiting CPU is 'bad' if you are not waiting on human interactions. If your cpu is waiting on a disk, your performance would be better if you could somehow read before you need and have it buffered up in ram/cache/etc by the time you need it. If its waiting on a sensor, it is what it is. So it depends on the nature of the I/O what you can improve.
Okay, the purpose, at least in this scenario, of utilizing cpu through multiprogramming is to cut down on the amount of I/O wait. That makes sense.
note that I used '' on bad etc above.
if you have a program that is waiting on I/O, you can be in a good place, of course.
its all relative. Its 'bad' because your I/O is potentially not optimal (or, maybe it is: I/O is slow as I noted above, and apart from adding disks, there isnt a lot you can do past some point). Its good because your processing code is so efficient you are waiting on more input. That may be where you want to be. If you are looking at whole program optimal, then you know to look at the I/O and ignore the processing for now. It just depends on the question you want to answer, which is unclear in your classroom / generic question.
I follow you. It was enough general information to understand my graph though. So I thank you.
In my own work, waiting on I/O most often takes one of five forms. Waiting on network I/O (TCP/IP connections), waiting on disks, waiting on other serial I/O (could be USB), waiting on GPU interactions, or waiting on RAM.

RAM is something of it's own category, because it is so basic to the machine and you can only do so much about it. It isn't as though a system call or other application call is made to get RAM, it happens merely as the result of feeding both instructions and data from RAM into the CPU, and the techniques to deal with it are entirely different from the other 3, usually in the form of being aware of the CPU cache, the weakness of RAM addressing, etc.

The GPU is, similarly, a unique version. It is yet another computer, purpose built, into which we feed entire programs (shaders typically), and bulk data - then wait for processing to complete.

The others have one common theme, which jonnin makes clear - they are all some external device connected to the machine (which you're focusing on as the CPU), and feed data through some OS layer (device driver) over time - lots of time from the CPU's perspective.

There are two general objectives I find when dealing with this puzzle, one of which is tied to user interfacing. The user saves a file, or scrolls a window, upon which a new batch of data is to be fetched from some external device. While that I/O is performed, the user waits. Jonnin's point of buffering ahead is key.

The other general objective is to avoid having an I/O device wait on the CPU. This isn't something I've yet read in this thread, so I'll focus on that a moment.

Let's say you're reading in chunks of data from either disk, some USB device (there are myriad such devices that aren't disks), or perhaps TCP/IP (some source on the web, could be streaming some game world data). In a "naive" scenarios, the CPU waits while a sufficient bucket of data arrives which is meaningful and can be processed (think web pages which you can see as they partially load, but then update several times, re-rendering the content as more data arrives). However, in this "naive" scenario, once that data arrives, the CPU works to "render" or process that data, during which time no new request is made to the source of the data - here, that source is waiting on the CPU to get ready to receive more data.

Of course, as you're thinking, a separate thread could continue streaming more data for the next batch, and this is common. This may increase the CPU usage of that processing code which renders the data as it comes in. This happens because it isn't waiting on the data as much, since another thread is handling that.

The nature of cycling between fetching and processing data is associated with the fact that many of the OS calls to read such data are blocking calls. Once such a function begins, the operating system uses the thread making the call to perform the processing, so it isn't available for that application to do more work while the data loads.

Some I/O functions have non-blocking versions. They typically use callbacks. A request for data is called, often like before, but the application does not expect the data to arrive when the call returns. In a non-blocking version, the call returns almost instantly (relative to OS calls that block - of course, that still will be hundreds of clock ticks, but not tenths of seconds or full seconds of blocking). When the data has arrived, a function is called by the OS at that moment - the callback function provided to the non-blocking version of the call. That callback is often from an OS thread servicing the data itself, so it may be wise to merely message the application to respond in it's own servicing thread, which in older forms of this was the application's main thread.

This is more complicated, but it is efficient. It is far more convenient to spin off a thread which makes the blocking calls, but that waiting is not at the expense of those threads which use the data for rendering or other processing.

Now back to the close if your question. If you see 100% CPU utilization with some low I/O wait figure (like 20%), this is an indication that the processing duty for rendering (or whatever is being done with the data) is so heavy that the CPU is possibly overtaxed. What matters there is a third metric - the saturation of that data source. If, for example, you are reading from an SSD that could sustain 350 Mbytes per second, but that 20% utilization actually represents some 100 Mbytes per second, perhaps that 100% CPU utilization is a hint that if the rendering or processing were more efficient, you could pull in more data and complete the work faster.

When CPU utilization is lower, and the I/O is higher, it is more likely that the bandwidth of the source device is saturated, and the CPU is fully capable of processing that data as fast as the device can provide it.

A super simple scenario I explored shows this kind of issue, relative to reading data into streams (like students would do). A simple loop reading text from a large file using "cin >> somestring" shows that, frankly, it isn't efficient. The thread reading that it, on typical modern hardware this side of a 3 Ghz 64 bit CPU, may only process about 150 Mbytes per second, even from a device that could provide up to 900 Mbytes per second.

Comparing this to several other approaches, like the "fread" family from the old C library, shows that fread, while inconvenient and perhaps not as safe in it's origins, is several times more efficient, able to read in over 600 Mbytes per second in the same basic loop of reading in text.

I tried memory mapped file techniques, using the boost library, to read the data in as a region of mapped memory. This was the most efficient on both Windows and Linux, pulling in 900+ MBytes per second on the same hardware.

In those 3 experiments, no threading was used. However, the nature of memory mapped file operations is that the operating system is responding to page faults to fill memory, and operates "behind the scenes" in it's own threads/processes. Referencing memory not paged in causes a block to the CPU, but not for the entire duration of filling the memory region - only that portion being accessed by the CPU when the page fault is processed initially.

My point in this closing is that depending on what device you're waiting on, there may be several solutions more attuned to the machine's native operation which are more efficient than merely threading. For example, pulling in data from a file using "cin" in a thread isn't going to move data faster than not using a thread - switching to fread has more upper room for speed. If the drive is slow (say some rotating laptop drive), the source may only provide 60 Mbyters per second, so no real difference in performance would be observed, but some higher utilization of the CPU within the OS code pulling that data in may be observed.

Memory mapping is unique to disk operations, but is somewhat portable between Linux, MAC and Windows (boost provides a way to make it quite portable). Other devices, however, have few if any such options (the 'net "is what it is", as jonnin said).
You're asking about a solution. We could give much better info if you described the problem that you're having instead.

Guessing at your problem, here are some thoughts.

I/O wait is "good" in the sense that it means your code is fast. The faster your code is, the more time it will spend waiting for I/O.

One way to measure if you're doing I/O efficiently is to see how fast you can read the data and dump it in the bit bucket. If your program processes the data at about the same speed then you're probably going as fast as the hardware allows. If the program processes the data substantially slower then it's time to look at how to speed it up.

There are slow ways and fast ways to read disk files. In roughly slower to faster order they are:
- C++ streams
- C stdio functions (fopen, fclose, fread, fwrite)
- OS calls (open, close, read, write on UNIX/Linux systems)
- Memory mapped files.

Hi there, I also have faced this problem. I am new to coding, but I study sociology and need it for data analysis. I was looking for help with https://eduzaurus.com/free-essay-samples/sociology/ this service, but couldn`t find it. There are only essay examples, it is good enough, but nothing about coding( So, I hope to find help here.
Topic archived. No new replies allowed.