With Unix Epoll and Windows Completion Port we can read from multiple fd's/sockets at the same time and as soon as one of them has data to read epoll/completion port will notify us.
How can this be setup without the help of Epoll or IOCP? Basically, what does Epoll and IOCP do under the hood to figure out when there is data available on one of many files?
The only way I could think of is that somewhere down the chain (very low level) there has to be a loop and each time it checks if there is data available on any of the fd's, and if there is data available it notifies the caller.
There are no loops. At the lowest level, the kernel posts a DMA request to an I/O device like "read from block foo and copy the contents into memory address bar". The CPU can then idle or go do something else. The device writes to RAM directly and once it's done it sends a hardware interrupt that the kernel can handle. Eventually the usermode process gets notified.
If you don't want to use asynchronous I/O, you could also create a separate thread for each file you want to read and perform a normal synchronous read. When the read completes you signal it via a semaphore/event/condition variable, and you wait in the the main thread on all of them. It's not really any more efficient (probably a lot less so).
People that write kernals/operating systems how do they make these requests? What libraries, languages do they use?
Pfft... No idea. Grab the source for an open source kernel and start browsing. They don't use any libraries, I can tell you that.
<edit>
The relevant sources in Linux seem to be under /drivers/ata/sata_* in the latest source tarballs.
</edit>
By device you mean hard disk for example?
Yes.
So how do you think epoll monitors multiple files? That's still usermode. I should be able to set it up as efficient as those libraries on my own.
Nope. epoll has a kernel mode component.
The central concept of the epoll API is the epoll instance, an in-kernel data structure
I'm going way way back to UNIX system 5, but, as I recall, when a process blocks, the kernel put it on a queue with a small bit of data indicating the event that it was blocked on. That might be the inode of a file it it was waiting for I/O, or the wakeup time if it was sleeping, etc. The system call for checking multiple I/O is poll(). Poll takes a bitmap of the file descriptors that it's waiting for, so it might make sense for that same bitmap to be the event that the process waits for.
When events happen that could wake up a process, the kernel checks the wait queue. If any process was waiting for the event, it schedules the process to run.