• Forum
  • Lounge
  • Compilers, interpreters, JIT and browser

 
Compilers, interpreters, JIT and browsers.

Hi guys,

Preface by saying, long time since I posted but great to be back, been keeping busy with IT related ventures, but I digress.. This thread's a fun one and I hope to ask some questions that I've never encountered. Before that, I'll give a quick summary of my understanding of compilation, interpreters and JIT(just in time compilation). If by chance, you'd rather skip this summary section, by all means do.

I'm going to use Java as my example when explaining compilation,interpreters and JIT, I feel this language lends itself well for examples like this. Java is both a statically typed compiled language but also an interpreted language. Source code written in java(.java files) are fed to a compiler such as javac, javac compiles said source code into byte code. The bytecode is just instructions that all java virtual machine's interpreters can understand, this is what helps make Java platform agnostic. The bytecode will be saved to .class files.

.class files can be run by the java runtime environment's JVM(Java's implementation of a virtual machine). The JVM will have an interpreter, the interpreter reads our java bytecode line by line, translating it to native instructions. This can be overly cumbersome, especially large functions that contain constructs such as loops. To overcome this, the JVM will use just in time compilation(JIT). JIT is implemented as follows(or close to); the JVM will have caches/buffers that keep track of function calls, store machine instructions(code cache), etc. When a large function is called a number of times, the JVM will store a copy of this function for further use, thus; instead of translating the entire function again and again, the JVM will fetch the machine instructions directly from the code cache and execute them. This greatly speeds up the time of the JVM(although still slower than languages that compile directly to machine instructions such as C and C++).

This is where the question comes about.... Browsers such as firefox are almost virtual like environments, they contain means to parse HTML,XML,CSS and more, they also contain Javascript interpreters that enables devs to create some dynamic-ability, add functionality to their front ends. Javascript is a dynamically typed, JIT interpreted language. The thing that confuses me is, It's pretty clear how Javas JVM interacts with the CPU, it first translates the bytecode to machine instructions and runs them natively, but, how does Javascript do this??? Javascript is contained inside of a sandbox(the browser), so how does the Javascript interpreter or virtual machine translate Javascript bytecode to native machine instructions?

It makes things confusing because the browser sits in the way, does the browser interact with the CPU(because you will be changing the look of what the browser contains/the web page) or does the Javascript interpreter send the translated machine code to the CPU?

Thanks

Last edited on
how does Javascript do this???
The steps are pretty much the same as for Java, only instead of running in separate processes, the bytecode compiler and the JIT run as separate functions in the same process[1].

It makes things confusing because the browser sits in the way, does the browser interact with the CPU(because you will be changing the look of what the browser contains/the web page) or does the Javascript interpreter send the translated machine code to the CPU?
I think you're confused about how execution works. You never "send" stuff to the CPU. The CPU is not a device like a disk is and has no capacity to "receive" things. The CPU controls what the machine (including itself) does. As such, it can do things like
* Read numbers from memory into its registers.
* Write numberes from its registers into memory.
* Change the contents of its registers.
* Execute an instruction in memory, based on the value of its registers.
* Request a device to write data to memory from its internal memory, or vice versa.
So what the JS engine does is "simply":
1. Compile the JS to bytecode.
2. Translate the bytecode into native machine instructions. When I say "native machine instructions" I mean that the CPU can directly execute them.
3. Place these instructions into executable memory. What I mean by "executable memory" is that most OSs organize memory into pages (typically of around 4 KiB) and give each a number of toggable flags. Flags such as writable, executable, etc. Pages that don't have the executable flag set cannot be executed (obviously).
4. When the browser is ready to execute JS code, it jumps into it. From the point of view of the code, what this would look like is that you'd have a bunch of points in some data structure like
 
std::map<std::string, void *> js_functions;

The pointers point to raw byte arrays. When you want to execute function foo you do
1
2
3
4
5
6
typename void (*js_function)(js_param *, size_t);

auto f = js_functions["foo"];
auto foo = (js_function)f;
js_param p[] = { 42 };
foo(p, 1);
The final pointer dereference tells the CPU "pull the value of pointer 'foo' from memory and load it into the instruction pointer, saving its previous value by pushing it onto the stack". The instruction pointer contains the next instruction that should be executed, so this causes the CPU to jump to the address pointed to by foo, and when the function finally returns the previous value of the instruction pointer will be popped off the stack, which will cause the caller to resume executing.


[1] Probably. Modern browsers makes things more complicated because the same logical "instance" runs as multiple processes that cooperate via IPC.
Very true, I probably should have worded that paragraph a little better, I should have said the java virtual machine will translate the byte code into machine code, store those instructions in memory, The CPU will then fetch the instructions from executable memory(The CPU is the one that fetches, although I/O devices can have microcontrollers or processors that fetch data from memory too). That would be more precise, I think?

Point 4 is where it starts to click. Since the JS interpreter is a child of the browser, this only allows those machine instructions to be executed inside of the browser(I'm sure there(probably) is ways in which it can escape).

In terms of actually manipulating the elements inside a web page. Let's say we have a simple web page that has a button, we attach an event listener to said button. Once we click that button the browsers JS virtual machine will translate our JS code to native machine instructions. Further more, let's say, when we click the button, the page will change the color of all the <h2> tags.
I'm finding it hard to grasp how machine instructions can manipulate DOM elements such as changing the colors of <h2> tags.

Thanks Helios
I'm finding it hard to grasp how machine instructions can manipulate DOM elements such as changing the colors of <h2> tags.

Maybe it generates a call to pre-written function that is provided by the browser?
Similar to how you can call library functions in C++ to change the colour of something if you used a graphics/GUI library.
I'm finding it hard to grasp how machine instructions can manipulate DOM elements such as changing the colors of <h2> tags.
The functions that manipulate the DOM are implemented by the browser. The JS just needs to call into them, and the JIT simply generates dynamic code that calls into the browser's static code, passing the appropriate parameters.
Last edited on
You should pick up SICP at some point and work through it.
https://web.mit.edu/6.001/6.037/sicp.pdf

It has a reputation for being hard, too hard for its target audience (complete beginners), but you're not a beginner and I'd strongly recommend it anyway.

Alternatively I came across this short NES emulator here:
https://github.com/binji/smolnes/blob/main/deobfuscated.cc
Which is probably small enough to get a handle on with some effort.
Last edited on
Registered users can post here. Sign in or register to post.