Creating a (custom-language) compiler able to use third-party libraries

Hello,
I'm trying to make my own compiler and ran into a few issues.

I know how to get the lexer and parser to work, but how can my compiler use third-party APIs/libraries, things like "windows.h" or SFML or anything alike?

(the examples I've given are coded in c++, but they wouldn't necessarily be in that language)

I'm not sure at what level I should even try to use these libraries -am I supposed to call the .dll file (on windows) they provide myself, with machine code? And if so, how would I do that? If not, what else has to be done?

Thanks
Last edited on
Before talking about using libraries, how advanced is your language? You've mentioned the lexer and parser, but that's just the very basics of a language implementation. Can you execute code? Do you have a type system? Do you have subroutines? There's little point in continuing the discussion if you can't at least do these things.
I'm planning things out at the moment, and I know I will be able to execute code, and yes, I do have a type system. I don't know what you mean by subroutines -and if you think I need any of those things, then maybe instead of claiming this thread is useless, you would be of more use by actually pointing out why it matters (or if you can't be bothered, other links that do so).
And if by subroutines you mean functions, yes, functions are supported.

I'm gonna get a compiler working no matter what, and I need that question answered because it is what I feel is the thing keeping me from doing so. I didn't ask "why am I unable to make a compiler", I asked a very specific question: so instead of assuming I did no research whatsoever (with no evidence, mind you), keep to the topic unless you have a reason not to.
Last edited on
Easy there. No need to get an attitude.

I didn't assume anything, I merely asked a question. Do you expect me to just intuitively know what you have implemented/planned?

What you need is what's known as a foreign function interface (or FFI). Let's suppose that you want to support calling external C functions, which is the most common use case for an FFI, plus it's a relatively simple language.
First let's look at how it would be done in C++:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
//Declare for the type system the type of the function you want to call.
typedef double (*some_function_f)(const char *);

void test(){
    //Load a DLL or shared-object from the file system.
    auto module = load_dynamic_library("library");
    if (!module){ /*...*/ }
    
    //Dynamic libraries have a table where they list the symbols they export.
    //C symbols are listed simply with their name. C++ symbols have their names
    //mangled (because of function overloading).
    auto function = module.find_symbol("some_function");
    if (!function){ /*...*/ }
    
    //Tell the type system that the void * we obtained actually points to a
    //function of the correct type.
    auto some_function = (some_function_f)function;
    
    //Finally, call the function. The language takes care to put your arguments
    //in the correct format for the function and then interpret the return value
    //accordingly.
    auto some_double = some_function("hello");
}
So from this we can see a few things you'll need:
* First of all, your type system needs to be at least somewhat compatible with C's. You can choose to support the type system incompletely (e.g. no structs) but then you won't be able to call any C functions that use those features. If the user still insists on calling such a function by declaring it improperly, the program will likely crash.
* You also need to provide some way to load modules and find symbols in them. Some languages support this without having to write code in functions. For example, in C#:
1
2
[DllImport("some_function", CallingConvention = CallingConvention.Cdecl)]
private static extern double initialize_index([MarshalAs(UnmanagedType.LPArray)] byte[] argument);
* The user needs to be able to say what the type of an external symbol is.
* Finally, you need to be able to actually call the function. This means: put the stack and CPU registers in the correct state according to the calling convention while, jump to the function, interpret and put the return value somewhere, and restore the stack and CPU registers so the user's code and continue executing.
There are some things that are not clear from your post.

When you say you want to write a compiler, that implies you want to generate executable code. Generating executable code is complex and requires an intimate knowledge of the X86 architecture. It also requires knowledge how to call and bind operating system routines.
Do you want to save the executable instructions you generate?
Do you want to be able to bind them into other programs?
Do you want to be able to execute your compiled code from the command line?

Implementing an interpreter is much, much simpler.
With an interpreter, you can define your own instruction formats.
Your compiler and interpreter can execute instructions "on the fly" as the syntax is parsed,
or you can generate a file of instructions which can be loaded and executed at any time.
These defined instructions can be at as high or low a level as you want.

For example, you can have language syntax that says "CLEAR SCREEN".
This generates a "CLEAR SCREEN" opcode in your file. When your interpreter reads the opcode for the "CLEAR SCREEN" instruction, it calls the system API to clear the screen.
Last edited on
I would look at an example of assembly that calls a library -- the old 'the gun' project is a good example (it recreated windows notepad in assembly using calls to the GUI libraries). Most of what you need would be found in a simple example of this. (I am assuming that a compiler generates assembly language, but you could also generate C or C++ and compile the result, many a tool does this).

--------
subroutine, function, procedure, and method are all near-synonyms. There are probably a couple more in the family as well.

I learned pascal as a wee tot and the rule there was functions return a value (think math, y=f(x)) but not all jargon users are following that definition.
Last edited on
Just to be clear, I'd like general answers for machine code, since that's the hardest case and solving that will solve any other alternative (so as not to ask in the future, and for others finding this thread).
In reality, I am compiling to c++ and letting the c++ compiler generate the code to load dlls for the time being.
Oh, and I know what a compiler is, and yes, I realize that I'm currently making a translator and not a compiler. I'm not making a runtime interpreter.

Yes, I get that part -the problem is the execution, very specifically, how to use a specific function, for example:
In windows, to set the position of a window via instructions, you need to use "SetWindowPos", inside WinUser.h.
I -obviously- wouldn't be able to include that file, since it's written in c++.
This means that I would have to implement my own way of calling them.
I understand your proposal of making, basically, the equivalent of the c++ standard for this. I think it's a great idea -however I wanted a more general way to call ANY function.

So to sum up the issue -I need a way to call "SetWindowPos" without including any c++ file, even if it is in c++. I do not know how to do that.
I've looked into the file itself and found that the expression expands to:
1
2
3
4
__declspec(dllimport)
int
__stdcall
SetWindowPos(_In_ HWND hWnd, _In_opt_ HWND hWndInsertAfter, _In_ int X, _In_ int Y, _In_ int cx, _In_ int cy, _In_ UINT uFlags);

What does that code do? I'm assuming it's a function declaration that won't be resolved until runtime, when the dll file is loaded. Is this right?
Is there no need to specify the specific .dll file to load?
Would this work the same if this was a third party dll? If not, how would I be able to a function in it?
Would it work the same with libraries in other (compatible) languages?
Get John Levine's book Linkers and Loaders, ISBN 978-1558604964

It's available online:
http://www.becbapatla.ac.in/cse/naveenv/docs/LL1.pdf
Last edited on
Topic archived. No new replies allowed.