If you could create your own language...

Pages: 12345
Brace yerself, helios. Nobody uses Bison or Yacc for anything anymore, they use LEMONS!

http://www.hwaci.com/sw/lemon/

Just kidding. Again, xorebxebx, why do you even keep trying?

-Albatross
(2 pages in)
I agree with xorebxebx. I'm just not entirely sure that it's possible for you to take anything he says as constructive criticism instead of troll-baiting. If this is just a language to get used to the concept, then fine, flaws are flaws. This is a test. If you're writing your first OpenGL application, you're not going to write thousands of lines of code to optimize it. But if you're planning on making this language easy to program in, lose the HTML-like syntax.
Last edited on
Just for the fun of it is why i"m doing it. and it's becoming less verbose the more I work with it. It's starting to look more like basic ._.
Well, I'm not sure I really want to get involved here, but the topic was interesting, at least ostensively.

I have been designing a simple interpreted language called aliae (A List Is An Evaluation). It is something of a cross between Tcl and Scheme. There are still some issues to figure out with the REPL, scoping, and some syntaxis (particularly with quoting), but it looks something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
function n! n  # Compute the factorial of n
  {
  if (n <= 1)
    then { return 1 }
    else { return (n * (n! (n - 1))) }
  }

variable x
print "Calculate the factorial of a number\n",
      "Please enter a number\n",
      "> "
input x
print "f( ", x, " ) = ", (n! x), "\n"

The basic structure in the language is a list. The basic premise of the language is that a list is an expression to be evaluated. That is, as in Tcl and Scheme, a list is both code and data. Hence, a list is an evaluation.

The elements of a list are separated by whitespace, and elements may themselves be lists (that is, lists are technically s-expressions). Unlike in Scheme, all lists are proper lists. Sublists are delimited by parentheses (either '(' and ')' or '{' and '}', which are equivalent but not interchangeably pairable).

A list is sequentially evaluated, using a lazy, accumulating reduce and transform process. The language allows nice functional constructs: lambdas, currying, proper tail-recursion, and closures in a convenient, almost automatic way. For example, the recursion on line 5 of the example above is recognized as tail-recursive, and handled as such for you. The way the REPL works also makes creating new control constructs surprisingly painless and easy. (Imagine writing your own if control structure!)

A special type of function (or named lambda) is an operator, which is a binary function that can be used in infix notation. There is no concept of precedence, so parentheses (or sublists) must be properly applied (as you can see on line 5 of the example above).

A special operator is the lambda operator, which is expressed as the comma (see lines 9-11 and 13). It is an n-ary operator that applies the next N elements of the list to the previous lambda (instead of the previous result, as binary operators do). The 'print' function is a unary function, so the lambda operator is useful to print more than one thing at once.

Commentary is signalled with a leading hash. It may be whitespace delimited (see line 1) or parenthetically delimited:

    print "Hello ", #( a comment )# "world!" #(don't forget the newline)# ,"\n" #tada!

Identifiers may be composed of any sequence of characters excluding the following

  ( )  sublist delimiters
  { }  sublist delimiters
  " '  quotes
  [ ]  list splice
  .    scope resolution
  ,    lambda operator
  #    comment
  :    reserved
  ;    reserved
  \    character escape

Also, built-in identifiers (like if and lambda) may not be redefined, as they can in Tcl and Scheme.

It was designed to be easy to implement in a small amount of code, and usable in either an embedded context or as a stand-alone program. I was considering using it as part of a series of articles here... Alas for time. (And energy.)

Well, that's it for the moment...
@Duoas,
That looks very interesting. The syntax looks quite tidy.
I had a quick question. Well... actually two.

1. What made you decide on # for comments?
2. Are you planning to implement some functionality that permits you put some code between a "then" clause and an "if" clause, or is the "then" only there for code readability?

-Albatross
@Albatross: Can you post an example code of your programming language (POOL)?
Ah, POOL (recursive acronym for POOL Object Oriented Language). I'm considering changing the name to CYARA (CYARA's Yet Another Recursive Acronym), but I dunno. I could give a short example, with comments.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include_searched stdio.pool
main //POOL implements states. main is a keyword that represents a state, specifically the first one run.
{
    state  
    /* state is a keyword that when put inside a main block gives various features for passing args, and the like.
    Most importantly, the state block/object determines the flow of the program.
    By default, a state block/object in a state is run first.*/
    {
        this.next -> extern.printer.public.print_queue.run_all;
        /* -> and <- can be a bit misleading. One can think of them as copying operators.
        Most objects have a boolean variable that if changed to true will start some chain of events in the object. 
        However, all objects have an member that if copied from pauses the object, "next".
        The this keyword indicates you're accessing a local member of an object.
        The extern keyword indicates you're accessing the member of another object.*/
    }
    //Something worth noting is that an object can be called before it's declared.
    console::out printer
    {
        this.protected.print_queue.0 <- "GTFO WORLD!!!";
        //"Protected" in POOL is different than protected in C++.
        //Protected here indicates you're accessing an object that can only be replaced on the object's construction or by an object's friend.
        extern.state.exit <- 0;
    }
}


With a minimum amount of fancy formatting, the hello world program is 7 lines long. 6 if I relocate the first line containing this.next to be in the same line as the state keyword, and 5 if I delete the extern.state.exit <- 0; but that last one could result in a memory leak.

Although I'm writing the compiler already, some of POOL's semantics and its library are still a work in progress.
© Albatross 2010. All rights reserved.

-Albatross
Last edited on
.. Recursive .. acronyms?.. ...
*head explodes*

:P
POOL? That's pretty COOL.
@Albatross: What differences are between states and functions?

COOL: is this an acronym of Chrisname's Object Oriented Language? Hahaha
Excellent question, Null.

Sates in POOL have no return value aside from an integer upon program termination, also states can inherit structures (collections of objects) from other states, and if a state reaches its "end" without calling another state, the program terminates. Technically, one could use them as functions, however I had different ideas for the calling and definition of subroutines that could be thought of as "one step" (still working out the syntax for the definition part).

Mostly the idea behind states was the ability to radically and easily change the way a program works if necessary.

-Albatross
1. What made you decide on # for comments?

It is a common comment marker, and it works well with shell script magic.

2. Are you planning to implement some functionality that permits you put some code between a "then" clause and an "if" clause, or is the "then" only there for code readability?

I prefer the readability. (I'm a Pascal man through and through.) I may make the word optional, as it is in Tcl, just so that people familiar with C and C++ can have their way too. Nevertheless, it makes for more readable and organized code and incurs no real penalty on the programmer.

It comes down to a difference in philosophy about code flow. C and C++ people tend to consider the if condition and the else branch to have equal prominence. Algol and Pascal people tend to consider both the then and else branches to be subordinate to the if test.

Making it an optional keyword has ramifications to the REPL... (particularly with currying/lambda application). It is one of the things I still have to iron out about the language.


My current idea is that each "word" (list element) is repeatedly evaluated until it cannot be further evaluated (either it is a value or a lambda). If it is a lambda, then the appropriate number of arguments are applied (as available -- possibly currying). Once done, the lambda is evaluated. Arguments are applied with lazy evaluation -- meaning that an argument value may not be evaluated at all if not used in the lambda expression.

Suppose someone declares a variable named "then". Clearly this is a problem. (I could simply forbid variables having "reserved" names, but this doesn't solve user-defined control constructs.)

The other option is simply to evaluate all arguments, in order, before applying them to the lambda, but this obviates lazy evaluation...

Heh...
@Duoas,
I think the reason so many scripting languages use '#' for comments is because of the '#!' that starts many shell scripts, to tell the shell what interpreter to call:
#!/usr/bin/env aliae
You got backwards. #! was introduced because # was used for comments.
Oh, really? Never mind, then.
Just a quick question about those shell scripts, since I only tested them on Ubuntu so far, do they work on Windows/Mac OS/Other Linux distro's too?
Anything that uses a shell based on the Bourne Shell (sh), which includes bash, ksh, zsh and I think [t]csh although they aren't strictly sh-based.

Edit: Can anyone point me to a tutorial for writing a recursive descent parser? I want to write a class that handles summations in the format
<value>
   Σ  <expression>
<variable> = <value>

e.g.
  4
  Σ 2n
n = 0

The only bit I want to parse is the expression (e.g. 2n) because the rest can be passed as normal variables. The only part I haven't written yet is the only part that's more than one line long, and that's the parser. I've looked at the Wikipedia page, but the code is not commented at all, so I can't really tell what's happening. I don't really want to use a generator, I'd rather write it myself.
Last edited on
But that syntax is terrible both to parse and to use. Why not something more sensible like sum(i=1,n,2*i)?
Helios, is there any language here that seems like one you would like to use? If not, can you give us an example of a non-existant language that you would like to use?

-Albatross
Pages: 12345