This summer I’m taking a short break from grad school and working as a research intern at Mozilla. Awesome experience so far, and it’s great to be back to working on open source projects! I don’t think I’ve ever cloned so many github repos in such a short time.
What’s new with LLJS?
We recently added arrays, with both statically stack-allocated and
dynamically heap-allocated variants. These resemble C and C++ array
syntax, respectively. I also added
union types, to complement our
struct types. These behave precisely as you would expect
from C. Recently I added syntax for defining structs and unions inside
other structs or unions, which significantly simplifies writings
I also worked on optimizing our malloc implementation, which is much faster now. We use a naive malloc algorithm from the K&R book, but since we don’t actually need to worry about paging, I modified this so that all memory is on a single page. This cut out function calls for allocating new pages, speeding allocation up quite a bit.
My mentor here at Mozilla, Michael Bebenita, added functions and optional constructors to structs, which are beginning to look a lot like C++ structs rather than C structs.
Finally, perhaps the most exciting new thing in LLJS is memory checking! Tim Disney, another intern here at Mozilla, implemented Valgrind style memory checking for LLJS. It can currently detect the most common memory errors: use after free, uninitialized reads, double frees, and memory leaks.
Esprima in LLJS
As part of another project, I needed super fast JS parsing in JS. Since I was already working with LLJS, I figured I might as well try it out on a larger scale and see if I couldn’t get Esprima ported over to LLJS and using structs for the generated AST. I hoped this would make an already fast parser even faster and leaner by using manual memory allocation.
Well, over 4000 lines of LLJS later, I finished the port. String handling is somewhat inefficient (it creates a new C string for every string in the program), but it works! Check out the sources on github if you’re interested.
So, was it faster? Well… not really. As it turns out, modern JS engines are very good at allocating objects, so manual allocation does not appear to be faster than the engine object allocation. The LLJS version does use about %10 less memory when parsing large JS sources, so that’s a win. From some preliminary testing, traversing the AST appears to be faster with LLJS, since property access is fast, but this comes at the cost of making traversal code harder to write.
Where I think we can definitely win, though, is code which allocates and frees code very often. Manual memory reuse and freeing could provide speedups here over engine garbage collection for some applications (note: I have yet to test this…).
In the beginning of the summer I started off adding source map support to LLJS, so that debug tools in the browser will show the corresponding LLJS sources instead of the compiled JS sources. This is mostly implemented in escodegen, which we use to generate the JS code from a rewritten AST. This got a little bogged down due to a bug in the chrome devtools (not loading sources from a data URI source-map), so I haven’t finished it off yet. Pretty much all that’s left to do is testing there, though.
As far as language features go, I will definitely be adding support
enum types as soon as I get the chance. We are also looking at
implementing bit fields
as a convenience for bit packing in structs.
If you have more ideas for where LLJS should head, please start an issue on github or (even better) toss us a pull request. Most importantly though, go try it out! If you find it helpful (or frustrating), let us know.