Sunday, November 1, 2015

SSL Ciphers

I just checked in support for doing encryption/decryption through OpenSSL.  Crack provides a wrapper library (crack.crypt.ssl.cipher) that allows you to encrypt/decrypt using a writer, so for example to encrypt:

     import crack.crypt.ssl EVP_aes_256_cbc, EncryptWriter;

     backing := (cwd/'outfile').writer();
     out := EncryptWriter(EVP_aes_256_cbc(), key, backing);
     for (data :=
     out.close();   # You must explicitly close (or make them go out of scope).

There were a few ciphers (specifically the AES CCM and GCM ciphers) that didn't pass a round-trip test.  These require some special setup that I don't feel motivated to figure out.

Wednesday, July 22, 2015

Crack 0.11 Released

I am pleased to announce the release of Crack 0.11.  Enhancements include:

  • Allow importing crack.compiler outside of an annotation.
  • Fixed casting so we can now correctly cast from secondary bases and even cast across bases.
  • Added crack.protobuf.ann, which provides an annotation to allow you to expand inline protobuf messages into generated code.
  • Streamline generic serialization (only serialize file name and line number when they change).
  • Enhancements to jack and alsa modules, made alsa extension more OO.
  • Lots of smaller fixes and enhancements.


Thursday, July 16, 2015

Moving to github

Sadly, googlecode is shutting down and we needed to find a new home for Crack.

I've been a long time fan of Mercurial (I think I was using it before git existed) but I seem to be in the minority in that.  Most Crack contributors appear to prefer git.  There's also a huge community that's developed around github.  So, given that our current project hosting is going away, this seemed to be a good time to convert entirely to git and relocate to github.

In truth, I've been running the project out of git for over two months now.  I migrate changes to the mercurial repo on googlecode whenever I push, and I expect I will continue to do so for the foreseeable future, but the repository in github (crack-lang/crack) should now be regarded as canonical.  Patches to the codebase should be sent as pull requests or as git patches generated by "format-patch".  We will no longer accept requests to pull from a mercurial repository.

Hopefully we'll get around to replacing our home page before googlecode disappears entirely.

Friday, July 10, 2015

Protobuf Annotations

We've had a module to deal with protobuf wire protocols for a while now (crack.protobuf), but you've always had to code your message serialization by hand.  Not any more!

I've just checked in crack.protobuf.ann, which is an annotation that allows you to define protobuf message definitions inline and generates the appropriate code.  So, for example, this:

 @protobuf {
    # Uncomment "debug" to emit the generated code to standard output with
    # line numbers.

    message Bar {
        repeated string name = 1;

Generates this:

class Bar @impl Message {
     Array[String] name = {};
     oper init() {}
     void serialize(ProtoWriter dst) {
         for (item :in
             dst.write(1, item);
     void addField(Field field) {    
         if ( == 1) {
     int cmp(Bar other) {    
         int rc;
         ((rc = cmp(,
         return rc;
     int cmp(Object other) {
         if (o := Bar.cast(other))
             return cmp(o);
             return Object.cmp(other);
     uint makeHashVal() {    
         return makeHashVal(;
     @static Bar makeFromField(Field field) {
         Bar inst = {};
         return inst;


As protobuf generation goes, this isn't all that great.  For one thing, it would be much better if we could just generate crack code from the proto compiler like everything else and then we could use the same .proto files as everything else.  The implementation is also very incomplete.  It only supports the int32, int64, string and bytes types (and int64 is currently broken) and it doesn't support any protobuf syntax that is even slightly advanced.  That said, it still beats coding the serialization by hand.

Monday, March 30, 2015

Crack 0.10 Released

We're pleased to announce the release of crack 0.10.

It has been over a year since the release of version 0.9, way longer than we had intended.  But we were determined to get caching working reasonably well, and this has been a difficult feature to get right.

Fixing the bugs in caching required a major overhaul of many of the internals of the executor.  As a result, we now have a much cleaner, much more reliable codebase.

The new release is also generally much faster.  Caching produces an observable speedup, but even more significant are the gains from deferred JITting (see Fast from late last year).

Caching is now enabled by default.  Though we know of no bugs in it, I expect that there are still a few.  If you suspect problems, try disabling it with the -K option or by setting the environment variable CRACK_CACHING to "false".  Providing a snapshot of your .crack/cache directory will help us debug the problem.

Enjoy the new release, and happy cracking :-)

Wednesday, February 18, 2015

I just pushed a change that fixes a long-standing problem with interfaces.
Interfaces are implemented as ordinary annotations. When the @interface annotation is used, it expands a definition of a class derived from VTableBase. This lets us mix the interface in with arbitrary base classes and also lets us define a class with multiple interfaces (diamond inheritance, where a class inherits from multiple base classes with a common ancestor class, is not allowed in Crack except for the VTableBase class, so interfaces can not be derived from Object which classes are by default).
We could just derive from VTableBase except that we still need the reference counting mechanism defined in Object. For example:

    class A {}
    class B : VTableBase {}
    class C : A, B {}
    C c = {};
    B b = c;
    c = null;

In the example above, when 'c' is assigned to null, the reference count of its 
object drops to 0 and the object is deleted because B doesn't have bind and 
release operators or a reference count.  'b' now points to unmanaged memory.

Part of what @interface does is to generate bind and release operators that delegate to Object.oper bind() and Object.oper release(). But to be able to ues these, we need a way to convert the interface to the Object that its implementation class will be based on.
As a hack, the code used to generate a special method: _iface_get$(Interface)Object() ($(Interface) was the name of the interface class). This was abstract in the interface and defined in the implementation to simply "return this," allowing us to do the conversion. But _iface_get* had some problems because it was based on the simple class name. So:
  • Aliasing interfaces didn't work, because "@implements Alias" caused _iface_getAliasObject() methods to be generated instead of _iface_getRealClassNameObject() methods.
  • You couldn't implement two different interfaces with the same simple name, for example Functor1[void, int] and Functor1[void, String] (for generics, we only used the name of the generic class, not the instantiation class).
The solution to this is the special "oper from ClassName" operator. Like "oper to", "oper from" is implemented in the parser and has full knowledge of the type that is specified. Therefore, it can implement the operator based on the canonical name of the type, solving the problems above.
I originally considered creating a more complete implementation of virtual base classes, but with "oper from", I no longer see a use case for it. I think that everything that you can do with a virtual base class you can also do with "oper from" and annotations.
Of course, I'm open to a counterexample.

Thursday, August 7, 2014


A long-standing goal of the Crack language has been to produce a system that is fast.  Specifically, we wanted to minimize the code-test cycle.  The language has thus far fallen short in this respect.  Startup times for anything less than very small programs have tended to be over five seconds.

We were never able to obtain any solid conclusions from profiling the code, but we knew there was a lot to be gained from caching.  So over two years ago we started work on caching.  Needless to say, this was a harder problem than we imagined.

Caching is simply storing the artifacts of a successful compile so that we don't have to produce them again.  In theory, the work involved in fully parsing and validation the source as well as generation and optimization of LLVM IR code is greater than the effort involved in simple deserialization of these structures.

Getting it to work right has been very difficult, largely because of a number of shortcuts and bad design choices that I made while producing the original compiler.  My first reaction to this was, of course, to try to adapt caching to the existing code.  However, when these attempts failed I did finally do the right thing and make the big changes to the design that were necessary to get caching to work.

And now, finally, it does.  I can run code, make changes, and run again with the most complicated programs that I have and everything just works.  The modules for the changed code and all of their dependencies get rebuilt, everything else gets loaded from the cache.  Sweet!  (I'm sure there are still some bugs in there somewhere...)

So trying this out against the most complicated program that we have -- the "wurld" game engine demo in 'examples' -- without any cached files, takes 12 seconds to build on my laptop.  After caching, it takes ... 9 seconds.

9 seconds?  Really?  A 25% reduction after all of that work?  That's depressing.  So I did some more benchmarking.

One of the results of the restructuring work required for caching is that all of the jitting of a cached program now happens at the end, just before the program is run and after all of the modules are loaded from the cache.  Originally we jitted every module when we finished loading it.  The new structure makes it much easier to see how much time we're spending on jitting versus everything else.

It turns out, we were spending most of that time, 7-8 seconds of it, on jitting.

We were storing the addresses of all of our functions before running everything.  We need function addresses in order to generate source file and function names when reporting stack traces.  But in the course of looking all of these addresses up, we were also generating all of the functions in the program, whether we needed them or not.

LLVM's ExecutionEngine lets you watch for notifications for a number of different internal events.  In particular, it lets you get a notification when a function gets jitted. So I replaced the registration loop with a notification callback.  Now instead of function registration driving jitting, jitting drives function registration, and only for the functions that are used by the program.  This got us down to about 5 seconds total runtime.

The ExecutionEngine also lets you enable "lazy jitting."  By default, when you request a function address, that function and all of its dependencies get jitted.  By contrast, lazy jitting defers the jitting of a function until it is actually called.  Enabling this feature brought the total run time down to under 2.5 seconds, because the hacked version of "wurld" I was using for benchmarking was loading everything and doing initialization but then immediately terminating, so much of the code is never called anyway.

I'm not sure if there's anything more we can do to improve on this, but 2.5 seconds is well in the realm of tolerable.

It's somewhat embarrassing that the huge effort of caching yielded only a small amount of the initial gains while an hour or two of investigation and simple tweaking had such a big impact.  But on the other hand, the simple tweaking couldn't have worked without some of the changes we put in for caching.  And as it turns out, post-tweaks, caching ends up saving us 60% of the startup time, which is huge.

Going forward, after we release 1.0, I still want to start to experiment with alternate backends.  libjit, in particular, looks like a promising alternative to LLVM for our JIT backend.  But as it stands, in terms of performance, I think we're in pretty good shape for a 1.0 release.