Thursday, June 25, 2015

Killing Off Wasabi

Jacob Krall (comments):

For the first time since FogBugz’ creation some 15 years ago, it contained 0 lines of VBScript or Wasabi code. The ‘End‘ of an era.

For those of you that don’t know, Wasabi was our own in-house .NET compiler. In this post, I’m going give you the background of Wasabi, including why we created it and what it enabled us to do, before giving you the inside scoop on why we’ve now killed it off. In a follow-up post, we’ll talk about some parts of Wasabi that we’ll be open-sourcing.

[…]

In 2011, Ted Unangst proposed “Project Soy Sauce”, a combination of .NET Reflector and Perl scripts to turn the Wasabi binary output into nearly-readable C# source code. This time, we finally succeeded, because of Roslyn, the newly open-sourced .NET Compiler Platform.

[…]

Building an in-house compiler rarely makes sense. It is much more economical to lean heavily on other people’s work and use standard, open technology. You must be prepared to not only create your own universe but to then nurture it long-term. We weren’t ignorant of this when the project began, but it’s much easier to switch languages at the beginning of a project than halfway through. A homegrown compiler might make sense when trying to port to a new platform, which Wasabi helped us do several times.

Update (2015-07-12): tptacek:

Quick rant: if this had been a Lisp project, and they’d accomplished this stuff by writing macros, we’d be talking this up as a case study for why Lisp is awesome. But instead because they started from unequivocally terrible languages and added the features with parsers and syntax trees and codegen, the whole project is heresy.

gecko:

Let me start by saying that Wasabi as a strategic move was brilliant. If David disagrees there, I’m a bit surprised: FogBugz represented an awful lot of battle-tested low-bug code, and finding a way to preserve it, instead of rewriting it, made one hell of a lot of sense. I’m with you that the general thoughts in this forum that we’d have to be insane to write a compiler are misguided. Wasabi let us cleanly move from VScript and ASP 3 to .NET without doing a full rewrite, and I’d be proud to work at a place that would make the same decision in the same context with full hindsight today.

That said, I think Wasabi made two technical decisions that I disagreed with at the time and still disagree in with in retrospect. First, Wasabi was designed to be cross-platform, but targeted .NET prior to Microsoft open-sourcing everything and Mono actually being a sane server target. At the time, I thought Wasabi should’ve targeted the JVM, and I still think in retrospect that would’ve been a much better business decision. I really prefer .NET over Java in general, but I know that it caused us an unbelievable amount of pain back in the day on Unix systems, and I think we could’ve avoided most of that by targeting the JVM instead. Instead, a significant portion of "Wasabi" work was actually spent maintaining our own fork of Mono that was customized to run FogBugz.

Second, Wasabi worked by compiling to C# as an intermediary language. There was a actually an attempt to go straight to IL early on, but it was rejected by most of the team as being a more dangerous option, in the sense that maybe three people on staff spoke IL, whereas pretty much everyone could read C#. I also think this was a mistake: the C# code was not human-readable, made debugging more complicated (VS.NET had something similar to source maps at the time, so it wasn’t impossible, but it was very indirect and quirky for reasons I can get into if people are curious), and that decision meant that Wasabi had all of the limitations both of its own compiler, and of Microsoft’s C# compiler. IMHO, these limitations are a big part of why the ultimate move away from Wasabi was even necessary in the first place, since they increased both the maintenance and developer burden.

So from my own perspective, I think that Wasabi was a mistake in that, if we were going to go to C#, we should’ve just got the translation good enough to really go to C# and then ditch Wasabi; and if we weren’t, we should’ve actually owned what we were doing and written a genuine direct-to-IL compiler so we’d have more control over the experience, instead of going through C#. But I still really do genuinely believe that our going to Wasabi was a brilliant strategic decision, and I think Fog Creek would have suffered immeasurably had we not done it.

Update (2015-07-16): Jacob Krall (comments):

I am also releasing Wasabi’s new code generator as open source software. I believe this is the first real-world transpiler targeting C# with Roslyn to be open sourced. It is my hope that this article and associated source code will serve as a rough guidepost for anyone who wants to write a .NET code generator using Roslyn.

[…]

We anticipated that CodeDOM would provide some level of static typing to our code generation process, as our previous generators always included a period of fixing obvious typographical errors (a missing semicolon here, an extra open-bracket there). CodeDOM also exposed a handy method for compiling directly to C#. However, we were slightly disappointed when we discovered that CodeDOM simply writes to disk and shells out to csc.exe. Another irritation was that word “common” I included in my description of CodeDOM: it is a lowest-common-denominator tool. Basically, the output was intended for a compiler, not human consumption.

[…]

CodeDOM’s representation is a mutable graph, so I would often pass node references as parameters and mutate them in other methods. Roslyn represents source code as an immutable graph of nodes, so the correct way to generate a given node is from the bottom up. However, because I was trying to modify a copy of the CodeDOM generator into shape, I had to change the calling convention of some of the methods which depended on mutation. There are many convenience methods in Roslyn to allow you to create a mutated copy of your node, but it wasn’t always clear where the result should be stored.

[…]

One complaint I have about Roslyn is with the SyntaxKind type – for performance reasons, it is implemented as a gigantic enum. However, this means it is not type safe – you can pass any SyntaxKind to any method that takes one, even where doing so will cause an inevitable crash. I ended up having to fix dozens of cases where I passed the wrong SyntaxKind (for example, passing a QuestionToken where I meant to pass a ConditionalExpression), and Roslyn would throw an exception.

vog:

Although slightly off-topic, I find the mentioned idea of “syntax trivia” very compelling. This should not only be part of code parsers, but of any parsers.

2 Comments RSS · Twitter

Leave a Comment