Quixotic, Exegetic Programming

Saturday, June 17, 2006

Sending JNI to reform school

Listened to the discussion about JNI on the JavaPosse today, 2-Jun-06. (Incidentally, I'm using the Posse as a means of making myself blog. I send them an email on their topic, then I expand it to be a blog entry. This seems to be working in terms of making me do the writing, so I think I'm going to try to keep it up.)

Anyway, it was a good discussion on the podcast, but there was not a lot of meat on any of the various ways in which JNI could be reformed to be more usable, just the suggestion that it should be possible. I think I have some ideas and that's what this post is about.

This topic is especially interesting to me because for the past several years at work, I've had to deal with this one 3rd party package that is unmanaged native code. In fact, it is the worst code I've ever seen. AFAICT, it was written in C in the either late 80's or early '90s to support a very particular business which has absolutely mushroomed over the years. So this ancient code is supporting some complex products that had not even been invented when it was written and it's grown into a real mess. To give you the flavor of what I'm talking about here, the entire API is contained in one header file that is more than 100 pages long when printed. All access to the API begins with one struct definition which takes up probably 10 of those pages and which wraps arrays of pointers to the structs and #define's which make up the rest. In fairness to the firm which makes this thing, they do wrap this API with another simplified API that most people use. Unfortunately, my particular use requires dealing with the nastiness of the original.

Following the standard Java industry practice, I wrapped all of this code in a JNI layer. It's ugly, it's impossible to debug and getting ant to build it correctly on both Windows and Linux is like having a root canal. Having spent time with JNI before, I just thought that there was nothing that could be done to fix all this. But I now know that I was wrong. JNI takes a very naive, brute force approach to the problem of native code and there are better ways.

Post-my-recent-job-change, I have had to use the same library, only doing the wrapping and invocation from C# running in the CLR (this library is a standard in my industry and understanding it at a deep level makes one extremely valuable in certain organizations). For this particular problem (accessing native code from managed code) Microsoft has addressed the issue in a much nicer way than Java does with JNI. Basically they have done a language which looks a lot like C++ but which executes inside the CLR. They call this C++.NET, presumably because its _NOT_ C++, it only looks a lot like it.

Here're the big differences between C++.NET and standard C++ that I have seen:

a) no multiple inheritance, C++.NET uses single inheritance and interfaces since, like the JVM, that's what the CLR supports.
b) GC operates on C++.NET objects, but keywords have been added to defeat GC where necessary (like when you have a pointer to a C struct in native code).
c) Loads of new semantics necessary to deal with a and b.
d) C++.NET objects and their methods are directly accessible from C# and VB in much the same way that Java and Groovy objects can directly communicate, i.e. you just make the call directly without any intermediate dispatch.

So why is this better than JNI? First, because you write managed code in one language only, rather than having your code split between two languages (with one of them name-mangled) as with JNI. C++.NET directly parses and imports the headers (or rather header in this case) for the native code and it allows you to access the functions defined there directly in the managed language. Second, writing tests becomes _way_ easier, as you can test your code directly rather than having to go through the name mangled interface. Third, the standard debuggers work over your code which exercises the external native library (they don't work over the third party code since that has no debug symbols anyway, but that's something you can live with). And fourth, all of this naturally fits into your IDE. If you have ever had to debug JNI code, you know exactly what I'm talking about here.

Now, the C++.NET language is ugly, but so are C and C++, so MS's solution just seems reasonable. This is especially true if you add similar functionality to the JVM when you consider that:

a) the Sun tools group that does C++ is moving to the NetBeans platform,
b) Sun has a new emphasis on other languages running in the JVM and
c) .NET _is_ the competition for Java after all.

Someone from Sun recently made the remark to me that internally Sun is divided into "C people" and "Java people" and that this is not an official division, but rather a cultural one. So, I'd like to see Sun work on a similar language to C++.NET as one way of ending that cultural divide. They have the skills internally, they have the tools and they have the ability to experiment with JVM enhancements that could support this. And they're behind on this particular technology which, IMHO, will be needed to move the world out of running unmanaged code.

Saturday, May 27, 2006

How is Gilad Bracha like Herbert von Karajan?

This was originally an email I sent to the JavaPosse after listening to an episode of theirs which got me thinking about what I'd like to see from the JVM in the future. But I've decided to revise and extend those remarks here to help me clarify my thinking on the topic.

So here's what I meant by the title of this post:

A friend and I were once discussing the classic Deutsche Gramophone recording of von Karajan conducting the Berlin Philharmonic in Beethoven's Ninth. I said to my friend (parroting reviews that I had read) how great that music was. My friend replied: "I know that Herbert von Karajan knows more about the Ninth than I can ever hope to, but still... that's the most ponderously slow performance of that piece that I have ever heard".

I know that Gilad Bracha knows more about the inner workings of the JVM and the JLS than I can ever hope to, but still...

As little background: I attended Gilad's talk at JavaOne this year on supporting dynamic languages in the JVM. (You'll need to login as contentbuilder/doc789 when prompted if you follow that link).

Gilad has started a JSR which he anticpates will propose that a new instruction be added to the JVM to support dynamic invocation of methods, i.e. an instruction which won't require the heavy type checking that the current invoke* instructions do. The notion is that this will speed performance of other languages which target the JVM as a runtime environment but which, unlike Java, make use of dynamic typing. The goal and desired outcome seems quite reasonable and during the talk I found out things about the JVM that either I didn't know or had forgotten long ago. But, unlike most J1 sessions, the Q&A session which followed fascinated me almost as much as the talk.

The first questioner was Corky Cartwright. Full disclosure: I had spoken with Corky (whom I know, like and respect) prior to Gilad's talk and had agreed with him that getting tail calls supported in the JVM was of prime importance if functional languages were to be first class citizens in the Republic of Java. So when I saw Corky approach the microphone, I knew he was coming loaded for bear.

What followed was (for me anyway) the wonderful spectacle of two passionate professors engaging in a theological debate on the value of particular JVM features with each wanting their particular esoteric feature added. (Gilad's title at Sun after all is "Computational Theologist") . Basically, folks in attendance were getting a grad school seminar without paying tuition. It was wasted on most of the audience. Needless to say, I expect that it's Gilad's esoteric feature that will make the cut for Dolphin, while Corky's is viewed as being too complex.

After the presentation Gilad wrote a blog entry on being surprised that no one had brought up continuations as a desirable feature. That's the action that motivated me to write this blog entry. Basically, I'm arguing that invokedynamic, tail calls, and continuations should all get serious consideration for support in Dolphin, and that if Sun can't/won't do that for fear breaking the JVM, open sourcing Java would allow others to do it and see the effect.

First, you may want to review this discussion of continuations from the Lambda the Ultimate site. (LtU is one of my favorite sites). I think it says what needs to be said about continuations in the briefest manner:

A continuation *is* a closure - there's no difference, conceptually (there may be implementation-level differences, which essentially reflect that a continuation is a closure which has been created in a different way than most closures).

The things that tend to mess people up when thinking about continuations are (a) not fully understanding closures in the first place; (b) the behavior of what Scheme calls call/cc, which wraps up the "current continuation" as a closure; and (c) the idea that functions don't have to return to the place that they're called from, which is fundamental to the notion of continuation.

So, working with that definition, lets talk about Gilad's two posts a bit more. The example of usage of continuations that Gilad gives (and dismisses) in his blog entry is the easiest to understand example for all 10,000 of the Java programmers out there who have written a web framework (and had it discussed on the JavaPosse) . It is NOT the appropriate use case for deciding if continuation support should go into the JVM, though. More illuminating examples for that decision are ones that Gilad doesn't respond to, but which arise in the comments of his blog posts. These are 1) support for languages other than Java in the JVM and 2) grid computing.

In this longish post on JRuby at the Headius blog site, I found the following to be quite revealing on the topic of language support:

In JRuby's case, a major missing piece was the inability to longjmp, C's function for leaping from one call stack to another. longjmp is heavily used (understatement!) in Ruby for everything from threading to continuations to exception handling. Missing longjmp in Java presents a very large hole when porting Ruby C. Many creative attempts to mimic longjmp were therefore created: exception-based flow control allowed loop keywords like 'next' to throw control back to a higher-level loop construct; a recursive evaluator repeatedly called itself for new AST nodes, ever-deepening the stack but always keeping lower nodes within the context of higher ones; exception-based "return sleds" allowed returns to bubble their results back up to the appropriate recipient; and on and on. Many of these approaches were extremely novel, worthy of their own papers and accolades. Indeed, several of them have shown up in academic papers and PhD theses in some form or another.

This (and some other discussion in that post) implies that dynamic languages which implement continuations (or just closures) have to do their JVM implemenation in a pretty lame manner - they end up keeping the stack for the language implementation in the JVM heap so that they can do the necessary manipulation of the stack that the language demands. In effect, they have to manipulate their flow control structures in a manner that reminds me of someone doing all of their Java method invocations via reflection.

In JRuby's case, they end up sort of simulating the Ruby VM on the Java VM, it seems. This has to damage Java interoperability and the language's performance simultaneously and it severely complicates the design.

If Sun really values having other languages on the JVM, the two killer features that need direct JVM support would seem (to me) to be continuations and tail calls. Get those right and Lisp, Scheme, Ruby, Scala and others can be supported with their full native performance and, I predict, Groovy would get way faster. This sort of "foreign" language support is something I bet we won't see happening in .NET anytime soon, providing a clear differentiating feature for Java.

The second example that Gilad fails to take into account is the one that's really near and dear to my heart - grid computing with Jini.

In the kind of compute grids that spark my interest, what you really want to do is create a closure and dispatch it, along with a set of data to curry it, many times to a large number of computers, each of which binds the received data, then invokes the closure and puts the result of the closure evaluation somewhere where it can be found later. Think about it, that's asking for a closure which can be saved and executed elsewhere and which doesn't perform a synchronous return to the caller, i.e. it's asking for a continuation as defined above.

The fascinating thing about this model of a compute grid, when compared to models like MPI, WebServices and even JXTA, is that it allows your code to be mobile around the grid - a feature which, when run on Windows, is, unfortunately, called a virus. This, I'm betting, is why you don't see a Jini-like thing on Windows - anyone who suggests mobile code in Redmond is guarenteed to be hooted down as an advocate of allowing SoBig.F to run loose around the net forever. (But.... it could be another differentiating feature for Java if Sun ever really grokked what Jini is).

So... doing continuations in a secure manner requires one to be VERY careful. Unfortunately, being careful in the current JVM also requires an enormous number of special tricks - tricks which could be obviated, I suspect, if the JVM had first class support of closures and continuations without breaking the Java security model.

The current best tricks, IMHO, to use in support of this model are called - wait for it - Jini and JavaSpaces. Jini/JavaSpaces allows one to use RMI, or rather, JERI which should have been RMI 2.0, (see JSR-78 and JSR-76 ) with cryptographically strong codebase annotations to create a sort of light weight closure/continuation.

Rereading the gobbledygook that is that last sentence should tell you everything you need to know about why Jini hasn't stormed the world. It's just damned hard to understand, much less implement, secure models of mobile code. The fact remains nonetheless, that it is the current best way to do a compute grid. Look at the combination of the computeserver project and netbeans, and I think you glimpse part of the future of compute grids. But I am somewhat partisan, as one can probably tell from my latest JavaOne presentation. (You'll need to login as contentbuilder/doc789 when prompted if you follow that link)

So I would like to see Gilad von Karajan and the rest of the JVM team at Sun quit being so ponderously slow to introduce continuations (and tail calls and invokedynamic) and make life easier for groovy, ruby, lisp and Jini. But I ain't holding my breath..

Anyway, my fingers are now tired...

Friday, May 05, 2006

Things that need to be done for ComputeCycles

So I'm working on getting the ancillary materials together for our JavaOne presentation. We will need to post on the ComputeCycles webpage: http://computecycles.dev.java.net a list of the subprojects which make up ComputeCycles and the kinds of things we see people working on. We also need to check in some source code that currently is sitting in the old repository. So here's my list of tasks:

Move the groovy config language package (dynamics) to the latest groovy release and check it in to computecycles.dev.java.net.
Add more tests to dynamics and perhaps adapt it to computeserver (http://computeserver.dev.java.net) when the source for that becomes available
Check in and test Sean's new SSO config webapp
Look again at working with an unmodified version of com.sun.jini.start and see if there are any possibilities of resolving the truststore issue without custom code.
Debug and test using JACC and com.sun.jini.tool.DebugPolicyProvider together in the latest version of Glassfish
Determine if we can simply reconfigure Glassfish to get our URLStreamHandlerFactory in place at launch, rather than modifying actual code. I'm actually pretty hopeful on this score.
Get an example working of Jini services deployed under Glassfish and vending proxies via JNDI as well as via an LUS
Port the existing Jini Service Browser replacement to run under Glassfish rather than under Tomcat. i.e. provide a webapp which uses JNDI vended LUS proxies to discover other services.
Configure the Jini services under Glassfish to vend https proxies rather than SSL proxies, i.e modify the canonical Tim/Brian examples to use https so that we can go through firewalls.
Test the bootstrapper with configuration to use a JNDI-vended LUS rather than using the normal LU discovery protocol with the attendant problems that that presents for firewalls.
Improve and test the base service and client classes
Port current trinity and edison classes to derive from the improved base classes
Write SpaceGhost on the new base classes
Write Pharoah on the new base classes
Write the test cases
Demo!

Yeah, that's all we need to do. :^)