Friday, March 25

Linear Algebra + Java = ?

I like Java. As a matter of fact Java is the one language that I get around with best and which I like to use in just as many projects as I can. The current project needs a lot of specialized mathematics, most of which in the area of linear algebra - meaning matrices and their decompositions, determinants, eigenvectors, singular values, stuff like that. I want to do that with Java as well. The JDK doesn't bring the needed algorithms with it, so what to do?

As I don't like to implement that kind of stuff myself (which is fairly complicated to do, and the result will probably not work very well) I was looking at the web to see where to find implemenations of suitable algorithms. Here the results of my research:

First of all I should mention that (being a student and not getting any money for that mainly scientific project) I was looking for cost-free code. So, to get it over with, there are of course some commercial mathematical systems - if you care, here are some links: Maple, Matlab, VisualNumerics All of those products are much more comprehensive than anything you will find in the open source. Matlab is probably THE mathematical system. Before you wonder, because they don't specifically talk about Java - there exist interfaces allowing you to call their interfaces from Java.

But I was after free code for linear algebra. What I found at first was LAPACK. This is a library in Fortran77 and pretty mature. It doesn't contain the newest research, but it works. It works tightly together with the BLAS codebase, so both always come together. There exist optimized implementations from AMD as well as from Intel for their processors. I have also seen code from SUN for their SPARC architecture. LA functions are fairly complex and may take some time to compute, so the need for the best possible implementation, optimized for the system at hand, is there. I don't know how well these optimized implementations really work. I didn't try, because I am still after a version for Java.

At this point I was already a little disappointed, because there just won't be any optimized implementation with Java, it will run within the Java VM and won't know anything about cool special features of the CPU. Or, in other words, it relies on the VM to know about these features and use them. If it doesn't, well - then it doesn't. But lets get over this disappointment and look at the projects.

JAMA - This one is a pure Java implementation of basic linear algebra functions. It can compute SVD, also LU decomposition and QR. It has a fairly nice API and is easy to use. It doesn't do SVD fast enough for large matrices, but as it has a very low learning curve, I do recommend it for smaller computations.

JLAPACK - This one is basically blas+lapack for Java. It was converted automatically from Fortran77 and incorporates none of these: OOP, a plausible API. It is a bunch of methods with names like DGESVD and it returns results through call by reference (filling passed double[] arrays) instead of by return values. It takes some detective work to understand which parameters are needed and why. Also to understand which function does what. There is some kind of pattern which relates the function name to what it is doing. I managed to call DGESVD and was pretty happy about that already. I would not recommend to use it, because of the lack of any kind of ease of use. Although you may still prefer it, if you need some of its functions. It is more comprehensive than JAMA, so you may not be able to get around it. I did get around.

JScience - This one looks promising (at least it got a pretty web site). It is even an active project, most recent version (1.0.4) at the time of this writing is from 26. January 2005. (Which makes it 6 years younger than JAMA and I think 250 years younger than LAPACK). It already has some implementations of mathematical functions, but I do miss the SVD (because I need it). The linear algebra part of this package is lacking important features. It has a similar implementation of a matrix class like JAMA does (which adds SVD and QR). It has LU, though. As this is (other than JAMA) actively developed, it should be observed. The API looks well and the author plans to add more featurs this year. It doesn't have enough for me right now, though. If you are a physicist, you may get more value from this library than I do.

GTP (look in the software section) - This one is not a library but a tool. It parses text documents and indexes it using LSI. It uses a very efficient SVD function based on the iterative lanczos method. Besides, this is basically a port to Java from SVDPACK. It doesn't contain all the functionality from SVDPACK but instead only provides those functions which the used SVD algorithm needs. Try to not look at its design from an OOP point of view, because you won't like it. Its ugly, but still efficient. I am using it for SVD, interfacing it from my own Java application.

Thats what I found, hope it sheds some light on the topic.

So long, have a nice day.

Eff