Mar 23, 2009

OORegress: Another Stats Package

A while back I complained about how statistics functionality in OpenOffice is sadly lacking. Then I discovered that JRuby can fairly easily tie into OpenOffice, so I started thinking I might be able to tie the two together.

I mucked around a bit and rolled up something that can help out. It is a stats program that ties into OpenOffice from the command line, allowing you to enter commands to do statistical things. For example, the following code will run a regression:
regress(Y = X)
It will evaluate the model:
Yi = α + β * X + ei
Note that you don't specify the intercept, it is implied. I'll be adding a feature eventually where you can force a zero intercept.
You can also do some fancier things:
regress(ln(Y) = X1^2 + D*X2)
Which will obviously regress
ln Yi = α + β1*X1i2 + Di*X2i + e1
Where X1, X2 and D are independent variables.

The interface follows some simple conventions. The columns of the spreadsheet contain the data, and the very first row contains the name of each column. In your regression equation you address the variables by those names. When you run the regression, the program will open up a new sheet in Calc with the regression output with a bunch of info about the coefficients, their significance, some properties of the variance of the regression, R2, etc. I pretty much just copied the stuff that Excel prints out when you run a regression because this is what my stats classes want. However I make it easier here since you can use an actual regression formula instead of having to copy-paste columns and apply formulas in the spreadsheet itself.

I'm working on documenting how to use the program, and also working on some new functionality like lagged variables and having ΔYi instead of just some function of Yi. However the regression itself doesn't completely work at the moment, so new stuff will have to wait.

You can check out the code if you like from here: I should have a more functional version coming out soon.

No comments: