Mar 30, 2009

Discovering Fractals: The Julia Set

After my post yesterday about the Mandelbrot set I started digging around a bit more. I discovered this other thing called a Julia set, which is something similar.

It still uses this formula:
zi+1 = zi + c
However where Mandelbrot set started with z0 = 0 and c as the coordinates of the pixel, the Julia set has z0 as the coordinates of the pixel and c as some number. Depending on the c that you pick, you end up with a different picture.

I also looked into colouring it a bit better than black and blue, it is now based on HSV colours instead of RGB. Then if the formula escapes your radius at iteration i, you set the hue of the pixel to (i+b) mod 360 where b is some offset you can choose to change your colour scheme. With b = 0 you get a bright orange background, but I didn't really like it so I picked b = 180 which gives a nice lightish blue background. Unfortunately I'm not completely liking the bright pink/purple of the swirls, anyone know a few things about colour theory and have suggestions?

Here it is centred at 0.35 - 0.33i, with 4x zoom and c = (φ - 2) + (φ - 1)i, φ being the golden ratio:

Mar 29, 2009

Discovering Fractals: The Mandelbrot Set

I read a blog post a while back about content generation in video games, one section of which is terrain generation. One interesting way to generate realistic terrain is using fractals. One of the most well-known and relatively basic fractals is the Mandelbrot set. After reading through this article I decided to write a simple program that generates pictures based on this set.

Here's how the set works. You take a complex number c. You then take a sequence described by the following formula:
zi+1 = zi2 + c
Where z0 = 0. If this sequence is bounded (it does not escape to infinity), then c is in the set, otherwise it is not.

You can display this using a program by taking each pixel in your window and converting it to a complex number. You then test to see if the number is in the set by iterating over it a certain number of times to see what happens. Of course you won't be 100% precise, but with enough iterations you can get a pretty good approximation.

On top of this you can make it look even prettier by tweaking the colour depending on how quickly each point escapes to infinity. If it shoots off into infinity immediately, you make it darker. If it takes a while, then you make it lighter. I used the following formula:
colour = (0, 0, i / n * 255)
Where i is the iteration that the number escaped a certain threshold where the norm of the complex number is greater than this threshold (I used 2 and 3, didn't notice a huge difference between them) and n is the number of iterations (I used 500, 1000 and 2000, 1000 made the prettiest picture). This means that the colour of each pixel ranges between black and bright blue.

Here's what I got when I calculated it centred at c = 0.38 + 0.1i, with 64x zoom (where 1x zoom is in (1 + i to -1 - i)), 1000 iterations and a threshold of 2 - unfortunately when viewing the picture in the browser (at least for me on Ubuntu with Firefox) it doesn't look as nice as if you actually download the thing so I recommend you do that to appreciate it more:


That's pretty eh?

One thing that I find neat about this set is that it is self-similar. Each one of those little blue blobs is how the picture looks if you centre at 0 + 0i and have 1x zoom. You can then zoom into each one of those and see little bumps on the side that also look exactly the same as the whole. And each of those bumps has smaller bumps on it that also look the same. Pretty cool eh?

UPDATE: I wrote a follow-up about this here.

Mar 25, 2009

The Freaky Sandbox

One thing that can always be useful is the ability to run "unsafe" code inside a sandbox, where it can do limited damage. For a project I am working on, I wanted to do this very thing. Users would be able to upload their own code for execution on the server.

Naturally, this leads to a whole whack of security issues. If you just run code unchecked, users could get access to files, threads, processes, all sorts of things.
Fortunately there is this gem called the Freaky Sandbox which lets you run Ruby code, and it disables access to many security sensitive areas.

Unfortunately the gem doesn't work anymore. If you check out the source and install it, all sorts of bad things happen when you actually try to use the gem.
So before diving in to fix this, I decided to see if JRuby had a similar gem. Fortunately, it does. Now when I first tried it with JRuby 1.1.something it also didn't work, but I made some tweaks to it and now it does. Also with JRuby 1.2 RC1 it works (and also with RC2).
UPDATE (Mar. 28/10): The gem in the repositories no longer works, however you can check out the code to the javasand gem and install it using these instructions.

So now you can do fancy things like this:
require 'sandbox'

s = Sandbox.safe

File.open("untrusted_code.rb") do |f|
s.eval f.read
end
What it does is filters out things like Thread and Kernel.fork, and file manipulation, things like that. It also doesn't see any classes that you have defined unless you import them into the sandbox.

So how could this be useful? One thing I'm thinking of is to allow users to write custom scripts for a site. Of course most users wouldn't be able to take advantage of this feature, but if script writers made their scripts available then they could just grab it and install it.

You could try something like this:
sandbox = Sandbox.safe

sandbox.import RandomAPIClassThatIWrote

res = s.eval untrusted_code

# do stuff with res
Unfortunately importing modules doesn't seem to work, it gives a NullPointerException when you try. If I get some free time soon I may look into it and try to fix it, but for now I'll just have to live with importing classes.

Now the question is: is this perfectly safe? Well, no. The code you run in the sandbox can still have infinite loops in it which will halt your program. Unfortunately it is impossible to tell if arbitrary code will have infinite loops in it or not, so we're out of luck for fixing this problem. However what you can try is the following:
sb_thread = Thread.new do
s.eval untrusted_code
end

sleep TIMEOUT

sb_thread.raise if sb_thread.alive? # use raise since kill doesn't seem to work
Normally this is unsafe, because of blocking operations like synchronous IO and things like that. However with the sandbox, you cannot use the require keyword. You also have no access to the File or Dir classes (they are not defined inside the sandbox). So correct me if I'm wrong but unless you go and import these classes into the sandbox there are no unsafe things that could happen by just killing the thread.

Mar 23, 2009

OORegress: Another Stats Package

A while back I complained about how statistics functionality in OpenOffice is sadly lacking. Then I discovered that JRuby can fairly easily tie into OpenOffice, so I started thinking I might be able to tie the two together.

I mucked around a bit and rolled up something that can help out. It is a stats program that ties into OpenOffice from the command line, allowing you to enter commands to do statistical things. For example, the following code will run a regression:
regress(Y = X)
It will evaluate the model:
Yi = α + β * X + ei
Note that you don't specify the intercept, it is implied. I'll be adding a feature eventually where you can force a zero intercept.
You can also do some fancier things:
regress(ln(Y) = X1^2 + D*X2)
Which will obviously regress
ln Yi = α + β1*X1i2 + Di*X2i + e1
Where X1, X2 and D are independent variables.

The interface follows some simple conventions. The columns of the spreadsheet contain the data, and the very first row contains the name of each column. In your regression equation you address the variables by those names. When you run the regression, the program will open up a new sheet in Calc with the regression output with a bunch of info about the coefficients, their significance, some properties of the variance of the regression, R2, etc. I pretty much just copied the stuff that Excel prints out when you run a regression because this is what my stats classes want. However I make it easier here since you can use an actual regression formula instead of having to copy-paste columns and apply formulas in the spreadsheet itself.

I'm working on documenting how to use the program, and also working on some new functionality like lagged variables and having ΔYi instead of just some function of Yi. However the regression itself doesn't completely work at the moment, so new stuff will have to wait.

You can check out the code if you like from here: http://code.google.com/p/ooregress/. I should have a more functional version coming out soon.

Mar 10, 2009

Erb Comments in JRuby vs. Ruby

With my current job, we built up most of the software with JRuby. More recently I discovered that development is faster using Ruby and then deploying on JRuby.

However then we discovered some bugs that were in code that hadn't changed in a long time. For some reason, certain chunks of our view templates were just disappearing. Nobody could figure out why.

This is where Vim came unexpectedly to the rescue. You can write comments in erb by writing something like this:
<%# this is a comment %>
You can also go like this:
<% # this is a comment %>
However with the latter type of comment, it would always mess up the syntax highlighting in Vim. It wouldn't recognize that the %> was closing the erb comment tag, because it thought it was part of the comment. I had noticed this before but didn't really care that much, because it wasn't my code and I didn't really want to mess with it. But this time in order to find the bug I figured I should have correct syntax highlighting.
Anyway after fixing the comments, I couldn't find anything wrong with the file. I was still baffled. So I figured I'd just go and look at the page in the browser again, and boom! It worked!

My hypothesis is that Ruby and Vim probably use the same logic for parsing those erb comments, but JRuby uses something different. JRuby would behave exactly the same if we had <%# or <% #. Ruby on the other hand will think that anything after the # is part of a comment, and therefore will ignore the %> that is on the same line.

Mar 4, 2009

Adventures in Haskell

I must apologize, I've been uncharacteristically quiet as of late and haven't posted much stuff. And even when I do, it's usually to report about (J)Ruby things that I've discovered. For the most part it is because I rarely have time/motivation to finish off the many half-finished posts I have sitting around. That's mainly because any free time I get is either lost in the world of Oblivion or spent in my geeky pursuit of learning Haskell.

Yes, I'm attempting to learn Haskell. Some people come home, sit down and fire up the TV, or read a book, or even write blogs. I come home, pour myself a glass of wine, and crack open Real World Haskell. Usually this ends up with me getting another glass of wine.
It's quite a well written book. It's freely available online, and the online version allows people to comment on each paragraph so there's usually extra tidbits of information that you can get from other smart people explaining tricky things in a different way than the book.

Anyway to the language itself. I saw Haskell while I was at school, in a lab in CSC101 one exercise asked us to write insertion sort in Haskell, which is about 2 lines of code. However it still managed to baffle pretty much everyone in the class, myself included. Of course, after learning Scheme in later classes the functional programming aspect comes much more easily now, but Haskell is still quite weird.

This is what I've come out with so far:
  1. Type inference is awesome. No need to write things like std::vector< std::pair >::const_iterator or junk like that (don't ask why I have that particular data structure, I made it up on the spot). The Haskell compiler usually looks at your code and says, "Aha! You are probably wanting this type!" Of course it may not always be right, or you want to restrict the types a bit better than what it gives you, but it gives you the option of not specifying things.
  2. Haskell is lazy. This is different than any other language I have used. It does exist in other languages too, although less completely. This should be familiar:
    if (1 || infiniteLoop())
    In C this will work fine, because the infiniteLoop function is never actually called. However, this will result in an infinite loop:
    foo(1, infiniteLoop())
    , even if foo looks like this:
    foo(int i, int j){
    if (i || j)
    return i;
    }
    In Haskell, this is not the case. Expressions are only ever evaluated when they are needed. In the previous example, the value of j is never actually needed, but it is calculated anyway.
    This results in some very interesting results. The standard library of Haskell has a function called repeat, which takes one parameter and just repeats it as a list. Infinitely. You can even write a list like this: [1..], which is just [1, 2, 3, 4, 5, ...]. These aren't all that useful on their own, but combine it with a function like take which takes the first n values of a list:
    take 5 (repeat 'a') == ['a', 'a', 'a', 'a', 'a']
    take 5 [1..] == [1, 2, 3, 4, 5]
    This may be geeky, but I think that's pretty cool.
  3. No side effects - Code is not allowed to have side effects, except is special circumstances. So all variables are immutable once they are set (in Haskell-speak: "bound"). This has some interesting results. Suppose you have this:
    f (g x) (h y)
    Where f, g and h are functions, and x and y are values. Since there are no side effects, it doesn't matter whether you execute g or h first inside of f. In fact, you can run them simultaneously. No side-effects means easy concurrency, no need for locks or the complications that go with it.
I could probably go on for a while about the nifty things I've discovered (like how you can make *-~!!! a valid operator), but I'd recommend checking it out for yourself. If you're up for it anyway, it is a weird language compared to, well, most other languages out there, but I think it is one of those ones worth learning just for the ideas it gives you.