Ubuntu: A Love/Hate Relationship: February 2012

Feb 27, 2012

Vim Relative Line Numbers

A little trick I learned the other day at Montreal.rb: relative-line numbering (can be enabled with :set rnu). The current line number is zero, and the left-hand column shows the distance from your cursor. This way you can easily do commands like d5k to delete the 5 lines above the cursor, or d5j to delete the 5 lines below it.

Feb 24, 2012

Asynchronous Function Looping in C#

It is often the case in code that you have to do several things in a sequence since each computation is dependent on the one(s) before it:

// ...
// stuff 1
// ...
// stuff 2
// ...
// stuff 3
// ... etc.

Good software techniques will tell you that you should break some of these up into methods:


stuff1();
stuff2();
stuff3();

If it gets big, you can even put it all in a collection and iterate (we're starting to get into weird coding now, I don't think anybody would actually do this):

var collection = new List<Action>() { stuff1, stuff2, stuff3 };
foreach (var func in collection){
  func();
}

Now the part where this would actually be useful. What if some of these functions could potentially be asynchronous? That is, they depend on some value that may not be readily available - maybe user input, maybe some data from a network, etc. Blocking is not usually a great option - a modal dialog demands that the user pays attention to it even if there is something more important somewhere else. It would be better if this computation could "pause" and then resume later on when we get what we need. In some languages including Scheme and Ruby, you can accomplish this using a construct called callcc:

var collection = new List<Action<Action>>() { stuff1, stuff2,
                                              stuff3 };
foreach (var func in collection){
  // pseudo-code warning
  call_cc(func);
}

Here, call_cc() will call func and pass in a function which will start executing right after the call_cc() call: it is a continuation of the loop. When func is done (or when it receives the response it wants), it can call this function to nicely continue executing the loop.

Unfortunately, C# 4.0 and lower do not support anything like callcc. C# 5.0 will support the await and async keywords which will accomplish exactly what we want, but for the time being we'll have to make do with what we have. How can we do that without callcc?

Let's give it a shot using a recursive function:

void AsyncForeach(IEnumerator<Action<Action>> iter){
  if (iter.MoveNext()){
    iter.Current( () => {
      AsyncForeach(iter);
    });
  }
}
void OtherFunc(){
  // ...
  var collection = new List<Action<Action>>() { stuff1, stuff2,
                                                stuff3 };
  
  AsyncForeach(collection.GetEnumerator());
}

This would require every function in collection adhere to the Action<Action> delegate and when it is done, it will need to call the continuation manually in order to resume the computation. This is a bit annoying, and it's why all the BeginConnect, BeginSend, etc. in System.Net require an AsyncCallback to call when they are done. The new async and await keywords will be extremely useful to accomplish our task since everything is called automatically:

var collection = new List<Action>() { stuff1, stuff2, stuff3 };
foreach (var func in collection){
  // func doesn't even need to call anything to
  // keep this thing going!
  await func();
}

It is useful to learn this from approach though. Say we want to halt the loop prematurely from within one of the functions. In that case, the function could simply not call the continuation. That would end our recursion, causing us to break out of our loop - the equivalent of the break keyword. In order to do that with the await keyword we'd have to have some sort of exception handling system, or return type, etc.

We could go even further and implement something similar to Python's for...else construct where if break is called somewhere in the computation it will run the else block:

for i in range(10):
  if i == 5:
    break
else:
  # this is executed
  print "should run"

for i in range(10):
  if i == 12:
    break
else:
  # this is not executed
  print "should not run"

We can do this by adding failure "continuations" to our functions:

void AsyncForeach(IEnumerator<Action<Action, Action>> iter,
                  Action failure){
  if (iter.MoveNext()){
    iter.Current( () => {
      AsyncForeach(iter, failure);
    }, failure);
  }
}
void OtherFunc(){
  // ...
  var collection =
    new List<Action<Action, Action>>() { stuff1, stuff2, stuff3 };
  
  AsyncForeach(collection.GetEnumerator(), () => {
    // handle failure
  });
}

In this case the functions stuff1, stuff2, etc. will call the first function if they should continue looping, or call the second one in case of failure.

There's one final tweak to all of this. At the moment there are two problems with AsyncForeach: it depends on the type of the list we're iterating over (IEnumerator<Action<Action, Action>>), and it does not close over any variables that we may need for the loop. Can we do this using a closure?

In fact, we can:

var collection =
    new List<Action<Action, Action>>() { stuff1, stuff2, stuff3 };

// declare looper early so that it closes over itself
Action looper = null;
var iter = collection.GetEnumerator();
looper = () => {
  if (iter.MoveNext()){
    iter.Current(looper, () => {
      // handle failure
    });
  }
};
// don't forget to start the loop
looper();

Since this isn't so DRY, we can top it all off with a function that returns a function:

Action GetAsyncForeach<T>(IEnumerable<T> collection,
                          Action<T> body){
  var iter = collection.GetEnumerator();
  return () => {
    if (iter.MoveNext()){
      body(iter.Current);
    }
  };
}

void OtherFunc(){
  // ...
  var collection =
    new List<Action<Action, Action>>() { stuff1, stuff2, stuff3 };
  
  Action checker = null;
  checker = GetAsyncForeach(collection, (current) => {
    current(checker, () => {
      // handle failure
    });
  });
  checker(); // start the loop
}

We now have a DRY, re-usable component for implementing an asynchronous foreach loop in our code. It's not the most elegant approach, but it works really well and we don't need that much extra boiler plate to get this done (if C# supported a let rec keyword, we could make it even shorter!).

This is a useful method of looping through some asynchronous tasks that you may have to do. I found myself needing this sort of thing when calling ShowDialog would lock the entire GUI while the system waited for the user to input something, however sometimes the user would have to attend to something else before responding to the dialog. Since later actions in the loop depended on the result of the dialog box, a more asynchronous method was necessary.

Ultimately, this is why I believe that all programmers should have some experience with functional programming; this is a technique that would be obvious to a programmer in Lisp or OCaml but might be a bit trickier to someone who just has OO experience. Having functional programming know-how in your toolbelt will make you a better C# programmer.

Feb 12, 2012

Non-deterministic Programming - Amb

I've been very slowly plowing through SICP and I've recently read through their chapter on non-deterministic programming. When you program this way your variables no longer have just one value, they can take on all of their possible values until stated otherwise. An example:

x = amb(1, 2, 3, 4)

In this case x is all of 1, 2, 3, and 4. If you try to print out x it will print out 1 because the act of printing it temporarily forces a value, but otherwise you can treat the variable as though it had all of those values.

You can then force certain subsets of the values with assertions:

assert x.odd?

In this case x would become just 1 and 3. If you then added a final assertion that x > 2 you would force a single value and x would be 3. If you instead added the assertion x > 3 then x would have no values: an exception would be thrown saying that x is basically "impossible".

This is useful when you are searching for something. Suppose you're trying to find numbers that satisfy Pythagorus' theorem:

a = amb(*1..10)
b = amb(*1..10)
c = amb(*1..10)

assert a**2 + b**2 == c**2

puts a, b, c
next_value
puts a, b, c

This code would print out 3, 4, and 5 on the first output, followed by 6, 8, and 10 on the second. The next_value function would tell the amb system to find another solution to the set of variables that satisfy the assertions we specified. If nothing else is found, an exception will be thrown.

Even more interesting, there is a library in Ruby called amb that implements this stuff. Unfortunately it doesn't work just like the above example, you can only get the either the first set of values that fit, or all of them:

require 'rubygems'
require 'amb'
include Amb::Operator

a = amb(*1..10)
b = amb(*1..10)
c = amb(*1..10)

# calling amb with no arguments causes it
# to backtrack until the criteria is met
amb unless a*a + b*b == c*c

# prints out the first match
puts "#{a}, #{b}, #{c}"

# prints out every match and then crashes
amb
puts "#{a}, #{b}, #{c}"

This is a pretty cool way to program, and it would be interesting to know when it might be practical to use. I tried it with a couple of problems on Project Euler but unfortunately since the backtracking method that amb uses isn't always the most efficient approach it would choke a bit. Perhaps if this gem gets some attention and some love, it might end up as something a bit more performant!

Feb 10, 2012

The Brilliant Design of Magic Ink

I've been plowing through Bret Victor's Magic Ink essay and I've noticed a set of very interesting UI elements that I really enjoy and hope to add to my style of writing (it might even be worth it to add these features to Wordpress' publishing code).

These are:

Anchors at every paragraph: the essay is long. When I sent it to some of the fellow programmers at work, the first thing they did was complain about how long it was. Since you don't usually read a 50 page essay in one sitting, you need a way to bookmark not only the page you are on (useless since everything is on one page) but where you are on that page. Victor takes advantage of HTML's ability to set anchor points within a page to anchor every paragraph so that when you don't really want to read anymore, you can just click on the hash next to paragraph (only visible when you mouse over the paragraph) and it will update your address bar to anchor to that paragraph. This is not exactly fancy modern technology, this entire feature just takes advantage of named anchors and the fact that when you click on one the browser doesn't reload the page. Extremely useful for long essays, yet I can't recall seeing this anywhere else.
Footnotes/endnotes are actually sidenotes. It's a bit annoying when you read an essay online that has footnotes/endnotes with a star or a cross or something and you have to scroll all the way to the bottom to see those notes (some posts make it easier by providing a link, but as vi users know moving your hand to the mouse is a pain). It takes effort and there is a disconnect between when you read what you are writing and navigating to wherever the note is. By putting the note on the side next to the paragraph, it is much easier to just move your eyes to look at the note rather scrolling or clicking. This comes at the cost of screen real-estate, however if you make the actual content of your essay thinner you don't lose a whole lot since there is still a flow when you're reading line-to-line.

I really like these UI tweaks because they are so incredibly simple, yet somehow are not very common practice. If ever I write something that is long and actually decent enough to slog through I hope to remember these little features to help people better read my stuff.

Ubuntu: A Love/Hate Relationship

Feb 27, 2012

Vim Relative Line Numbers

Feb 24, 2012

Asynchronous Function Looping in C#

Feb 12, 2012

Non-deterministic Programming - Amb

Feb 10, 2012

The Brilliant Design of Magic Ink

Blog Archive

About Me

Contact

Labels