Friday, 31 July 2009

Message Relays, Parrot Fashion

Lately I have been thinking a lot about pirates. Pirates are ubiquitous in programming language design but they've had a bad press. This post hopes to redress the balance a little.

I was recently chatting to my pirate buddy Elfonso, and he related to me a problem that he'd recently encountered concerning his four crewmates: Albert, Blackbeard, Cuthbert and Dread. The five of them had taken to living on the desert islands shown on the delightful map below.

The pirates like their privacy and each insists on having one island to himself. Following a dispute over some buried treasure, Elfonso only really talks to Dread; the other four all talk to each other, except Dread and Blackbeard who've never really seen eye to eye (they both have patches). Even when chillaxing on their private islands, the pirates like to send messages to each other, and do so by the classic medium of string and tin cans. Due to a shortage of string, the islands are now wired together somewhat haphazardly, as the map shows. The pirates will forward messages for each other if no direct route is available between the two islands, but they'd really rather not. Elfonso's question was, "Where should each pirate live so that they can all send messages to their friends with the minimum amount of forwarding?"

Let's try a simple arrangement (see left) and see what happens. Let's say Elfonso takes the middle island, and the other four pirates take the four islands that have a direct connection with him. Now when Elfonso and Dread want to exchange messages, they can do it directly; each of the other pairings (Albert with Blackbeard, Cuthbert, and Dread; Dread with Cuthbert and Cuthbert with Blackbeard) will require Elfonso to forward the message. So if each pair of pirates who are on speaking terms exchange one message, that will require five forwards in total. We'll say that this arrangement has a score of 5, remembering of course that a lower score is better. Is it possible to improve on this arrangement?

There is a simple but very time-consuming way to find the best possible arrangement; we simply try every possible combination of pirates and islands, and work out the score for each one, then take the best of the bunch. This is quite feasible for five pirates and eight islands, but (planning ahead) we'd like a method that works for thousands of islands and hundreds of pirates! Even the fastest computer would take hours to solve a problem that large with this "brute force" method, and pirates are notoriously technophobic in any case. Clearly we need something different.

One solution is to use a method which will hopefully find a "fairly good" arrangement very quickly. This allows us to trade off the quality of the arrangement with how long we are willing to spend looking for it. This is where I felt the CPRG could maybe help Elfonso out.

A simple approximate method is to put the pirate with the most friends on the island with the most wires, the second most popular on the second best-connected island, and so on. In this case this leads us to the arrangement on the left. The real weakness in this case is that Dread and Elfonso are at opposite ends of the map, so every message between them will need to be forwarded by Albert and Blackbeard (or Albert and Cuthbert). But overall this isn't too bad; it has a "score" of 6, which is clearly not ideal, but it was quick to work out and in some contexts that would be good enough.

My current work is in finding a method somewhere between these two; still fast enough to solve examples involving large numbers of pirates and islands, but producing better solutions than the naive method of giving the chattiest pirates the best-connected islands. How can we do this? There are many possible approaches. One of the weaknesses of the simple approach above is that it ignores the structure of the islands and the pirate's communications; even if two pirates talk to each other a lot, they won't necessarily be given nearby islands. I am investigating methods which take this structure into account. Early results suggest that it's often possible to get within 10% - 20% of optimal with approaches that run 10 or 20 thousand times faster than brute force.

Having read this far, you may now be scratching your head and wondering how on Earth this could pass for computer science. Maybe you figured it out already. The islands are analogous to computers, joined together in a network; or maybe, processors joined together on a chip. The pirates are particular programs which need to run, and maybe communicate with each other. At this point the simple model of "communicate or don't communicate" breaks down; instead we need to know how often each program needs to send or receive data to each other program, and how fast the links between the computers can provide that data. But in fact this extra structure usually makes it easy to find a "reasonably good" solution, even if it complicates finding optimal solutions.

The final task, of finding an optimal solution for Elfonso and his piratical buddies, is left as an exercise to the reader.