Thursday, 1 March 2012

Queries in general-purpose programming languages

The CPRG group has informal meetings where we discuss various topics related to programming langauges including current trends in programming languages, new features or other topics that some of our members are interested in.

At the last meeting, I gave a brief presentation about query support in general-purpose programming languages. The talks does not, obviously, cover the broad range of research that has been done in the area - instead, I focused on some langauges that I'm familiar with and at features that make them interesting in some way, so I talked about LINQ (in C#), monad comprehensions in Haskell and customizable queries in F# 3.0.

The slides from the talk can be found on SlideShare:



One of the topics that are discussed in the talk is how the join clause in LINQ corresponds to the operation of applicative functors. To find out more about this topic, you can also read Beyond the Monad fashion (I.): Writing idioms in LINQ and Beyond the Monad fashion (II.): Creating web forms with LINQ on my blog.

Monday, 3 May 2010

Video Talk: Haskell Type Constraints Unleashed

Unfortunately, due to the air travel disruption in Europe in April I was unable to fly to Japan to attend FLOPS 2010 to present my paper on extensions to Haskell's type constraints system. Fortunately my inability to attend provided motivation to record a video talk. See below for links to the video and to the paper.

Tuesday, 9 February 2010

A domain specific language for binary file formats

I recently wrote a parser for the Flash file format specification, in Haskell. I managed to build support for reading and writing all the Flash 10 specification (apart from the embedded multimedia formats, such as h.264 and MP3) in only three days of committed coding.

How did I implement this 250+ page specification in such a short amount of time? Simple - I took the specification and made it executable.

The specification is full of tables describing the on-disk format that look like this:



At first, I began by manually coding up my library from these tables. Unfortunately, each table corresponds to no less than three bits of Haskell:


  • A data type definition for the record (called FOCALGRADIENT in this case)

  • A function for reading that record from the file (of type SwfGet FOCALGRADIENT: SwfGet is a state monad encapsulating a ByteString)

  • A function for writing that thing back (of type FOCALGRADIENT -> SwfPut (), where SwfPut is another monad)



It quickly became obvious that writing this code (and keeping the three versions in sync!) was totally untenable.

My solution was to turn my parser into a Literate Haskell file which can contain sections like the following:

   p18: RGB color record
\begin{record}
RGB
Field Type Comment
Red UI8 Red color value
Green UI8 Green color value
Blue UI8 Blue color value
\end{record}


I then have a preprocessor which replaces these wholesale with code implementing the data type, reader and writer. For our running example, we get this:

   data RGB = RGB{rGB_red :: UI8, rGB_green :: UI8, rGB_blue :: UI8}
deriving (Eq, Show, Typeable, Data)
getRGB
= do rGB_red <- getUI8
rGB_green <- getUI8
rGB_blue <- getUI8
return (RGB{..})
putRGB RGB{..}
= do putUI8 rGB_red
putUI8 rGB_green
putUI8 rGB_blue
return ()


Choice and Repetition



Frequently, some part of the record is repeated a number of times that depends on an earlier part of the record. In other cases, the exact shape a field takes can depend on an earlier one. For example:

    p96: ActionConstantPool
\begin{record}
ActionConstantPool
Field Type Comment
ActionConstantPool ACTIONRECORDHEADER ActionCode = 0x88
Count UI16 Number of constants to follow
ConstantPool STRING[Count] String constants
\end{record}


This record contains a field-controlled repetition count. My generator is smart enough to:


  • Generate a Haskell expression from the repetition count within the square brackets. In this case it is just Count, but in general this can contain arithmetic and references to several fields.

  • Omit the Count field from the generated data type, and infer it from the length of the ConstantPool when we come to put this record back



There is similar support for choice. For example, look at the CodeTable field in DefineFontInfo:

    p177: DefineFontInfo
\begin{record}
DefineFontInfo
Field Type Comment
Header RECORDHEADER Tag type = 13
FontID UI16 Font ID this information is for.
FontNameLen UI8 Length of font name.
FontName UI8[FontNameLen] Name of the font (see following).
FontFlagsReserved UB[2] Reserved bit fields.
FontFlagsSmallText UB[1] SWF 7 file format or later: Font is small. Character glyphs are aligned on pixel boundaries for dynamic and input text.
FontFlagsShiftJIS UB[1] ShiftJIS character codes.
FontFlagsANSI UB[1] ANSI character codes.
FontFlagsItalic UB[1] Font is italic.
FontFlagsBold UB[1] Font is bold.
FontFlagsWideCodes UB[1] If 1, CodeTable is UI16 array; otherwise, CodeTable is UI8 array.
CodeTable If FontFlagsWideCodes, UI16[] Otherwise, UI8[] Glyph to code table, sorted in ascending order.
\end{record}


The field changes representation based on FontFlagWideCodes - this is represented in the corresponding Haskell record as an Either type, and the FontFlagsWideCodes flag is not put in the record at all, as it can be unambiguously inferred from whether we are Left or Right.

Bit Fields



The fields we saw in DefineFontInfo with the name UB were actually bit fields. UB[n] is actually an unsigned number with n bits in it. Here is another example:

    \begin{record}
BLURFILTER
Field Type Comment
BlurX FIXED Horizontal blur amount
BlurY FIXED Vertical blur amount
Passes UB[5] Number of blur passes
Reserved UB[3] Must be 0
\end{record}


Bit fields pose a problem - how should they be represented in Haskell? If I know the number of bits statically, there might be a suitable type I can choose. For example, my generator will use Bool for UB[1] fields. Unfortunately, in general I'll have to choose a Haskell type which is large enough to hold the maximum possible number of bits I might encounter.

What I have chosen to do is represent any UB[n] field (where n is not 1) with a Word32 (32 bits seems to be the upper limit on any variable-length bitfields I've encountered in practice, though this isn't part of the specification). What this means, however, is that my generator must produce runtime assertions in the writing code to ensure that we won't lose any information by truncating the Word32 to the number of bits we actually have available.

Abstraction



Sometimes, the structure of a record varies depending on the context in which it is used. For example, the GLYPHENTRY record can only be read and written if we know the number of bits used for its two fields. This is encoded as follows:

    p192: Glyph entry
\begin{record}
GLYPHENTRY(GlyphBits, AdvanceBits)
Field Type Comment
GlyphIndex UB[GlyphBits] Glyph index into current font.
GlyphAdvance SB[AdvanceBits] x advance value for glyph.
\end{record}


The "parameter list" after GLYPHENTRY in the header must be filled out with concrete arguments by any user of GLYPHENTRY, such as in the GlyphEntries field of the following:

    \begin{record}
TEXTRECORD(TextVer, GlyphBits, AdvanceBits)
Field Type Comment
TextRecordType UB[1] Always 1.
StyleFlagsReserved UB[3] Always 0.
StyleFlagsHasFont UB[1] 1 if text font specified.
StyleFlagsHasColor UB[1] 1 if text color specified.
StyleFlagsHasYOffset UB[1] 1 if y offset specified.
StyleFlagsHasXOffset UB[1] 1 if x offset specified.
FontID If StyleFlagsHasFont, UI16 Font ID for following text.
TextColor If StyleFlagsHasColor, If TextVer = 2, RGBA Otherwise RGB Font color for following text.
XOffset If StyleFlagsHasXOffset, SI16 x offset for following text.
YOffset If StyleFlagsHasYOffset, SI16 y offset for following text.
TextHeight If StyleFlagsHasFont, UI16 Font height for following text.
GlyphCount UI8 Number of glyphs in record.
GlyphEntries GLYPHENTRY(GlyphBits, AdvanceBits)[GlyphCount] Glyph entry (see following).
Padding PADDING8 Padding to byte boundary
\end{record}


Records abstracted over parameters generate Haskell reading and writing code which is lambda-abstracted over those same paramaters, just as you might expect. Likewise, occurrences of argument lists correspond to applications of arguments to a nested reading or writing function.

Conclusion



Having a declarative description of the contents of SWF files helped enormously. It had the following benefits:


  • Very rapid development. I essentially just pasted in the contents of the specification for each record. This also led to a very low rate of bugs in the code (though I found some specification bugs, I can hardly be blamed for those :-)

  • Easy post-hoc changes when adding new features.
    • I only added assertions to check bit fields lengths quite late on in development, and it was a painless generator change. If I had handwritten everything I would have had to do a lot of tedious and error-prone programming to add a small feature like this.

    • I could make use of the declarative information to generate reasonable error messages for showing to the user when they try to write back an invalid SWF. Again, this is something I decided to do after transcribing the specification.

  • Very easy to read and spot bugs. Legibility was a goal when designing the DSL.

  • It's relatively easy to change to using custom Haskell code for dealing with those records that can't be expressed in the DSL - you can copy and paste the unsatisfactory generated Haskell and then mold it to fit your needs.



Using custom code (as per the last bullet point) is sometimes indispensable for making the Haskell records look nicer. For example, you might need to write it so you can stop generating this sort of record:

    data RECT = RECT { rECT_nbits :: UI8, rECT_xmin :: SB, rECT_xmax :: SB, rECT_ymin :: SB, rECT_ymax :: SB }


And, by observing that you can compute nbits by the maximum of the base-2 logarithms of the 4 SB fields, generate this one instead:

    data RECT = RECT { rECT_xmin :: SB, rECT_xmax :: SB, rECT_ymin :: SB, rECT_ymax :: SB } 


The fact that you have to use copy and paste to get this done is something which I'm not totally happy with. I've been looking at approaches to turn my DSL into
much more of a combinator library (i.e. taking the focus away from using an external code generator).

A combinator based approach would make it easy to add extra primitive combinators that know about special cases like one - but more on that another time!

Interested souls can get the code for my SWF library on GitHub. It can successfully roundtrip all of the example files that I could gather from the Gnash and Flash Gordon projects.

Monday, 23 November 2009

Slides from talk: Ypnos: Declarative, Parallel Structured Grid Programming

Slides from this talk, given on Friday 20th November at the CPRG weekly seminar, can be found here. The paper accompanying the talk can be found here.

Talk abstract:

A fully automatic, compiler-driven approach to parallelisation can result in unpredictable time and space costs for compiled code. On the other hand, a fully manual approach to parallelisation can be long, tedious, prone to errors, hard to debug, and often architecture-specific. This talk presents a declarative domain-specific language, Ypnos, for expressing structured grid computations which encourages manual specification of causally sequential operations but then allows a simple, predictable, static analysis to generate optimised, parallel implementations. Ypnos is sufficiently restricted such that optimisation and parallelisation is guaranteed.

Friday, 31 July 2009

Message Relays, Parrot Fashion

Lately I have been thinking a lot about pirates. Pirates are ubiquitous in programming language design but they've had a bad press. This post hopes to redress the balance a little.

I was recently chatting to my pirate buddy Elfonso, and he related to me a problem that he'd recently encountered concerning his four crewmates: Albert, Blackbeard, Cuthbert and Dread. The five of them had taken to living on the desert islands shown on the delightful map below.



The pirates like their privacy and each insists on having one island to himself. Following a dispute over some buried treasure, Elfonso only really talks to Dread; the other four all talk to each other, except Dread and Blackbeard who've never really seen eye to eye (they both have patches). Even when chillaxing on their private islands, the pirates like to send messages to each other, and do so by the classic medium of string and tin cans. Due to a shortage of string, the islands are now wired together somewhat haphazardly, as the map shows. The pirates will forward messages for each other if no direct route is available between the two islands, but they'd really rather not. Elfonso's question was, "Where should each pirate live so that they can all send messages to their friends with the minimum amount of forwarding?"

Let's try a simple arrangement (see left) and see what happens. Let's say Elfonso takes the middle island, and the other four pirates take the four islands that have a direct connection with him. Now when Elfonso and Dread want to exchange messages, they can do it directly; each of the other pairings (Albert with Blackbeard, Cuthbert, and Dread; Dread with Cuthbert and Cuthbert with Blackbeard) will require Elfonso to forward the message. So if each pair of pirates who are on speaking terms exchange one message, that will require five forwards in total. We'll say that this arrangement has a score of 5, remembering of course that a lower score is better. Is it possible to improve on this arrangement?

There is a simple but very time-consuming way to find the best possible arrangement; we simply try every possible combination of pirates and islands, and work out the score for each one, then take the best of the bunch. This is quite feasible for five pirates and eight islands, but (planning ahead) we'd like a method that works for thousands of islands and hundreds of pirates! Even the fastest computer would take hours to solve a problem that large with this "brute force" method, and pirates are notoriously technophobic in any case. Clearly we need something different.

One solution is to use a method which will hopefully find a "fairly good" arrangement very quickly. This allows us to trade off the quality of the arrangement with how long we are willing to spend looking for it. This is where I felt the CPRG could maybe help Elfonso out.


A simple approximate method is to put the pirate with the most friends on the island with the most wires, the second most popular on the second best-connected island, and so on. In this case this leads us to the arrangement on the left. The real weakness in this case is that Dread and Elfonso are at opposite ends of the map, so every message between them will need to be forwarded by Albert and Blackbeard (or Albert and Cuthbert). But overall this isn't too bad; it has a "score" of 6, which is clearly not ideal, but it was quick to work out and in some contexts that would be good enough.

My current work is in finding a method somewhere between these two; still fast enough to solve examples involving large numbers of pirates and islands, but producing better solutions than the naive method of giving the chattiest pirates the best-connected islands. How can we do this? There are many possible approaches. One of the weaknesses of the simple approach above is that it ignores the structure of the islands and the pirate's communications; even if two pirates talk to each other a lot, they won't necessarily be given nearby islands. I am investigating methods which take this structure into account. Early results suggest that it's often possible to get within 10% - 20% of optimal with approaches that run 10 or 20 thousand times faster than brute force.

Having read this far, you may now be scratching your head and wondering how on Earth this could pass for computer science. Maybe you figured it out already. The islands are analogous to computers, joined together in a network; or maybe, processors joined together on a chip. The pirates are particular programs which need to run, and maybe communicate with each other. At this point the simple model of "communicate or don't communicate" breaks down; instead we need to know how often each program needs to send or receive data to each other program, and how fast the links between the computers can provide that data. But in fact this extra structure usually makes it easy to find a "reasonably good" solution, even if it complicates finding optimal solutions.

The final task, of finding an optimal solution for Elfonso and his piratical buddies, is left as an exercise to the reader.

Tuesday, 19 May 2009

Dictionaries: lazy or eager type class witnesses

I gave a talk at the Cambridge computer lab on May 15, 2009:

Type classes are Haskell’s acclaimed solution to ad-hoc overloading. This talk gives an introductory overview of type classes and their runtime witnesses, dictionaries. It asks the questions whether dictionaries should abide by Haskell’s default lazy evaluation strategy.

Conceptually, a type class is a type-level predicate: a type is an
instance of a type class iff it type provides an implementation for
overloaded functions. For instance, `Eq a’ declares that type `a’
implements a function `(==) :: a → a → Bool’ for checking equality.

Type classes are used as constraints on type variables, in so-called
constrained polymorphic functions. E.g. `sort :: Ord a => [a] → [a]’
sorts a list with any type of elements `a’ that are an instance of the
Ord type class, i.e. provide implementations for comparison.

Witnesses for type class constraints are necessary to select the appropriate implementation for the overloaded functions at runtime. For instance, if `sort’ is called with Int elements, the Int comparison must be used, versus say Float comparison for Float elements.

Two forms of witnesses have been considered in the literature, runtime type representations and so-called dictionaries, of which the latter are the most most commonly implementation, e.g. in GHC . Haskell implementations treat dictionaries just like all other data, as lazy values that may potentially consists of non-terminating computations. This way part of the type checker’s work, who has made sure that the dictionaries do exist, is simply forgotten. Is this really necessary? Can performance be gained by exploiting the strict nature of dictionaries?

You can get the slides here.

Wednesday, 8 April 2009

Compiler research centres

As Mary Hall, David Padua and Keshav Pingali observed in a February 2009 Commun. ACM article:
There are few large compiler research groups anywhere in the world today. At universities, such groups typically consist of a senior researcher and a few students, and their projects tend to be short term, usually only as long as a Ph.D. project.
This is certainly the case in the UK. Imperial's Software Performance Optimisation group is a blueprint of that (OK, with a post-doc sandwiched between the senior researcher and students).  Whilst other research groups in the UK may appear to be larger, compiler researchers typically represent a smaller subset of researchers working on programming languages or computer architecture. For example, in Cambridge the Programming group is part of the Programming, Logic, and Semantics group (and a recent article in the IET magazine suggests that a senior member of the Programming group also belongs to the Computer Architecture group); in Oxford the Programming Tools group is part of the Programming Languages group; in Manchester and Edinburgh compiler researchers work in the Advanced Processor Technologies group and the Compiler and Architecture Design group, respectively. Note that only the Edinburgh group prides themselves as compiler researchers (at least if one judges by the group name). 

In the same Commun. ACM article we read the following recommendation:
...researchers must attempt radical new solutions that are likely to be lengthy and involved. The compiler research community (university and industry) must work together to develop a few large centers where long-term research projects with the support of a stable staff are carried out. Industry and funding agencies must work together to stimulate and create opportunities for the initiation of these centers.
The situation is definitely getting better in the US. In April 2009, The Defense Advanced Research Projects Agency (DARPA) awarded $16 million to the Platform-Aware Compilation Environment (PACE) project at Rice University, as part of its Architecture Aware Compiler Environment (AACE) programme. In March 2007, Intel and Microsoft announced awarding a combined $20 million grant to the Parallel Computing Laboratory at UC Berkeley and to the Universal Parallel Computing Research Centre at Illinois (with the universities reportedly applying to additional funding of $7 and $8 million, respectively, to match the industry grant). No doubt, these parallel computing centres will spend big on compiler projects (e.g. PACE is explicitly focused on compilers).

What about compiler funding the UK? From what I see (by looking at the number of PhD studentships, post-doctoral fellowships and permanent positions), the UK spending on compiler research seems to have been monotonically increasing over the past couple of years. However, this growth does not have as much support as in the US, so the UK risks losing its competitiveness in this field.