The other day we had a meeting at work with a former colleague (now at QMUL) to discuss general project progress. The topics covered included the somewhat complicated workflow that we're using for doing optical music recognition (OMR) on early printed music sources. It includes mensural notation specific OMR software called Aruspix. Aruspix itself is fairly accurate in its output, but the reason why our workflow is non-trivial is that the sources we're working with are partbooks; that is, each part (or voice) of a multi-part texture is written on its own part of the page, or even on a different page. This is very different to modern score notation in which each part is written in vertical alignment. In these sources, we don't even know where separate pieces begin and end, and they can actually begin in the middle of a line. The aim is to go from the double page scans ("openings") to distinct pieces with their complete and correctly aligned parts.
Anyway, our colleague from QMUL was very interested in this little part of the project and suggested that we spend the afternoon, after the style of good software engineering, formalising the workflow. So that's what we did. During the course of the conversation diagrams were drawn on the whiteboard. However (and this was really the point of this post) I made notes in Haskell. It occurred to me a few minutes into the conversation that laying out some types and the operations over those types that comprise our workflow is pretty much exactly the kind of formal specification we needed.
Here's what I typed:
module MusicalDocuments where import Data.Maybe -- A document comprises some number of openings (double page spreads) data Document = Document [Opening] -- An opening comprises one or two pages (usually two) data Opening = Opening (Page, Maybe Page) -- A page comprises multiple systems data Page = Page [System] -- Each part is the line for a particular voice data Voice = Superius | Discantus | Tenor | Contratenor | Bassus -- A part comprises a list of musical sybmols, but it may span mutliple systems --(including partial systems) data Part = Part [MusicalSymbol] -- A piece comprises some number of sections data Piece = Piece [Section] -- A system is a collection of staves data System = System [Staff] -- A staff is a list of atomic graphical symbols data Staff = Staff [Glyph] -- A section is a collection of parts data Section = Section [Part] -- These are the atomic components, MusicalSymbols are semantic and Glyphs are --syntactic (i.e. just image elements) data MusicalSymbol = MusicalSymbol data Glyph = Glyph -- If this were real, Image would abstract over some kind of binary format data Image = Image -- One of the important properties we need in order to be able to construct pieces -- from the scanned components is to be able to say when objects of the some of the -- types are strictly contiguous, i.e. this staff immediately follows that staff class Contiguous a where immediatelyFollows :: a -> a -> Bool immediatelyPrecedes :: a -> a -> Bool immediatelyPrecedes a b = b `immediatelyFollows` a instance Contiguous Staff where immediatelyFollows :: Staff -> Staff -> Bool immediatelyFollows = undefined -- Another interesting property of this data set is that there are a number of -- duplicate scans of openings, but nothing in the metadata that indicates this, -- so our workflow needs to recognise duplicates instance Eq Opening where (==) :: Opening -> Opening -> Bool (==) a b = undefined -- Maybe it would also be useful to have equality for staves too? instance Eq Staff where (==) :: Staff -> Staff -> Bool (==) a b = undefined -- The following functions actually represent the workflow collate :: [Document] collate = undefined scan :: Document -> [Image] scan = undefined split :: Image -> Opening split = undefined paginate :: Opening -> [Page] paginate = undefined omr :: Page -> [System] omr = undefined segment :: System -> [Staff] segment = undefined tokenize :: Staff -> [Glyph] tokenize = undefined recogniseMusicalSymbol :: Glyph -> Maybe MusicalSymbol recogniseMusicalSymbol = undefined part :: [Glyph] -> Maybe Part part gs = if null symbols then Nothing else Just $ Part symbols where symbols = mapMaybe recogniseMusicalSymbol gs alignable :: Part -> Part -> Bool alignable = undefined piece :: [Part] -> Maybe Piece piece = undefined
I then added the comments and implemented the
part function later
on. Looking at it now, I keep wondering whether the types of the
functions really make sense; especially where a return type is a type
that's just a label for a list or pair.
I haven't written much Haskell code before, and given that I've only implemented one function here, I still haven't written much Haskell code. But it seemed to be a nice way to formalise this procedure. Any criticisms (or function implementations!) welcome.