Tuesday, January 11, 2011

Haskell Northface for Mathematica Mountaineers

Preface


This is not a shootout between the Haskell programming language and Mathematica’s core language features. I simply like to illustrate how to solve problems in both languages using a cook book style approach. Mathematica and Haskell are very different beasts and all we do is toying with different tools at solutions to algorithmic problems. I’d like to adopt the view that programming languages are tools to help the mind to shape a solution for a problem.

Disclaimer: It is rather unlikely that you will learn to program in Haskell or in Mathematica just by reading this article! Simply, get a good book on Haskell, go through the online tutorials and utilize both languages to solve real problems. As always, it comes down to practice!

Introduction


For programmers coming from the C/Java language school or those who are mainly versed in imperative programming paradigms, a transition to functional programming thinking can be like a climb of the Eiger Northface when you're just trained in mountain hiking. Haskell’s quite distinct mathematical theorem like syntax seems rather daunting to understand. For Mathematica users who already practiced the functional style and rule style programming, the Haskell mountain may still be a climb but from higher up. Whatever the individual situation, if you are not an Haskell Illuminati you still have to climb!

Now, there are very good online tutorials and books on Haskell already available. I won’t repeat what is presented there. The Haskell tutorial section is quite extensiveand I’m certain you’ll find some good motivating blogs to get you started.

Personally, the two most unusual concepts about Haskell to call my mountain are the intricate use of the versatile parametrized type system which also handles recursive types and the abstract concept of Monads. Neither of those are prominent features in Mathematica and deserve some good consideration.

Haskell is a type-safe compiled language with limited introspection at runtime. The language is strongly typed. Even better, frequently the compiler can infer data types at compile time without the programmers intervention. Type safety is a really good feature, both compiler and programmer have to agree on data types, thus eliminating numerous errors before the actual program is being deployed and used.
Types in Haskell are quite versatile, almost everything can be typed. Basic primitives like integers, doubles and the usual suspects, or lists or enumeration structures, recursive structures and all sorts of functions have type specifiers. Haskell cranks up the complexity a notch by allowing multiple type parameters, which faintly resemble C++ templates, or Java Generics. Let’s don’t go to the comparison here, but at compile time Haskell can treat functions almost as data.
The concept of pure functions, that is functions with no side-effects other than computing derived data from input data, is a unique feature of functional languages. The term “side-effects” deserves a bit of explanation here. Haskell has no notion of an object comprised of data and methods. This is in contrast to other object oriented language (OOL) implementations where methods are invoked on objects. In OOL, objects are a compound of at least temporarily persistent data and methods. This combination can be thought of in a way of a finite state machine with methods to alter the internal state, query aspects of the state, or to derive new data from external to the object ones via methods that mingle the external data with the internal state data.
In OOL, a construct such as
    Y = obj.method(X)
may result in very different outcomes Y, even when X remains unaltered. The notion of “obj.method” hides all the complexity of the computation, the programmer doesn’t know what it does unless he is completely aware of the internal object state. And the object state can depend on a long history of previous alterations. Needless to say, erroneous assumptions of an object's state is frequently a source of programming flaws, and the result is software failure.
In Haskell, a pure functions is much more reliable,
Y = func X
the outcome Y is assured the same if X is the same, the sameness of X induces the sameness of Y (X -> Y).

Haskell, quickly runs into limitations with pure functions in real world applications. We need to get data in and out of the scope of a Haskell program to do something useful. These so called useful things can be processing program arguments, reading/writing files or data on Unix sockets (all external data connections), and of course displaying data or acting on external user generated events.
Now, that is where the abstract concept of Monads come into the game. It’s greek etymological origin “Monos” is not too enlightening either. Beware of the Greeks...! Again, there are some very good tutorials on Monads out there. Personally, I find this one quite elucidating: Yet Another Monad tutorial

I don’t want to delve into the Monad matter nor recite Vigil here. You’ll find plenty of elucidating material in these online tutorials. However, later on when we could need a Monad, I’m happy to give you my perspective.

Now, let’s move on to Mathematica’s core features. Mathematica is an interpreted language. At it’s core is the Lisp type symbolic expression (Sexpr). Essentially everything is an expression, data, functions, procedural programming flow control commands, variable symbols, you name it!
In short, a symbolic expression is like a list {a,b,c,d,...} that starts with a discriminator head: “Integer”, “List”, “Plus”, “If”. Mathematica syntax allows this:
(1)
Integer@@{1},  Plus@@{1,2}, If@@{Equal@@{x,y}, 1, 2},
.
Interestingly, all sorts of complicated stuff, data, procedural programs, pure functions, mathematical equations, symbolic transformations can be encoded in the symbolic expression language. In the early years of Lisp, the language designers still made a distinction between data encoded in Sexpr, versus the data transformations specified in meta expressions (Mexpr). However, it was quickly realized that data and transformations can be conceptually encoded the same way, namely via Sexpr. Hence, there is almost no distinction between data and a program! The examples in (1) are not very nice to read nor to write. Therefore, Mathematica provides plenty of syntactic display sugar and writing shortcuts to hide the fact that everything it internally represents is a simple symbolic expression. Here are some equivalent inputs:

Convenient notationPrefix notation (FullForm)Mathematica (Sexpr) notation
1Integer@@{1}
{1, a, {2, b}}List[1, a, List[2,b]]{1, Symbol@@{“a”}, {2, b}}
1 + 2Plus[1, 2]Plus@@{1,2}
1 + 2 * 3Plus[1, Times[2, 3]]Plus@@{1, Times@@{2,3}}
If[x==y, 1, 2]If[Equal[x,y], 1, 2]If@@{Equal@@{x,y}, 1, 2}
myFunc[x_Integer] = x^2Set[myFunc[x_Integer], Power[x,2]]see below (3)


The universality (homoiconicity) of symbolic expressions allows data to be packed together with methods or programs, hence all different programming styles can be mimiced to some degree. Static type checking doesn’t really exist, Mathematica is not a compiled language, but since symbolic expressions start with a head that discriminates what the expression is, type safety can be build in dynamically. In practice, dynamic type safety in Mathematica comes in the form via matched argument patterns:
(2)

myFunc[x_Integer] = x^2
which means, only use this transformation function when the argument's symbolic expression head is an integer. Truthfully, functions in Mathematica and also in Lisp are unlike functions in any compiled language, Haskell is no exception here!
A function in Mathematica or Lisp are transformation rules on Sexpr, and the transformation rules are also encoded as Sexpr. Hence, you can have very intricate self modifying programming systems, where one can build functions from so called data or modify functions via other functions. In Mathematica, function modification goes way beyond the usually known functional composition. Well, by now you notice that the distinction between data and a function is completely blurred.
In support of that claim we can give the function definition in (2) as an equivalent  Sexpr
(3)
Set@@{myFunc@@{Pattern@@{x, Blank@@{Integer}}}, Power@@{x,2}}
The capability to treat functions as data, obviates the need to introduce concepts such as highler level functions, meta-programming, or parametrized template programming, or generics or whatever. Also, there is no need to introduce new language constructs in order to group data and methods together into what is called object class hierarchies. Naturally, a list of symbolic expressions  already constitutes such a hierarchical grouping.

Lazy Haskell


Amongst the features, that  distinguishes Haskell from other languages is the pervasive use of lazy evaluation, it can be both a blessing and a curse. Let’s be precise what we mean. In Mathematica, we could evaluate the square root of the first 10 elements by firstly evaluating 1 000 000  square roots, then taking the first 10 results:


Timing[Take[Map[N[Sqrt[#]]&, Range[1000000]], 10]]

This takes a relatively long time (16s on 2.4 Ghz core duo 2), and it is pretty stupid to compute things that you don’t need anyway!

A direct translation into Haskell would read like this:

take 10 (map sqrt [1..1000000])

The answer is computed very quickly, only 10 square roots are being evaluated! Practically, the specified range in the map function can be semi-infinite [1..] and only the first 10 elements will ever be computed. Indeed, the example above is too simplistic. It is more a statement to avoid stupid things rather than a showcase for lazy evaluation. One could easily mend Mathematica’s wasted computation by pruning the range!

Now, let’s embark on some more complicated stuff. Say, we have a list of 2n elements, how can we generate all possible pairing between the distinguishable elements of that list? What is it good for? For a moment, let’s think of a group of 2n people and we would like to match them up.
Mathematically, this is called a fixed-point free involution. In Mathematica a list of 4 elements should give us 3 possible pairings:

Involutions[{1,2,3,4}]  -> {{{1, 2}, {3, 4}}, {{1, 3}, {2, 4}}, {{1, 4}, {2, 3}}}
.
Again, it shows that everything is a list in Mathematica; here a list of lists of lists. In Haskell, we would like to represent the involutions via the following representation

involutions [1,2,3,4]  -> [[(1, 2), (3, 4)], [(1, 3), (2, 4)], [(1, 4), (2, 3)]]

utilizing a list of lists of tuples (tuples have the characteristics of immutable lists).
Firstly, we’ll give some Mathematica code to generate the involutions:
(P1)
Involutions[{a_,b_}] := {{{a,b}}}
Involutions[pool_List] := Block[{n = Count[pool, _]}, If[n <= 2,
  {{pool}},
  Flatten[
   Map[
    Flatten[
      Outer[Join,
       {{{pool[[1]], pool[[#]]}}},
      Involutions[
        Drop[pool, {#}] // Rest
        ],
       1
       ],
      1] &,
    Range[2, n]],
   1]
  ]
 ]
.
The basic idea behind this algorithm is to select a pair (1,k) between the first and iteratively the 2nd to the 2n-th element of the given list, then to combine  the (1,k) pair with the involutions from a list where 1 and k is dropped. Sounds simple! Obviously, recursion plays a big role to solve this problem elegantly.

Now onto Haskell:
(P2)
import Data.List
involutions :: [Int] -> [[(Int,Int)]]
involutions [] = []
involutions [a] = []
involutions [a, b] = [[(a,b)]]
involutions (p0:ps) =
           let
               cycleset = init (scanl (\(x:xs) idx -> xs ++ [x]) ps ps)
           in
               concatMap
                   (\(p1:ps1) -> (p0,p1) `combine` (involutions ps1))
                   cycleset
           where
               combine :: (Int, Int) -> [[(Int,Int)]] -> [[(Int,Int)]]
               combine pair [] = [[pair]]
               combine pair (i0:is) = map (\partial -> pair : partial) (i0:is)


Mathematica has a few functions that don’t translate exactly into Haskell, Flatten, Outer, Drop are the ones used in (P1) that don’t have an exact translation or functionality in Haskell. How about translating the Haskell (P2) back into Mathematica with greater similarity?
(P3)
Involutions[{}] = {};
Involutions[{_}] := {};
Involutions[{a_, b_}] := {{{a, b}}}
Involutions[{p0_, ps__}] :=
Block[{cycleset = Drop[FoldList[RotateRight[#1, 1] &, {ps}, {ps}], -1],
  Combine = Function[{pair, inSet}, Map[Join[{pair}, #] &, inSet]] },
 Flatten[
  Map[{p0, #1 // First}~Combine~Involutions[Drop[#1, 1]] &, cycleset], 1]
 ]


The program examples (P1-P3) already exhibit a good a amount of idiosyncrasies in Mathematica’s functional style programming and the translations into the Haskell language. In many cases, it is possible to find equivalent Mathematica constructs for Haskell expressions, i.e. concatMap <-> Flatten[Map[ …], 1], scanl <-> FoldList , to name a few. Local variables and local functions as presented in (P2) in the ‘let’ and ‘where’ sections are placed in Mathematica’s ‘Block’ variable section. Function argument patterns are quite similar in both programming systems. On the other hand, there are many Mathematica rule based and procedural programming styles, that don’t translate well into Haskell.

However, the superficiality of the syntax beguiles the stark computational differences between Mathematica and Haskell. Haskell still uses lazy evaluation, try this in Haskell,
take 2 (involutions [1..100])
.
Now try the same in Mathematica,

Take[Involutions[Range[1,100]], 2]


and you may wait for an answer for an inordinate amount of time! This is a much better example that demonstrates the virtues of lazy evaluation. Haskell’s pull processing model only calculates required data. The default operating mode in Mathematica is the push processing model.

Although, Mathematica has some machinery to defer processing via ‘Hold’ type transformations, it would be a non-trivial task to apply these Hold transformations in the case of (P3) in order to mimic lazy evaluation a la Haskell.


Sequential thinking bubbles up, beware of the Monads!

We would like to apply a somewhat intricate test on our involutions. Specifically, we’d like to test whether an involution is disconnected. Involutions can be thought of as distinguishable beads on a string, one bead for each element in the list. Then, one connects only one chords between each two beads. In summery, there are (2n-1)!! configurations for 2n beads. Now, a disconnected involution is one where one can cut the string between the beads, pulling two pieces apart without cutting any chords.
Here’s some Mathematica code that will do the test
(P4)
IsDecomposable[{{}}] = False;
IsDecomposable[involution_List] :=
   Block[
       {triplesList = Map[{#[[1]], #[[2]], #[[2]] - #[[1]]} &, pairs], steppingList},
       steppingList =
           Fold[ReplacePart[#1, {#2[[1]] -> #2[[3]], #2[[2]] -> -#2[[3]]}] &,
           Range[2 Length[pairs]], triplesList];
       Return[Length[Select[Drop[Accumulate[steppingList], -1], # == 0 &]] > 0]]


A major difference in Haskell is, that there is no way to modify an existing list by replacing elements. Mathematica’s 'ReplacePart' doesn’t exist in Haskell. However, all we need is to quickly construct a new list with a specific element replaced:
(P5.1)
replacePart _ _ [] = []
replacePart pos e l@(y:ys) = concat [take pos l, [e], drop (pos+1) l])

Hence, the Haskell code to perform the same test can be formulated as:
(P5.2)
decomposable :: [(Int,Int)] -> Bool
decomposable [] = False
decomposable [(_,_)] = False
decomposable involution@(x:_) =
           let
               accumulativeList = accumulate steppingList
               steppingList = foldl'
                   (\l s -> replacePart (fst.fst $ s) (snd s)
                                              (replacePart (snd.fst $ s) ((*) (-1) $ snd s) l))
                   initialList
                   triplesList
               triplesList = map (\p -> ((fst p - 1, snd p - 1), snd p - fst p)) involution
           in
               not.null $ filter (== 0) $ init accumulativeList
           where
               accumulate :: [Int] -> [Int]
               accumulate [] = [0]
               accumulate (y:ys) = scanl (+) y ys
               initialList = [0 | iter <- [1..n]] :: [Int]
               n = (*2) $ length involution

Notice, there is only one easily readable function chain in the “in” section. All the aiding definitions are freely distributed in the “let” and “where” sections. Quite interesting is the fact, that the aiding definitions are not written in the sequence in which they are needed. With non-strict evaluation, Haskell figures out the sequence itself! Also, the omission of commas in a function argument sequence leads to grouping ambiguities that need to be resolved with either plenty of parenthesis or an application de-lineator ($). In Haskell, one finds plenty of type definitions, i.e. “accumulate :: [Int] -> [Int]”, stating that accumulate is a function that takes in and produces a list of integers. Otherwise, the list related computational aspects are rather similar in both programs (P4) and (P5).

Let’s print out the first non-decomposable pairing. Naively, we could program a very simple “for loop” with a break statement in Mathematica:
(P6)
invList = Involutions[Range[1, 12]];
For[i = 0, i < Length[invList], i++,
   Print["Looping ", i];
   If[IsDecomposable[invList[[i]]],
       Continue[],
       Print["not decomposable ", invList[[i]]];
       Break[];
   ]
]
.
The shortcomings in (P6) are quite clear. Firstly, Mathematica needs to compute the full list of all involutions, remember there are (2n-1)!! many ones, and then Mathematica loops over it only partially. What a waste!


Astonishingly, an almost imperative sequential “for loop” can be used in Haskell to print out the first non-decomposable involution

(P7.1)
instance MonadPlus IO where
   mzero       = ioError (userError "break point")
   m `mplus` n = m `catch` \_ -> n
main :: IO ()
main = forM_ (involutions [1..12]) $ \x -> do
   putStr "Looping: "
   if (not.decomposable) x
   then
       do
           putStr "not decomposable "
           print x
           guard False
   else return ()

, but do things look strange here! Also, where did the iteration variable “i” go?


Well, I guess it is the time to say something not too verbose and self evident and stupid about Monads. If you heard about Monads in Haskell before and you associate with them unpure functions that do computer IO things or think of them as special function values (monadic value), then please try to forget this association at least temporarily! A large number of tutorials is devoted to make the abstract and language agnostic concept of Monads more palatable. If there is a correlation between an abstract concept and the number of tutorials devoted to it, then we have a proof that Monads are rather abstract. You can follow many different paths to your Monad enlightenment, don't depend on my examples here!


Digging deeper into the “forM_” definition, it states that “forM_” consumes an ordinary value [a0, a1, a2, …] out of a list and places it into a monadic context, whereupon everything is chained together. (P7.1) can be rewritten,

(P7.2)
instance MonadPlus IO where
   mzero       = ioError (userError "break point")
   m `mplus` n = m `catch` \_ -> n


mF :: [(Int,Int)] -> IO ()
mF = \x -> do
   putStr "Looping: "
   if (not.decomposable) x
   then
       do
           putStr "not decomposable "
           print x
           guard False
   else return ()


main :: IO ()
main = mapM_ mF (involutions [1..12])

where “mF” ingests an ordinary value and produces a monadic value of type IO (). Now, how does this help to understand the "for loop" in (P7.1) and to bring back the iteration variable “i”? We’re almost there. The missing link is to realize, that the “mapM_” function also chains together the list of monadic values “mF a”:


mF a0 >> mF a1 >> mF a2 >> ...

All monadic types in a chain must be the same! Indeed, the compiler tries hard to verify the proper types. Here in our case, it is type IO () for input-output with no values handed down the chain.

Equally well, we could pass values (monadic values) between the chain elements, and thus quickly implement a loop counter like this:
(P7.3)

mF2 :: Int -> [(Int,Int)] -> IO Int
mF2 =
       \iter x -> do
           putStr "Looping: "
           print iter
           if (not.decomposable) x
           then
               do
                   putStr "not decomposable "
                   print x
                   guard False
                   return (iter+1)
           else return (iter+1)

main2 :: IO Int
main2 = foldM mF2 0 (involutions [1..12])

Now, the monadic chain of type “IO Int” looks like this:

let mF2f = flip mF2
return 0 >>= mF2f a0 >>= mF2f a1 >>= mF2f a2 >>=...

where let mF2f = flip mF2 defines a function with reversed arguments. A bit more Haskell idiosyncratic is the following syntax

return 0 >>= \idx -> mF2 idx a0 >>= \idx -> mF2 idx a1 >>= \idx -> mF2 idx a2

.One can view monadic chains as a sort of type prison, the monad type is immutable in the chain. A monadic value that progresses down the chain, can’t have its type altered. A monadic IO type chain cannot transform itself into a pure function again. Therefore, the Haskell compiler can safely assume, that the resulting value of the chain is “IO Int”. As an added benefit, the compiler now knows that it must apply strict evaluation in sequence and to consider IO side effects. Please, don't think that Monads are all about IO interactions! The Monad type class and the Monad laws are there to assure the integrity of a Monad chain and to provide a nice interplay with Monad agnostic functions, "mapM, liftM ..." . Some libraries have Monad type constructors without side effects and the need for strict evaluation.


Of course, we could have avoided the “for loop” construction entirely and simply play with pure functions!
take 1 $ filter (not.decomposable) $ involutions [1..12]
Well, thats enough for this article. I plan to tackle more fun problems like solving a Sudoku puzzle in as few steps as deterministically possible or to implement probabilistic updates on Junction Trees in the next articles.

No comments:

Post a Comment