Chapter 7: Trees

I start off confidently, typing in the first example into my scratch
file:

 > data Tree = Leaf a | Branch (Tree a) (Tree a)

Of course, ghci complains bitterly, and strangely

    Prelude> :l 7.hs    [1 of 1] Compiling Tree             ( 7.hs, interpreted )
    7.hs:5:19: Not in scope: type variable `a'    7.hs:5:36: Not in scope: type variable `a'    7.hs:5:45: Not in scope: type variable `a'
    Failed, modules loaded: none.

Eventually I realise I started off a little too confidently and left out
the parametrised variable…

 > data Tree a = Leaf a | Branch (Tree a) (Tree a)

I'm trying to practise my syntax by reading the definitions of the
functions (mapTree, fringe) etc., and then writing
them out before compiling and checking in the book. As it happens I've
previewed them before, but at least this helps get the syntax into my
head!

I'm still finding the error messages hard to read:

 7.hs:13:27:    Couldn't match expected type `a' (a rigid variable)           against inferred type `[a]'      `a' is bound by the type signature for `fringe' at 7.hs:11:19    In the expression: fringe x    In the first argument of `(++)', namely `[fringe x]'    In the expression: [fringe x] ++ [fringe y]

This is because the expression at the bottom should have been

    fringe x ++ fringe y

as fringe already returns a list.

Curried datatypes

I thought I'd test the functions and tried it out

 *Tree> fringe t <interactive>:1:7:    Couldn't match expected type `Tree a'           against inferred type `Tree Integer -> Tree Integer'    In the first argument of `fringe', namely `t'    In the expression: fringe t    In the definition of `it': it = fringe t

Which I thought was odd. Where was the “Tree Integer -> Tree
Integer”
coming from?

 *Tree> :t t t :: Tree Integer -> Tree Integer

I tried constraining the definition of my tree, t:

 > t :: Tree Integer > t = Branch (Branch (Leaf 2) (Leaf 3)) 

Which of course gave a similar error, and I eventually realised that I'd
missed off the second branch…

 > -- much better > t = Branch (Branch (Leaf 2) (Leaf 3)) (Leaf 4)

What can I say, it's been a long day at work… Anyway, with a
structure like this I can play with the functions

 *Tree> fringe $ mapTree (*2) t [4,6,8]

Ex 7.1 Look at the definitions of fringe and treeSize.

Are there commonalities that can be abstracted out, like they were
when we developed fold?

 > fringe :: Tree a -> [a] > fringe (Leaf x)     = [x] > fringe (Branch x y) = fringe x ++ fringe y
 > treeSize :: Tree a -> Integer > treeSize (Leaf _) = 1 > treeSize (Branch x y) = treeSize x + treeSize y

I looked at this stupidly for some time thinking “But the bit that
handles the leaf isn't the same function as the one that sums together
the branches!” and then eventually realised that modern functional
programming languages can probably handle more than one functional
argument…

I came up with

 > -- a new Higher Order Function for trees?? > treeF :: (a->b) -> (b->b->b) -> Tree a -> b > treeF f o (Leaf x) = f x > treeF f o (Branch x y) = (treeF f o x) `o` (treeF f o y)

Which would then be used with functions like

 > fringeF :: Tree a -> [a] > fringeF = treeF list (++) >     where list x = [x]

 > treeSizeF :: Tree a -> Integer > treeSizeF = treeF one (+) >     where one _ = 1

which, incredibly, works! The formulation for treeHeight is a bit
uglier:

 > treeHeightF :: Tree a -> Integer > treeHeightF = treeF zero maxPlusOne >     where zero _         = 0 >           maxPlusOne x y = 1 + max x y

{

Parenthesis on const: I don't like having to create little lexical
stubs to give the literal 0 and 1 above. In Perl you would just do sub {1}
for example, which made be think of a lambda: (\_ -> 1) which is
hideous. Now if only there was a function that threw away one of its
arguments. I got as far as defining it

  *Tree> let _ & a = a  *Tree> map (&1) [2,3,4]  [1,1,1]

when I remembered that I'd seen it in the prelude. It's a bit less
compact than my operator version, but it is called “const”,
which makes sense. So now we can write

 > treeHeightF :: Tree a -> Integer > treeHeightF = treeF (const 0) maxPlusOne >     where maxPlusOne x y = 1 + max x y

}

And, the definition of mapTree was written more intuitively than
rationally – I can barely believe that it works:

 > mapTreeF :: (a->b) -> Tree a -> Tree b  > mapTreeF f = treeF mapLeaf Branch >     where mapLeaf x = Leaf (f x)

See that?! The two things that absolutely blew me away were

  1. The function, f is folded into the Branch calculation -
    e.g. it didn't just do Branch (mapTree x) (mapTree y), but
    (mapTree f x) etc. I didn't understand this, but it
    felt right, so I tried to compile it…
  2. You can use the Branch constructor as a function! Er, ok,
    that might seem old hat if you know algebraic datatypes, but for some
    reason I was pleased with that.

Anyway, the new functions work:

 *Tree> fringeF $ mapTreeF (*2) t [4,6,8]

Exercise 7.2 takeTree and takeTreeWhile

Using the definition of InternalTree

 > data InternalTree a = ILeaf >                     | IBranch a (InternalTree a) (InternalTree a)

I came upon something like the following:

 > takeTree :: Int -> InternalTree a -> InternalTree a > takeTree n (IBranch w x y)  >     | n > 0 = IBranch w (takeTree (n-1) x) (takeTree (n-1) y) > takeTree _ _ = ILeaf

Well, er. That compiles! Which you have to admit is a start. But is
it correct? Hudak's been a little unhelpful here by not actually
providing any debugging tools or setting those as exercises. I was
about to start creating a function dumpTree until I remembered
something about Show

 > data InternalTree a = ILeaf >                     | IBranch a (InternalTree a) (InternalTree a) >                     deriving Show >  > -- here's an example InternalTree to play with. > itree = IBranch 1 (IBranch 2 ILeaf (IBranch 3 ILeaf ILeaf)) ILeaf

Does that work? If it does, then we'll be able to enter it on the ghci
interactive prompt and have a nice dumped representation:

    *Tree> itree    IBranch 1 (IBranch 2 ILeaf (IBranch 3 ILeaf ILeaf)) ILeaf

Yay! So let's try the takeTree function

  *Tree> takeTree 0 itree  ILeaf  *Tree> takeTree 1 itree  IBranch 1 ILeaf ILeaf     *Tree> takeTree 2 itree  IBranch 1 (IBranch 2 ILeaf ILeaf) ILeaf  *Tree> takeTree 3 itree  IBranch 1 (IBranch 2 ILeaf (IBranch 3 ILeaf ILeaf)) ILeaf  *Tree> takeTree 4 itree  IBranch 1 (IBranch 2 ILeaf (IBranch 3 ILeaf ILeaf)) ILeaf

Cheered by this success, I wrote the takeTreeWhile function quite easily

  > takeTreeWhile :: (a->Bool) -> InternalTree a -> InternalTree a  > takeTreeWhile p (IBranch w x y)  >     | p w = IBranch w (takeTreeWhile p x) (takeTreeWhile p y)  > takeTreeWhile _ _ = ILeaf

And again, that seems to give sensible results:

  *Tree> takeTreeWhile (<2) itree  IBranch 1 ILeaf ILeaf  *Tree> takeTreeWhile (<3) itree  IBranch 1 (IBranch 2 ILeaf ILeaf) ILeaf  *Tree> takeTreeWhile (<4) itree  IBranch 1 (IBranch 2 ILeaf (IBranch 3 ILeaf ILeaf)) ILeaf

Exercise 7.3 define foldr and repeat for InternalTree

This is a slightly tricksy question – what does foldr
mean for a Tree? Do its subfunctions take a Tree input, or does it work
on the values contained therein? If the latter (which I think is more
probable), then for the Tree datatype above, the implementation could be
trivially expressed as

  > foldTree :: (a->b->b) -> b -> Tree a -> b  > foldTree f init t = foldr f init (fringe t)
  *Tree> fringe t  [2,3,4]  *Tree> foldTree (+) 0 t  9

Of course, the InternalTree doesn't have values stored at the fringe.
We could define a similar function first

> iTreeVals :: InternalTree a -> [a]
> iTreeVals (ILeaf) = []
> iTreeVals (IBranch w x y) = w : iTreeVals x ++ iTreeVals y

  > foldITree :: (a->b->b) -> b -> InternalTree a -> b  > foldITree f init t = foldr f init (iTreeVals t)
  *Tree> iTreeVals itree  [1,2,3]  *Tree> foldITree (+) 0 itree  6

This of course means iterating the whole tree to get the values before
then iterating the list to fold over it, which might be inefficient. So
we could instead try to fold as we iterate the tree:

  > -- does this work?  > foldITree' :: (a->b->b) -> b -> InternalTree a -> b  > foldITree' op init (ILeaf) = init  > foldITree' op init (IBranch w x y)  >     = w `op` (foldITree' op init x) `op` (foldITree' op init y)

My main doubt was about the last line: how do we fold down 2 separate
paths? It turns out to be justified

 7.hs:76:10:    Couldn't match expected type `a' (a rigid variable)           against inferred type `b' (a rigid variable)      `a' is bound by the type signature for `foldITree'' at 7.hs:73:19      `b' is bound by the type signature for `foldITree'' at 7.hs:73:22    In the first argument of `op', namely        `w `op` (foldITree' op init x)'    In the expression:          (w `op` (foldITree' op init x)) `op` (foldITree' op init y)    In the definition of `foldITree'':        foldITree' op init (IBranch w x y)                     = (w `op` (foldITree' op init x)) `op` (foldITree' op init y)

I think that error message is relating to the `op` line, and thinking
about it, it op is (a->b) then w `op` x `op` y can't
work, because the intermediate result is of type b rather than a. (And
in fact, commenting the second `op` allows it to compile. I'm not
quite sure of how to proceed here. I could change the type signature of
op to (a->b->b->b) but that seems rather clumsy. I may come back to
this.

Now let's turn to repeat. The function is defined

  > repeat :: a -> [a]

if given an element 12, it will repeat it infinitely [12,12,12,12,12].
What does that mean for a tree? One thing I note from above is that we
use ILeaf as the base case just like we use [] in a list. Perhaps
repeat will simply substitute all ILeaves for the list again?

  > repeatITree :: InternalTree a -> InternalTree a  > repeatITree t = let t' = t  >                 in _repeatITree t'  >     where _repeatITree (ILeaf) = t  >           _repeatITree (IBranch w x y)   >                 = IBranch w (_repeatITree x) (_repeatITree y)

I think this is doing the right thing:

  -- formatted the show output manually  *Tree> takeTree 1 $ repeatITree itree                             IBranch 1                       ILeaf             ILeaf  *Tree> takeTree 2 $ repeatITree itree                             IBranch 1                 (IBranch 2                 (IBranch 1              ILeaf    ILeaf)             ILeaf   ILeaf)  *Tree> takeTree 3 $ repeatITree itree                             IBranch 1              (IBranch 2                    (IBranch 1        (IBranch 1   (IBranch 3      (IBranch 2    ILeaf)     ILeaf ILeaf)  ILeaf ILeaf))   ILeaf  ILeaf) 

Ex 7.3. Using any of the definitions of tree above, define
versions of zip and zipWith

Let's stick with InternalTree for the moment

  > zip      :: [a] -> [b] -> [(a,b)]  > zipITree :: InternalTree a -> InternalTree b -> InternalTree (a,b)
  > itree  = IBranch 1 (IBranch 2 ILeaf (IBranch 3 ILeaf ILeaf)) ILeaf  > itree' = IBranch 3 (IBranch 4 ILeaf (IBranch 5 ILeaf ILeaf)) ILeaf
  > zipITree :: InternalTree a -> InternalTree b -> InternalTree (a,b)  > zipITree (ILeaf) (ILeaf) = ILeaf  > zipITree (IBranch w x y) (IBranch w' x' y')  >     = IBranch (w,w') (zipITree x x') (zipITree y y')
  *Tree> zipITree itree itree'  IBranch (1,3) (IBranch (2,4) ILeaf (IBranch (3,5) ILeaf ILeaf)) ILeaf

Do the structures of the trees have to be identical? would it make
sense to zip a Leaf with a Branch? What would go in the tuple (a,b) in
that case? We could supply an “init” argument to use instead. (Or
rather, two init arguments, one for each type of tree). I think right
now it makes sense to stick with the same structured tree. Zipping a
different kind of tree will complain about non exhaustive search
patterns, but we could do one better and define something like

  > zipITree _ _ = error "zipITree can only zip trees with the same structure"

zipWithTree is straightforward to define at this stage

  > zipWithITree :: (a->b->c) -> InternalTree a -> InternalTree b -> InternalTree c  > zipWithITree _  (ILeaf) (ILeaf) = ILeaf  > zipWithITree fn (IBranch w x y) (IBranch w' x' y')   >     = IBranch (fn w w') (zipWithITree fn x x') (zipWithITree fn y y')

And you have no idea how smug^whappy I feel about being able to call a
recursive function with pattern matching and polymorphic types
“straightforward” ;-)

  *Tree> zipWithITree (+) itree itree'  IBranch 4 (IBranch 6 ILeaf (IBranch 8 ILeaf ILeaf)) ILeaf

and we can defined zipITree in terms of zipWithITree as

  > zipITree' :: InternalTree a -> InternalTree b -> InternalTree (a,b)  > zipITree'  = zipWithITree pair  >     where pair x y = (x,y)

Ex 7.5. Enhance the Expr datatype with variables and Let
expressions

When I skimmed through this chapter a couple of weeks ago I had the idea
that this exercise would be trivial in Haskell and, as my laptop was out
of action, I wrote it out longhand with pen and paper. I was so chuffed
by this that I had to dial up my web host from my phone and type it into
a text editor and compile. Modulo a couple of syntax errors, this
worked first time. Apart from a small tweak (my original Let
expressions only took constant values rather than the result of an
arbitrary Expr) this is what I wrote:

  > -- the following is copied from SOE, with addition of V and Let lines  > data Expr = C Value  >           | Expr :+ Expr  >           | Expr :- Expr  >           | Expr :* Expr  >           | Expr :/ Expr  >           | V String  >           | Let String Expr Expr  >  > type Value   = Float  >  > -- we need a table of String -> Value, which I chose  > -- to define as a list of tuples:  > type Vars = [(String, Value)]  >  > -- the following is taken from the book, with very few  > -- changes:  mainly the propagation of the Vars table:  > evaluate :: Vars -> Expr -> Value  > evaluate _  (C x) = x  > evaluate vs (e1 :+ e2) = evaluate vs e1 + evaluate vs e2  > evaluate vs (e1 :- e2) = evaluate vs e1 - evaluate vs e2  > evaluate vs (e1 :* e2) = evaluate vs e1 * evaluate vs e2  > evaluate vs (e1 :/ e2) = evaluate vs e1 / evaluate vs e2  > -- just the following deal with setting and getting variables from  > -- the table  > evaluate vs (V v)        = findVar vs v  > evaluate vs (Let k e' e) = (evaluate (setVar vs k (evaluate vs e')) e)  >  > -- and we need a couple of noddy functions to get and "set" (i.e.,  > -- return a new version of the table but with the variable added  >  > findVar :: Vars -> String -> Value  > findVar ((k,v):vs) s = if s == k then v  >                        else findVar vs s  > findVar _ s = error ("Couldn't find variable " ++ s);  >  > setVar :: Vars -> String -> Value -> Vars  > setVar vs k v = (k,v) : vs

And that's it! When people say, as they often do, that Haskell is a
great language to write programming languages in, I'd never understood
them, but I think I'm starting to get it. As well as supporting the
“data as code” idea like Lisp, the pattern matching and algebraic types
make the above really compact. In other languages you might have to
mess about with a whole lot of tree-walking and book-keeping code.

(My delight at the ease and simplicity of this question are tempered
slightly by the pain and confusion I subsequently faced on several
of the questions in earlier chapters though…)