<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Just another lambdabananacamel, &#187; haskell</title>
	<atom:link href="http://greenokapi.net/blog/tag/haskell/feed/" rel="self" type="application/rss+xml" />
	<link>http://greenokapi.net/blog</link>
	<description>Perl, Haskell, stuff</description>
	<lastBuildDate>Wed, 02 Feb 2011 14:32:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<item>
		<title>Coin Tricks</title>
		<link>http://greenokapi.net/blog/2009/09/19/coin-tricks/</link>
		<comments>http://greenokapi.net/blog/2009/09/19/coin-tricks/#comments</comments>
		<pubDate>Sat, 19 Sep 2009 17:55:29 +0000</pubDate>
		<dc:creator>osfameron</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[derrenbrown]]></category>
		<category><![CDATA[random]]></category>

		<guid isPermaLink="false">http://greenokapi.net/blog/?p=170</guid>
		<description><![CDATA[Derren Brown recently claimed he would predict the UK lottery numbers, live on television, and then explain how he did it. It&#8217;s doubtful that he did either â€” the alternative explanation that he&#8217;d actually committed a massive and improbably daring fraud is vastly more likely than the bullshit he spun out, as it&#8217;s actually possible&#8230; [...]]]></description>
			<content:encoded><![CDATA[<p>Derren Brown recently claimed he would predict the UK lottery numbers, live on television, and then explain how he did it.  It&#8217;s doubtful that he did either â€” the alternative explanation that he&#8217;d actually committed a massive and improbably daring fraud is vastly more likely than the bullshit he spun out, as it&#8217;s actually possible&#8230;</p>
<p>I&#8217;m certain (or at the very least hopeful) that the actual &#8220;reveal&#8221; will come later on in the series&#8230; but in the mean time, he did show one very cute mathematical trick for winning a coin game.</p>
<p>This is the game:</p>
<ul>
<li> Player 1 chooses a 3-coin combination, say Tails,Tails,Heads</li>
<li> Player 2 chooses another, for example Heads,Tails,Tails</li>
<li> We now throw a series of coins until one of the players&#8217; combination is thrown in order.</li>
</ul>
<p>You might think, given that any 3-coin combination is as likely as another (will be thrown with a 1/8 probability) that there&#8217;s nothing to choose between them.  But notice that I didn&#8217;t say you threw 3 coins at a time!  For example, if we throw.</p>
<ol>
<li> Tails</li>
<li> Heads</li>
<li> Tails &#8230;  at this point we&#8217;ve thrown 3 coins, and matched no combination</li>
<li> Tails &#8230; and now player 2 has won: coins 2-4 read Heads,Tails,Tails</li>
</ol>
<p>If you look at these 2 combinations, you&#8217;ll see that Player 1 will win if the sequence goes: Tails,Tails,(any number of Tails), Head.  Player 2 will win in any other situation, i.e. 75% of the time.<br />
The combination the hapless participant chose (Heads,Heads,Heads) is even worse, losing 87.5% of the time!</p>
<p>It&#8217;s about time for me to try modelling a problem in Haskell, so let&#8217;s try this one!<br />
I find the <a href="http://www.haskell.org/onlinereport/random.html">docs for <tt>System.Random</tt></a> rather confusing, but found some inspiration from a<br />
<a href="http://www.haskell.org/pipermail/haskell-cafe/2005-April/009687.html">haskell-cafe post about random coin throws</a> and encouragement on #haskell from Luke30, Twey, and others.</p>
<p>I won&#8217;t go through the code in detail this time, but here are some key things to note:</p>
<ul>
<li> the <tt>randoms</tt> function returns an infinite stream of random things.  In this case I&#8217;m using it as a stream of Bools, like <tt>[False, True, True, False, True, ...]</tt> and then converting them into <tt>[Tail,Head, ...]</tt>.
<p>lilac pointed out that I could create an instance of Random for <tt>Coin</tt>s, I&#8217;ll have a look at that soon.</li>
<li>I&#8217;m using <tt>tails</tt> (nicely overloaded vocabulary ;-) to iterate the infinite sequence of coin flips, stopping when one of the players&#8217; sequences matches.</li>
</ul>
<p>And here&#8217;s the code:</p>
<pre>import System.Random
import Control.Monad
import System.IO
import Data.Maybe
import Data.List

data Coin = Head | Tail
    deriving (Eq, Ord, Show)

-- Derren Brown's coin game.
-- The second winner has chosen a combination that will win
-- significantly more often

main = do
    g1 &lt;- guess
    let g2 = counter g1
    putStrLn $ "Player 1 chooses " ++ (show g1)
    putStrLn $ "Player 2 chooses " ++ (show g2)
    coins &lt;- coinFlips
    let winner = take 1 .  catMaybes .
                 map (win g1 g2) $
                    tails coins
    putStrLn $ "Player " ++ (show winner) ++ " wins!"

guess = do f &lt;- coinFlips
           return $ take 3 f

counter [a,b,_] = [rev b, a, b]
    where rev Head = Tail
          rev Tail = Head

win g1 g2 l | g1 `isPrefixOf` l = Just 1
            | g2 `isPrefixOf` l = Just 2
            | otherwise         = Nothing

-- modified from
-- http://www.haskell.org/pipermail/haskell-cafe/2005-April/009687.html
coinFlips :: IO [Coin]
coinFlips = do g &lt;- newStdGen
               let bools = randoms g
               let coins = map bool2coin bools
               return coins
      where bool2coin True  = Head
            bool2coin False = Tail</pre>
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2009/09/19/coin-tricks/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Is currying monadic?</title>
		<link>http://greenokapi.net/blog/2009/05/07/is-currying-monadic/</link>
		<comments>http://greenokapi.net/blog/2009/05/07/is-currying-monadic/#comments</comments>
		<pubDate>Thu, 07 May 2009 12:24:28 +0000</pubDate>
		<dc:creator>osfameron</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[currying]]></category>
		<category><![CDATA[fp]]></category>
		<category><![CDATA[monads]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://greenokapi.net/blog/?p=163</guid>
		<description><![CDATA[Here&#8217;s a question that came up while I&#8217;ve been trying to implement currying in Perl: is Currying monadic? I&#8217;ve tried a couple of times, but not managed to explain what I mean very well on #haskell, so let&#8217;s see if a longer writeup explains it better. My simplistic understanding of monads is that they take [...]]]></description>
			<content:encoded><![CDATA[<p>
Here&#8217;s a question that came up while I&#8217;ve been trying to<br />
<a<br />
href="http://greenokapi.net/blog/2009/05/04/currying-in-perl/">implement<br />
currying in Perl</a>: is Currying monadic?  </p>
<p>I&#8217;ve tried a couple of times, but not managed to explain what I mean very well on #haskell, so let&#8217;s<br />
see if a longer writeup explains it better.</p>
<p>
My simplistic understanding of monads is that they take various things that<br />
are nests of nested expressions, and allow you to reason about them, and<br />
given them a pretty syntax that makes it look like they are in fact a<br />
<i>sequence</i> of commands.</p>
<p>
For example: (pseudocode)</p>
<p><code>
<pre>
    let a = 1
    let b = 2
    output a+b
</pre>
<p></code></p>
<p>Looks like a sequence of commands, but you couldn&#8217;t just separate each<br />
line and string the commands together.  Rather you have to consider it<br />
as a nest:</p>
<p><code>
<pre>
    (let a = 1
        (let b = 2
            (output a+b )))
</pre>
<p></code></p>
<p>And in many cases, we can go from a nested structure to a monad, for example, we could simplify these horribly nested <tt>if</tt>s:</p>
<p><code>
<pre>
    (if file exists
        (if file is readable
            (if reading the file gave a string
                (if the string matches a username
                    (do some action with the username)))))
</pre>
<p></code></p>
<p>&#8230; into a nice Maybe monad:</p>
<p><code>
<pre>
    do check file exists
       check file is readable
       string <- read from file
       check string matches username
       do something with the username
</pre>
<p></code></p>
<p>Similarly, nested lists:</p>
<p><code>
<pre>
    (for each i in list 1
        (for each j in list 2
            (is i the same as j?
                (output i))))
</pre>
<p></code></p>
<p>can be abstracted as a List monad:</p>
<p><code>
<pre>
    do i <- list 1
       j <- list 2
       check i == j
       output i
</pre>
<p></code></p>
<p>OK, so I'm not talking about wrapping and unwrapping values, the monad<br />
laws, or typing in general.  They are what gives this stuff its strong<br />
theoretical basis and makes it robust.  But the simple-minded "Nest -><br />
Sequence" idea works for me at least, and gives a good feel for<br />
where monads can come in useful.</p>
<h2> Currying </h2>
<p>So... when I was implementing currying, I noted that you could write a<br />
currying sub (pseudocode again):</p>
<p><code>
<pre>
    function add3 (a, b, c) {
        return a+b+c
    }
</pre>
<p></code></p>
<p>as something like this:</p>
<p><code>
<pre>
    function add3 (a) {
        return function (b) {
            return function (c) {
                return a+b+c
            }
        }
    }
</pre>
<p></code></p>
<p>This is again a nested expression.  So I wondered if you could again "flatten" it with a monadic do block:</p>
<p><code>
<pre>
    let add3 = do
        a <- get first parameter
        b <- get second parameter
        c <- get third parameter
        return a+b+c
</pre>
<p></code></p>
<p>OK, so I "know" that functions in Haskell (which uses currying for<br />
functions as a general rule) are the "Reader monad".  But I don't<br />
understand it well enough to know if that means you can use Reader to<br />
implement the above...</p>
<p>
(I don't understand Reader at all in fact.  I must bang my head against<br />
it again, but I find it very confusing - how the monad is represented,<br />
what the functions are, and how they get magically applied.)</p>
<p>
So, I did attempt to implement it using my Perl module<br />
<tt>Acme::Monads</tt>.  This <a<br />
href="http://github.com/osfameron/acme--monads/blob/master/t/04_curry.t">test</a><br />
shows that this sort of works:</p>
<p><code>
<pre>
    my $add = mdo (2) {
        mbind $x = Curry shift;
        mbind $y = Curry shift;
        return $x+$y;
        };

    say $add->(1)->(2); # 3, yay!
</pre>
<p></code></p>
<p>Note that the "return" isn't a monadic <tt>Unit</tt> but Perl's return,<br />
i.e. a plain value.  That makes this example strictly speaking not<br />
monadic: what I don't know is whether that means the whole idea is fatally flawed, or whether (as I believe) I was just too dumb to fix the errors I got when I tried running with munit...<br />
I suspect that it should be possible, with the<br />
whole expression then being wrapped by some function<br />
(<tt>runCurry</tt>?) which extracts the final result out of the monad.</p>
<p>
Did this explanation make any sense?  Please let me know, and any<br />
comments on whether it's possible/sane to do this (writing currying as a<br />
monad) are appreciated!</p>
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2009/05/07/is-currying-monadic/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Currying in Perl</title>
		<link>http://greenokapi.net/blog/2009/05/04/currying-in-perl/</link>
		<comments>http://greenokapi.net/blog/2009/05/04/currying-in-perl/#comments</comments>
		<pubDate>Mon, 04 May 2009 00:03:45 +0000</pubDate>
		<dc:creator>osfameron</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[fp]]></category>

		<guid isPermaLink="false">http://greenokapi.net/blog/?p=162</guid>
		<description><![CDATA[&#8220;Currying&#8221; is a simple idea that is surprisingly powerful on the one hand, and which was surprisingly hard (at least for me) to get my head around - possibly because the concept didn&#8217;t exist natively in the languages I learnt first. When you declare a function in currying style, each argument is taken one at [...]]]></description>
			<content:encoded><![CDATA[<p>
&#8220;Currying&#8221; is a simple idea that is surprisingly powerful on the one hand,<br />
and which was surprisingly hard (at least for me) to get my head around<br />
- possibly because the concept didn&#8217;t exist natively in the languages I<br />
learnt first.</p>
<p>
When you declare a function in currying style, each argument is taken<br />
one at a time, returning a new, more specialised function each time,<br />
until the function is finally executed at the end.</p>
<p>
Consider this function:</p>
<p><code>
<pre>
    sub add ($left, $right) {
        return $left + $right;
    }
</pre>
<p></code></p>
<p>(Note we&#8217;re assuming that Perl has argument lists&#8230; which it can do<br />
with Devel::Declare, but we&#8217;ll come to that in a bit).  If this was<br />
currying style then we wouldn&#8217;t call it with</p>
<p><code>
<pre>
    my $answer = add (1, 3);  # 4
</pre>
<p></code></p>
<p>but instead</p>
<p><code>
<pre>
    my $answer = add(1) #  a function with $left bound to 1
                  ->(3) #  now executed with $right bound to 3
</pre>
<p></code></p>
<h2>Implementation</h2>
<p>This is actually quite simple to implement in pure vanilla Perl.</p>
<p><code>
<pre>
    sub add {
        my $left = shift;
        return sub {
            my $right = shift;
            return $left + $right;
        };
    }
</pre>
<p></code></p>
<p>That isn&#8217;t very pretty or convenient though&#8230; handily there are several<br />
modules to encapsulate this behaviour on CPAN.  One of them is my<br />
<tt><a href="http://search.cpan.org/~osfameron/Sub-Curried/lib/Sub/Curried.pm"><br />
Sub::Curried</a></tt>.  The docs mention some of the other modules with<br />
similar functionality: mine, which uses the shiny goodness of<br />
<tt><a href="http://search.cpan.org/~flora/Devel-Declare-0.005000/lib/Devel/Declare.pm#NAME"><br />
Devel::Declare</a></tt>, has the advantage that you can declare a<br />
curried subroutine with a simple, perlish syntax: in fact you do it more<br />
or less exactly like the example I gave above.  (The difference is that<br />
we can&#8217;t override the <tt>sub</tt> keyword, so we create a new one,<br />
&#8216;<tt>curry</tt>&#8216;:</p>
<p><code>
<pre>
    curry add ($left, $right) {
        return $left + $right;
    }
</pre>
<p></code></p>
<p>Using D::D, the moment the Perl parser sees a symbol &#8216;curry&#8217; being<br />
compiled, it hands control to our custom parser, which then injects code<br />
into the source while it&#8217;s still being compiled.  We can do some cunning<br />
stuff, including telling it &#8220;Hey, when you get to the end of the scope<br />
you&#8217;re compiling, inject some more text!&#8221;  </p>
<p>
I used to keep hold of an array <tt>@filled</tt> of the partially<br />
applied arguments and then apply all in one go at the end.  But it seems<br />
to be more elegant to actually transform into sometihng like the<br />
&#8220;vanilla&#8221; example I gave above.  I say &#8220;something like&#8221; but it&#8217;s<br />
actually little more complicated:</p>
<p><code>
<pre>
    sub add {
        return ROUTINE unless @_; # a reference to this subroutine
        check_args(2, @_);        # check we weren't called with >2 args
        my $f = sub {
            my $left = shift;
            return ROUTINE ...
            check_args...         # check we weren't called with >1 arg
            my $f = sub {
                my $right = shift;
                return $left + $right; # actually do the thing
                };
            $f = ...
            };
        # now call the subroutine for each
        $f = $f->(shift) for @_;
        return $f;
    }
</pre>
<p></code></p>
<p>Yikes!  The extra boilerplate is there to make sure that we get some<br />
niceties:</p>
<ul>
<li> Die with informative error message if we&#8217;re called with too many<br />
arguments.</p>
<li> Handle being called with multiple arguments: i.e. treat add(1,2)<br />
the same as add(1)->(2)</p>
<li> When called with zero arguments, the sub returned is logically (and<br />
cutely!) an <i>alias</i>.  (I think the ROUTINE trick isn&#8217;t needed, will<br />
probably disappear with a restructure)</p>
<li> Handle functions that return multiple values smoothly (not shown<br />
above)
</ul>
<h2>Uses of currying</h2>
<p>OK, so we&#8217;ve done a lot of furious work behind the scenes to make<br />
something look very simple while doing a whole lot of extra work.  But<br />
why?  When you program in Haskell, you&#8217;ll find currying useful at every<br />
turn: it&#8217;s a hunch that it&#8217;d be useful in Perl too.  There are certainly<br />
some cute examples of it in common use:</p>
<h3>Currying the invocant</h3>
<p>Modules that do setup on a class would often use class methods: for<br />
example:</p>
<p><code>
<pre>
    package My::Class;
    use base 'Class::Accessor';
    __PACKAGE__->add_accessor('foo');
    __PACKAGE__->add_accessor('bar');
    __PACKAGE__->add_accessor('baz');
</pre>
<p></code></p>
<p><tt>__PACKAGE__</tt> refers to the current package, so this is the same as<br />
writing:</p>
<p><code>
<pre>
    My::Class->add_accessor('foo');
    ...
</pre>
<p></code></p>
<p>It&#8217;s also utterly hideous.  Moose on the other hand provides a syntax<br />
like this:</p>
<p><code>
<pre>
    package My::Class;
    use Moose;
    has 'foo' => ...
    has 'bar' => ...
    has 'baz' => ...
</pre>
<p></code></p>
<p>What&#8217;s going on?  Incredibly, &#8216;has&#8217; is actually a class method just like<br />
&#8216;add_accessor&#8217; was!  But the leftmost argument (the invocant, usually<br />
referred to as <tt>$self</tt> for object methods, or <tt>$class</tt> for<br />
class methods) has been curried into it.  This is because Perl&#8217;s<br />
importing is dynamic and instead of just copying the method.</p>
<p><code>
<pre>
    *{CALLER::has} = \&has;
</pre>
<p></code></p>
<p>It can do somethingl like this:</p>
<p><code>
<pre>
    *{CALLER::has} = has($CALLER); # assuming a currying 'has'
</pre>
<p></code></p>
<h3>Sections</h3>
<p>In Haskell you can take references not just to functions, but to<br />
operators.</p>
<p><code>
<pre>
    add = (+)  -- alias
    add 1 2    -- result is 3
    (+) 1 2    -- also 3
</pre>
<p></code></p>
<p>You can also take &#8216;sections&#8217; of these operators, by &#8216;partially applying&#8217;<br />
either the left or the right hand side.</p>
<p><code>
<pre>
    add2       = (+ 2)
    halve      = (/ 2)
    reciprocal = (1.0 /)
</pre>
<p></code></p>
<p>This isn&#8217;t the same as currying, though you could implement sections<br />
with curried functions:</p>
<p><code>
<pre>
    curry divide ($left, $right) {
        $left / $right
    }

    my $reciprocal = divide(1);         # 1    / $ARG
    my $halve      = flip(divide)->(2); # $ARG / 2
</pre>
<p></code></p>
<p>That&#8217;s rather ugly though, and remembering to <tt>flip</tt> the<br />
arguments is annoying.  So, again with <tt>Devel::Declare</tt> I<br />
implemented <tt><br />
<a href="http://github.com/osfameron/misc-opensource/tree/master/scratch/perl/sub-section/">Sub::Section</a><br />
</tt> (not yet on CPAN, the github repo is linked for now).</p>
<p><code>
<pre>
    my $add2         = op(+ 2);
    my $halve        = op(/ 2);
    my $contains_foo = op(=~ 'foo');
</pre>
<p></code></p>
<p>And of course:</p>
<p><code>
<pre>
    my $greet        = op("Hello " .);
    say $greet->('World');
</pre>
<p></code></p>
<h2>Talk</h2>
<p>I&#8217;ll be talking about Functional Perl at the<br />
<a<br />
href="http://northwestengland.pm.org/meetings/004.html">NorthWestEngland<br />
perlmongers tomorrow Tues 5th</a> in Manchester, UK.</p>
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2009/05/04/currying-in-perl/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>(rough) Grids in Haskell</title>
		<link>http://greenokapi.net/blog/2009/03/10/rough-grids-in-haskell/</link>
		<comments>http://greenokapi.net/blog/2009/03/10/rough-grids-in-haskell/#comments</comments>
		<pubDate>Tue, 10 Mar 2009 08:56:01 +0000</pubDate>
		<dc:creator>osfameron</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[fp]]></category>
		<category><![CDATA[game]]></category>
		<category><![CDATA[grid]]></category>

		<guid isPermaLink="false">http://greenokapi.net/blog/?p=160</guid>
		<description><![CDATA[(This isn&#8217;t a full blog post, but a note of a few things about implementing game grids in Haskell). A [[Cell]] structure seems to make sense for a lot of boards. In fact, even the problems I&#8217;m looking at might be approached simply indexing into row then col each time you want to access a [...]]]></description>
			<content:encoded><![CDATA[<p>(This isn&#8217;t a full blog post, but a note of a few things about implementing game grids in Haskell).</p>
<ul>
<li>A [[Cell]] structure seems to make sense for a lot of boards.  In fact, even the problems I&#8217;m looking at might be approached simply indexing into row then col each time you want to access a list.  However it feels inelegant, and some of the things I want to do (looking at lines of game pieces in a given direction N/S/E/W) would be more elegant if I can simply traverse the grid in an arbitrary direction.</li>
<li>morrow remembered there had been a <a href="http://www.haskell.org/pipermail/haskell-cafe/2008-December/052615.html">discussion on haskell-cafe about grids</a> (<a href="http://www.haskell.org/pipermail/haskell-cafe/2009-January/052756.html">+</a> <a href="http://www.haskell.org/pipermail/haskell-cafe/2009-January/052891.html">+</a>) recently.  It seems to be mainly about infinite grids, but has lots of interesting stuff to absorb.</li>
<li>paolino remembered that comonads were useful in grid representation.  I don&#8217;t understand what they  <em>are</em> but google finds  <a href="http://blog.sigfpe.com/2006/12/evaluating-cellular-automata-is.html">http://blog.sigfpe.com/2006/12/evaluating-cellular-automata-is.html</a> which looks interesting.</li>
<li>I think I want a zipper, but didn&#8217;t know how it would work on [[Cell]].  Saizan clarified: &#8220;if Zipper [] a is the zipper for [a], then the one for [[a]] is Zipper [] (Zipper [] a), and depending on which layer you use the next/previous operations you move in one dimension or the other&#8221;.  This isn&#8217;t quite true though, as when you move up/down by rows, you have to somehow remember the columnwise index.</li>
<li>(Saizan referred to fmap&#8217;ing operations to keep the columns in sync, which is, I think, a more higher-order way of doing the same thing :-)</li>
<li>dolio posted an <a href="http://hpaste.org:80/fastcgi/hpaste.fcgi/view?id=2234#a2235">ADT representation of a grid</a>.   I don&#8217;t understand that at all yet, but it is pretty&#8230; He also wrote a <a href="http://hpaste.org:80/fastcgi/hpaste.fcgi/view?id=2234#a2236">Traversable instance</a> (this would appear to be a generic way to expose functions to move around in your arbitrary datastructure)</li>
<li>I decided to try to fold a [[Cell]] into a Cell { value=x, down=d, right=r }.  This took me several hours last night, during which I cursed my feeble brain, and the fact that mutable references would have made the task trivial in other languages.  (However, mutable references would make the grid useless for many useful things, like keeping intermediate copies of grid state).  My no-doubt crappy code is <a href="http://github.com/osfameron/misc-opensource/blob/bf54cd635d44d506a3166ef7d482f1b152a860fc/scratch/haskell/grid.hs">on github</a> (I&#8217;ll come back to it in more detail later)</li>
<li>While I&#8217;m whining about haskell being &#8220;too hard&#8221; I should mention that I&#8217;ve got bogged down in representing grids (for games or spreadsheet) in Java, Perl, and Javascript too, so I think it may be more a case of my brain being feeble.  That or I&#8217;m just overcomplicating things&#8230;</li>
<li>(As I remember it, the OO representation of grids is simpler to get started with but then I get bogged down with issues of multiple inheritance etc.)</li>
</ul>
<p>Comments and suggestions welcome, I should write a full post on this soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2009/03/10/rough-grids-in-haskell/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>More longest paths, and sick folds.</title>
		<link>http://greenokapi.net/blog/2009/01/31/more-longest-paths-and-sick-folds/</link>
		<comments>http://greenokapi.net/blog/2009/01/31/more-longest-paths-and-sick-folds/#comments</comments>
		<pubDate>Fri, 30 Jan 2009 22:08:05 +0000</pubDate>
		<dc:creator>osfameron</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://greenokapi.net/blog/?p=159</guid>
		<description><![CDATA[This week&#8217;s simple longest path exercise seems to have had more mileage in it than I expected. Thanks to everyone&#8217;s comments and suggestions, I&#8217;ve updated with a number of times with, among other things, an improved Haskell version that acts on path elements instead of just characters. But I had intended to do a version [...]]]></description>
			<content:encoded><![CDATA[<p>This week&#8217;s <a href="http://greenokapi.net/blog/2009/01/27/theres-the-nub-snippet-in-perl-and-haskell/">simple longest path exercise</a> seems to have had more mileage in it than I expected.  Thanks to everyone&#8217;s comments and suggestions, I&#8217;ve updated with a number of times with, among other things, an improved Haskell version that acts on path elements instead of just characters.</p>
<p>
But I had intended to do a version of this in Perl.  I mentioned last time that the typical approach in this language would be nested hashes&#8230; which elicited the comment:<br />
<blockquote>Ugh, one of the absolute worst things with perl: nonsensical handling of nested data structures. Every other modern language does this vastly better.</p></blockquote>
<p>Em&#8217;s fightin&#8217; words!  So let&#8217;s try it in Perl and see how it looks&#8230;</p>
<p>
First of all we need to create the multilevel hash.  We want to end up with something like this:</p>
<blockquote><pre>
        {
          '' => {
                  'a' => {
                           'a' => {}
                         },
                  'qux' => {
                             'wibble' => {}
                           },
                  'foo' => {
                             'bar' => {
                                        'baz' => {}
                                      }
                           },
                  'aa' => {}
                }
        };
</pre>
</blockquote>
<p>The usual way to do this is to maintain an idea of the node that we&#8217;re currently in, and follow it to the next item of the path, changing the <tt>$node</tt> variable to the next level down.  So, if we had a path <tt>/foo/bar/baz</tt>, we&#8217;d start off with an empty hash <tt>{}</tt>, create the next node <tt>{ foo=> {} }</tt> and start the process again with the new empty hash <tt>{}</tt> and the remaining path <tt>bar/baz</tt>.</p>
<p>
Remind you of anything?  I thought it looked a little like a fold with an accumulator variable:  the <tt>$node</tt> is the accumulator, which is initialised to {} and which gets set to each corresponding node in turn.  The path is the list that we fold over.  And it tickled me that we can write:</p>
<blockquote><pre>
use List::Util 'reduce';
sub mk_hash {
    my %hash;
    reduce { $a->{$b} ||= {} }
           \%hash,
           split '/'
               for @_;
    return \%hash;
}
</pre>
</blockquote>
<p>In polite company (well, in Perlish polite company at last), calling <tt>map</tt> in a void context is frowned upon, as it&#8217;s abusing a functional operator for its side effects.  Here we are abusing <tt>reduce</tt>, making it destructive on a tree, and discarding its (useless) return value (the final leaf $node element).  But it does work exactly as required: (the <tt>for @_</tt> means that we run this sick fold on each path in turn) and returns a multilevel hash.</p>
<p>
I&#8217;m aware that the example above might confirm the original commenter&#8217;s dislike of Perl hackishness.  Personally I think it&#8217;s quite cute, mixing a beautiful functional idiom with pragmatic mutation.  This may be brought on by my dabbling with Perl and FP programming languages, and is probably incurable.  If you are either a) disgusted or b) confused by the Perl code above, you might like to try implementing the traditional algorithm (please let me know!)  We&#8217;ll have a look at the equivalent in Haskell shortly.</p>
<p>
There isn&#8217;t a leaves function built in for multilevel hashes (we do find some in the <tt>Tree::</tt> and <tt>Graph::</tt> namespaces, but let&#8217;s stick with the core datastructures for now), so let&#8217;s build one.  This version is actually very similar to an FP language definition I think.</p>
<blockquote><pre>
sub leaves {
    my ($node, $path) = @_;
    if (keys %$node) {
        <i># We still have to descend into all the leaves</i>
        map {
              my ($k,$v)=@$_;
              leaves($v, [@$path, $k] );
            }
        kv_list $node
    } else {
        <i># the base case - we are at a leaf, so return path!</i>
        join '/', @$path;
    }
}
</pre>
</blockquote>
<p>Sadly, though there&#8217;s an inbuilt iterator <tt>each</tt>, there isn&#8217;t a flat list of keys/values, so we&#8217;ll have to define the function <tt>kv_list</tt> used above.  The easiest way would be:</p>
<blockquote><pre>
sub kv_list {
    my $hashref = shift;
    map { [$_ => $hashref->{$_}] } keys %$hashref;
}
</pre>
</blockquote>
<p>Though we could also create an &#8220;iterator_to_list&#8221; function that acts on <tt>each</tt>.</p>
<p>
And now, the Haskell version: let&#8217;s start with creating the tree.  Instead of a hash, we&#8217;ll use <tt>Data.Map</tt> which is similar, but has an implementation better suited to purely functional usage.  Of course, we can&#8217;t use the same technique as the Perl version: we don&#8217;t have destructive mutation, and if we did a fold, it would merely return the final leaf node of each path.  So we start with a simpler recursive definition.  I say &#8220;simpler&#8221;, but I needed a lot of help with this.  Luckily #haskell came to the rescue: rwbarton pointed out that I&#8217;d need a <tt>newtype</tt> to do a recursive map, Heffalump improved it with <tt>unNode</tt>, sjanssen and quicksilver helped with a missing node constructors etc.  So eventually my naive version of the code looked like:</p>
<blockquote><pre>
import qualified Data.Map as M
import Text.Regex

newtype Node = Node { unNode :: M.Map String Node }
    deriving Show

dive :: Node -> [String] -> Node
dive n     [] = n
dive n (s:ss) = let v  = M.lookup s (unNode n) :: Maybe Node
                    n' = case v of
                            Nothing -> Node M.empty
                            Just v' -> v'
                            :: Node
                in Node $ M.insert s (dive n' ss) (unNode n)

splitpath = splitRegex $ mkRegex "/"

make_tree ps = foldl dive (Node M.empty) $ map splitpath ps
</pre>
</blockquote>
<p>This is quite clumsy, and could be improved by replacing the lookup/insertion with a single insertWith.  The <tt>Data.Map</tt> API is quite large and you need to play with it a bit to get your head around it!  And of course I got a better version, from sjanssen:</p>
<blockquote><pre>
import qualified Data.Map as M
import Text.Regex
import Data.Monoid

newtype Node = Node { unNode :: M.Map String Node }

instance Monoid Node where
    mempty = Node M.empty
    mappend (Node x) (Node y) = Node $ M.unionWith mappend x y

splitPath = splitRegex $ mkRegex "/"

nodeFromPath = foldr (\x n -> Node $ M.singleton x n) mempty . splitPath

nodeFromPaths :: [FilePath] -> Node
nodeFromPaths = mconcat  . map nodeFromPath
</pre>
</blockquote>
<p>There are a number of clever things here!  First of all, he returns the root node using a fold, which I was dubious about being able to do.  That&#8217;s because instead of diving from the root, he starts from the right (<tt>fold<b>r</b></tt>) and constructs the leaf node, then inserts that as the value of the node above, all the way up to the top.  Very cute.  Then note that he&#8217;s using the Data.Map API elegantly: notice how he uses <tt>M.singleton</tt> where I would have naively done a fromList of a single element, for example.  Also, instead of having to descend each tree, updating, he simply creates a set of single path trees, then merges them together at the end with <tt>M.unionWith</tt>.  Finally, it&#8217;s using <tt>Monoid</tt>s, which (as well as being a classic Doctor Who monster) is a fancy way of saying &#8220;they behave like appendable things&#8221; (more or less).  In fact you could write the snippet above without, but it does give us the convenient <tt>mempty</tt> and <tt>mconcat</tt> functions.</p>
<p>
So, which version do you prefer?  I mentioned to sjanssen that I thought the Haskell one had more &#8220;concepts&#8221;, but actually both versions use folds and some sort of mapping data structure.  One is destructive, the other pure.<br />
Of course, as he pointed out, not knowing Perl, my version looked incomprehensible.<br />
But if you didn&#8217;t know how to write the sick fold above in Perl, you could have easily written a simple recursive version.  Whereas the Haskell version does require some knowledge, like the use of recursive newtypes (which I found very confusing &mdash; and hard to compile/debug) and as I&#8217;ve mentioned, the Data.Map API is large, as befits its power.</p>
<p>
I&#8217;m too tired to attempt the leaves function in Haskell &mdash; please do comment if you give it a go!</p>
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2009/01/31/more-longest-paths-and-sick-folds/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>There&#8217;s the nub (snippet in Perl and Haskell)</title>
		<link>http://greenokapi.net/blog/2009/01/27/theres-the-nub-snippet-in-perl-and-haskell/</link>
		<comments>http://greenokapi.net/blog/2009/01/27/theres-the-nub-snippet-in-perl-and-haskell/#comments</comments>
		<pubDate>Mon, 26 Jan 2009 22:44:27 +0000</pubDate>
		<dc:creator>osfameron</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://greenokapi.net/blog/?p=158</guid>
		<description><![CDATA[Here&#8217;s a simple problem, with solutions in Perl and Haskell. @joel: suppose I have a list of strings (they happen to be paths) - how might I find only the longest instance of each path? @joel: that is give /foo /foo/bar /foo/bar/baz /qux I only want back /foo/bar/baz and /qux @joel: do I need to [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a simple problem, with solutions in Perl and Haskell.</p>
<blockquote>
<pre>@joel: suppose I have a list of strings (they happen to be paths) -
       how might I find only the longest instance of each path?
@joel: that is give /foo /foo/bar /foo/bar/baz /qux I only want back
       /foo/bar/baz and /qux
@joel: do I need to build a trie myself? my cpanfu is failing me
</pre>
</blockquote>
<p>As this was a Perl channel, Penfold suggested splitting on &#8220;/&#8221; into a multilevel hash, then<br />
outputting the leaf nodes.  Then pjcj suggested, performance permitting, to go through the<br />
list deleting each entry that is a substring of the previous entry: that is, treating the<br />
strings simply as strings of characters.</p>
<p>
This makes it sound like you have to compare each string with the strings shorter than it,<br />
but it turns out that you can get exactly the right behaviour on an asciibetically sorted<br />
list.  For example, a set of paths similar to the ones Joel quoted might get sorted as:</p>
<ul>
<li> /foo</li>
<li> /foo/bar</li>
<li> /foo/bar/baz</li>
<li> /qux</li>
<li> /qux/wibble</li>
</ul>
<p>You can see that this should reduce to</p>
<ul>
<li> /foo/bar/baz</li>
<li> /qux/wibble</li>
</ul>
<p>as the list collapses the previous element every time we realise that it is a substring<br />
prefix of the following element.</p>
<p>
We can do this in a single pass using a <tt>fold</tt>, which is a pattern that&#8217;s less common in<br />
Perl than other idioms from functional programming like <tt>map</tt> and <tt>grep</tt>, but<br />
which is implemented in <tt>List::Util</tt>&#8216;s <tt>reduce</tt>.  A typical example of reduce<br />
might be this:</p>
<blockquote>
<pre>  sub sum {
    reduce { $a+$b } @_;
  }
  print sum (1,2,3); <em># = 6</em></pre>
</blockquote>
<p>Of course there the two elements $a and $b are the same kind of thing (a number) and the<br />
reducing function ($a+$b) returns a number in turn.  But for the longest paths problem,<br />
we need something else: a list of results.  This is a common idiom in Haskell: using an<br />
&#8220;accumulator&#8221; as the initial value.  However Perl&#8217;s <tt>reduce</tt> uses the first element<br />
of the list as the initial value.  That&#8217;s often a very good strategy often known as<br />
<tt>foldl1</tt>, but for a long time I thought that the Perl implementation was weak as it<br />
didn&#8217;t allow a more flexible arbitrary value.  I was wrong.  LeoNerd pointed out in #moose<br />
that you can just pass the init value as the first element of the list!  (This works in Perl<br />
of course, as lists are untyped).  For example, this subroutine would be a (rather silly)<br />
way of turning a list into a list reference:</p>
<blockquote>
<pre>  sub as_list {
    reduce { [@$a,$b] } [], @_
  }
</pre>
</blockquote>
<p>So, we&#8217;re going to call our function like this:</p>
<blockquote>
<pre>  my $longest_list = longest qw( /foo /qux/wibble /foo/bar/baz /qux /foo/bar );
</pre>
</blockquote>
<p>And we can implement it like this:</p>
<blockquote>
<pre>  sub longest {
    reduce {
        my ($acc, $val) = ($a, $b); <em># 'accumulator' and 'value'</em>
        my $last = pop @$acc || ''; <em># The "last" value is an empty string</em>
                                    <em># the first time around</em>

        <em># We return a list reference, which will be the accumulator </em>
        <em># the next time around</em>
        [ @$acc,
          $val =~ /^\Q$last\E/ ? () : ($last), # collapsed, if substring
          $val ]
      }
      [],      <em># initial acumulator (empty list)</em>
      sort @_; <em># sorted input list</em>
  }</pre>
</blockquote>
<p>OK, so I promised myself I&#8217;d sketch this in Haskell too.  It should be slightly simpler<br />
as the hackish <tt>pop</tt>ping off the end of the list is replaced with a nice<br />
pattern match.</p>
<blockquote>
<pre>  longest :: [String] -&gt; [String]
  longest = foldl aux [] . sort
    where aux []         v = [v]
          aux xss@(x:xs) v = v :
                            (if x `isPrefixOf` v
                                then xs
                                else x:xs)
</pre>
</blockquote>
<p>As often happens when converting an algorithm to Haskell, the result list is<br />
backwards, as we&#8217;re consing the new result rather than appending to it.  But in<br />
fact, it&#8217;s not especially elegant: the conditional concatenation is a bit<br />
clumsy for example.  I asked for feedback on #haskell.  And, as so often<br />
happens on that channel, I got an altogether more compact version: augustss<br />
suggested something like:</p>
<blockquote>
<pre>  longest2 = nubBy (flip isPrefixOf) . sortBy (flip compare)
</pre>
</blockquote>
<p><tt>nubBy</tt> is precisely the pattern that reduces a list based on whether adjacent elements<br />
are to be collapsed or not.  (Apparently you may need to reverse the sense of <tt>nubBy</tt> in<br />
GHC 6.10 â€” I&#8217;m using 6.6.1 â€” which simplifies to <tt>nubBy isPrefixOf</tt>.  Nothing<br />
like backwards compatibility&#8230;)</p>
<p>
<strong>Update:</strong> <tt>sortBy (flip compare)</tt> is dolio&#8217;s suggested improvement to the original <tt>reverse . sort</tt>.  The original version does read better I think, but flipping the comparison is a bit more efficient than having to reverse the list after sorting it.  It also starts to produce values lazily before finishing the sort.</p>
<p>
<strong>Update:</strong> Pumpkin noted that a trie would be more efficient than nub&#8230; D&#8217;oh!  <tt>nub</tt>, while it <em>looks</em> elegant and compact above is actually losing the benefit of us having sorted the list in the first place (it&#8217;s implemented using a set of chinese box filter functions &#8211; effectively the same as a linked list of comparisons.  If we match the nub, this is very efficient â€” that&#8217;s the outermost box already!  However if we&#8217;re not a prefix, then we&#8217;ll uselessly check if we&#8217;re a prefix of every other path, even though we couldn&#8217;t possibly be.</p>
<p>
What we really need is precisely the semantics of Unix&#8217;s <tt>uniq</tt>, which expects a sorted list, and can therefore do a single pass, like our reduce version.  Olathe pointed out a solution with <tt>map head . group</tt> which is almost what we want&#8230; except we need the <tt>-By</tt> version.  Given a new function <tt>uniqBy</tt>, we just need to substitute it for <tt>nubBy</tt> and we&#8217;re suddenly efficient again!</p>
<blockquote>
<pre>  uniqBy eq = map head . groupBy eq
  longest3 = uniqBy (flip isPrefixOf) . sortBy (flip compare)
</pre>
</blockquote>
<p><b>Update 2009-01-27:</b> It occurred to me that instead of all the flipping, we could simplify <tt>uniqBy</tt> by making it take the last element instead of the <tt>head</tt>.</p>
<blockquote><pre>
  longest4 = uniqBy' isPrefixOf . sort
  uniqBy' eq = map last . groupBy eq
</pre>
</blockquote>
<p><b>Update:</b> X pointed out that there&#8217;s a bug: ["/a", "/aa", "/aaa"] will all get smushed<br />
together as they&#8217;re substrings.  Actually whether this is a bug to the original specification<br />
is debatable &#8211; Joel did mention &#8220;strings&#8221;, however he did also mention that they are file<br />
paths, so let&#8217;s try to make them do the right thing:</p>
<p>
In fact the basic core <tt>uniqBy' isPrefixOf . sort</tt> can remain exactly the same.<br />
But instead of a single string <tt>"/foo/bar/baz"</tt> we&#8217;ll want to be passing<br />
it a list of strings like <tt>["foo","bar","baz"]</tt>.  As Haskell&#8217;s list processing<br />
is completely generic, all the building blocks like <tt>isPrefixOf</tt> and <tt>sort</tt><br />
will Just Work exactly as before!  The only complication is splitting and joining the<br />
strings, something that&#8217;s trivial in Perl.  I<br />
<a href="http://greenokapi.net/blog/2007/11/12/haskell-words-and-perl-split/">wrote about<br />
splitting words in Haskell</a> previously, but in the mean time Brent Yorgey has created<br />
<a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/split"><tt>Data.List.Split</tt></a><br />
which should do exactly what we want.  However I can&#8217;t install it on my laptop&#8217;s ubuntu packaged<br />
ghc 6.6.1, so for now I&#8217;ll fall back to <tt>Text.Regex.splitRegex</tt>.  Then, to join the<br />
path separator again we need to <tt>intersperse</tt> it, but that doesn&#8217;t flatten the list,<br />
so we end up with something like <tt>["foo", "/", "bar"]</tt>.  <tt>Control.Monad</tt>&#8216;s<br />
<tt>join</tt> function, generic as it is, actually does the right thing here.  So we end<br />
up with:</p>
<blockquote><pre>
  longest5 :: [String] -> [String]
  longest5 = map rejoin . uniqBy' isPrefixOf . sort . map split
      where split   = splitRegex $ mkRegex "/"
            rejoin  = join . intersperse "/"
</pre>
</blockquote>
<p>We can test this against an input like <tt>[ "/a", "/aa", "/a/a" ]</tt> to check it&#8217;s all<br />
correct.</p>
<p>
The same approach should work in Perl, except that splitting/joining will be slightly simpler,<br />
while the equivalent <tt>isPrefixOf</tt> logic is more complex, as it doesn&#8217;t have the same<br />
generic list comparisons.  I&#8217;ll leave that as an exercise to the reader :-)</p>
<p>
Talking of which, thanks to Will for the Python/Erlang versions in the<br />
comments, &#8220;Anon&#8221; for an Ocaml solution, and &#8220;Programmer&#8221; for a Perl challenge<br />
I&#8217;ll come back to if time allows.</p>
<p>
<b>Update 2009-01-30:</b> Will pointed out that my &#8220;improved&#8221; testcase isn&#8217;t.  Better would be <tt>["/a", "/aa", "/b", "/b/b"]</tt> to check that /a and /aa are distinct, but /b is merged into /b/b.</p>
<p>
He also sent an <a href="http://nanu.appspot.com/Bp">improved Erlang version</a>, thanks!</p>
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2009/01/27/theres-the-nub-snippet-in-perl-and-haskell/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Crossword puzzles in Haskell</title>
		<link>http://greenokapi.net/blog/2008/12/18/crossword-puzzles-in-haskell/</link>
		<comments>http://greenokapi.net/blog/2008/12/18/crossword-puzzles-in-haskell/#comments</comments>
		<pubDate>Wed, 17 Dec 2008 23:11:41 +0000</pubDate>
		<dc:creator>osfameron</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[crossword]]></category>

		<guid isPermaLink="false">http://greenokapi.net/blog/?p=156</guid>
		<description><![CDATA[Every year or so I come back to the problem of writing a crossword puzzle compiler/player. I think Javascript would be the most promising for a web-based player, though I&#8217;ve given it a go in Java and Perl too. Modeling the problem is interesting in an Object Oriented language &#8211; I would find myself getting [...]]]></description>
			<content:encoded><![CDATA[<p>
Every year or so I come back to the problem of writing a crossword puzzle<br />
compiler/player.  I think Javascript would be the most promising for a<br />
web-based player, though I&#8217;ve given it a go in Java and Perl too.<br />
Modeling the problem is interesting in an Object Oriented language &#8211; I would<br />
find myself getting bogged down with &#8220;Lines&#8221; and the similarities between<br />
rows (Across) and columns (Down).  I have a suspicion that OO Roles might be<br />
a more expressive way to model this.  Anyway, given that I&#8217;ve not been writing<br />
much about Haskell <sup><a id="nb_1_a" href="#nb_1">1</a></sup>, this is a good time to redress the balance.</p>
<p>
In the OO implementations, Cells would refer to the &#8220;Light&#8221; (group of adjacent<br />
cells running Across/Down) that they&#8217;re in.  And the Light would of course refer<br />
to the cells&#8230; this idea filled me with terror in Haskell, as it involves<br />
&#8220;Tying the Knot&#8221;, which seems terribly clever and confusing.  As it happens, you<br />
can often get away with having mutual references in Functional Programming, as<br />
they just spontaneously turn out not to be necessary.  So far, this seems to be<br />
the case, though I think that if I take it to the point of navigating the grid<br />
from a UI, I may need an explicit structure to manage this, like a <a href="http://www.haskell.org/haskellwiki/Zipper">zipper</a>.</p>
<p>
This is a &#8220;literate Haskell&#8221; post.  You should be able to save it as<br />
&#8220;crossword.lhs&#8221; and run it with <tt>runghc</tt> or <tt>ghci</tt> (calling<br />
the <tt>main</tt> function to test it).</p>
<p>We start off with some imports: the <tt>List</tt> and <tt>Char</tt> modules<br />
(which I&#8217;ve naughtily not specified) and the handy <tt>join</tt><br />
function to flatten a list.</p>
<pre>

> import Data.List
> import Data.Char
> import Control.Monad (join)
</pre>
<p>They say that much of the work in Haskell is defining the types.<br />
<tt>Direction</tt> is an Enum for the direction of a Light.<br />
<tt>Cell</tt> is either a Block (a black square) or a Cell (either blank,<br />
or already filled in).  I guess I could model <tt>Block | Blank | Cell</tt><br />
but don&#8217;t yet see an advantage to that.</p>
<pre>

> data Direction = Across | Down
>     deriving (Show, Eq, Ord)
>
> data Cell = Cell  (Maybe Char) Coord
>           | Block              Coord
>     deriving (Show, Eq)
>
> type Coord  = (Int, Int)
>
> coord (Cell _ c) = c
> coord (Block  c) = c
</pre>
<p>Directions can be sorted (which may come in useful for showing Across<br />
clues before Down ones), and this can be done automatically by deriving<br />
Eq and Ord.  But how would we sort a Cell?  We&#8217;ll do it by the (x,y)<br />
coordinates, which means I can&#8217;t use automatic derivation.  Perhaps<br />
I could swap the order of arguments and still use that, but for now I<br />
defined a custom <tt>compare</tt>.</p>
<pre>

> instance Ord Cell
>     where compare l r = compare (coord l) (coord r)
</pre>
<p><tt>Light</tt>s (distinct from Clues, which may be spread over 1 or more<br />
Lights) are a list of cells in a given direction.  (It would be nice to<br />
specify that the cells really are contiguous, not sure if this is something<br />
that fundeps would be useful for?)</p>
<pre>

> data Light     = Light [Cell] Direction
>     deriving (Show, Eq, Ord)
>
> -- we'll sometimes want to know the first cell in a Light:
> headC (Light (c:_) _) = c
</pre>
<p>Of course Lights are numbered&#8230; but with the algorithm that I&#8217;m using, we<br />
don&#8217;t know the number (like 5 Across) at the time we create it.  I created<br />
a new type, <tt>LightN</tt>, but perhaps I should have modeled with a<br />
<tt>Maybe Int</tt> instead.</p>
<pre>

> data LightN    = LightN Int Light
>     deriving Show
</pre>
<p>Now we need some test data.  I started with a grid from the wonderful<br />
<a href="http://en.wikipedia.org/wiki/John_Galbraith_Graham">Araucaria</a>,<br />
<a href="http://www.guardian.co.uk/crossword/java/new/0,,-22461,00.html">Cryptic<br />
Crossword No. 24298</a>.</p>
<pre>

> grid = textToCells [
>     "TRIPOD#        ",
>     "# # # # # # # #",
>     "PARSIFAL#RESCUE",
>     "# # # # # # # #",
>     "###            ",
>     "# # # # # ### #",
>     "ANNA###        ",
>     "# # # # # # # #",
>     "        ###PARE",
>     "# ### # #U# # #",
>     "         S  ###",
>     "# # # # #E# # #",
>     "      #INFERNAL",
>     "# # # # #U# # #",
>     "FRAGMENT#L     "
>     ]
</pre>
<p>We&#8217;re going to want to output the grid, lights, and other objects, so let&#8217;s<br />
define some functions to do that.</p>
<pre>

> showCell (Block _)         = '#'
> showCell (Cell (Just c) _) = c
> showCell _                 = ' '

> showLightN :: LightN -> String
> showLightN (LightN n l@(Light cs d)) =
>        show n ++ " "
>     ++ show d
>  -- ++ " " ++ show (coord $ headC l)
>     ++ ": "
>     ++ show (map showCell cs)
>     ++ " (" ++ (show $ length cs) ++ ")"
>     
</pre>
<p>Similarly, we want to parse the list of strings above into a list of<br />
crossword cells.  <tt>textToCells</tt> threads the row and column<br />
number with every character in the grid by zipping the list with the<br />
infinite list <tt>[1..]</tt>, which I think is quite cute, though<br />
there are no doubt more elegant versions (list comprehensions?)</p>
<pre>

> charToCell :: Char -> Coord -> Cell
> charToCell '#' = Block
> charToCell ' ' = Cell Nothing
> charToCell  c
>     | isAlpha c = Cell (Just c)
>     | otherwise = error $ "Invalid character '" ++ [c] ++ "'"
>
> textToCells :: [[Char]] -> [[Cell]]
> textToCells                     = zipWith  makeRow       [1..]
>     where makeRow  row          = zipWith (makeCell row) [1..]
>           makeCell row col char = charToCell char (row,col)
</pre>
<p>But working out what cells are is the easy part!  We now want to<br />
know which cells form a light &mdash; i.e. groups of more than 1 non-block<br />
cell in either direction Across/Down.  To get data for both directions,<br />
it&#8217;s easiest to run in two passes, one in the normal direction, the<br />
other <tt>transpose</tt>d.  (I did consider trying to do both at the<br />
same time, but it hurt my brain: a one pass solution involving magical fumplicative<br />
arroids or somesuch is left as an exercise to the very clever reader).</p>
<pre>

> lights dir grid = concatMap
>                     (flip lightsInLine dir)
>                     $ rot dir grid
>     where rot Across = id
>           rot Down   = transpose
>
> lightsInLine :: [Cell] -> Direction -> [Light]
> lightsInLine cells dir =
>     let l  = filter isMultiCell
>                $ groupBy areCells cells
>     in  map (\c -> Light c dir) l
>
> areCells x y = isCell x &#038;&#038; isCell y
> isCell (Cell _ _) = True
> isCell  _         = False
>
> isMultiCell (x:y:_) | areCells x y = True
> isMultiCell _ = False
</pre>
<p>So&#8230; where have we got with all of this modeling?  Well, we can now<br />
find all the Across and Down lights.  But then we&#8217;ll want to number<br />
them.  To do that, we&#8217;d have to sort them (by the coordinate).  Across and<br />
Down lights can have the same number (like 5 Across and 5 Down in our<br />
example grid) so we want to group by lights that have the same head cell.<br />
Then we can thread the light number again, using the <tt>zipWith ... [1..]</tt><br />
trick:</p>
<pre>

> allLights = join $ zipWith (map . LightN) [1..] gs
>         where gs = groupBy eqHead ls
>               ls = sort $ (lights Across grid)
>                        ++ (lights Down   grid)
>               eqHead l r = (headC l) == (headC r)
</pre>
<p>And finally, we can see the result of all the hard work, with a list of<br />
all the lights, and their current (partial) solutions:</p>
<pre>

> main = mapM_ putStrLn $ map showLightN allLights
</pre>
<p>Obviously this is only a start on the problem.  For modeling, we now need a<br />
concept of a Clue (<tt>Clue String [Light]</tt>) and a solution &#8211; should the<br />
solution belong to the clue? or to the <tt>[Light]</tt>s that it&#8217;s made up of.<br />
How do we link the answer grid (where the lights contain the correct characters)<br />
with the play grid, which contains the current letters that the player believes<br />
to be right?  And how do we update the cells, lights, and grid while playing<br />
(or creating) a crossword?</p>
<p>Suggestions on these questions, and improvements or advice on the current code<br />
are greatly appreciated!</p>
<hr />
<ol>
<li> <a id="nb_1" /><br />
    I had a complaint about this from a Planet Haskell reader: and though<br />
    the <a href="http://planet.haskell.org/faq.html">FAQ</a> does suggest<br />
    that it&#8217;s ok, or even encouraged to write about other things, perhaps<br />
    I should <i>also</i> write a little about Haskell&#8230; ;-)</p>
<p><a href="#nb_1_a">back</a></p>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2008/12/18/crossword-puzzles-in-haskell/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Countdown words game solver in Haskell</title>
		<link>http://greenokapi.net/blog/2008/06/18/countdown-words-game-solver-in-haskell/</link>
		<comments>http://greenokapi.net/blog/2008/06/18/countdown-words-game-solver-in-haskell/#comments</comments>
		<pubDate>Wed, 18 Jun 2008 12:02:12 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[countdown]]></category>
		<category><![CDATA[haskell]]></category>

		<guid isPermaLink="false">http://osfameron.vox.com/library/post/countdown-words-game-solver-in-haskell.html?_c=feed-rss-full</guid>
		<description><![CDATA[Will on #geekup has been working on a Countdown letters and numbers game solver written in Python. I thought it&#39;d be fun to try to do it in Haskell, and started with the letters game (anagram) solver. Starting with a string of jumbled letters, the goal is to make the longest possible anagram. I remember [...]]]></description>
			<content:encoded><![CDATA[<p>
Will on #geekup has been working on a<br />
<a href="http://www.channel4.com/entertainment/tv/microsites/C/countdown/index.html">Countdown</a><br />
<a href="http://countdown.willboyce.com/">letters and numbers game<br />
solver</a> written in Python.  I thought it&#39;d be fun to try to do it in Haskell,<br />
and started with the letters game (anagram) solver.
</p>
<p>
Starting with a string of jumbled letters, the goal is to make the longest possible<br />
anagram.  I remember the first time I tried to solve anagrams I jumped into the<br />
problem without thinking and got mixed up in all kinds of complicated combinatorial<br />
mess.  The actual answer is very simple:  let&#39;s take two words which are anagrams of<br />
each other:</p>
<ul>
<li>monad</li>
<li>nomad</li>
</ul>
<p>Both of them contain the same letters, so they are identical in some form of &quot;canonical<br />
representation&quot;, for example</p>
<ul>
<li><tt>{a:1, d:1, m:1, n:1, o:1}</tt> &#8212; dictionary mapping letter to number of times used</li>
<li><tt>&quot;admno&quot;</tt> &#8212; a string with the letters sorted
    </li>
</ul>
<p>So we just need to consider all the subsets of the original jumbled letters in turn, and<br />
compare them against a map of:: <tt>canonical representation -&gt; [list of words]</tt>.</p>
<p>
So for example:
</p>
<blockquote><pre>pmqnrdzoa...&#160;m&#160;n&#160;d&#160;oa...</pre>
</blockquote>
<p>This function is called a <em>powerset</em>.  I&#39;m lazy so I googled<br />
<a href="http://www.haskell.org/pipermail/haskell-cafe/2003-June/004484.html">a definition</a>.<br />
We want the longest words first.  The definition of powerset I found does a<br />
<em>depth first search</em> so it&#39;s not in order of length.  What we want to do<br />
is to work on a list like this</p>
<blockquote><pre>list s =                  sortBy (flip $ comparing length)  -- longest first                . nub                -- unique entries only                . powerset           -- all combinations of                 . canonicalize       -- canonical (sorted) string                $ s</pre>
</blockquote>
<p>where <em>canonicalize</em> is just <tt>sort . map toLower . filter isLetter</tt>.</p>
<p>
<em>comparing</em> is a nice litle utility sub that makes the above effectively the<br />
same as (\a b -&gt; length a `compare` length b).  We then <em>flip</em> it to reverse the<br />
ordering (and this is actually a good use for flip ;-).
</p>
<p>
Ordering by length is potentially inefficient â€” it checks the length of each<br />
element twice, and unlike Perl (where a string knows its own length), a string is<br />
just a list, so it has to descend the list to find it out.  This is easy to optimize<br />
by precalculating the lengths, using a technique that in Perl we call the &quot;Schwartzian<br />
transform&quot;, and I&#39;ll probably come back to this.
</p>
<p>
OK, so we have a list of subsets to compare, now we need to find a dictionary of<br />
canonical representations of words.  Luckily most unixy distributions ship with<br />
one, often <tt>/usr/dict/words</tt>, but Ubuntu sticks is elsewhere.
</p>
<p>
I asked on #haskell, and was told I should use a <tt>Data.Map</tt>, Haskell&#39;s<br />
basic equivalent of a hash or associative array, but implemented using a<br />
Functional Programming friendly tree representation.  In actual fact,<br />
quicksilver, mrs, and mmorrow told me the answer straight away, but let&#39;s<br />
pretend for the purpose of this post that we&#39;re working it out now :-)
</p>
<p>
Assuming I load that<br />
module, as is common, as <tt>M</tt>, I&#39;d essentially want to call<br />
<tt>M.insertWith (++)</tt> on each element.  The <tt>(++)</tt> is the<br />
concatenation operator, and it&#39;s the right thing to use because the<br />
dictionary is mapping <tt>String -&gt; [String]</tt>, for example</p>
<blockquote><p><tt>fromList [(&quot;admno&quot;, [&quot;nomad&quot;,&quot;monad&quot;,&quot;Damon&quot;]),...]</tt></p>
</blockquote>
<p><em>insertWith</em> returns a new copy of the Map each time.  It&#39;s like an accumulator<br />
which gradually takes on the entries from the list of words.  And whenever we think<br />
about accumulators, we can think about <em>fold</em>s.</p>
<blockquote>
<pre>foldl&#39; (\m x -&gt; M.insertWith (++) (canonicalize x) [x] m) mempty listOfWords</pre>
</blockquote>
<p><em>mempty</em> is shorthand here for &quot;an empty Data.Map&quot;.  But we can go one better as<br />
apparently fold/insertWith is so common that there is a shorthand, <em>fromListWith</em>!</p>
<blockquote>
<pre>fromListWith (++) . map (canonicalize &amp;&amp;&amp; return)</pre>
</blockquote>
<p>Woah!  That&#39;s quite compact, and I just introduced some new syntax too:  The<br />
<tt>&amp;&amp;&amp;</tt> is basically saying &quot;let&#39;s make a tuple with<br />
the result of calling these 2 functions on my input!&quot; so it&#39;s the same as</p>
<blockquote>
<pre>fromListWith (++) . map (\a -&gt; (canonicalize a, return a))</pre>
</blockquote>
<p>And <em>return</em> just means &quot;wrap this value in the appropriate Monad&quot;.  So it&#39;s<br />
a scary way of saying <tt>[a]</tt>, because we&#39;re &quot;in&quot; the List monad.  (In the same way<br />
that <tt>mempty</tt> above was an empty Data.Map, because it was &quot;in&quot; the Map Monad.)</p>
<p>
Whenever I play with Map, I get angry errors about the monomorphism restriction.<br />
The way around that is to add an explicit type signature.  If, like me, you&#39;re not<br />
quite sure what to put there, you can add a compiler directive to quell the error,<br />
then work out what the signature would be by calling <tt>:t my_function</tt> from the<br />
GHCI command line.  (You&#39;ll often find afterwards that you can remove the signatures<br />
if you wanted to, because later on the compiler has more information to work out the<br />
types of things.  It&#39;s only really <em>during</em> incremental development that you<br />
get the problem.</p>
<blockquote>
<pre>{-# LANGUAGE NoMonomorphismRestriction #-}-- (that&#39;s the compiler directive, you can comment this out later)

makeAnag s = do    d &lt;- dict    return $ take 4 $ getAnagrams s d

dict = do file &lt;- readFile &quot;/etc/dictionaries-common/words&quot;          return $ mkdict $ lines file

mkdict :: [String] -&gt; M.Map String [String]mkdict = M.fromListWith (++) . map (canonicalize &amp;&amp;&amp; return) . filter longEnough

longEnough = (&gt;=3) . length</pre>
</blockquote>
<p>As you can see, for all the perceived difficulty of doing IO in a pure language<br />
like Haskell, it doesn&#39;t seem all that hard in this simple case.  <tt>readFile</tt><br />
reads the file, and <tt>lines</tt> splits it into an array of lines.</p>
<p>
The final thing is to check each powerset against the dictionary.<br />
To extract the value, we use <tt>M.lookup</tt>.  This function <tt>fail</tt>s if it<br />
can&#39;t find a value.  So we could do
</p>
<ul>
<li>For each powerset in the list
</li>
<li>Check if it&#39;s present
</li>
<li>And add it to the list if so
</li>
</ul>
<p>Which of course we could do with the <tt>Maybe</tt> type and a <tt>filter</tt>.<br />
But we want a list, and in the List type, failure is represented by an empty list<br />
<tt>[]</tt>.  So we can just map and we&#39;d get something like:</p>
<blockquote><pre>[ [&quot;anagram&quot;], [], [], [&quot;anagram 1&quot;, &quot;anagram 2&quot;], [] ]</pre>
</blockquote>
<p>With an empty list for each failure.  We can use concatMap to join these together.  So it&#39;s:</p>
<blockquote><pre>concatMap (\v -&gt; M.lookup v dict) listOfPowersets</pre>
</blockquote>
<p>Though that actually returns:</p>
<blockquote><pre>[ [&quot;anagram&quot;], [&quot;anagram 1&quot;, &quot;anagram 2&quot;], ]</pre>
</blockquote>
<p>which I hadn&#39;t expected.  (<tt>M.lookup</tt> returned a list like <tt>[&quot;anagram<br />
1&quot;, &quot;anagram 2&quot;]</tt>.<br />
Quite literally it <tt><strong>return</strong></tt>ed it, which in List context means it<br />
actually passed <tt>[[&quot;anagram 1&quot;, &quot;anagram 2&quot;]]</tt>, which is why the list isn&#39;t completely<br />
flattened by concatMap.  I get around this by using <tt>join</tt>.  This is another of those<br />
monadic functions: in List context it does exactly what we want here, flattening this list.</p>
<blockquote>
<pre>getAnagrams s d = join                 . concatMap (flip M.lookup $ d)                $ filter longEnough  -- 3 or more letters                . sortBy (flip $ comparing length)  -- longest first                . nub                -- unique entries only                . powerset           -- all combinations of                 . canonicalize       -- canonical (sorted) string                $ s</pre>
</blockquote>
<p>You can look at the final <a href="http://greenokapi.net/svn/code/scratch/countdown.hs">Haskell Countdown code</a>.  I&#39;ll look at optimizing the sort and the powersets soon, any comments on<br />
other improvements (including better algorithms) very welcome. </p>
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2008/06/18/countdown-words-game-solver-in-haskell/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Monads in Perl (take 1)</title>
		<link>http://greenokapi.net/blog/2008/06/13/monads-in-perl-take-1/</link>
		<comments>http://greenokapi.net/blog/2008/06/13/monads-in-perl-take-1/#comments</comments>
		<pubDate>Thu, 12 Jun 2008 22:27:26 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[perl]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[monads]]></category>

		<guid isPermaLink="false">http://osfameron.vox.com/library/post/monads-in-perl-take-1.html?_c=feed-rss-full</guid>
		<description><![CDATA[I&#39;ve been away for a while from Haskell so I thought I should do some revision and really get my head around Monads. While I plodded through the wonderful &#34;meet the monads&#34; tutorial, I decided that the best way to learn would be to do. By implementing Monads in Perl. I&#39;d highly recommend trying to [...]]]></description>
			<content:encoded><![CDATA[<p>
I&#39;ve been away for a while from Haskell so I thought I should do some revision<br />
and really get my head around Monads.  While I plodded through the wonderful<br />
<a href="http://www.haskell.org/all_about_monads/html/meet.html">&quot;meet the monads&quot; tutorial</a>, I decided that the best way to learn would be to do.  By implementing Monads in Perl.<br />
I&#39;d highly recommend trying to implement monads in Your Favourite Language, if it<br />
supports lambdas.  Perl has <a href="http://sleepingsquirrel.org/monads/monads.html"> already been done by<br />
Greg Buchholz</a> and rather nicely too, but there&#39;s no Monad library on CPAN<br />
so I thought it would be worth a try.
</p>
<p>
First of all, the question of how to model &quot;types&quot; is easily resolved.  We bless each<br />
monad into the <tt>Monad</tt> class or a subclass.  These can then have methods for<br />
<tt>bind</tt> and <tt>return</tt> etc.
</p>
<p>
Now I do like the haskell <tt>&gt;&gt;</tt> and by a stroke of good fortune, Perl allows<br />
us to overload that symbol too.</p>
<blockquote>
<pre>use overload &#39;&gt;&gt;&#39;   =&gt; &#39;Bind&#39;;</pre>
</blockquote>
<p>
I use the string <tt>&#39;Bind&#39;</tt> rather than the reference <tt>\&amp;Bind</tt>, so that<br />
the subclasses can easily override it.
</p>
<p>
Some default bind methods in <tt>Monad.pm</tt> and <tt>Monad::Maybe</tt> etc.,<br />
<a href="http://greenokapi.net/svn/code/scratch/Monads/">available here</a> and<br />
we have some simple examples like this one (in test.pl):</p>
<blockquote>
<pre>my $result =       (Writer 2) &gt;&gt;      L { my $x = shift; (Writer $x*2, &quot;Doubled. &quot;) &gt;&gt;      L { my $y = shift; (Writer $y+1, &quot;Plus 1. &quot;)  &gt;&gt;      L { my $z = shift; (Writer $z*3, &quot;Tripled $z. &quot;)       }}};</pre>
</blockquote>
<p>
Woot!  OK, that&#39;s not entirely beautiful, but it&#39;s been slightly improved by the<br />
overloading of <tt>&gt;&gt;</tt>.
</p>
<p>
The <tt>L</tt> lambda generator is also there for readability.  It&#39;s basically defined as</p>
<blockquote><pre>sub L (&amp;) { shift }</pre>
</blockquote>
<p>i.e. it&#39;s an identity function, but it&#39;s an L (like lambda) and to my mind, lined up on<br />
the left, it looks pleasingly like &quot;and then&quot;.</p>
<h2>Nests</h2>
<p>
This didn&#39;t just fall straight out of the text editor into fully working code, of course.<br />
A blow-by-blow account of me getting confused wouldn&#39;t be especially interesting, but one<br />
big &quot;aha&quot; moment is worth pointing out.  I realised that I was thinking of monads as being<br />
a chain of lambdas, each one passing control to the next, like OO chaining:
</p>
<p style="clear:both">
<img src="http://greenokapi.net/blog/lambda-chain.png" alt="Chain of lambdas?">
</p>
<p style="clear:both">
But that doesn&#39;t work, as of course then the <tt>$x</tt>, <tt>$y</tt>,<br />
<tt>$z</tt> of each scope would be separate, whereas in fact, in &quot;later&quot; sections, you<br />
can refer to <tt>$x</tt> too.  This implies that the model is more like a nest of lambdas:
</p>
<p style="clear:both">
<img src="http://greenokapi.net/blog/lambda-nest.png" alt="Nest of lambdas">
</p>
<p style="clear:both">
This is made fairly clear in the Perl above, with its delimited braces, if you look at<br />
where the closing &quot;<tt>}</tt>&quot; are, and which opening &quot;<tt>{</tt>&quot; they match up with.
</p>
<p>
This is an interesting mind shift, and one that I still haven&#39;t really fully grasped, as<br />
I&#39;ll demonstrate a bit later.
</p>
<h2> Polymorphic functions on monads </h2>
<p>
In Haskell, you can call &quot;<tt>return</tt>&quot; in a monadic block to &quot;lift&quot; a value to<br />
the appropriate monad.  Similarly, you can call &quot;<tt>fail</tt>&quot;, and the function<br />
will fail in the right way (returning <tt>Nothing</tt> in a Maybe, throwing an error<br />
in IO).  This is a function call, not a method, so how does it know which monad to<br />
behave as?
</p>
<p>
Of course Haskell does this with its strong inferencing typechecker.  The<br />
compiler &quot;knows&quot; that we are in Maybe, so &quot;fail&quot; will be <tt>fail :: Maybe</tt>.
</p>
<p>
Perl on the other hand doesn&#39;t have a strong type-inferencing compiler&#8230;<br />
Right now I&#39;m doing some shonky magic with <tt>caller()</tt> that works in this<br />
very simple test case (and I believe <em>only</em> in this test case).  I think I<br />
could just simplify things and set a dynamic variable &quot;$Monad::current_monad&quot;<br />
on the first occurrence of <tt>Bind</tt>.  Yeah, global variables, yuck.  The<br />
final alternative that occurs to me would be to run the whole thing in a Reader<br />
monad which just passes the name of the monad&#8230; but I&#39;m fairly sure that&#39;s<br />
slightly insane.
</p>
<h2> So what can it do right now? </h2>
<p>
The <a href="http://greenokapi.net/svn/code/scratch/Monads/test.pl">test script</a> shows<br />
the current capabilities.  As of r246, I have Writer, Maybe, and List implemented (the<br />
Monad superclass is effectively Identity).
</p>
<p>
I think Maybe is very useful &#8211; with some wrapper functions that raise Perl functions to<br />
monadic ones using a variety of strategies (fail on undef/0/die etc.) it could be a useful<br />
addition to the toolbox, simplifying a nested set of <tt>if</tt> checks.
</p>
<p>
The List monad already does list comprehensions, albeit with a rather yucky syntax.<br />
Which is of course the big problem, &#39;cos Perl programmers (and this statement may surprise<br />
non Perl programmers :-) are often obsessive about syntax.
</p>
<h2> Making it look pretty </h2>
<p>
OK, so we already added a bit of sugar with the <tt>&gt;&gt;</tt> overloading, and the<br />
<tt>L</tt> function for lambda generators, but it&#39;s still rather ugly with the mix of<br />
Perlish argument unpacking (<tt>my $x = shift</tt>), scope delimiters (<tt>}}}</tt>) etc.
</p>
<h3> Source filters! </h3>
<p>
The original <a href="http://sleepingsquirrel.org/monads/monads.html">Perl monad<br />
tutorial</a> used a source filter to give a monadic Do notation.  It&#39;s a fairly<br />
nice one as they go, but I don&#39;t really want to treat my program as a string if<br />
I can help it, so let&#39;s look at some other techniques first!
</p>
<h3> <tt>Devel::Declare</tt> </h3>
<p>
Matt Trout has been working on some crazy parsing magic in<br />
<tt><a href="http://search.cpan.org/%7Emstrout/Devel-Declare/">Devel::Declare</a></tt>.<br />
This isn&#39;t a source filter, but (I think) hooks into Perl&#39;s parser to change the way that<br />
subroutine declarations are parsed.  It&#39;d designed to give us parameter<br />
unpacking, so that we could substitute:
</p>
<blockquote><p><tt>L {my $x = shift; .... }</tt></p>
</blockquote>
<p>with:</p>
<blockquote><p><tt>L ($x) { .... }</tt></p>
</blockquote>
<p>
In the current version this doesn&#39;t work (you can define <tt>L</tt> like that<br />
easily, but the overloaded <tt>&gt;&gt;</tt> evidences a minor parsing bug<br />
(you&#39;d have to put the expression between parentheses to get the precedence<br />
right, which loses the syntactic advantage we gain).
</p>
<p>
Still, hopefully will be fixed in a future release.
</p>
<h3> Generators </h3>
<p>
&quot;Valued Lessons&quot; has a beautiful post on <a href="http://www.valuedlessons.com/2008/01/monads-in-python-with-nice-syntax.html">Monads<br />
in Python (with nice syntax!)</a>.  The parenthesis is not hyperbole: the post describes<br />
a monadic do block which looks about as pretty as Haskell&#39;s, but which works in a different<br />
way.  We spell &#39;bind&#39; (Haskell&#39;s <tt>&lt;-</tt>) as &#39;<tt>yield</tt>&#39;.  So a control sub<br />
calls the &#39;do&#39; block, gets out monadic values one by one as they are <tt>yield</tt>ed back,<br />
and deals with the nitty gritty of <tt>bind</tt>ing them to the rest of the generator.
</p>
<p>
It took quite a while to understand the Python code: in fact I&#39;m not sure I understand<br />
it fully, I really don&#39;t buy into the &quot;Python is so easy to read&quot; meme, and certainly<br />
the &quot;<tt>@whatever</tt>&quot; syntax, which seems to be &#39;decorators&#39; that modify the subroutine<br />
that follows them, are rather confusing at first.  But it&#39;s quite impressive, and it took<br />
me a while to replicate in Perl.
</p>
<p>
First hurdle:  Perl doesn&#39;t have generators.  OK, that shouldn&#39;t be an issue, I thought,<br />
because we have the <a href="http://search.cpan.org/">CPAN</a>.  And yes, I found Brock Wilcox&#39;s<br />
<tt><a href="http://search.cpan.org/%7Eawwaiid/Coro-Generator">Coro::Generator</a></tt>.
</p>
<p>
This doesn&#39;t quite do what I want though.  The yield only works one way, so</p>
<blockquote><p><tt>my $x = yield (Monad 3);</tt></p>
</blockquote>
<p>doesn&#39;t actually bind $x to 3.  I asked Brock on IRC, and apparently this behaviour is<br />
desired (I&#39;m not quite sure why) so I forked his code to play with it :-)<br />
Also, the coroutine restarts immediately it finishes, which is inconvenient.<br />
Brock suggested yielding undef at the end, which is fine, I can do that from the<br />
control sub.<br />
(The Python version deals with finishing by throwing an exception, so perhaps<br />
it has the same semantics?)</p>
<p>
After a lot of ugly pain, I finally got this working, and we can now do:</p>
<blockquote><pre>my $result = Do {    my $x = yield (Just 3);    my $y = yield (Nothing);    my $z = yield (Just 5);    warn &quot;x=$x, y=$y, z=$z&quot;;    Just 6;</pre>
</blockquote>
<p>
Why the pain?  Failing to understand coroutines while trying to use them to implement monads<br />
(which I understand only very slightly) was a bad start.  I found myself using the Do function<br />
to repeatedly take a value from the generator and bind it with the next value (rather than<br />
letting the monadic bind deal with those details).  And even when I&#39;d realised that the<br />
sub that I needed to bind was a lambda that would abstract the details of invoking the coroutine,<br />
I still ended up flailing around more or less at random till I finally got it working.
</p>
<p>
The current code is ugly (declared inline in <tt>test.pl</tt> rather than modularized) but<br />
the result is pleasantly magical and readable.
</p>
<p>
Props of course to Python for having powerful techniques like <tt>yield</tt> and decorators<br />
in core!
</p>
<h3> Hold the champagne </h3>
<p>
Of course the final test example, in the List monad doesn&#39;t work.  Why?  The List monad&#39;s<br />
bind strategy is to call the function on every element of the list, so the coroutine will get<br />
called repeatedly.  And every time it&#39;s called, the execution pointer will move on.
</p>
<p>
I wonder whether the Python version has the same problem?  I looked again at<br />
the Coro modules on CPAN, and noted that they are advertised as being able to<br />
implement &quot;(non-clonable) continuations&quot;.  I think this is the problem: I want to be<br />
able to take the point at which the next Bind will be called, and call exactly that same<br />
point multiple times (for the List monad).  I asked various people including Brock again,<br />
and Scott Walters (the authors of <a href="http://continuity.tlt42.org/">Continuity, a<br />
continuation-based web application framework in Perl</a>) and got the answer that Perl<br />
really doesn&#39;t do proper continuations.  (As far as I understood it, they&#39;re more or<br />
less practically impossible, due to the way Perl models its execution context).
</p>
<p>
So, unless I&#39;ve misunderstood (and please let me know if I have!) this technique is<br />
limited to monads that only call the bound function once (e.g. most of them except List).<br />
That&#39;s a shame though, as the List comprehension semantics would be lovely to express<br />
in a monadic do block.
</p>
<h3> Meta continuations </h3>
<p>
The Valued Lesson post <em>does</em> implement continuations monadically&#8230; Could we do that<br />
and then implement monadic do using these monadic continuations?  I think the answer might be<br />
&quot;Yes but my brain would explode trying to implement it&quot;.
</p>
<h3> Plan <tt>B</tt> </h3>
<p>
I think that the most sensible method may be to take the contents of the monadic do<br />
block and use the <tt>B::</tt> modules to convert them from what <em>looks like</em>
</p>
<blockquote><p><tt>my $x = bind ...;</tt></p>
</blockquote>
<p> to </p>
<blockquote><p><tt>... &gt;&gt; sub { my $x = shift;</tt></p>
</blockquote>
<p>.  Which<br />
is pretty much the approach of Greg Buchholz&#39;s source filter.  But I think a parse<br />
tree transformation may be more elegant.  (This said, I don&#39;t know the Perl source<br />
or understand the opcodes, so it may just be slightly crazy).</p>
<p><strong>Update: </strong>Some <a href="http://www.reddit.com/r/programming/info/6n5wt/comments/">discussion on reddit</a>, as Vox still doesn&#39;t support OpenID</p>
<p style="clear:both;">
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2008/06/13/monads-in-perl-take-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Haskell &#039;words&#039; and Perl &#039;split&#039;</title>
		<link>http://greenokapi.net/blog/2007/11/12/haskell-words-and-perl-split/</link>
		<comments>http://greenokapi.net/blog/2007/11/12/haskell-words-and-perl-split/#comments</comments>
		<pubDate>Mon, 12 Nov 2007 13:18:11 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[split]]></category>

		<guid isPermaLink="false">http://osfameron.vox.com/library/post/haskell-words-and-perl-split.html?_c=feed-rss-full</guid>
		<description><![CDATA[Haskell&#8217;s prelude has a function words that splits a string by spaces. Prelude&#62; words "nice cup of tea" ["nice","cup","of","tea"] Apparently the question comes up quite regularly on irc or haskell-cafe as to why this function is specialised to split only on whitespace. Perl&#8217;s split, for example, can split on any character, or indeed string or [...]]]></description>
			<content:encoded><![CDATA[<p>Haskell&#8217;s prelude has a function words that splits a string by spaces.</p>
<blockquote>
<pre> Prelude&gt; words "nice cup of tea" ["nice","cup","of","tea"]</pre>
</blockquote>
<p>Apparently the question comes up quite regularly on irc or<br />
haskell-cafe as to why this function is specialised to split only on<br />
whitespace. Perl&#8217;s <tt>split</tt>, for example, can split on any character, or indeed string or regular expression.</p>
<p>As quicksilver has suggested, the split function is more complicated than you might think:</p>
<ul>
<li>character/string/regular expression</li>
<li>include the split token in the list?</li>
<li>collapse empty tokens?</li>
</ul>
<p>and therefore perhaps the reason that there is no general function, is that it would require a very complex type.</p>
<p>I thought about this some more, and if I&#8217;ve got anything at all<br />
about functional programming, it&#8217;s that you can build up progressively<br />
more complicated functions from smaller pieces. I wondered if I could<br />
do the same with split.</p>
<p>After playing a little with <tt>unfoldr</tt>, I decided I was better<br />
off using a simple recursive solution for now until I understand it<br />
better. But the first thing I need is a function to split a string just<br />
once.</p>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> splitOne <span style="color: red;">=</span> splitOne' <span style="color: red;">[</span><span style="color: red;">]</span> <span style="color: cyan;">&gt;</span>  <span style="color: green;"><span style="text-decoration: underline;">where</span></span> splitOne' acc p <span style="color: red;">[</span><span style="color: red;">]</span>         <span style="color: red;">=</span> <span style="color: cyan;">(</span>acc<span style="color: cyan;">,</span> Nothing<span style="color: cyan;">,</span> Nothing<span style="color: cyan;">)</span>                                                              <span style="color: cyan;">&gt;</span>        splitOne' acc p xs<span style="color: red;">@</span><span style="color: cyan;">(</span>x<span style="color: red;"><strong>:</strong></span>xs'<span style="color: cyan;">)</span> <span style="color: red;">=</span>  <span style="color: cyan;">&gt;</span>            <span style="color: green;"><span style="text-decoration: underline;">let</span></span> m <span style="color: red;">=</span> p xs                                                                         <span style="color: cyan;">&gt;</span>            <span style="color: green;"><span style="text-decoration: underline;">in</span></span> <span style="color: green;"><span style="text-decoration: underline;">case</span></span> m <span style="color: green;"><span style="text-decoration: underline;">of</span></span>                                                                                    <span style="color: cyan;">&gt;</span>                Just <span style="color: cyan;">(</span>s<span style="color: cyan;">,</span> rest<span style="color: cyan;">)</span> <span style="color: red;">-&gt;</span> <span style="color: cyan;">(</span>acc<span style="color: cyan;">,</span> Just s<span style="color: cyan;">,</span> Just rest<span style="color: cyan;">)</span>                                                  <span style="color: cyan;">&gt;</span>                Nothing        <span style="color: red;">-&gt;</span> splitOne' <span style="color: cyan;">(</span>acc<span style="color: cyan;">++</span><span style="color: red;">[</span>x<span style="color: red;">]</span><span style="color: cyan;">)</span> p xs'</pre>
</blockquote>
<p>splitOne takes a function which either returns the separator and the<br />
rest of the string, or Nothing. We iterate the list of characters,<br />
stopping at the first where this function matches. Examples of these<br />
functions:</p>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> onCharP p xs<span style="color: red;">@</span><span style="color: cyan;">(</span>x<span style="color: red;"><strong>:</strong></span>xs'<span style="color: cyan;">)</span> <span style="color: red;">|</span> p x    <span style="color: red;">=</span> Just <span style="color: cyan;">(</span><span style="color: red;">[</span>x<span style="color: red;">]</span><span style="color: cyan;">,</span> xs'<span style="color: cyan;">)</span> <span style="color: cyan;">&gt;</span>                      <span style="color: red;">|</span> otherwise <span style="color: red;">=</span> Nothing                                                                                  <span style="color: cyan;">&gt;</span> onChar c <span style="color: red;">=</span> onCharP <span style="color: cyan;">(</span><span style="color: cyan;">==</span>c<span style="color: cyan;">)</span></pre>
</blockquote>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> onSpace <span style="color: red;">=</span> onCharP isSpace <span style="color: cyan;">&gt;</span> onComma <span style="color: red;">=</span> onChar  <span style="color: magenta;">','</span></pre>
</blockquote>
<p>At which point we can do:</p>
<blockquote>
<pre> *Main&gt; splitOne onSpace "nice cup of tea" ("nice",Just " ",Just "cup of tea") *Main&gt; splitOne onSpace "nice" ("nice",Nothing,Nothing)</pre>
</blockquote>
<p>So we now need to run for the whole length of the string, which is where the actual split function comes in.</p>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> split t p <span style="color: red;">[</span><span style="color: red;">]</span> <span style="color: red;">=</span> <span style="color: red;">[</span><span style="color: red;">]</span> <span style="color: cyan;">&gt;</span> split t p xs <span style="color: red;">=</span> <span style="color: green;"><span style="text-decoration: underline;">let</span></span> <span style="color: cyan;">(</span>tok<span style="color: cyan;">,</span>sep<span style="color: cyan;">,</span>rest<span style="color: cyan;">)</span> <span style="color: red;">=</span> splitOne p xs <span style="color: cyan;">&gt;</span>                    res <span style="color: red;">=</span> t <span style="color: cyan;">(</span>tok<span style="color: cyan;">,</span>sep<span style="color: cyan;">)</span> <span style="color: cyan;">&gt;</span>                 <span style="color: green;"><span style="text-decoration: underline;">in</span></span>  <span style="color: green;"><span style="text-decoration: underline;">case</span></span> rest <span style="color: green;"><span style="text-decoration: underline;">of</span></span> <span style="color: cyan;">&gt;</span>                      Nothing    <span style="color: red;">-&gt;</span> res <span style="color: cyan;">&gt;</span>                      Just rest' <span style="color: red;">-&gt;</span> res <span style="color: cyan;">++</span> split t p rest'</pre>
</blockquote>
<p><tt>split</tt> takes a transformation function as well as a<br />
predicate. This takes the lists of (separator,token) and transforms<br />
them as required.</p>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> onlyToken <span style="color: red;">::</span> <span style="color: cyan;">(</span>t<span style="color: cyan;">,</span> t1<span style="color: cyan;">)</span> <span style="color: red;">-&gt;</span> <span style="color: red;">[</span>t<span style="color: red;">]</span> <span style="color: cyan;">&gt;</span> onlyToken <span style="color: cyan;">(</span>x<span style="color: cyan;">,</span><span style="color: green;"><span style="text-decoration: underline;">_</span></span><span style="color: cyan;">)</span>  <span style="color: red;">=</span> <span style="color: red;">[</span>x<span style="color: red;">]</span></pre>
</blockquote>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> <span style="color: blue;">-- onlyWord ("",_) = []</span> <span style="color: cyan;">&gt;</span> onlyWord <span style="color: cyan;">(</span><span style="color: red;">[</span><span style="color: red;">]</span><span style="color: cyan;">,</span><span style="color: green;"><span style="text-decoration: underline;">_</span></span><span style="color: cyan;">)</span> <span style="color: red;">=</span> <span style="color: red;">[</span><span style="color: red;">]</span> <span style="color: cyan;">&gt;</span> onlyWord <span style="color: cyan;">(</span>x<span style="color: cyan;">,</span> <span style="color: green;"><span style="text-decoration: underline;">_</span></span><span style="color: cyan;">)</span> <span style="color: red;">=</span> <span style="color: red;">[</span>x<span style="color: red;">]</span></pre>
</blockquote>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> tokenAndSep <span style="color: cyan;">(</span>t<span style="color: cyan;">,</span> Nothing<span style="color: cyan;">)</span> <span style="color: red;">=</span> <span style="color: red;">[</span>t<span style="color: red;">]</span> <span style="color: cyan;">&gt;</span> tokenAndSep <span style="color: cyan;">(</span>t<span style="color: cyan;">,</span> Just s<span style="color: cyan;">)</span>  <span style="color: red;">=</span> <span style="color: red;">[</span>t<span style="color: cyan;">,</span>s<span style="color: red;">]</span></pre>
</blockquote>
<p>This means that you can write the <tt>words</tt> function, as well as a function to split on commas, with different behaviours.</p>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> words  <span style="color: red;">=</span> split onlyWord    onSpace <span style="color: cyan;">&gt;</span> commas <span style="color: red;">=</span> split tokenAndSep onComma</pre>
</blockquote>
<p>As quicksilver suggested, <tt>split</tt> does indeed have a rather complicated type:</p>
<blockquote>
<pre> *Main&gt; :t split split :: (([a1], Maybe a2) -&gt; [a]) -&gt; ([a1] -&gt; Maybe (a2, [a1])) -&gt; [a1] -&gt; [a]</pre>
</blockquote>
<p>but the final function is simple enough. I did promise that we&#8217;d be<br />
able to split on words as well as characters, and this is why splitOne<br />
runs the predicate against <tt>xs</tt> instead of just the head of the list.</p>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> onPrefix <span style="color: red;">::</span> Eq a <span style="color: red;">=&gt;</span> <span style="color: red;">[</span>a<span style="color: red;">]</span> <span style="color: red;">-&gt;</span> <span style="color: red;">[</span>a<span style="color: red;">]</span> <span style="color: red;">-&gt;</span> Maybe <span style="color: cyan;">(</span><span style="color: red;">[</span>a<span style="color: red;">]</span><span style="color: cyan;">,</span> <span style="color: red;">[</span>a<span style="color: red;">]</span><span style="color: cyan;">)</span> <span style="color: cyan;">&gt;</span> onPrefix <span style="color: red;">=</span> onPrefix' <span style="color: red;">[</span><span style="color: red;">]</span> <span style="color: cyan;">&gt;</span> <span style="color: green;"><span style="text-decoration: underline;">where</span></span> onPrefix' <span style="color: red;">::</span> Eq a <span style="color: red;">=&gt;</span> <span style="color: red;">[</span>a<span style="color: red;">]</span> <span style="color: red;">-&gt;</span> <span style="color: red;">[</span>a<span style="color: red;">]</span> <span style="color: red;">-&gt;</span> <span style="color: red;">[</span>a<span style="color: red;">]</span> <span style="color: red;">-&gt;</span> Maybe <span style="color: cyan;">(</span><span style="color: red;">[</span>a<span style="color: red;">]</span><span style="color: cyan;">,</span> <span style="color: red;">[</span>a<span style="color: red;">]</span><span style="color: cyan;">)</span>            <span style="color: cyan;">&gt;</span>        onPrefix' acc <span style="color: red;">[</span><span style="color: red;">]</span> s2 <span style="color: red;">=</span> Just <span style="color: cyan;">(</span>acc<span style="color: cyan;">,</span> s2<span style="color: cyan;">)</span> <span style="color: cyan;">&gt;</span>        onPrefix' acc <span style="color: green;"><span style="text-decoration: underline;">_</span></span>  <span style="color: red;">[</span><span style="color: red;">]</span> <span style="color: red;">=</span> Nothing                                         <span style="color: cyan;">&gt;</span>        onPrefix' acc s1 s2  <span style="color: cyan;">&gt;</span>           <span style="color: red;">|</span> <span style="color: cyan;">(</span>head s1<span style="color: cyan;">)</span> <span style="color: cyan;">==</span> <span style="color: cyan;">(</span>head s2<span style="color: cyan;">)</span>  <span style="color: cyan;">&gt;</span>                 <span style="color: red;">=</span> onPrefix' <span style="color: cyan;">(</span>acc<span style="color: cyan;">++</span><span style="color: red;">[</span>head s1<span style="color: red;">]</span><span style="color: cyan;">)</span> <span style="color: cyan;">(</span>tail s1<span style="color: cyan;">)</span> <span style="color: cyan;">(</span>tail s2<span style="color: cyan;">)</span> <span style="color: cyan;">&gt;</span>           <span style="color: red;">|</span> otherwise <span style="color: red;">=</span> Nothing</pre>
</blockquote>
<p>Which gives us:</p>
<blockquote>
<pre> *Main&gt; split onlyWord (onPrefix "and") "sex and drugs and rock and roll" ["sex "," drugs "," rock "," roll"]</pre>
</blockquote>
<p>OK, this is still missing two important things from the Perl function:</p>
<ul>
<li>split on a regexp.  You could use some parser combinator as the splitOne predicate.</li>
<li>split a certain number of fields and then stop, like so:<br />
<blockquote>
<pre>  DB&gt; x split /,/, "red,green,yellow,blue", 3  0  'red'  1  'green'  2  'yellow,blue'</pre>
</blockquote>
<p>This one, I haven&#8217;t really thought enough about how to implement, without complicating things.</li>
</ul>
<p>As usual when I start on Haskell, I&#8217;m missing some obvious way to<br />
make the API not suck horribly, so comments and criticism very welcome.</p>
<p><strong>Update Nov 15: </strong></p>
<p>Thanks to everyone who responded here and on <a href="http://programming.reddit.com/info/60ec6/comments">reddit</a> (it is a constant source of pleasure and amazement that people find these silly posts worth commenting on :-)</p>
<p>Andrew, vincentk, and geezusfreak commented that splitting a string<br />
is really a form of parsing, and therefore they would use a proper<br />
parser library like Parsec. Good point, though split in Perl is often<br />
â€œjust good enoughâ€ as a lightweight solution. I&#8217;d like to see a<br />
solution in Parsec or similar though.</p>
<p>Mamie Camacho suggested to use Text.Regex:</p>
<p>I had a play with this and it&#8217;s quite simple to do something like:</p>
<blockquote>
<pre> <span style="color: cyan;">&gt;</span> splitRegex <span style="color: cyan;">(</span>mkRegex <span style="color: magenta;">"\\s*,\\s*"</span><span style="color: cyan;">)</span> <span style="color: magenta;">"eggs,ham, whatever"</span></pre>
</blockquote>
<p>Haskell&#8217;s string quoting is less pleasant than Perl&#8217;s (having to<br />
quote the backtick) and this version doesn&#8217;t seem to have the semantics<br />
of keeping the separator in capture brackets. (e.g. if you use the<br />
string <tt>â€œ(\\s*,\\s*)â€œ</tt>)</p>
<p>Other good suggestions included starting from the source of <tt>words</tt> and deriving a more general solution from that.  Thanks again!</p>
<p style="clear:both;">
]]></content:encoded>
			<wfw:commentRss>http://greenokapi.net/blog/2007/11/12/haskell-words-and-perl-split/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
