[Warning: This is a very long post. You might consider skipping the entire Learning you for great good! section, jumping directly to the Well, what’s a Monad? section]
My functional journey started with Javascript techniques, grew as I briefly explored a few functional languages, then bloomed as I got comfortable with Elixir. Going further has finally defined some long-nebulous terms and shown me the power of both algebraic data types and compiler assurance that a given function is truly pure.
Before you think anything else, let me first say that I still think Elixir is great. The concurrency primitives it inherits from Erlang/OTP are top-notch, and its own innovations like easy macro use via quote
/unquote
are truly powerful. For example: Absinthe, a GraphQL library for Elixir, uses macros for compile-time validation that your schema is sound. I’ve written a good bit of Elixir code now, including custom Ecto and Absinthe data types.
But I am always trying to improve. And in my internet wanderings I encountered this excellent video about OCaml. A couple key points stuck out for me:
These points spoke directly to my real-world experiences. I wanted these features.
I was excited to try Dialyzer, having had a great experience with gradual typing in JavaScript via Flow. Flow’s disjoint unions give you something close to the expressive power of a language like OCaml. And so, I expected Dialyzer to give me a similar level of power. I slowly added type annotations to my entire project, pushed forward by Credo, a linter for Elixir.
The first surprise was that Dialyzer doesn’t work on .exs
files. And Mix understands only .exs
files in your test/
directory when you run mix test
. So, no gradual typing for your tests. At least, not without jumping through some hoops.
Second, errors I expected did not materialize. This Elixir code has a few problems:
@spec test(item :: :one | :two | :three) :: :four | :five
def test(item) do
case item do
:zero -> :four
:one -> :five
:two -> :six
end
end
There are three errors I would like Dialyzer to catch for me here:
path
are three different atoms: :one
, :two
, and :three
. But the case
statement pattern-matching refers to :zero
- that’s not a potential value.case
statement is also missing the potential value :three
in this case.:four
or :five
, but the value :six
is returned in that third case.I am disappoint.
At least those @spec
statements are still useful documentation.
So I guess I’m looking for static typing. What language has that but also universal immutability and a functional design like Elixir?
Isn’t it pretty obscure? Like, not really used by anyone in production? And people who do use it have no problem injecting terms like ‘monad’ into everyday conversation. But I did hear that it is always immutable, like Elixir. And that it will draw nice lines around all of your I/O too.
Let’s dig in.
First, it should be noted that it’s named after Haskell Curry, and currying is a pretty important concept in functional programming. They really reached for the stars in naming this thing! That level of gravity fits, because the language was designed by a working group that evolved out of a 1987 conference called “Functional Programming Languages and Computer Architecture.” In the words of one of the committee members:
The committee’s primary goal was to design a language that satisfied these constraints:
- It should be suitable for teaching, research, and applications, including building large systems.
- It should be completely described via the publication of a formal syntax and semantics.
- It should be freely available. Anyone should be permitted to implement the language and distribute it to whomever they please.
- It should be based on ideas that enjoy a wide consensus.
- It should reduce unnecessary diversity in functional programming languages.
An interesting outcome of #1 is that it is considered to be very academic. And because of #3 there are quite a few compilers available.
The good news is that at this point haskell.org is pretty clear about where you should start:
ghc
to compile, ghci
for REPLstack
cabal
Okay, now what?
I had come across the “Learn You a Haskell for Great Good!” tutorial a couple times before, but never had the motivation to power through it. This time I was on a mission. Here are my notes - things I thought were surprising or particularly important as I went through each section.
From the introduction all the way to 7. Modules:
'
to either denote a strict version of a function (one that isn’t lazy) or a slightly modified version of a function or a variable. Because '
is a valid character in functions, we can make a function like this.”fn 1 2
to 1 \`fn\` 2
:
is the ‘append to head of list’ operator, a cheap operation++
is ‘assemble a new list by combining’ which is a pretty costly operation!!
indexes into a list, zero-indexed<
, <=
, >
and >=
to compare lists, they are compared in lexicographical order. First the heads are compared. If they are equal then the second elements are compared”head
/tail
as expected for working with lists. last
/init
are their opposite.length
, null
(True
if list is empty), reverse
, take
, drop
, maximum
, minimum
, sum
, product
, elem
[1..20]
, ['a'..'z']
, ['A'..'Z']
. Can reverse, can provide more than one number to give you an amount per step, and you can leave out the end for an infinite list!Int
is a bounded integer, Integer
is unboundedcycle
, repeat
, replicate
[x*2 | x <- [1..10], x*2 >= 12]
First section: the value taken, second section (could be more than one): the range, third section (also allows multiple): the filters. Multiple values give you a cross product.(1, 2)
. Note: each tuple size is its own type, so you can’t mix lengths in a list.zip
combines two lists to one list of pairs. It will truncate the longer list.let rightTriangles' = [ (a,b,c) | c <- [1..10], b <- [1..c], a <- [1..b], a^2 + b^2 == c^2, a+b+c == 24]
fn x y = x + y
. fn 1
returns a function that takes one number.:t
or :type
in ghci
is extremely useful!Eq
is for equality, inequalityOrd
is an extension of equality - for ordering of elementsShow
is converting to string for consoleRead
is converting from stringEnum
a set of sequential typesBounded
ends of the range. minBound
and maxBound
are essentially polymorphic constantsNum
anything that standard arithmetic operations (+
, -
, etc.) apply toIntegral
restricts Num
to whole numbersFloating
restricts Num
to types that can represent non-whole numbersread "5" - 2
works, because it knows we’re looking for an integer. but plain read "5"
wouldn’t. Need this: read "5" :: Int
_
in a pattern match is something you don’t care about and won’t use@
syntax all@first:rest
|
) is used for a set of pattern-matching expressions within one function overload.otherwise
is a funny keyword mapped to True
, useful in pattern matchingwhere
keyword gives you an opportunity to massage your parameters a bit before running a bunch of piped guard statements through itlet
/in
expression where you do a bunch of setup, and then the result of the expression is the result of it allcase
expression uses pattern matching across one incoming value(\xs -> length xs > 15)
key is the \
which apparently looks a little bit like a lambda symbol (λ)addThree :: (Num a) => a -> a -> a -> a
addThree x y z = x + y + z
addThree = \x -> \y -> \z -> x + y + z
foldl
is reduce, starting from the beginning of the list. foldr
from the right. foldll
and foldrl
are the same, but they don’t require a separate starting value. The first element taken from the list is the accumulator.scanX
methods, just like foldX
map :: (a -> b) -> [a] -> [b]
. Also, “if you have a function like foo a = bar b a
, you can rewrite it as foo = bar b
”reverse
as foldl (flip (:)) []
$
operator has a very low precedence, so x $ stuff
functions as x (stuff)
: f $ x = f x
. Nicer chaining when doing several steps in a row..
operator composes functions f . g = \x -> f (g x)
(or f . g x = f $ g x
given currying). Nicer chaining without the actual function parameters, to prepare one callable function out of a number of constituent steps.replicate 100 (product (map (*3) (zipWith max [1,2,3,4,5] [4,5,6,7,8])))
becomes
replicate 100 . product . map (*3) . zipWith max [1,2,3,4,5] $ [4,5,6,7,8]
import
pulls everything in globally, but you can add qualified
to ensure that all references in the file retain that namespace (and as
to change that name). You can also use hiding
to provide a list of things in that module NOT to include.Data.List
module has a lot of really useful methods: intersperse
, intercalate
, transpose
, concat
, concatMap
, and
, or
(you’ll probably use any
and all
, since they take a predicate) and so on: https://downloads.haskell.org/~ghc/latest/docs/html/libraries/base-4.9.0.0/Data-List.htmllength
takes an Int
and not an Integer
(or better Integral
) so you can use Data.List’s genericLength
instead. There are other methods like this.Data.Char
for string handling.Data.Map
for key/value behavior. A map is printed out as fromList [("key1",1),("key2",2)]
Data.Set
for sorted, unique list. Implemented with trees like map. Displayed as fromList [1, 2, 3]
Data.Map.lookup
: data Maybe a = Nothing | Just a
module Something.Name (
fn1,
fn2
) where
fn1 x = x^1
fn2 x = x^2
:t
and :i
in the console?
Int
type is not so flexible. Don’t restrict things to Int
. Probably use Integral
or Numeric
.Data.Ratio
is what you get with 1 % 3
(requires an import of Data.Ratio
)Fractional
is what you get with 1 / 3
:t Eq
or :t Ord
give you an error.
:info Eq
(or :i
) gives you information about the typeclassghc --make -Wall
checks for exhaustiveness!Data.Char
deal with unicode? Are toUpper
and toLower
unicode-aware?
ghci
to show unicode characters properly, I had to install unicode-show
and start up with ghci -interactive-print=Text.Show.Unicode.uprint
Data.Text
is good for ‘packed’ (not a linked list) UTF-16 strings. Sadly, it doesn’t pass all of these tests: https://mortoray.com/2013/11/27/the-string-type-is-broken/Prelude Data.Text> noel = pack "noël"
Prelude Data.Text> Data.Text.reverse noel
"l̈eon"
From 8. Types and Typeclasses to a bit of 9. I/O:
data
keyword allows you to create a new type. If it contains types after the name, it creates a function with that arity which constructs the type.deriving (Show)
which is required for your type to be printed to the console.Point(..)
(includes all constructors). Interesting: “We could also opt not to export any value constructors for Shape
by just writing Shape
in the export statement. That way, someone importing our module could only make shapes by using the auxiliary functions” This is how Map.Map
is exported - you can only create one with fromList
data Type
followed by concrete type classes, like Float
or Int
. You can parameterize your type with a lowercase value, usually a
. data Maybe a = Nothing | Just a
can be applied to any type. Maybe
by itself doesn’t make sense, but Maybe Int
does.Map k v
which is almost always Ord k
so the lookup tree can be ordered. What ends up happening is that all the functions that operate on Map k v
specify that the k
must be Ord k
and then there are a couple functions that just don’t care about whether the k
is ordered.[minBound..maxBound] :: Day
after defining this:data Day = Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday
deriving (Eq, Ord, Show, Read, Bounded, Enum)
type Name = String
to make your function signatures easier to deal with. These can also be parameterized like type AssocList k v = [(k,v)]
. Or use partial application: type IntMap = Map Int
. The key difference is that these don’t produce new functions - the names specified after type
can only be used in type declarations and casts.Either a b
, where a
is some sort of type that can tell us something about the possible failure and b
is the type of a successful computation. Hence, errors use the Left
value constructor while results use Right
.”infixr
and infixl
!data
typesclass
keyword essentially gives you interfaces, like Eq
which provides (==)
and (/=)
. instance
provides an implementation of a class
against a given type.id
is “a standard library function that takes a parameter and returns the same thing”Functor
represents things that can be mapped over: class Functor f where\n fmap :: (a -> b) -> f a -> f b
. f
could be a list, for example. There are some rules about Functor behavior, like: order should be maintained.kind
using :k
in ghci
- *
means ‘a concrete type.’ * -> *
means a concrete type will be generated when a concrete type is provided. Maybe
has that type, but Maybe Int
is a final concrete type.()
- it’s an empty tuple, with a type of ()
.<-
operator, like name <- getLine
- “You can read that piece of code like this: perform the I/O action getLine
and then bind its result value to name
”<-
construct. And if we’re taking data out of an I/O action, we can only take it out when we’re inside another I/O action.”do
keyword allows for a sequence of operations. let
can be used freely.return
makes an I/O action out of a pure value. ”return
is sort of the opposite to <-
”when
from Control.Monad
is essentially “if something then do some I/O action else return ()
”mapM
, forM
, for mapping over I/O actionsgetContents
for a streaming representation of stdio. interact
is similar, but it pulls a line at a time, handing it to you to respond or not.System.IO
- readFile
, writeFile
, appendFile
. Low level, streaming methods are openFile
and withFile
in System.IO
. Can set how files are buffered when streaming with hSetBuffering
hGetContents
, hGetLine
, hPutStr
, etc.System.Environment
has getArgs
and progName
. Looks like we’ll need a tool to combine things like --cows 5
into one object(r, gen) = random (mkStdGen 949494) :: (Int, StdGen)
(from System.Random
)randoms
is useful, an infinite list of random numbers based on an initial generatorrandomR
and randomRs
are the same, but with boundsSystem.Random.getStdGen :: IO StdGen
for that initial seed! newStdGen
updates the global generator so if you call getStdGen
again, you’ll get something different.List
can be onerous when dealing with files, can use Data.ByteString
(not lazy at all) and Data.ByteString.Lazy
(not as lazy as List
). A lot of the same behaviors as list - :
is cons
. You probably want to use the strict version cons'
so the bytes get packed into larger chunks.System.Directory
IO
management? For example, is there a web server out there built on top of Haskell? Perhaps a game loop?
let
and do
. Two choices:let thing1
thing2
thing3
let
thing1
thing2
thing3
From a bit more 9. I/O to 11. Functors and Monoids:
System.IO.Error.catchIOError :: IO a -> (IOError -> IO a) -> IO a
System.IO.Error.isXXXError
IOError
with System.IO.Error.ioeXXX
IO
methods, or you can return something like IO (Either a b)
to have the compiler help you deal with the fact that something went wrong.instance
keyword is key to applying a typeclass (like Functor
) to a new type.fmap
, because it looks prettier.”Functor
’s fmap
is kind of like unboxing, applying a function, then re-boxing. But then you consider that ->
is a functor, and you can fmap
functions themselves: ”fmap
acts just like . for functions” - function composition. Chaining. Sadly the analogy starts to break down. “You can think of fmap as either a function that takes a function and a functor and then maps that function over the functor, or you can think of it as a function that takes a function and lifts that function so that it operates on functors.”Functor
:
Applicative
is a Functor
with an even greater ability to be composed. Just (+3) <*> Just 9
results in Just 12
Functor
-> Applicative
is like going up a ‘meta’ level: (\x -> "hello " ++ show x)
fmap Just 1
, to Just (\x -> "hello " ++ show x) <*> Just 1
. Or: Just (+) <*> Just 3 <*> Just 5
gives Just 8
Applicative
:
pure f <*> x
is equivalent to fmap f x
pure id <*> v = v
pure (.) <*> u <*> v <*> w = u <*> (v <*> w)
pure f <*> pure x = pure (f x)
u <*> pure y = pure ($ y) <*> u
<$>
is pretty much the same thing as fmap
. So we can do this: (++) <$> Just "johntra" <*> Just "volta"
, which is equivalent to this: (++) "johntra" "volta"
(except for the Maybe
wrapper)ZipList
: getZipList $ (+) <$> ZipList [1,2,3] <*> ZipList [100,100,100]
returns [101,102,103]
newtype
keyword is used for simple wrapping of another type, likely for new implementations (with instance
keyword) of different typeclass
methods - like ZipList
. It’s a little bit faster than types created with data
- no wrapping/unwrapping.Monoid
type class. “A monoid is when you have an associative binary function and a value which acts as an identity with respect to that function. When something acts as an identity with respect to a function, it means that when called with that function and some other value, the result is always equal to that other value. 1
is the identity with respect to *
and []
is the identity with respect to ++
”++
does take two lists and append one to the other, *
doesn’t really do any appending, it just multiplies two numbers together”Monoid
:
mappend mempty x = x
mappend x mempty = x
mappend (mappend x y) z = mappend x (mappend y z)
(Notice that monoids don’t require that mappend a b
be equal to mappend b a
)Data.Foldable
exports generic foldXX
methods that work on any Foldable
type.We’re finally talkin’ 12. Monads! Today is also when we complete the tutorial.
>>=
“: (>>=) :: m a -> (a -> m b) -> m b
where m
is Applicative
Monad
is not declared with an Applicative
type constraint, but it is.pure
and return
are exactly the same. Based on that prior point, seems like return
came before pure
>>=
: return (0,0) >>= landLeft 1 >>= banana >>= landRight 1
>>=
with lambdas to preserve context becomes do
:foo :: Maybe String
foo = Just 3 >>= (\x ->
Just "!" >>= (\y ->
Just (show x ++ y)))
Can be rewritten as this. We could even take that last line and turn it into a return
to make it simpler…
foo :: Maybe String
foo = do
x <- Just 3
y <- Just "!"
Just (show x ++ y)
do
expression, the fail
function is called.”do
notation, as long as you write your own guard
function to replicate the filter clausesMonad
:
return x >>= f
is equivalent to f x
m >>= return
is equivalent to m
(m >>= f) >>= g
is equivalent to m >>= (\x -> f x >>= g)
(same thing as Applicative associativity, it’s just that the ‘unboxing’/‘boxing’ makes the syntax a little harder.f <=< g = (\x -> g x >>= f)
then this becomes easier to grok:
f <=< return
is equivalent to f
return <=< f
is equivalent to f
f <=< (g <=< h)
is equivalent to (f <=< g) <=< h
ghc-pkg list
Control.Monad.Writer
- keeping track of some Monoid
along with the primary target of the calculation. Could be used for logging, if String
or [String]
is used for the Monoid
[]
as a Writer
’s Monoid
- it’s slow to append to the end of a list, fast to prepend. So be sure to make things right-associative, use tail-call recursion (essentially: attach the log before calculating the next level of recursion)DiffList
is a higher-performance type than a plain list>>=
implementation will make sure that it all works out.”Control.Monad.State
- newtype State s a = State { runState :: s -> (a,s) }
State
examples exactly as provided! :0(Monad
instead of Applicative
:
fmap
-> liftM
<*>
-> ap
liftA2
-> liftM2
filter
-> filterM
mapM
-> mapM_
foldl
-> foldM
join
is a flatten
call for a Monad
: join (Just (Just 9))
results in Just 9
- but it only works for two layers. m >>= f
is equivalent to join (fmap f m)
Data.Ratio.Rational
for no loss of precision when doing math on non-integers! It’s written as a fraction with a percentage sign: 1 % 4
and it only resolves to something else when you need it toProbability
monad to track the probability of a given outcome along with the value of that outcomex -: f = f x
break
takes a list and a predicate - the first item in the pair is the set of items for which the predicate returned false. When the predicate returns true, the rest of those items are the second item in the pair.State
and StateT
?
type State s = StateT s Identity
” (via http://stackoverflow.com/questions/9697980/the-state-monad-and-learnyouahaskell-com)State
in type hints (::), but state
when calling the function. Confusing.type State s = StateT s Identity
like that’s supposed to help me understand the difference between them. What is Identity
?Writer [String] (Maybe Int)
how would that work? Would the nice do
syntax still apply, or would we need to develop yet another level of meta introspection/execution?
Maybe
and Writer
. I guess that’s my big question - I haven’t seen a good example of truly pure functions used on the stuff inside Monads. I suppose that inside any of these methods I could be calling pure functions instead of simply return (Just x)
.dropFourOrLess :: Maybe Int -> Writer [String] (Maybe Int)
dropFourOrLess Nothing = return Nothing
dropFourOrLess (Just x)
| x > 4 = do
tell ["Keeping " ++ show x]
return (Just x)
| otherwise = do
tell [show x ++ " is too small, dropping"]
return Nothing
dropTenOrMore :: Maybe Int -> Writer [String] (Maybe Int)
dropTenOrMore Nothing = return Nothing
dropTenOrMore (Just x)
| x < 10 = do
tell ["Keeping " ++ show x]
return (Just x)
| otherwise = do
tell [show x ++ " is too large, throwing it away"]
return Nothing
main :: IO ()
main = putStrLn $ show $ return (Just 10) >>= dropFourOrLess >>= dropTenOrMore
Outputs this:
WriterT (Identity (Nothing,["Keeping 10","10 is too large, throwing it away"]))
“Monads are this strange, mysterious, weird but powerful, expressive pattern that we can use for doing programming, and they come with curse. The ‘monadic curse’ is that once someone learns what monads are and how to use them, they lose the ability to explain it to other people.” - Douglas Crockford
I now understand this difficulty. I understand why all the previous articles about Monads I’ve read were pretty much incomprehensible. The truth is that if you want to understand Monads as they really are, and not in metaphor, you need to start from a couple basic building blocks.
The functional concept of mapping an operation over the contents of something is now quite widely known thanks to libraries like Lodash. Lodash has the ability to map over JavaScript objects and arrays. A Functor
is anything that can be mapped over with fmap
. <$
is the simplest form of this, replacing all values inside the Functor
with a single value:
Prelude> fmap (+ 1) (Just 4)
Just 5
Prelude> 5 <$ (Just 4)
Just 5
Applicative
is a Functor
which is more composable. We can create a new Applicative
from a value with the pure
function, for example. And we can apply functions that don’t know anything about the Applicative
to the values inside:
Prelude> pure 5
5
Prelude> pure 5 :: Maybe Int
Just 5
Prelude> (+) <$> Just 1 <*> Just 4
Just 5
Prelude> Just (+) <*> Just 1 <*> Just 4
Just 5
Drumroll please. A Monad
is an Applicative
which implements two new functions. return
is exactly like pure
, assembling a Monad
from a raw value. >>=
is a new operator which takes a Monad
, extracts the value from it and gives it to the provided function, which is then responsible for returning a new Monad
. Take a look at these chained calls using lambda functions:
Prelude> return 0 >>= (\i -> return (i + 1)) >>= (\j -> return (j + 4))
5
Prelude> return 0 >>= (\i -> return (i + 1)) >>= (\j -> return (j + 4)) :: Maybe Int
Just 5
Notice how return
eliminates the need to specify the exact Monad
we want. This then allows us to change the meaning of the expression with a type hint. This expression works for any Monad
whose contents are compatible with +
and can be initialized with 0
.
Now you’re probably wondering what the big deal is. Sure, it’s a collection of traits. And that Maybe
thing is a Monad
, whatever. But there are a whole lot of others: IO
, List
, Either a
, Control.Monad.State
and so on. You can see the available set by typing :i Monad
into ghci
(more are added to this list as you import new modules). Explore these and get a feel for the ideas they encapsulate. Most aren’t as neatly described as Maybe
, a container for another value (or Nothing
). Some are a whole lot more abstract. For example: encapsulating a future state (State
), or side effects (IO
).
This wide applicability one of the reasons people have such a hard time explaining the Monad
concept. But at least now all those claims that JavaScript Promises are Monads make a bit more sense!
You’ll see this everywhere as you explore Haskell, so it’s worth covering. It only seems scary because the term Monad
is pretty scary. Two behaviors are specified: the ability to create an empty Monoid
, and the ability to combine two different Monoid
s into one final Monoid
:
Prelude> mempty :: [Int]
[]
Prelude> mappend [1] [2, 3]
[1,2,3]
Haskell is incredibly powerful, and incredibly complex. I see now why it has the reputation it does. There’s a huge barrier to entry, and it really is vastly different from the set of languages primarily taught and used today. What might pull someone into this world? You can’t get a job doing it if you don’t know it yet!
I’m not really comfortable with it, but I do plan to practice. I’m drawn to the value provided by its type system and the restrictions around I/O, but I’ve also heard about its unpredictability regarding performance. I’m especially looking forward to exploring its concurrency capabilities. We’ll see!
I feel like I’m at a good stopping point in my functional programming journey. At least for now… who knows, maybe Idris is in my future!
A few resources:
I’ve decided to put away my consultant hat for a while, because I’ve joined Open Whisper Systems to work on their Signal Desktop application! I’m really excited to contribute to such an important... Read more »
After five years working with Node.js, I’ve learned a lot. I’ve already shared a few stories, but this time I wanted to focus on the ones I learned the hard way. Bugs, challenges, surprises, and... Read more »