view 7-Haskell/README.txt @ 94:eb868f089bd1

Add notes and examples on data types and monads Still to do: Day 3 exercises
author IBBoard <dev@ibboard.co.uk>
date Sun, 23 Jun 2019 20:13:55 +0100
parents 7e4afb129bef
children
line wrap: on
line source

Install with `zypper install ghc` for the Glasgow Haskell Compiler. The interactive shell (REPL) is `ghci`. Currently on v8.6.5 (was 6.12.1 in the book). Haskell uses strong static types with inference.

Numbers behave like numbers. Strings have double-quotes and characters have single quotes. An array of characters (in square brackets) is a string. "+" is purely numeric addition - concatenation is "++".

Equality is tested with "==" (equals) and "/=" (not equals - not "is divisible by"!). Haskell is strongly typed

Indentation is significant in Haskell (like Python) BUT you can do an "if … then … else …" on a single line (however, it will assume that "if then" is a parse error, probably because functional code isn't imperative and so can't miss the else)

Using functions on the wrong type of arguments seems to give helpful error messages along the lines of "No instance for (arg types) arising from a use of 'function' at X" (and "+" is a function).
":t val" lets you see the type of a variable or value, and ":set +t" lets you see the type of returned values. 

In code, functions are defined as "type function_name param = body". In code, that is always prefixed with "let".
Functions return the result of their last instruction.
Full function definitions are preceded by a type definition line in the form "function_name :: type(s) -> type(s)". You can also have generics and have "function_name :: (Parent_class generic_name) => generic_name -> generic_name".
Code in files needs to be in a module, which starts "module ModuleName where" on the top line.
You can then do `:load filename.hs` in the console. This then puts you in that module on the CLI (the prompt changes).

Haskell functions can use parameter matching on arguments - see factorial.hs.

Invoking parameters doesn't use brackets (because they're used for tuples - except when it's just a single value, when they're ignored).

Lists can be broken up with "head" and "tail" functions (or "fst" and "snd" for tuples) *or* they can be broken up on assignment. It's like Python multi-assignment, but with colon - "(_head:_tail)" (":" is the list construction operator). It can then be used in function definitions, e.g.:
    size [] = 0
    size (h:t) = 1 + size t

Haskell can create ranges with [start..end]. Specifying an invalid range (e.g. decreasing) gives an empty list. Specifying an increment is different to all other languages: [start, next..end] (e.g. "[10, 8..4]" gives "[10, 8, 6, 4]" and it works out to decrement by 2).
You can also do (lazy) infinite ranges by not specifying an end and then using "take" functions etc to pull just the values you need.

List comprehension is slightly pythonic: "[expr | val <- [vals] ]" where "expr" is the expression to calculate in the new list, val is a variable used in expr and [vals] is the source list. It's effectively "expression for value in list".
You can also assign multiple independent variables, possibly from the same source. For example:
    let crew = ["Kirk", "Spock", "McCoy"]
    [(a,b) | a <- crew, b <- crew]
calculates all combinations of crew names. Changing it to:
    [(a,b) | a <- crew, b <- crew, a /= b]
lets you add filtering to stop people being paired with themselves. Or you could use "a < b" to make it return unordered unique pairings.

Haskell has "map" - "map func list" - where "func" can be an anonymous function.

Anonymous functions are defined as:
    (\param_1 … param_n -> function_body)
They can be called in-line, which looks odd:
    (\x -> x) "Logical."
return "Logical." (because the anon function returns the value passed to it).

Alternatively, anonymous functions can be written as locally scoped functions after a main function definition, using "where":
    squareAll list = map square list
        where square x = x * x

Map can also be used with part of a function such as "(+ 1)". The book calls this a "section", but it isn't clear what it means. The wiki says a "section" is a partially applied infix operator (https://wiki.haskell.org/Section_of_an_infix_operator).

Haskell also has the standard functional filter (takes two parameter: a boolean "keep in list" function and a list) and foldl/foldr (take three parameters - a two-value function (value and accumulator), a starting value and a list).
You can even "foldl (+) 0 [1..3]" to sum by using "+" as the two-parameter folding function.

All of this has apparently been a lie, though. Haskell functions only have one argument! If you check the type of a multi-argument method then you get "arg_1_type -> arg_2_type -> arg_3_type -> result" rather than just "arg_1_type, arg_2_type, arg_3_type -> result" (Note: it won't be bracketed on the left because it's not expecting a three-tuple)
This world view can be made more apparent with the following:
    let prod x y = x * y
    let double = prod 2
    let triple = prod 3
double and triple are partially applied functions, and the only (sane) way to be able to do that is if "prod x y" is a two-part function! The book says "When Haskell computes prod 2 4, it is really computing "(prod 2) 4". [insert anonymous functions]". This is called currying.

Haskell has "data types" that use the 'data' keyword to specify an allowed set of values, and "types" that reference data types, e.g.:
    data Suit = Spades | Hearts
    data Rank = Ten | Jack | Queen | King | Ace
    type Card = (Rank, Suit)
    type Hand = [Card]
Suit can be Spades or Hearts, Rank can be Ten through Ace (no ordering implied here because we don't use "deriving Ord"), a Card is a tuple of a rank and a suit, and a Hand is a list of cards.
This will fail to print values in the console unless we add "deriving (Show)" to the end of the "data …" lines.

The book isn't clear what makes a "data" vs a "type", but it appears at first to be about whether it refers to values or other "classes" ('data' types).
Except that on the next page the "Triplet" definition doesn't meet that criteria.
The Haskell wiki (https://wiki.haskell.org/Type) says types are defined using "data" and then "type" introduces synonyms.
The examples are:
    data Maybe a = Just a | Nothing
(Define "Maybe" as a polymorphic of type variable "a" as "Just a thing of that type 'a'" or "Nothing")
    data Tree a = Branch (Tree a) (Tree a) | Leaf a
(Define "Tree" as a polymorphic of type variable "a" as a Branch of two trees of type a or a Leaf of type a - you'd assume Branch and Leaf are defined elsewhere, but apparently this is enough to define them implicitly!)

Note: The wiki defines Card as a "data", not as a "type". It also does it quite differently, using "Card = Card { value :: Rank, suit :: Suit } …", which seems recursive at first but is probably just like "class MyClass { public MyClass(var a) { … } }".
Also from the wiki, if you "deriving" something like Ord or Enum then you need to use "instance [derived thing] [my type] where … [implementation]"

Haskell also includes "classes", but they're not like Object-Oriented classes. They define which operations work on which inputs (e.g. Bool + Bool doesn't work, but Integer + Integer does). It's a bit more like an interface in that it defines functionality
and guarantees functions exist BUT with optional default implementation. For example, in the following definition:

class Eq a where
    (==), (/=) :: a -> a -> Bool
    x /= y = not (x == y)
    x == y = not (x /= y)

This means nothing on its own, but any data type "deriving Eq" will work with "==" and "/=" and you only need to implement one for your type to get the other for free.

Monads. Monads are about function composition. They let you do the equivalent of "v = func1(v); v = func1(v); v = func2(v); return v" from imperative languages.
You can do this pattern with a multi-line "let … in [return_value]" *but* you can't reuse variable names.

Monads have a type constructor that takes a container, a function called "return" to wrap a function in the container, and a bind function called ">>=" that unwraps and chains the function. See monad.hs.

A built-in use of monads is the "do" notation, which gives an imperative style on the surface but is implemented with monads:

    tryIo = do putStr "Enter your name: " ;
               line <- getLine ;
               let { backwards = reverse line } ;
               return ("Hello. Your name backwards is " ++ backwards)

That's a function call "tryIo" that uses "do" to look imperitive. This is just syntactic sugar so that you don't have to type ">>=" so often. The semi-colons are optional, and you can wrap the code in "{}"

A list is already defined as a monad using "instance". The definitions are simple enough:

instance Monad [] where
    m >>= f = concatMap f m
    return x = [x]

So we define a monad on the list (top line) then define the binding, which applies the function to every value in the list, and then the return value is a list of the value.
This allows lists to be used in "do …" blocks to write a "for each item in an array return a value" function, such as:
    let cartesian (xs, ys) = do x <- xs; y <- ys; return (x, y)
Here, we take in two lists (xs and ys) and for each pair x and y we return the tuple BUT the monad definition and the "do" operation means that we get a list returned:
    Main> cartesian ([1..2], [3..4])
    [(1,3), (1,4), (2,3), (2,4)]
The book says this is "an alternate computational strategy" to list comprehension, but it reads almost entirely like list comprehension. The only difference in this example is that the "return" is at the end. c.f.:
    let cartesian_list (xs, ys) = [(x,y) | x <- xs, y <- ys]
The password cracking example is a little more complex but could still probably be done with list comprehension.

"Maybe" is also a monad on the Maybe data type. The type is defined as "Nothing" or "Just a". The Monad returns is an alias for "Just" and so the result is "Just x" and there are two ways of binding the function.
    Nothing >>= f = Nothing
    (Just x) >>= f = f x
i.e. there are no "null pointer exception"-type problems for chaining function calls as applying a function to Nothing safely returns Nothing.