What is Functional Programming?

  • From HaskellWiki: "Functional programming is a style of programming which models computations as the evaluation of expressions."
  • Functional Programming is programming without state
    • No mutability!
    • Referential Transparency!
    • Great for pipelines!
  • Another view: Functional programs describe what things are instead of how to do them

In [1]:
fact 0 = 1
fact n = n * fact (n-1)
-- Pro way of doing this: fact n = product [1..n]

fact 10


3628800

Benefits of Functional Programming, or: why do you care?

  • Clean, modular code
  • Easy to refactor
  • Easy to make abstractions
  • Allows for tons of aggressive compliler optimizations
  • Can look at functions in isolation
  • Often trivially parallizeable
    • No state -> no data races
    • No state -> extremely lightweight threads

In [2]:
-- Parallelization example
import Control.Parallel.Strategies (parMap, rseq) -- from `parallel` package

-- Non-parallel version
incrementAll xs = map (+1) xs

-- Parallel version
incrementAll' xs = parMap rseq (+1) xs

incrementAll [0,1,2,3]
incrementAll' [0,1,2,3]


[1,2,3,4]
[1,2,3,4]

Examples of Functional Languages

Lisp Family

  • Common Lisp
  • Clojure
  • Scheme
  • Racket

ML Family

  • Haskell <- This is what we will be using today
  • OCaml
  • Erlang
  • Scala
  • Elm

You are already using functional programming!

  • Lambdas
  • Recursion
  • Arguably generators!
  • Math!
  • So take the leap! Cast off your imperative chains!

Hello, Haskell


In [3]:
-- "Hello, World!" in Haskell
main = putStrLn "Hello, World!"

:t main
:t putStrLn
:t "Hello, World!"
main


main :: IO ()
putStrLn :: String -> IO ()
"Hello, World!" :: [Char]
Hello, World!

Thinking Functionally

Printing first 100 primes


In [4]:
-- define infinite list of primes
primes :: [Integer]
primes = 2 : filter isPrime [3..] -- the primes are 2, followed by every prime number from 3 on

-- check whether a given number is prime
-- a number is prime if it is not divisible by any prime up to its square root
isPrime :: Integer -> Bool
isPrime n = divisibleByNone n relevantPrimes
    where relevantPrimes = takeWhile relevant primes
          relevant p = p*p <= n

-- check whether an integer is divisible by any number in a list of integers
divisibleByNone :: Integer -> [Integer] -> Bool
divisibleByNone n = not . any (divides n)
    where divides n p = n `mod` p == 0

-- the main function: print the first 100 primes
main :: IO ()
main = print . take 100 $ primes

main

-- Can easily print all primes < 1000 instead 
anotherMain :: IO ()
anotherMain = print . takeWhile (<1000) $ primes

anotherMain

-- See primes.py for imperative comparison!
-- Key take away: functional code has no variables--we were just defining what things were


[2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,179,181,191,193,197,199,211,223,227,229,233,239,241,251,257,263,269,271,277,281,283,293,307,311,313,317,331,337,347,349,353,359,367,373,379,383,389,397,401,409,419,421,431,433,439,443,449,457,461,463,467,479,487,491,499,503,509,521,523,541]
[2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,179,181,191,193,197,199,211,223,227,229,233,239,241,251,257,263,269,271,277,281,283,293,307,311,313,317,331,337,347,349,353,359,367,373,379,383,389,397,401,409,419,421,431,433,439,443,449,457,461,463,467,479,487,491,499,503,509,521,523,541,547,557,563,569,571,577,587,593,599,601,607,613,617,619,631,641,643,647,653,659,661,673,677,683,691,701,709,719,727,733,739,743,751,757,761,769,773,787,797,809,811,821,823,827,829,839,853,857,859,863,877,881,883,887,907,911,919,929,937,941,947,953,967,971,977,983,991,997]

Map Filter Reduce

  • General approach for solving large classes of problems
  • Lends itself well to parallelization, distributed computing, etc
  • Strategy:
    • Start with a large dataset
    • Apply some transformation to each (map)
    • Cut down to just the interesting results (filter)
    • Summarize (reduce)
  • map, filter, and reduce are the name of the python functions. In Haskell, they are map, filter, and foldl/foldr.

In [5]:
-- map applies a function to each element in a list
:t map
map negate [0..10]


map :: forall a b. (a -> b) -> [a] -> [b]
[0,-1,-2,-3,-4,-5,-6,-7,-8,-9,-10]

In [6]:
-- filter takes a list and cuts it down to only the elements which match some predicate
:t filter
filter even [0..10]


filter :: forall a. (a -> Bool) -> [a] -> [a]
[0,2,4,6,8,10]

In [7]:
-- foldl repeatedly applies a binary function to pairs of elements in a list, starting with a
-- supplied default value. So this is equivalent to ((((0+1)+2)+3)+4)...
-- (The `l` in foldl means that it does this in a left-associative way)
foldl (+) 0 [1..10]


Use sum
Found:
foldl (+) 0
Why Not:
sum
55

In [8]:
import Data.Char (digitToInt)

-- Read a string into an integer.
-- Strategy: start from left, reading each digit. To combine with the next element, multiply
-- the accumulator by 10 and add the next
readInt :: String -> Int
readInt to_read = foldl step 0 to_read
    where step old new = old*10 + digitToInt new
    
readInt "123"
-- this is will calculate (((0*10+1)*10+2)*10+3) = 123


123

Map Filter Reduce for Weak Lensing Pipeline

  • Goal: Take a bunch of raw images, process them, and combine only the high quality ones

In [9]:
-- Weak lensing pipeline

-- process a single image
process :: RawImage -> ProcessedImage
process = undefined

-- is this processed image high quality?
highQuality :: ProcessedImage -> Bool
highQuality = undefined

-- take a stack and a processed image, and get their combination
addToStack :: Stack -> ProcessedImage -> Stack
addToStack = undefined

-- the empty stack
emptyStack :: Stack
emptyStack = undefined

-- Now we are ready to define our pipeline!
-- Note: `.` is function composition, so `(f . g) x` is `f(g(x))`. Hence the right-to-left
--       ordering of map -> filter -> reduce
pipeline :: [RawImage] -> Stack
pipeline = foldl addToStack emptyStack . filter highQuality . map process

-- For fun: let's parallelize the processing
parallel_pipeline :: [RawImage] -> Stack
parallel_pipeline = foldl addToStack emptyStack . filter highQuality . parMap rseq process

-- datatype definitions (empty for this example)
data RawImage = RawImage {
    -- Raw image data here
    }

data ProcessedImage = ProcessedImage {
    -- Process image data here
    }
    
data Stack = Stack {
    -- Stacked image data here
    }

Parsing CSV files


In [10]:
-- CSV Parser example
import qualified Text.Parsec as P -- from `parsec` package
import Text.Parsec.String (Parser) -- ditto
import Control.Applicative -- built in

-- define types
type CSV = [Record]
type Record = [Field]
type Field = String

-- basic CSV parser
-- A CSV file is a list of Records separated by newlines
csv :: Parser CSV
csv = record `P.sepBy` P.endOfLine

-- A Record is a list of Fields separated by commas
record :: Parser Record
record = field `P.sepBy` P.char ','

-- A Field is many characters, none of which are '"' or a newline
field :: Parser Field
field = many (P.noneOf ",\n")

-- Now let's test it:
exampleFile =
    "Haskell,Functional\n\
    \Python,Imperative\n\
    \Prolog,Logical"

P.runParser csv () "example" exampleFile


Right [["Haskell","Functional"],["Python","Imperative"],["Prolog","Logical"]]

In [11]:
-- But wait! We forgot about escaping!

-- Escaped CSV parser

-- This is the same
csv :: Parser CSV
csv = record `P.sepBy` P.endOfLine

-- This is the same
record :: Parser Record
record = field `P.sepBy` P.char ','

-- So all we have to do is redefine what a field is.

-- A quoted field is a quote, followed by many characters which can appear in quotes, followed
-- by another quote
quotedField :: Parser Field
quotedField = P.char '"' *> many inQuoteChar <* P.char '"'

-- Characters in quotes are either escaped quotes or anything that's not a quote.
inQuoteChar :: Parser Char
inQuoteChar =
    P.try (P.string "\"\"" *> return '\"') -- escaped quote: when you read "", return "
    <|> P.noneOf "\"" -- anything that's not a quote

-- This is just our old definition of a field.
regularField :: Parser Field
regularField = many (P.noneOf ",\n")

-- A field is either a quoted field or a regular field
field :: Parser Field
field = quotedField <|> regularField

-- Now lets test it:
exampleFile' =
    "Regular,Regular Field\n\
    \Quoted,\"One, two, three\"\n\
    \Escaped,\"An \"\"escaped\"\" field\""

P.runParser csv () "example'" exampleFile'


Right [["Regular","Regular Field"],["Quoted","\"One"," two"," three\""],["Escaped","\"An \"\"escaped\"\" field\""]]

But Devon! What about inherently state-y things?

Implementing A Solar Simulation


In [12]:
-- Data type which determines the type of our solar simulation
data SolarState = SolarState {
        radius :: Double,
        metallicity :: Double
        -- ... other variables
    }

-- Define how to do a step in our simulation
step :: SolarState -> SolarState
step = undefined

-- Let's say we want to do n steps.
-- `iterate step initial_state` creates an infinite list of states, each one generated by
-- applying step to the previous state.
-- `!! n` takes the nth element
doNsteps :: Int -> SolarState -> SolarState
doNsteps n initial_state = iterate step initial_state !! n

-- Let's say we want to instead look at the first n steps
firstNSteps :: Int -> SolarState -> [SolarState]
firstNSteps n initial_state = take n $ iterate step initial_state

-- Let's say we want to iterate until some condition is true

-- This function tells us whether we are done
done :: SolarState -> Bool
done = undefined

-- Run the simulation until done
runSim :: SolarState -> SolarState
runSim initial_state = until done step initial_state

-- This kind of flexibility is awesome when your advisor inevitably changes their mind about
-- how your simulation should work!

Okay, but what about really, really inherently state-y things?

RNGs

  • Usually implemented as nextVal :: Rng -> (a, Rng)
  • Lots of ways of hiding this, so it's less obnoxious to work with. Example: type Rand g Int.

In [13]:
import Control.Monad.Random as R

-- Plan: define a "Random Int", manipulate that, and finally evaluate the result of these
-- manipulations by supplying an RNG.

-- Single die roll function. This represents a "random" integer between 1 and 6.
dieRoll :: R.RandomGen g => R.Rand g Int
dieRoll = R.getRandomR (1, 6)

-- Roll many dice, and get their results
dieRolls :: RandomGen g => R.Rand g [Int]
dieRolls = sequence (repeat dieRoll) -- The `sequence` turns a [Rand g Int] into Rand g [Int]
                                     -- i.e. It handles the RNG state chaining

-- Take 10 rolls
tenRolls :: RandomGen g => R.Rand g [Int]
tenRolls = take 10 <$> dieRolls

R.evalRand tenRolls (mkStdGen 71889) -- We could have seeded this however we wanted, including
                                     -- using /dev/random or whatever. This was simple though.


[5,5,1,3,2,2,1,1,1,1]

Input / Output

  • Different functional languages deal with this differently
  • In Haskell, all functions that interact with the rest of the world must be marked IO
    • Example: readFile :: Filename -> IO String
    • Usually these functions can be separated nicely from everything else!
    • If not, then it's good to know that your function is (arguably) not referentially transparent (hence the IO marker)

In [14]:
-- Implementation of `tac`
import System.Environment (getArgs)

main :: IO ()
main =
    getArgs -- get file names
    >>= mapM readFile -- get list of file contents
    >>= mapM_ (putStrLn . reverseFile)

-- >>= is a way of chaining IO actions. More on that in optional Typeclasses section.

-- Here we have separated the non-IO parts of our code from the IO parts
reverseFile :: String -> String
reverseFile = concat . reverse . lines

Optional Topic: Functional Error Handling


In [15]:
first :: [a] -> a
first (x:xs) = x

first [1,2,3,4,5]
first "Devon is cool"

-- Error: undefined for empty lists!


1
'D'

In [16]:
-- let's define a safer version.
-- data Maybe a = Nothing | Just a

first' :: [a] -> Maybe a
first' (x:xs) = Just x
first' [] = Nothing

first' [1,2,3]
first' []


Just 1
Nothing

In [17]:
-- Using this, we can force ourselves to deal with edge cases!

-- If we use the bad version of first, we can get into trouble
first_incremented :: Num a => [a] -> a
first_incremented xs = first xs + 1
first_incremented [3,2,1]
-- first_incremented [] -- error

first_incremented' :: Num a => [a] -> Maybe a
first_incremented' xs =
    case first' xs of
        Just n -> Just (n + 1)
        Nothing -> Nothing

first_incremented' [3,2,1]
first_incremented' []

-- We can do this more cleanly as:
first_incremented'' :: Num a => [a] -> Maybe a
first_incremented'' = fmap (+1) . first' --more on this in typeclasses section!
first_incremented'' [3,2,1]
first_incremented'' []


4
Just 4
Nothing
Just 4
Nothing

Optional Topic: Functional Data Structures


In [18]:
-- List
data List a = EmptyList | List a (List a) -- this is a full definition of a singly-linked list

-- example function to define on a list
listContains :: Eq a => List a -> a -> Bool
listContains EmptyList _ = False
listContains (List val rest) x = (val == x) || listContains rest x

-- Tree
data Tree a = EmptyTree | Tree (Tree a) a (Tree a) -- this is a full definition of a tree

-- example function to define on a tree
treeContains :: Eq a => Tree a -> a -> Bool
treeContains EmptyTree _ = False
treeContains (Tree left val right) x = (val == x) || treeContains left x || treeContains right x

Optional Topic: Writing Super General Code With Typeclasses, or: How I Learned to Stop Worrying and Love The Monad

Functors

  • We have seen types [a] and Maybe a
  • For [a] you can define a function (a -> b) -> [a] -> [b]
  • For Maybe a you can define a function (a -> b) -> Maybe a -> Maybe b
  • Let's abstract this!

In [19]:
class Functor' f where
    fmap' :: (a -> b) -> f a -> f b
    -- This is also written as <$>
    -- Must act intuitively with `id`
    
instance Functor' [] where
    fmap' = map

instance Functor' Maybe where
    fmap' f (Just a) = Just (f a)
    fmap' f Nothing = Nothing
    
fmap' show [1,2,3]
fmap' show (Just 1)
fmap' show Nothing


["1","2","3"]
Just "1"
Nothing

Applicative

  • You can make a similar typeclass for when the function itself is encapsulated. This is Applicative.
  • This is useful for building up things that may fail, or applying many functions

In [20]:
class Applicative' f where
    pure' :: a -> f a
    apply' :: f (a -> b) -> f a -> f b
    
-- uses:

-- Constructing thing who's parts may fail:
data Star = Star {
    starName :: String,
    starMass :: Double
    } deriving Show
    
makeStar :: Maybe String -> Maybe Double -> Maybe Star
makeStar maybe_name maybe_mass = pure Star <*> maybe_name <*> maybe_mass

makeStar (Just "Sol") (Just 2e30)
makeStar Nothing (Just 2e10)

-- Applying multiple functions:
[(*2), (^2)] <*> [1,2,3]


Just (Star {starName = "Sol", starMass = 2.0e30})
Nothing
[2,4,6,1,4,9]

(The dreaded) Monads!

  • Could also define something that handles (a -> f b) -> f a -> f b
    • This operator is called >>= in Haskell
    • Also need an operator, similar to pure. This is (unfortunately) called return in Haskell.
  • Useful for handling chains of things which carry some context around with them
  • This is how Haskell handles IO actions!

In [21]:
:opt no-lint -- ignore the lint behind the curtain!

In [22]:
posAdd :: Int -> Int -> Maybe Int
posAdd n x
    | n+x > 0 = Just (n+x)
    | otherwise = Nothing
    
posMul :: Int -> Int -> Maybe Int
posMul n x
    | n*x > 0 = Just (n*x)
    | otherwise = Nothing

return 1 >>= posAdd 1 >>= posMul 2
return 1 >>= posAdd 1 >>= posMul (-2)
return 1 >>= posAdd (-1) >>= posMul 2


Just 4
Nothing
Nothing

Haskell Syntax (For reference!)

Basic expressions


In [23]:
-- this is a comment

-- Basic types:
1   -- Integer
1.0 -- Double
True -- Boolean
'a' -- Char
"abc" -- String
(1,'a',"Alpha") -- Tuple
[1,2,3] -- List

-- Lists
1:[2,3] -- [1,2,3]

-- Pattern matching
(a,b) = (1,2) -- a and b are 1 and 2, respectively
(x:xs) = [1,2,3] -- x is 1, xs is [2,3]

-- Function application
negate 1 -- -1
not True -- False
gcd 15 20 -- 5
1 + 2 -- 3. Operators are just infix functions!
(+) 1 2 -- Prefix version


1
1.0
True
'a'
"abc"
(1,'a',"Alpha")
[1,2,3]
[1,2,3]
-1
False
5
3
3

Definitions and Types


In [24]:
-- Definitions
someVal = 3 -- Constant value
increment x = x + 1 -- Function taking one argument
add x y = x + y -- Function taking multiple arguments

-- Multiple definitions
sum' (x:xs) = x + sum' xs -- Recursively sum up a list
sum' [] = 0 -- Base case for empty list
 
-- Specifying types
someVal :: Int -- SomeVal is an Int
3 :: Float -- Here we want to make sure that 3 is read as a Float

decrement :: Int -> Int -- decrement is a function which takes an Int and returns an Int
decrement n = n - 1

multiply :: Int -> Int -> Int -- multiply takes 2 Ints and returns an Int
multiply x y = x * y

-- another view: multiply takes an Int and returns a function from Int to Int
triple :: Int -> Int
triple = multiply 3

-- Generic functions:
id' :: a -> a -- id' is a function which works for any type `a`. It takes an `a` and returns an `a`
id' x = x -- id' pretty much has to be defined this way, given its type signature

-- Constrained Generics
increment' :: Num a => a -> a -- A more generic signature for increment: will work for any Numeric type
increment' x = x + 1 -- Same function as before here


3.0

Putting all that syntax together


In [25]:
map' :: (a -> b) -> [a] -> [b] -- map' takes a function from `a` to `b` and a list of `a`s, and returns a list of `b`s
map' f (x:xs) = f x : map' f xs -- apply `f` to the first element, and then to the rest
map' _ [] = [] -- mapping over an empty list just returns the empty list.
               -- We don't care what `f` is here, so we write "_" as a placeholder (same as in Python)

:t show
map' show [1,2,3,4]


show :: forall a. Show a => a -> String
["1","2","3","4"]