Skip to content

Latest commit

Β 

History

History
507 lines (353 loc) Β· 8.95 KB

File metadata and controls

507 lines (353 loc) Β· 8.95 KB

Functional Programming in Python

β€”

About me

Data Scientist @ idalab (mainly Python) Used Ruby, JS, Python, Haskell, Swift for nontrivial projects Played with Clojure, Scala, Erlang, Elixir

http://kirelabs.org/fun-js

β€”

About this talk

  • Not a motivation of functional programming
  • How can FP by used in Python

β€”

Disclaimer

There should be one β€” and preferably only one β€” obvious way to do it. -- PEP 20 β€” The Zen of Python

^ Python has a philosophy and many of the things you will see go against this philosopy This goes so far that

β€”

Disclaimer (cont)

The fate of reduce() in Python 3000

not having the choice streamlines the thought process -- Guido van Rossum

^ 2005 Disagree: language should support me Don’t recommend you to use what I’m presenting here Show you the entrance to the rabbit hole Jupyter Notebook! Don’t be afraid to interrupt me!

β€”

Functional Programming (in Python)

  • first class functions
  • higher order functions
  • purity
  • immutability
  • composition
  • partial application & currying
  • recursion

^ Vocabulary of fuctional concepts

β€”

Functional Programming (in Python)

  • first class functions
  • higher order functions
  • purity
  • immutability (not today)
  • composition
  • partial application & currying
  • recursion (neither)

β€”

Purity

functions without side-effects

def add(a, b):
	return a + b
	
additions_made = 0
def add(a, b):
	global additions_made
	additions_made += 1
	return a + b

β€”

First class functions

def add(a, b):
	return a + b
	
add_function = add
	
add = lambda a,b: a + b

β€”

higher order functions

def timer(fn):
	def timed(*args, **kwargs):
		t = time()
		fn(*args, *kwargs)
		print "took {time}".format(time=time()-t)
		
	return timed

def compute():
	#…

timed_compute = timer(compute)
timed_compute()

β€”

Decorators

@timer
def compute():
    sleep(1)
    
compute()

β€”

Partial function application

def add1(num):
	return add(1, num)
add1(1)

# simpler
from functools import partial
add1 = partial(add, 1)
add1(1)

^ Toy example - just building up a toolbox for the fun stuff

β€”

Currying

[…] transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions, each with a single argument (partial application) β€” Wikipedia

β€”

Currying

def curried_add(a):
	def inner(b):
		return add(a,b)
	return inner
		
add(1)    # => <function …>
add(1)(1) # => 2

β€”

Currying

from toolz import curry
add = curry(add)

add(1)    # => <function …>
add(1, 1) # => 2

β€”

Interlude: Closures

def curried_add(a):
	def inner(b):
		return add(a,b)
	return inner
		
add(1)    # => <function …>
add(1)(1) # => 2

β€”

Currying example from the stdlib

from operator import itemgetter, attrgetter, methodcaller

obj.method()

from operator import methodcaller
methodcaller("method")(obj)

^ We will see how this is useful

β€”

[fit] Functional collection transformations

β€”

map

map(f, iter)

[f(el) for el in seq]

β€”

filter

filter(p, seq)

[el for el in seq if p(el)]

β€”

reduce

from functools import reduce
reduce(f, seq, initial)

result = initial
for el in seq:
	result = f(result, el)

β€”

^ almost every time I see a reduce() call with a non-trivial function argument, I need to grab pen and paper to diagram what's actually being fed into that function before I understand what the reduce() is supposed to do

β€”

function composition

[f(x) for x in seq if p(x)]

map(f, filter(p, seq))

from toolz.curried import compose, map, filter
compute = compose(map(f), filter(p))
compute(seq)

^ Illustrational purposes

β€”

Example: A bad CSV parser (1/3)

csv = """firstName;lastName
Jim;Drake
Ben;James
Tim;Banes"""

target = [{'firstName': 'Jim', 'lastName': 'Drake'},
          {'firstName': 'Ben', 'lastName': 'James'},
          {'firstName': 'Tim', 'lastName': 'Banes'}]

β€”

Example: Imperative Python (2/3)

lines = csv.split("\n")
matrix = [line.split(';') for line in lines]
header = matrix.pop(0)
records = []
for row in matrix:
    record = {}
    for index, key in enumerate(header):
        record[key] = row[index]
    records.append(record)

β€”

Example: Functional Python (3/3)

from toolz.curried import compose, map
from functools import partial
from operator import methodcaller

split = partial(methodcaller, 'split')
split_lines = split("\n")
split_fields = split(';')
dict_from_keys_vals = compose(dict, zip)
csv_to_matrix = compose(map(split_fields), split_lines)

matrix = csv_to_matrix(csv)
keys = next(matrix)
records = map(partial(dict_from_keys_vals, keys), matrix)

β€”

Example: PySpark

docker run --rm -v ${PWD}:/home/jovyan/work -p 8888:8888 jupyter/pyspark-notebook
def sample(p):
    x, y = random(), random()
    return 1 if x*x + y*y < 1 else 0

count = sc.parallelize(range(0, NUM_SAMPLES)) \
	.map(sample) \
	.reduce(lambda a, b: a + b)
print("Pi is roughly %f" % (4.0 * count / NUM_SAMPLES))

^ http://spark.apache.org/docs/latest/programming-guide.html#transformations

β€”

Example: K-Means

(Stolen and modified from Joel Grus)

def kmeans(points, k):
	return until_convergence(
		iterate(
			find_new_means(points),
			random.sample(points, k)))

β€”

def until_convergence(it):
    return last(accumulate(no_repeat, it))
    
def no_repeat(prev, curr):
    if prev == curr: raise StopIteration
    else: return curr

β€”

import random
from toolz.curried import iterate, accumulate, curry, groupby, last, compose

def kmeans(k, points):
    return until_convergence(iterate(find_new_means(points), random.sample(points, k)))
    
@curry
def find_new_means(points, old_means):
    k = len(old_means)
    clusters = groupby(compose(str, closest_mean(old_means)), points).values()
    return list(map(cluster_mean, clusters))

β€”

@curry
def closest_mean(means, point):
    return min(means, key=squared_distance(point))

@curry
def squared_distance(p, q):
    return sum((p_i - q_i)**2 for p_i, q_i in zip(p, q))

β€”

def cluster_mean(points):
    num_points = len(points)
    dim = len(points[0]) if points else 0
    sum_points = [sum(point[j] for point in points)
                  for j in range(dim)]
    return [s / num_points for s in sum_points]

β€”

Main takeaways

  • FP is possible in Python (to a degree)
  • small composable functions are good
  • FP == build general tools and compose them

^ Functional programming enables writing small composable functions Decide for yourself and with your team if this is a good idea

β€”

[fit] Whats missing in Python (or what I am missing)

  • More list functions
  • Nicer lambda syntax
  • Automatic currying, composition syntax
  • ADTs (sum types)1
  • Pattern Matching

β€”

Functional libraries

(More list functions)

^ Matthew Rocklin last year’s keynote ALEXEY KACHAYEV

β€”

Nicer lambda syntax

map(lambda x: x**2, range(5)) # => [0, 1, 4, 9, 16]

from fn import _
map(_**2, range(5)) # => [0, 1, 4, 9, 16]

^ Might not work as expected in i.e. PySpark

β€”

Other interesting stuff

β€”

Other talks (where I have stolen material)

^ Matthew Rocklin PyData NYC 2013 - pytoolz ALEXEY KACHAYEV PyCon UA 2012 - fn.py Joel Grus: PyData Seattle 2015

β€”

More FP?

β€”

original

Thank you

Daniel Kirsch daniel.kirsch@idalab.de @kirel https://github.com/kirel/functional-python

Footnotes

  1. Possible but ugly http://stupidpythonideas.blogspot.de/2014/08/adts-for-python.html ↩