Agda by Example: λ-calculus

2013-08-10 Agda by Example: λ-calculus

In the previous Agda example we saw how we can approach the task of verifying a sorting function. This time, we are going to write a type checker for the simply typed λ-calculus, plus a simple optimization on said terms that we will prove correct. As in the other post the bulk of the thinking has been done by other people. The type checking is a modified version of an example given in The View from the Left by Conor McBride and James McKinna,1 while the optimisation is inspired from a talk given by Adam Chlipala at POPL 2013—his Coq book Certified Programming with Dependent Types contains many similar examples.
Let’s get started.

Useful imports

module Lambda where

open import Data.Nat using (; zero; suc; _+_; _≤?_; _≥_)
open import Data.Fin using (Fin; zero; suc; toℕ)
open import Data.List using (List; []; _∷_; length)
open import Data.Vec using (Vec; []; _∷_; lookup)
open import Relation.Binary.PropositionalEquality using (_≡_; refl; cong; cong₂)
open import Relation.Nullary using (Dec; yes; no)
open import Function using (_∘_; _$_)
open import Data.Product
After the module declaration, we include some useful modules from the Agda standard library:
  • Data.Nat defines natural numbers.
  • Data.Fin defines an inductive families to represent the type of all numbers less than a given number. For instance Fin 3 will be inhabited by 0, 1, and 2. Another interpretation is that Fin n is the type inhabited by n elements.
  • Data.List, predictably, defines finite lists.
  • Data.Vec defines lists indexed by their length. This allows, for example, for safe indexing of elements.
  • Relation.Binary.PropositionalEquality defines propositional equality as presented in the previous post. cong₂ is the two-argument version of cong.
  • Relation.Nullary defines a type for decidable relations, Dec:
    data Dec (P : Set) : Set where
      yes : ( p :   P) → Dec P
      no  : (¬p : ¬ P) → Dec P
  • Function exports some common utilities regarding functions that should be familiar to the Haskell programmer, such as function composition (_∘_) and explicit application (_$_).

Simple types and raw terms

The language we are going to define is a simple simply typed λ-calculus. The types for our language will be of two kinds: natural numbers and functions (arrow types):
infixr 30 _⇒_
data Type : Set where
  nat : Type
  _⇒_ : Type  Type  Type
We will use de Bruijn indices to represent bound variables, where n represents the variable bound by the n-th λ going towards the top of the term. For example, using an Haskell notation for λs, we will have that
Named de Bruijn
λx → x λ 1
λx → λy → x λ λ 2
λx → λy → λz → x z (y z) λ λ λ 3 1 (2 1)
and so on. de Bruijn notation is a terrible device for humans to use directly, but it is often more convenient when dealing with terms mechanically. When parsing a language it is easy to go from names to nameless, so that we can have the best of both worlds.
Thus we will have:
infixl 80 _·_
data Syntax : Set where
  -- A variable, the ℕ being a de Bruijn index.
  var :   Syntax
  -- A number literal.
  lit :   Syntax
  -- Addition.
  _⊕_ : Syntax  Syntax  Syntax
  -- Function application.
  _·_ : Syntax  Syntax  Syntax
  -- Lambda abstraction, the type indicates the type of the argument: no
  -- type inference here.
  lam : Type  Syntax  Syntax

Typed terms, and type checking

Our λ-calculus has a very unsurprising type system. Typing judgements will be done in a context, holding the types of the abstracted variables. Calling the context Γ, we can give typing judgements for our syntax:
The type system for our language
For those unfamiliar with typing rules like the ones above, they can be read as “given what’s above the bar, what’s below holds”. In our case we have judgements of the form Γ ⊢ t : τ, which can be read as “In context Γ, term t has type τ”.
Lit says that number literals have type nat. Add says that given two terms of type nat, we can add them together yielding another term of type nat. Var says that given a variable we can get its type by looking it up in the context—we haven’t defined what a context is but we will soon. App says that given a function of type τ ⇒ σ we can apply it to a term of type τ to get a term of type σ. Finally, if we have a term t : τ in a context with σ “at its head”, we can form a function of type σ ⇒ τ, typed in the context with σ removed (Lam).
Now, taking advantage of Agda’s facilities, we can define a Term data type to represent terms together with their type—a Term will essentially include a derivation that its type is valid.
We will represent the context as a Vec containing Types. The fact that Vec carry their length at the type is useful, since it will indicate how many variables are in scope, and thus what the largest de Bruijn index is.
Ctx :   Set
Ctx = Vec Type
A type checked Term will be parametrised over a context Γ of some length n. Moreover, it will be indexed by its type. The type is an index because different constructors will be typed differently, just like in the typing derivations.
data Term {n} (Γ : Ctx n) : Type  Set where
  var :  {τ} (v : Fin n)  τ  lookup v Γ  Term Γ τ
  lit :   Term Γ nat
  _⊕_ : Term Γ nat  Term Γ nat  Term Γ nat
  _·_ :  {σ τ}  Term Γ (σ  τ)  Term Γ σ  Term Γ τ
  lam :  σ {τ}  Term (σ  Γ) τ  Term Γ (σ  τ)
Note that in the var case we could have written
var : (v : Fin n) → Term Γ (lookup v Γ)
However this would have caused problems further down the line,2 and thus we quantify over a type τ and use an explicit equality.
Also in var, we use Fin to make sure that the index is indeed “in scope”. If the context is of length n, an index of type Fin n makes sure that we can look up the type of the variable, since Fin n represents the type of all naturals less than n:
lookup : ∀ {n} {A : Set} → Fin n → Vec A n → A
Closed terms will be those living in empty contexts:
Closed : Type  Set
Closed = Term []
For example, we can write a function doubling numbers in Term. The type will predictably be nat ⇒ nat:
double : Closed (nat  nat)
double = lam nat (var zero refl  var zero refl)
We can also write a function that forgets the type information and gets us Syntax from Term:
erase :  {n} {Γ : Ctx n} {τ}  Term Γ τ  Syntax
erase (var v _) = var (toℕ v)
erase (lit n)   = lit n
erase (t  u)   = erase t  erase u
erase (t · u)   = erase t · erase u
erase (lam σ t) = lam σ (erase t)

Type equality

Before writing our type checking function, we need some additional tools. We will begin by defining a decision procedure for equality between types. First two lemmas telling us that if two arrow types are equal, both sides of the arrows are equal too:
≡⇒₁ :  {σ σ′ τ τ′}  σ  τ  σ′  τ′  σ  σ′
≡⇒₁ refl = refl
≡⇒₂ :  {σ σ′ τ τ′}  σ  τ  σ′  τ′  τ  τ′
≡⇒₂ refl = refl
Then the actual function. Dec is inhabited by either evidence that the relation holds, using the constructor yes; or that it doesn’t, using no:
_≟_ : (τ σ : Type)  Dec (τ  σ)
nat    nat   = yes refl
nat    _  _ = no λ()
_  _  nat   = no λ()
σ  τ  σ′  τ′ with σ  σ′ | τ  τ′
σ  τ  .σ  .τ | yes refl | yes refl = yes refl
σ  τ  σ′  τ′ | no  σ≢σ′ | _        = no (σ≢σ′  ≡⇒₁)
σ  τ  σ′  τ′ | _        | no τ≢τ′  = no (τ≢τ′  ≡⇒₂)
Note that in the cases where different constructors are present we use λ(), which is a shorthand for a function with an empty pattern—meaning that its input cannot be inhabited. For example when faced with nat ≟ σ ⇒ τ to prove the inequality we need to show that ¬ (nat ≡ σ ⇒ τ), which is a shorthand for (nat ≡ σ ⇒ τ) → ⊥—“We can derive falsity given that nat is equal to an arrow type.” But given how is defined Agda knows that its indices must be the same, and thus that nat ≡ σ ⇒ τ cannot be inhabited, allowing us to use an empty pattern.
Moreover, note how we use dependent pattern matching when recursing down the structure of the types: when pattern matching on an equality proof in a yes using refl the two relevant types are constrained to be the same using dotted patterns, which allows us to use refl again to prove equality of the type as a whole.

From Fin to

The second tool that we will use is a “view” to tell us whether a given “fits” in a Fin n.
data Fromℕ (n : ) :   Set where
  yes : (m : Fin n)  Fromℕ n (toℕ m)
  no  : (m : )      Fromℕ n (n + m)
Fromℕ is parametrised over the “upper bound” n, and indexed over the that we are bringing into Fin n. We can then write a function that tries to convert a number m to a Fin n:
fromℕ :  n m  Fromℕ n m
fromℕ zero    m    = no m
fromℕ (suc n) zero = yes zero
fromℕ (suc n) (suc m) with fromℕ n m
fromℕ (suc n) (suc .(toℕ m)) | yes m = yes (suc m)
fromℕ (suc n) (suc .(n + m)) | no  m = no m

Type checking

Finally we can proceed to the actual type checking procedure. As with Fromℕ, we define a data type indexed over what will be the input of our function—some piece of untyped Syntax:
data Check {n} (Γ : Ctx n) : Syntax  Set where
  yes : (τ : Type) (t : Term Γ τ)  Check Γ (erase t)
  no  : {e : Syntax}  Check Γ e
Check is parametrised over contexts. If type checking succeeds, we return a type τ and a term t : τ, the erasure of t being the Syntax we were type checking—the yes constructor. We can fail at any time using the constructor no.3
Our type checking procedure will work with a context and some Syntax:
check :  {n} (Γ : Ctx n) (t : Syntax)  Check Γ t
With var, we make sure that the index is in scope using fromℕ. If it is, the type is determined by what’s in the at the right index:
check {n} Γ (var v) with fromℕ n v
check {n} Γ (var .(toℕ v)) | yes v = yes (lookup v Γ) (var v refl)
check {n} Γ (var .(n + m)) | no  m = no
Literals are of type nat:
check Γ (lit n) = yes nat (lit n)
For additions, we recursively type check the two sides. If they are both nat, the resulting term is nat too. Otherwise, type checking fails:
check Γ (t  u) with check Γ t | check Γ u
check Γ (.(erase t)  .(erase u)) | yes nat t | yes nat u = yes nat (t  u)
check Γ (_  _)                   | _         | _         = no
Note that when checking against the results of the recursive calls, we use dotted matches to show the relationship between the input Syntax terms and the type checked terms: the former is the erasure of the latter. This way we can invoke yes so that the result will be well typed.
The interactive Agda mode is tremendously useful in these situations. If we are in the situation
check Γ (t ⊕ u) with check Γ t | check Γ u
check Γ (t ⊕ u) | p1 | p2 = {!!}
We can pattern match by placing p1 in the hole and then invoking C-c C-c, which will result in
check Γ (.(erase t) ⊕ u) | yes τ t | p2 = {!!}
Agda automatically places the dotted patterns. Then after doing the same for p2 and matching nat, we are left with
check Γ (.(erase t) ⊕ .(erase u)) | yes nat t | yes nat u = {!!}
The goal for that hole, as Agda reminds us, is
Check Γ (erase t ⊕ erase u)
We can type yes and then C-r to refine the hole, and then C-a in each hole to tell Agda to try to search for an appropriate term, which in this case succeeds. In general with strong enough types we only write an outline of the function and then a lot of details can be filled in automatically.
For what concerns application, we check that the term we are applying something to has a function type, and that the argument’s type is equal to the domain of said function:
check Γ (t · u) with check Γ t | check Γ u
check Γ (.(erase t) · .(erase u)) | yes (σ  τ) t | yes σ′ u with σ  σ′
check Γ (.(erase t) · .(erase u)) | yes (σ  τ) t | yes .σ u | yes refl = yes τ (t · u)
check Γ (.(erase t) · .(erase u)) | yes (σ  τ) t | yes σ′ u | no  _    = no
check Γ (t · u)                   | _             | _       = no
Finally, with λs, we check that the body of the function is well typed in the context plus the type of the argument. If that’s the case, the resulting type is an arrow type:
check Γ (lam σ t) with check (σ  Γ) t
check Γ (lam σ .(erase t)) | yes τ t = yes (σ  τ) (lam σ t)
check Γ (lam σ t)          | no      = no
Type checking the syntax for double will give good results:
> C-c C-n check [] (lam nat (var 0 ⊕ var 0))
yes (nat ⇒ nat) (lam nat (var zero refl ⊕ var zero refl))
While ill-typed examples will fail:
> C-c C-n check [] (lam nat (var 0 · var 0))
no

> C-c C-n check [] (lam (nat ⇒ nat) (var 0 ⊕ var 0))
no

Embedding terms

Now that we have a type checker, we can embed typed terms into Agda. This technique is a common and cheap way to give semantics to programming languages that we are modelling inside a theorem prover—we borrow the host language semantics.
In our case, we can easily define a function taking Type into Agda’s Set:
⟦_⟧ : Type  Set
 nat    = 
 σ  τ  =  σ    τ 
Then, to transport an arbitrary Term Γ τ to ⟦ τ ⟧, we need an environment storing values for the bound variables:
infixr 5 _∷_
data Env :  {n}  Ctx n  Set where
  []  : Env []
  _∷_ :  {n} {Γ : Ctx n} {τ}   τ   Env Γ  Env (τ  Γ)
Env Γ stores values corresponding to the types stored in Γ. We can extend an existing Env by consing a value of the right type to it, and extending the context.
We can also lookup values into the Env just like we can index into Vecs:
lookupEnv :  {n} {Γ : Ctx n} (m : Fin n)  Env Γ   lookup m Γ 
lookupEnv zero    (x  _)   = x
lookupEnv (suc n) (_  env) = lookupEnv n env
With these tools, we can define a function evaluating a Term into an Agda value of the right type:
_[_] :  {n} {Γ : Ctx n} {τ}  Env Γ  Term Γ τ   τ 
env [ var v refl ] = lookupEnv v env
env [ lit n      ] = n
env [ t  u      ] = env [ t ] + env [ u ]
env [ t · u      ] = (env [ t ]) (env [ u ])
env [ lam _ t    ] = λ x  (x  env) [ t ]
Now we can verify, for example, that double does the right thing:
double′ :  nat  nat 
double′ = [] [ double ] -- A closed term, in an empty environment

doubleTest : double′ 3  6
doubleTest = refl

Constant folding

To conclude, we will write a simple optimisation our terms, and prove that it is sound by showing that it always preserves the semantics of the original term, using the embedding above.
To represent “optimised” terms, we will use a record parametrised over the original term and holding the optimised version, plus evidence that for all environments evaluating the original and optimised term will yield the same Agda value:
record Optimised {n} {Γ : Ctx n} {σ} (t : Term Γ σ) : Set where
  constructor opt
  field
    optimised : Term Γ σ
    sound     :  {env}  env [ t ]  env [ optimised ]
However, to succeed, we need to postulate an axiom:
postulate ext :  {A B : Set} {f g : A  B}  ({x : A}  f x  g x)  f  g
This axiom says that if we have two functions f and g, and we have evidence that for all inputs the functions have equal outputs, then the functions themselves can be considered equal. This is usually know as functional extensionality, and we cannot prove it in Agda and theorem provers based on similar theories.4 Postulating extensionality is still (hopefully) consistent with Agda’s theory, although obviously we will not be able to compute with it—definitions pattern matching on equality proofs will get stuck on invocations of ext.
With this axiom and our record Optimised, we will write the optimisation and the proof at the same time. The optimisation will consist of reducing constant expressions to number literals, so that for example
lit 3 ⊕ lit 7
will be reduced to
lit 10
So, our function will take a term and produce an optimised version of it:
cfold :  {n} {Γ : Ctx n} {τ} (t : Term Γ τ)  Optimised t
Variables and literals stay untouched, so the proof of equality will be trivial:
cfold (var v p) = opt (var v p) refl
cfold (lit x)   = opt (lit x)   refl
For applications, we call cfold recursively, and then we use cong₂ to combine the resulting proofs of soundness—since evaluating functions means applying them, we use _$_:
cong₂ : {A B C : Set} (f : A → B → C) {x y u v} → x ≡ y → u ≡ v → f x u ≡ f y v
_$_   : ∀ {A : Set} {B : A → Set} → ((x : A) → B x) → ((x : A) → B x)
cong₂ _$_ : {A B : Set} {f g : A → B} {x y : A} → f ≡ g → x ≡ y → (f $ x) ≡ (g $ y)
cfold (t · u) with cfold t | cfold u
... | opt t′ p | opt u′ q = opt (t′ · u′) (cong₂ _$_ p q)
For λs, we need to use ext to prove that the optimised function is equal to the original one:
cfold (lam σ t) with cfold t
... | opt t′ p = opt (lam σ t′) (ext p)
Finally, with additions, we perform the actual optimisation. If both sides after optimisation are number literals we can replace them with another literal. Otherwise we leave the addition in place. We also combine the proofs of equality of the sides with cong₂ _+_, given that the denotation uses _+_.5
cfold (t  u) with cfold t | cfold u
... | opt (lit n) p | opt (lit m) q = opt (lit (n + m)) (cong₂ _+_ p q)
... | opt t′      p | opt u′      q = opt (t′  u′) (cong₂ _+_ p q)
And that’s it.
You can comment this post on reddit.

  1. See also the Agda version by Ulf Norell in his Dependently Typed Programming in Agda tutorial, after which some of the data types for this tutorial are patterned after.
  2. See the footnote ad the end of the section about constant folding.
  3. Notice that the no constructor does not include evidence that the term is ill-typed. This could be fixed by having a data type representing ill-typed terms.
    Precise types are usually a good idea, not only because they describe the program better to humans, but also because machines can do more with it. For example, when using Agsy with Check, bad is always going to be a valid candidate, even if the term we are checking is well typed.
    Here we have chosen a simpler way for brevity, especially since the “error” data type would be fairly boring.
  4. It has been an ongoing challenge to find a workable theory where functional extensionality holds—a recent proposal is observational equality.
    With “workable” we mean among other things a theory where type checking is decidable, something that we lose if for example we “solve” the problem in a naïve way by adding a rule to derive definitional equality (the meta-judgement in the type checker for term equality) from propositional equality.
    More broadly the theme of equality is a very debated one in type theory, with one related development getting a lot of people excited lately.
  5. If we had var to have lookup v Γ as an index, we would have been unable to pattern match on the optimised term here, since Agda would have got stuck trying to decide if there should be a case for var.

The Importance of Realism in Startups

The Importance of Realism in Startups

I’ve done a lot of video interviews. This is one of my favorite if not my favorite outright.
Screen Shot 2013-08-10 at 6.33.27 PM
It’s only 12 minutes long and if you’re a first-time entrepreneur (or second time, frankly) I encourage you to watch it if for nothing else than to get a sense that your struggles are universal.
TechCrunch interviewed me and asked me to talk about failure. So I spoke for 12 minutes about my own failures. I made many classic first-time mistakes which makes it easier for me to spot when others make similar bad choices.
It serves both as my warning signal of which teams to avoid funding, especially if I perceive they will make critical mistakes often led by hubris or naïveté. An obvious example is if they talk about M&A deals, teams they could just “bolt on” or “doing a rollup in their industry.”
It’s why I will never fund Conference Ho’s. I know that this is driven from an unhealthy ego, self-centrism and lack of regard for running options of one’s startup.
Vomit.
My errors also serve as my source for coaching others teams for more benign mistakes like over-building functionality, over-complicating the product or hiring too senior of people.
I think failure is critical for many reasons. Mostly because it makes us better leaders.
We learn from mistakes. We learn from losses.
In part I felt it was important to let people know that we all have failure and make mistakes.
As I’ve said before, all startups need to realize that every other company still has to see itself naked in the mirror in the morning. Stop reading their press releases or hearing their founder talk about he is crushing it. We all know that people who truly are crushing it rarely talk about it.
Once you realize that we’re all the same, all dealing with the same pressures, fears and struggles – you’ll learn to keep more focused on what you’re doing and not whatever everybody else is doing.
In my “failure interview” with TechCrunch I talked about the biggest stress that really comes from startups – dealing with all the other people with whom you work. Startups are filled with enormously talented people – often product people & engineers. As an industry we’re hardly used to talking openly about feelings or resolving conflicts.
It’s why I believe startup coaches are so important and I wish I knew more great ones. If you have great experiences please leave names in the comments section.

The Success Bias

In the end it’s easy to look back triumphantly at our startup experiences and define every move as heroic.
We of course remember the positive outcomes, the rewards, the press celebrations at key moments or at the finish line. We of course get all of the accolades if at the tape there was a financial pay day.
And with so many acquihires these days you never really know who was financially successful and who was safely landing a plane with no engines.
It’s certainly nice to look at your past accomplishments in your bio.
Yet these only tell the stories from one side of the startup ledger. They are whitewashed falsehoods that mask the struggles. They are only one aspect of the startup experience.
Even along the journey and nowhere near the destination I see many startups with their chests pumped out touting their latest deals, showing off their swish offices funded by millions of venture funding (and not necessarily yet the commensurate business success to afford said offices or perks).
I understand the temptation. In a world with too much tech hype many teams feel the need to constantly spin.
I prefer the opposite.
I prefer realism in startups. It’s part of my stump speech to first-time founders or university students.
Avoid the stupid mistake I made (and talk about in the video):
  • raising too much money too quickly
  • building too many features (a mile wide & an inch deep)
  • getting too much press before we were ready
  • focusing on M&A to fix our problems
  • believing our own hype
Most of the days at a startup are a grind. While you’re in the moment it feels like there are as many failures as their are successes.
Even success feels hollow. I had a friend who was on the front page of the business section of one of the top newspapers in the country while his company was 30 days from running out of cash. And in all seriousness the article prompted his relative to hit him up for money.
Every first-time entrepreneur who has raised millions in VC will know the surrealism of people calling you a millionaire while you are figuring out whether you can really afford to pay for a vacation since your credit card is already a little bit bruised.
We put on our brave faces and turn up everyday hoping that in the end we won’t feel like frauds. In fact, I believe that one of the largest motivators for startups to avoid the ultimate failure is to avoid the humiliation of not having every positive press mention seem like you were a phony.
It’s not that you don’t believe in your ultimate outcome – you have to believe in order to be insane enough to continue the journey against all odds – it’s just that there is nagging self doubt.
There are of course also external factors you can’t control. You think investors will continue to finance you – they promised they would – but you never really know. Until you know.
In the end you don’t always get the answer you had hoped for.

Youth vs. Wisdom

In your youth you have the bravado to face uncertainty with the blind optimism that success is inevitable. In short, you don’t know what you don’t know. I keep coaching an investor friend of mine that this can be a good thing at times and that he shouldn’t be too quickly to discount lack of experience from a team with serious talent and full of ambition.
This “naive optimism” is why I believe younger entrepreneurs are more likely to produce insanely big outcomes.
Yet youth often brings a triumphalism that blindsides entrepreneurs into missing the macro picture.
It is why younger entrepreneurs are more likely to drink their own Kool-Aid, which of course is dangerous.In bull markets many credit themselves with brilliance and industry stewardship when perhaps they are merely riding an ephemeral trend fueled by speculative capital.
Age brings wisdom. Timidity, too. And sometimes cynicism. But age brings perspective. If older entrepreneurs are more cautious it’s because life’s experiences have taught them to be so.
Older entrepreneurs tend to spend cash more wisely, for example. They feel less in a rush to keep up with the Jones’s since they’ve seen a few boom-and-bust cycles and they know it’s a marathon.
I find older entrepreneurs more willing to have pragmatic debates about competition as well. They realize that there is often more to be gained by attacking the existing market structure than each other.
Older entrepreneurs tend to avoid lawsuits where possible. There is less ego. Younger people still like to fight.
And for the most part they shy away from premature press because they know the consequences of getting over one’s skis.

The Case for Realism

I try my best to blog from a realistic perspective because partly because I believe it’s important for people on the journey to have a realistic perspective and not feel ashamed at their progress or performance.
It’s why I wrote one of my most read posts – Entrepreneurshit.
Because I know many people at big & successful companies or at fast-growing startups I know that even they have struggles, doubt, insecurities, fear of failure.
If you knew that it might help you realize that your failures are not so special.
There is always a tomorrow – even after bankruptcy. A second act. A new career if not a humbled one.
That’s why I loved the TechCrunch interview.
It gave me a chance to make sure that wherever you are in your career path you would know that we’ve all been there.
Even if our bio’s don’t mention it.
Failure is ok. It’s not the same as losing or being a loser. It’s a set back.
And it’s how you handle your failures that define you more than anything else.
****
p.s. If any of you were at the Foundry Group rock party in San Francisco where Ryan McIntyre and Seth Levine rocked with many other VCs (David Cremin, David Pakman, who am I forgetting?) then you’ll know how heroic this TechCrunch interview really was. It came with one hell of a hangover that I blame on unnamed LPs (ahem) and on my inability to say no to a challenge (and whiskey). Which come to think of it means I’ve really learned nothing at all in my old age. It came after only a few hours of sleep in which I had a meeting in the morning with an LP who will be smiling if he reads this because he knows that I had to stand for my whole meeting with him in order to get through the meeting.
Posted in Startup Advice

Mark Suster is a 2x entrepreneur who has gone to the Dark Side of VC. He joined Upfront Ventures in 2007 as a General Partner after selling his company to Salesforce.com. He focuses on early-stage technology companies. Read more about Mark.

The Evolution of a Haskell Programmer


The Evolution of a Haskell Programmer

Fritz Ruehr, Willamette University

See Iavor Diatchki’s page “The Evolution of a Programmer” for the “original” (though he is not the author), and also below for the story behind this version.

   (This page has been translated into the Serbo-Croatian language by Anja Skrba from Webhostinggeeks.com. Thanks, Anja, for all your hard work!)
 

Freshman Haskell programmer

fac n = if n == 0 
           then 1
           else n * fac (n-1)
 

Sophomore Haskell programmer, at MIT

(studied Scheme as a freshman)
fac = (\(n) ->
        (if ((==) n 0)
            then 1
            else ((*) n (fac ((-) n 1)))))
 

Junior Haskell programmer

(beginning Peano player)
fac  0    =  1
fac (n+1) = (n+1) * fac n
 

Another junior Haskell programmer

(read that n+k patterns are “a disgusting part of Haskell” [1]
and joined the “Ban n+k patterns”-movement [2])
fac 0 = 1
fac n = n * fac (n-1)
 

Senior Haskell programmer

(voted for   Nixon   Buchanan   Bush — “leans right”)
fac n = foldr (*) 1 [1..n]
 

Another senior Haskell programmer

(voted for   McGovern   Biafra   Nader — “leans left”)
fac n = foldl (*) 1 [1..n]
 

Yet another senior Haskell programmer

(leaned so far right he came back left again!)
-- using foldr to simulate foldl

fac n = foldr (\x g n -> g (x*n)) id [1..n] 1
 

Memoizing Haskell programmer

(takes Ginkgo Biloba daily)
facs = scanl (*) 1 [1..]

fac n = facs !! n
 

Pointless (ahem) “Points-free” Haskell programmer

(studied at Oxford)
fac = foldr (*) 1 . enumFromTo 1
 

Iterative Haskell programmer

(former Pascal programmer)
fac n = result (for init next done)
        where init = (0,1)
              next   (i,m) = (i+1, m * (i+1))
              done   (i,_) = i==n
              result (_,m) = m

for i n d = until d n i
 

Iterative one-liner Haskell programmer

(former APL and C programmer)
fac n = snd (until ((>n) . fst) (\(i,m) -> (i+1, i*m)) (1,1))
 

Accumulating Haskell programmer

(building up to a quick climax)
facAcc a 0 = a
facAcc a n = facAcc (n*a) (n-1)

fac = facAcc 1
 

Continuation-passing Haskell programmer

(raised RABBITS in early years, then moved to New Jersey)
facCps k 0 = k 1
facCps k n = facCps (k . (n *)) (n-1)

fac = facCps id
 

Boy Scout Haskell programmer

(likes tying knots; always “reverent,” he
belongs to the Church of the Least Fixed-Point [8])
y f = f (y f)

fac = y (\f n -> if (n==0) then 1 else n * f (n-1))
 

Combinatory Haskell programmer

(eschews variables, if not obfuscation;
all this currying’s just a phase, though it seldom hinders)
s f g x = f x (g x)

k x y   = x

b f g x = f (g x)

c f g x = f x g

y f     = f (y f)

cond p f g x = if p x then f x else g x

fac  = y (b (cond ((==) 0) (k 1)) (b (s (*)) (c b pred)))
 

List-encoding Haskell programmer

(prefers to count in unary)
arb = ()    -- "undefined" is also a good RHS, as is "arb" :)

listenc n = replicate n arb
listprj f = length . f . listenc

listprod xs ys = [ i (x,y) | x<-xs, y<-ys ]
                 where i _ = arb

facl []         = listenc  1
facl n@(_:pred) = listprod n (facl pred)

fac = listprj facl
 

Interpretive Haskell programmer

(never “met a language” he didn't like)
-- a dynamically-typed term language

data Term = Occ Var
          | Use Prim
          | Lit Integer
          | App Term Term
          | Abs Var  Term
          | Rec Var  Term

type Var  = String
type Prim = String


-- a domain of values, including functions

data Value = Num  Integer
           | Bool Bool
           | Fun (Value -> Value)

instance Show Value where
  show (Num  n) = show n
  show (Bool b) = show b
  show (Fun  _) = ""

prjFun (Fun f) = f
prjFun  _      = error "bad function value"

prjNum (Num n) = n
prjNum  _      = error "bad numeric value"

prjBool (Bool b) = b
prjBool  _       = error "bad boolean value"

binOp inj f = Fun (\i -> (Fun (\j -> inj (f (prjNum i) (prjNum j)))))


-- environments mapping variables to values

type Env = [(Var, Value)]

getval x env =  case lookup x env of
                  Just v  -> v
                  Nothing -> error ("no value for " ++ x)


-- an environment-based evaluation function

eval env (Occ x) = getval x env
eval env (Use c) = getval c prims
eval env (Lit k) = Num k
eval env (App m n) = prjFun (eval env m) (eval env n)
eval env (Abs x m) = Fun  (\v -> eval ((x,v) : env) m)
eval env (Rec x m) = f where f = eval ((x,f) : env) m


-- a (fixed) "environment" of language primitives

times = binOp Num  (*)
minus = binOp Num  (-)
equal = binOp Bool (==)
cond  = Fun (\b -> Fun (\x -> Fun (\y -> if (prjBool b) then x else y)))

prims = [ ("*", times), ("-", minus), ("==", equal), ("if", cond) ]


-- a term representing factorial and a "wrapper" for evaluation

facTerm = Rec "f" (Abs "n" 
              (App (App (App (Use "if")
                   (App (App (Use "==") (Occ "n")) (Lit 0))) (Lit 1))
                   (App (App (Use "*")  (Occ "n"))
                        (App (Occ "f")  
                             (App (App (Use "-") (Occ "n")) (Lit 1))))))

fac n = prjNum (eval [] (App facTerm (Lit n)))
 

Static Haskell programmer

(he does it with class, he’s got that fundep Jones!
After Thomas Hallgren’s “Fun with Functional Dependencies” [7])
-- static Peano constructors and numerals

data Zero
data Succ n

type One   = Succ Zero
type Two   = Succ One
type Three = Succ Two
type Four  = Succ Three


-- dynamic representatives for static Peanos

zero  = undefined :: Zero
one   = undefined :: One
two   = undefined :: Two
three = undefined :: Three
four  = undefined :: Four


-- addition, a la Prolog

class Add a b c | a b -> c where
  add :: a -> b -> c
  
instance              Add  Zero    b  b
instance Add a b c => Add (Succ a) b (Succ c)


-- multiplication, a la Prolog

class Mul a b c | a b -> c where
  mul :: a -> b -> c

instance                           Mul  Zero    b Zero
instance (Mul a b c, Add b c d) => Mul (Succ a) b d


-- factorial, a la Prolog

class Fac a b | a -> b where
  fac :: a -> b

instance                                Fac  Zero    One
instance (Fac n k, Mul (Succ n) k m) => Fac (Succ n) m

-- try, for "instance" (sorry):
-- 
--     :t fac four
 

Beginning graduate Haskell programmer

(graduate education tends to liberate one from petty concerns
about, e.g., the efficiency of hardware-based integers)
-- the natural numbers, a la Peano

data Nat = Zero | Succ Nat


-- iteration and some applications

iter z s  Zero    = z
iter z s (Succ n) = s (iter z s n)

plus n = iter n     Succ
mult n = iter Zero (plus n)


-- primitive recursion

primrec z s  Zero    = z
primrec z s (Succ n) = s n (primrec z s n)


-- two versions of factorial

fac  = snd . iter (one, one) (\(a,b) -> (Succ a, mult a b))
fac' = primrec one (mult . Succ)


-- for convenience and testing (try e.g. "fac five")

int = iter 0 (1+)

instance Show Nat where
  show = show . int

(zero : one : two : three : four : five : _) = iterate Succ Zero
 

Origamist Haskell programmer

(always starts out with the “basic Bird fold”)
-- (curried, list) fold and an application

fold c n []     = n
fold c n (x:xs) = c x (fold c n xs)

prod = fold (*) 1


-- (curried, boolean-based, list) unfold and an application

unfold p f g x = 
  if p x 
     then [] 
     else f x : unfold p f g (g x)

downfrom = unfold (==0) id pred


-- hylomorphisms, as-is or "unfolded" (ouch! sorry ...)

refold  c n p f g   = fold c n . unfold p f g

refold' c n p f g x = 
  if p x 
     then n 
     else c (f x) (refold' c n p f g (g x))
                         

-- several versions of factorial, all (extensionally) equivalent

fac   = prod . downfrom
fac'  = refold  (*) 1 (==0) id pred
fac'' = refold' (*) 1 (==0) id pred
 

Cartesianally-inclined Haskell programmer

(prefers Greek food, avoids the spicy Indian stuff;
inspired by Lex Augusteijn’s “Sorting Morphisms” [3])
-- (product-based, list) catamorphisms and an application

cata (n,c) []     = n
cata (n,c) (x:xs) = c (x, cata (n,c) xs)

mult = uncurry (*)
prod = cata (1, mult)


-- (co-product-based, list) anamorphisms and an application

ana f = either (const []) (cons . pair (id, ana f)) . f

cons = uncurry (:)

downfrom = ana uncount

uncount 0 = Left  ()
uncount n = Right (n, n-1)


-- two variations on list hylomorphisms

hylo  f  g    = cata g . ana f

hylo' f (n,c) = either (const n) (c . pair (id, hylo' f (c,n))) . f

pair (f,g) (x,y) = (f x, g y)


-- several versions of factorial, all (extensionally) equivalent

fac   = prod . downfrom
fac'  = hylo  uncount (1, mult)
fac'' = hylo' uncount (1, mult)
 

Ph.D. Haskell programmer

(ate so many bananas that his eyes bugged out, now he needs new lenses!)
-- explicit type recursion based on functors

newtype Mu f = Mu (f (Mu f))  deriving Show

in      x  = Mu x
out (Mu x) = x


-- cata- and ana-morphisms, now for *arbitrary* (regular) base functors

cata phi = phi . fmap (cata phi) . out
ana  psi = in  . fmap (ana  psi) . psi


-- base functor and data type for natural numbers,
-- using a curried elimination operator

data N b = Zero | Succ b  deriving Show

instance Functor N where
  fmap f = nelim Zero (Succ . f)

nelim z s  Zero    = z
nelim z s (Succ n) = s n

type Nat = Mu N


-- conversion to internal numbers, conveniences and applications

int = cata (nelim 0 (1+))

instance Show Nat where
  show = show . int

zero = in   Zero
suck = in . Succ       -- pardon my "French" (Prelude conflict)

plus n = cata (nelim n     suck   )
mult n = cata (nelim zero (plus n))


-- base functor and data type for lists

data L a b = Nil | Cons a b  deriving Show

instance Functor (L a) where
  fmap f = lelim Nil (\a b -> Cons a (f b))

lelim n c  Nil       = n
lelim n c (Cons a b) = c a b

type List a = Mu (L a)


-- conversion to internal lists, conveniences and applications

list = cata (lelim [] (:))

instance Show a => Show (List a) where
  show = show . list

prod = cata (lelim (suck zero) mult)

upto = ana (nelim Nil (diag (Cons . suck)) . out)

diag f x = f x x

fac = prod . upto
 

Post-doc Haskell programmer

(from Uustalu, Vene and Pardo’s “Recursion Schemes from Comonads” [4])
-- explicit type recursion with functors and catamorphisms

newtype Mu f = In (f (Mu f))

unIn (In x) = x

cata phi = phi . fmap (cata phi) . unIn


-- base functor and data type for natural numbers,
-- using locally-defined "eliminators"

data N c = Z | S c

instance Functor N where
  fmap g  Z    = Z
  fmap g (S x) = S (g x)

type Nat = Mu N

zero   = In  Z
suck n = In (S n)

add m = cata phi where
  phi  Z    = m
  phi (S f) = suck f

mult m = cata phi where
  phi  Z    = zero
  phi (S f) = add m f


-- explicit products and their functorial action

data Prod e c = Pair c e

outl (Pair x y) = x
outr (Pair x y) = y

fork f g x = Pair (f x) (g x)

instance Functor (Prod e) where
  fmap g = fork (g . outl) outr


-- comonads, the categorical "opposite" of monads

class Functor n => Comonad n where
  extr :: n a -> a
  dupl :: n a -> n (n a)

instance Comonad (Prod e) where
  extr = outl
  dupl = fork id outr


-- generalized catamorphisms, zygomorphisms and paramorphisms

gcata :: (Functor f, Comonad n) =>
           (forall a. f (n a) -> n (f a))
             -> (f (n c) -> c) -> Mu f -> c

gcata dist phi = extr . cata (fmap phi . dist . fmap dupl)

zygo chi = gcata (fork (fmap outl) (chi . fmap outr))

para :: Functor f => (f (Prod (Mu f) c) -> c) -> Mu f -> c
para = zygo In


-- factorial, the *hard* way!

fac = para phi where
  phi  Z             = suck zero
  phi (S (Pair f n)) = mult f (suck n)
  

-- for convenience and testing

int = cata phi where
  phi  Z    = 0
  phi (S f) = 1 + f

instance Show (Mu N) where
  show = show . int
 

Tenured professor

(teaching Haskell to freshmen)
fac n = product [1..n]


 

Background

On 19 June 2001, at the OGI PacSoft Tuesday Morning Seminar Series , Iavor Diatchki presented the paper “Recursion Schemes from Comonads” by Uustalu, Vene and Pardo [4]. I attended Iavor’s excellent presentation and remarked that I found the end of the paper rather anti-climactic: after much categorical effort and the definition of several generalized recursion combinators, the main examples were the factorial and Fibonacci functions. (Of course, I offered no better examples myself, so this was rather unfair carping.) Some time later, I came across Iavor’s "jokes" page, including a funny bit called “The Evolution of a Programmer” in which the traditional imperative "Hello, world" program is developed through several variations, from simple beginnings to a ridiculously complex extreme. A moment’s thought turned up the factorial function as the best functional counterpart of "Hello, world". Suddenly the Muse struck and I knew I must write out these examples, culminating (well, almost) in the heavily generalized categorical version of factorial provided by Uustalu, Vene and Pardo.
I suppose this is what you’d have to call “small-audience” humour.
PS: I’ve put all the code into a better-formatted text file for those who might like to experiment with the different variations (you could also just cut and paste a section from your browser).
PPS: As noted above, Iavor is not the original author of “The Evolution of a Programmer.” A quick web search suggests that there are thousands of copies floating around and it appears (unattributed) in humor newsgroups as far back as 1995. But I suspect some version of it goes back much further than that. Of course, if anyone does know who wrote the original, please let me know so that I may credit them here.


 

But seriously, folks, ...

On a more serious note, I think that the basic idea of the joke (successive variations on a theme, building in complexity) can serve a good pedagogical purpose as well as a humorous one. To that end, and for those who may not be familiar with all of the ideas represented above, I offer the following comments on the variations: The first version (straight recursion with conditionals) is probably familiar to programmers of all stripes; fans of LISP and Scheme will find the sophomore version especially readable, except for the funny spelling of “lambda” and the absence of “define” (or “defun”). The use of patterns may seem only a slight shift in perspective, but in addition to mirroring mathematical notation, patterns encourage the view of data types as initial algebras (or as inductively defined).
The use of more “structural” recursion combinators (such as foldr and foldl) is square in the spirit of functional programming: these higher-order functions abstract away from the common details of different instances of recursive definitions, recovering the specifics through function arguments. The “points-free” style (defining functions without explicit reference to their formal parameters) can be compelling, but it can also be over-done; here the intent is to foreshadow similar usage in some of the later, more stridently algebraic variations.
The accumulating-parameter version illustrates a traditional technique for speeding up functional code. It is the second fastest implementation here, at least as measured in terms of number of reductions reported by Hugs, with the iterative versions coming in third. Although the latter run somewhat against the spirit of functional programming, they do give the flavor of the functional simulation of state as used in denotational semantics or, for that matter, in monads. (Monads are woefully un-represented here; I would be grateful if someone could contribute a few (progressive) examples in the spirit of the development above.) The continuation-passing version recalls a denotational account of control (the references are to Steele’s RABBIT compiler for Scheme and the SML/NJ compiler).
The fixed-point version demonstrates that we can isolate recursion in a general Y combinator. The combinatory version provides an extreme take on the points-free style inspired by Combinatory Logic, isolating dependence on variable names to the definitions of a few combinators. Of course we could go further, defining the Naturals and Booleans in combinatory terms, but note that the predecessor function will be a bit hard to accomodate (this is one good justification for algebraic types). Also note that we cannot define the Y combinator in terms of the others without running into typing problems (due essentially to issues of self-application). Interestingly, this is the fastest of all of the implementations, perhaps reflecting the underlying graph reduction mechanisms used in the implementation.
The list-encoded version exploits the simple observation that we can count in unary by using lists of arbitrary elements, so that the length of a list encodes a natural number. In some sense this idea foreshadows later versions based on recursive type definitions for Peano’s naturals, since lists of units are isomorphic to naturals. The only interesting thing here is that multiplication (numeric product) is seen to arise naturally out of combination (Cartesian product) by way of cardinality. Typing issues make it hard to express this correspondence as directly as we’d like: the following definition of listprod would break the definition of the facl function due to an occurs-check/infinite type:
listprod xs ys = [ (x,y) | x<-xs, y<-ys ]
Of course we could also simplify as follows, but only at the expense of obscuring the relationship between the two kinds of products:
listprod xs ys = [ arb | x<-xs, y<-ys ]
The interpretive version implements a small object language rich enough to express factorial, and then implements an interpreter for it based on a simple environment model. Exercises along these lines run all through the latter half of the Friedman, Wand and Haynes text ([6]), albeit expressed there in Scheme. We used to get flack from students at Oberlin when we made them implement twelve interpreters in a single week-long lab, successively exposing more of the implementation by moving the real work from the meta-language to the interpreter. This implementation leaves a whole lot on the shoulders of the meta-language, corresponding to about Tuesday or Wednesday in their week. Industrious readers are invited to implement a compiler for a Squiggol-like language of polytypic folds and unfolds, targeting (and simulating) a suitable categorical abstract machine (see [9]), and then to implement factorial in that setting (but don't blame me if it makes you late for lunch ...).
The statically-computed version uses type classes and functional dependencies to facilitate computation at compile time (the latter are recent extensions to the Haskell 98 standard by Mark Jones, and are available in Hugs and GHC). The same kinds of techniques can also be used to encode behaviors more often associated with dependent types and polytypic programming, and are thus a topic of much recent interest in the Haskell community. The code shown here is based on an account by Thomas Hallgren (see [7]), extended to include factorial. Prolog fans will find the definitions particularly easy to read, if a bit backwards.
The first of the “graduate” versions gets more serious about recursion, defining natural numbers as a recursive algebraic datatype and highlighting the difference between iteration and primitive recursion. The “origamist” and “cartesian” variations take a small step backwards in this regard, as they return to the use of internal integer and list types. They serve, however, to introduce anamorphic and hylomorphic notions in a more familiar context.
The “Ph.D” example employs the categorical style of BMF/Squiggol in a serious way (we could actually go a bit further, by using co-products more directly, and thus eliminate some of the overt dependence on the “internal sums” of the data type definition mechanism).
By the time we arrive at the “pièce de résistance”, the comonadic version of Uustalu, Vene and Pardo, we have covered most of the underlying ideas and can (hopefully) concentrate better on their specific contributions. The final version, using the Prelude-defined product function and ellipsis notation, is how I think the function is most clearly expressed, presuming some knowledge of the language and Prelude definitions. (This definition also dates back at least to David Turner’s KRC* language [5].)
It is comforting to know that the Prelude ultimately uses a recursion combinator (foldl', the strict version of foldl) to define product. I guess we can all hope to see the day when the Prelude will define gcatamorphic, zygomorphic and paramorphic combinators for us, so that factorial can be defined both conveniently and with greater dignity :) .


* KRC may or may not be a trademark of Research Software, Ltd.,
    but you can bet your sweet bippy that
Miranda ™ is!


 

Revision history

  • 20 August 01: added the interpretive version, based on an environment model of a small object language (no, not in that sense of object ...). I’m thinking about re-arranging the order of the examples, so that longer ones that are not part of the main line of development don't intrude so much. I also advertised the page on the Haskell Café mailing list and requested that a link be added to the Haskell humor page. Finally, I have an interesting new example in the works that may actually have some original research value; more on this soon.
  • 14 August 01 (afternoon): added the combinatory version, now the fastest of the bunch, as measured in number of reductions reported by Hugs.
  • 14 August 01 (morning): adjusted the sophomore/Scheme version to use an explicit "lambda" (though we spell it differently in Haskell land) and added the fixed-point version.
  • 10 August 01: added the list-encoding and static computation versions (the latter uses type classes and functional dependencies to compute factorial during type-checking; it is an extended version of code from Thomas Hallgren’s “Fun with Functional Dependencies” [7]).
  • 1 August 01: added accumulating-parameter and continuation-passing versions (the latter is a revised transliteration from Friedman, Wand and Haynes’ “Essentials of Programming Languages” [5]).
  • 11 July 01: date of the original posting.


 

References

  1. Highlights from nhc - a Space-efficient Haskell Compiler, Niklas Röjemo. In the FPCA ‘95 proceedings. ACM Press, 1995 (see also CiteSeer or Chalmers ftp archive)
  2. n+k patterns, Lennart Augustsson. Message to the haskell mailing list, Mon, 17 May 93 (see the mailing list archive)
  3. Sorting Morphisms,Lex Augusteijn. In Advanced Functional Programming, (LNCS 1608). Springer-Verlag, 1999 (see also CiteSeer).
  4. Recursion Schemes from Comonads, T. Uustalu, V. Vene and A. Pardo. Nordic Journal of Computing, to appear (see also Tarmo Uustalu’s papers page).
  5. Recursion Equations as a Programming Language, D. A. Turner. In Functional Programming and its Applications. Cambridge University Press, 1982.
  6. Essentials of Programming Languages, D. Friedman, M. Wand and C. Haynes. MIT PRess and McGraw-Hill, 1994.
  7. Fun with Functional Dependencies, Thomas Hallgren. Joint Winter Meeting of the Departments of Science and Computer Engineering, Chalmers University of Technology and Göteborg University, Varberg, Sweden, 2001 (available at the author’s web archive).
  8. The Church of the Least Fixed-Point, authour unknown. (A little bit of lambda calculus humor which circulated in the mid-1980’s (at least that’s when I saw it), probably from the comp.lang.functional newsgroup or somesuch. Please write me if you know the author or any other citation information).
  9. Categorical Combinators, Sequential Algorithms and Functional Programming, Pierre-Louis Curien. Springer Verlag (2nd edition), 1993.

Free and open-source general-purpose video player

mpv is a free and open-source general-purpose video player. 

mpv is based on the MPlayer and mplayer2 projects which it greatly improves. Learn about the differences with the former projects. 



Disqus for Web Expert