Imagine a bacon-wrapped Ferrari. Still not better than our free technical reports.
See all our reports

Evaluating Lambda Calculus in Scala

In this third post on the untyped lambda calculus, we’ll look at implementing an evaluator in Scala. We already did a parser and the framework for a REPL last time in Parsing Lambda Calculus. Hopefully, the conclusion to this post also has a better answer to the question we asked in What is Lambda Calculus and should you care?

This post will be rather code heavy, and you can also see code for each section in separate commits in a GitHub repo. Some familiarity with Scala is expected to be able to follow the code. In case you don’t care for the implementation details, you can skip to final thoughts (or the section on Recursion) when you start seeing embedded code.

Evaluation by variable substitution

As mentioned in the first post, evaluation in lambda calculus happens by variable substitution — we replace references to the lambda’s argument variable with whatever the lambda is called with.

The main rule of variable substitution is called β-reduction — if we have an application with a lambda on the left hand side, we can substitute the right hand side for the argument to the lambda. This kind of reducible expression is called a “redex”, and such a reduction corresponds to a single computation step.

There are several strategies on picking which exact reduction to do next. The main strategies are call-by-name, call-by-value and call-by-need. We will use call-by-value as that is common in modern programming languages. Under call-by-value evaluation, the outermost redexes are reduced, and then only if the right hand side has already been reduced to a value. Values are expressions that cannot be reduced any further. Usually, we consider lambdas and variables to be values, but applications not. Call-by-value evaluates arguments to functions before evaluating the function.

Substitution

Substitution can be trickier than we might initially think as we have to be careful not to change the meaning of any expressions in case of a naming conflict (and naming conflicts usually happen). For example, with just name-based substitution, we would get the following

(λx.(λx.x)) y → λx.y

changing the meaning of the program. A usual approach to solving such conflicts is to rename variables, because programs that only differ in variable names are still equivalent.

If you don’t care about the implementation details, now is a good point to skip most of the post.

Implementing Evaluation

We decided to try to get by without any renaming of variables. We keep track of a lexical scope of each lambda/variable. Each lambda defines a new scope, which refers to its parent scope. Each variable reference also refers to the scope of the lambda that defines the variable. There is a TOP scope that has no parent. For debugging purposes, we’ll add numeric ID’s to scopes.

object Scope {
  var id = 0
  def nextId = { val i = id; id += 1; i }
  val TOP = new Scope(None, Set())
}
class Scope(val parent: Option[Scope], val boundNames: Set[String]) {
  val id = Scope.nextId
 
  def closestBinding(name: String): Option[Scope] =
    if (boundNames contains name)
      Some(this)
    else
      parent flatMap (_ closestBinding name)
}

Scope also has a set of names that are bound in it, even though we only have one name in each scope for now. There is also a function for finding the closest lexical scope that binds a given name in the parent hierarchy. Here is the modified Var in the AST with references to Scope (other nodes need not change):

object Var {
  def apply(name: String): Var = Var(name, Scope.TOP)
}
case class Var(name: String, scope: Scope) extends Expr

Note: we now also have to change one line the parser: ident ^^ Var -> ident ^^ Var.apply

We don’t create Scopes during parsing, instead we have a “Binder” that makes a copy of the parsed tree while adding Scopes (the trees are immutable so we need to copy to modify).

class Binder() {
  def apply(term: Expr) = bind(term, Scope.TOP)
 
  def bind(term: Expr, parent: Scope): Expr = term match {
    case Lambda(arg, body) =>
      val λScope = new Scope(Some(parent), Set(arg.name))
      Lambda(arg.copy(scope = λScope), bind(body, λScope))
    case v @ Var(name, _) =>
      (parent closestBinding name) match {
        case Some(scope) => v.copy(scope = scope)
        case None        => sys.error("Undefined variable: " + name)
      }
    case Apply(fun, arg) =>
      Apply(bind(fun, parent), bind(arg, parent))
  }
}

For a Lambda, we create a new Scope. For a Var we look up the closest Scope binding its name, and set it’s scope to that. For now, we’ll raise an exception (using sys.error) if the variable is not bound.

We also need to invoke Binder in LambdaREPL, and change the PrettyPrinter because it uses Var in pattern matches and we added an argument to Var. We can also temporarily change the variable printing rule to s"$name${scope.id}" in order to see if our Scopes are assigned correctly.

λ> \x.x
Parsed: λx1.x1
λ> \x.\x.x
Parsed: λx2.(λx3.x3)

Seems correct. Now we can get to actual evaluation. As said, the main rule of evaluation is variable substitution. But when do we apply it? Under call-by-value, we apply it to an application which has a lambda on the left, and a value on the right. So we need a function that knows whether an expression is a value. We’ll consider Lambdas and Vars to be values (they will not be evaluated further if appearing by themselves):

def isValue(term: Expr): Boolean = term match {
  case _: Lambda => true
  case _: Var    => true
  case _         => false
}

Now we can write the pattern for when we should apply substitution:

case Apply(Lambda(argV, body), arg) if isValue(arg) =>
  // substitute arg for all references to argV in body

What if the arg to the lambda is not a value? We will add another pattern, for when the left side is a value, but the right hand side is not:

case Apply(fun, arg) if isValue(fun) =>
  // recursively call evaluation again to evaluate arg to a value

Finally, if the left hand side is not yet a value, we recursively evaluate the left-hand side:

case Apply(fun, arg) =>
  // recursively call evaluation again to evaluate fun to a value

Note that the only things being reduced during evaluation are Applys — a Lambda or Var by itself is never reduced further (that’s why we say they are values). Combining these patterns and assuming we have an implementation for substitution, we could create a function that takes a single computation step if one can be taken.

However, to fully evaluate a program, we need to keep taking single steps until no more can be taken, so we need a wrapper function, which recursively takes a single step, until it gets a MatchError (in case none of our patterns above matched):

def apply(term: Expr): Expr =
  try {
    val term2 = evalStep(term)
    apply(term2)
  } catch {
    case _: MatchError => term
  }

I created the evaluation code in a new package lambda.eval just to have some separation instead of everything in the same package.

package lambda
package eval
// having two package statements like this means the package is still “lambda.eval”,
// but we automatically import everything from the first package “lambda”.
 
// When parameter debug is true, print each evaluation step
class Evaluation(debug: Boolean = false) {
  lazy val pretty = new PrettyPrinter()
 
  def apply(term: Expr): Expr =
    try {
      val term2 = evalStep(term)
      if (debug)
        println(s"step: ${pretty(term)} → ${pretty(term2)}")
      apply(term2)
    } catch {
      case _: MatchError => term
    }
 
  def evalStep(term: Expr): Expr = term match {
    case Apply(Lambda(argDef, body), arg) if isValue(arg) =>
      new Substitution(argDef, arg)(body)
    case Apply(fun, arg) if isValue(fun) =>
      Apply(fun, evalStep(arg))
    case Apply(fun, arg) =>
      Apply(evalStep(fun), arg)
  }
 
  def isValue(term: Expr): Boolean = term match {
    case _: Lambda => true
    case _: Var    => true
    case _         => false
  }
}

For substitution we wrote new Substitution(argV, arg)(body), and let’s implement that in a separate class:

class Substitution(argV: Var, replacement: Expr) {
  val binder = new Binder()
 
  def apply(term: Expr): Expr = term match {
    case Var(argV.name, argV.scope) => binder.bind(replacement, argV.scope.parent.get)
    case Var(_, _)                  => term
    case Apply(fun, arg)            => Apply(apply(fun), apply(arg))
    case Lambda(arg, body)          => Lambda(arg, apply(body))
  }
}

When we apply substitution to the lambda body, we apply it recursively for a Lambda or an Apply, but when we find a Var that is a reference to the same variable we are substituting, then we just replace it with the replacement Expr. We use Binder to create a copy of the replacement, under the parent scope of the lambda which we are eliminating by the substitution — we need to make copies so that each instance of the replacement would have its own Scopes.

Notes on variable renaming

In the code above, we don’t do any variable renaming. It seems that when we have call-by-value evaluation, where the outermost redexes are reduced first, we can’t run into situations where we have a free variable in the replacement.

For example, I don’t think there is a situation where we could be substituting x for y in \y. (\x.y).

In the program (\y. (\x. y)) x the variable x is not bound — but this is not a valid program because we don’t allow unbound variables in a complete program (at least not yet).

Assume the expression is wrapped in another lambda that actually binds x e.g. \x. (\y. (\x. y)) x. This lambda by itself will never be evaluated because it is already a value. So it must be given an argument to get evaluated and for substitution to take place inside it. If the argument is also named x, then we have the same situation as one step earlier — we’d have an unbound x and would need to wrap it in another lambda. And so on.

We can give the lambda an argument that is a value, as in (\x. (\y. (\x.y)) x) (\x.x). In this case, the evaluation (and substitution) starts with the outer lambda:

  1. (\x. (\y. (\x.y)) x) (\x.x) matches the first rule of our evalStep function: Apply(Lambda, arg) if isValue(arg) and thus \x.x is substituted for x:

    (\x. (\y. (\x.y)) x) (\x.x) -> (\y. (\x.y)) (\x.x)
  2. (\y. (\x.y)) (\x.x) matches the same rule, and (\x.x) is substituted for y:

    (\y. (\x.y)) (\x.x) -> \x.(\x.x)
  3. Now we have only a lambda and no further reduction can be done.

The inner reference to x variable is bound correctly (because of how the bound name lookup works) and it seems to me that there is no way we could have an unbound variable x in the replacement during substitution, when we are limited to call by value evaluation. Please let me know in case I’m horribly wrong about all this.


Psst…if you like Scala, check out a full length PDF report by Rebel Labs–just click the button below!

Get the 'Scala 2013' Report