Posted on July 9, 2012.
Last time we saw how we can, in practice, achieve side effects in a functional programming language without sacrificing purity. It seems like a paradox at first, but we discovered how to let the runtime do the imperative work for us to keep our functions referentially transparent.
In our implementation, we modified our impure functions (such as print) to take a universe as an extra argument, and to return a new universe in which the side effect has been performed. We can pipeline these functions together to construct an imperative-looking program that executes instructions in the order we specify (taking advantage of function composition to impose the ordering).
There’s a problem with this, however. What happens if we do something like this:
def main(old_universe):
input1, new_universe1 = get_input(old_universe)
input2, new_universe2 = get_input(old_universe)
final_universe = print(input1 + input2, old_universe)
return final_universe
This seems conceptually valid. From our current universe, we spawn two identical parallel universes. In each, we ask the user for input, then we concatenate the two inputs (which should be the same if you’re a determinist) and print it in a new universe.
In practice, we know this can’t happen. Chances are, you’re running your program on a machine with a few gigabytes of memory and a few gigahertz of processor speed. There’s no way you’d be able to simulate several entire universes on your laptop.
We can, in practice, simulate one universe. The way we accomplish this is by actually performing the computation in our universe—so it looks like a simulation in the programming language, but in reality it is actually modifying the state of our universe. For example, if in our program we describe a universe in which a message has been printed to the screen, the runtime must actually print that message to the screen in our universe to determine what happens next. These side effects are not exposed in the language, which allows us to wave our hands and claim that the language itself is still pure, even if every possible implementation necessarily has the ability to modify the state of the machine.
Now all we need is a language construct that enforces this restriction—that only one universe exists at a time. While we’re at it, let’s modify our type system so that we can enforce this restriction on any type, and the Universe type will be a special case of this.
These peculiar types are called uniqueness types. The basic idea is this: we declare a type as unique, then any variable of this type goes out of scope wherever time it appears in an expression. So if universe are unique, after the line...
input1, new_universe1 = get_input(old_universe)
...the variable old_universe goes out of scope, so we can’t use it on the next line as we did in the example, and we can’t return its value. This means that only one universe can exist at any given time, which fixes the problem mentioned above. Also, we can enforce uniqueness typing at compile time! We now have a type system that allows us to maintain purity and still express side effects.
This is one way to write programs with side effects in functional languages, and it’s used by the programming language Mercury. Next time we’ll talk about another way to manage side effects, exception handling, and more: monads.
Appendix: Impure Conditional Expressions with Uniqueness Types
We just learned how to write programs with side effects as an essentially linear list of operations, each taking a universe and returning a new one. But how can we do impure operations conditionally? First attempt:
def main(universe0):
c, universe1 = get_char(universe0)
return (print("YES", universe1) if c == "y" else print("NO", universe1))
The last line does not compile since universe1 is used twice. One solution is simply to have the conditional syntax a if e else b (also known as e ? a : b) create a new lexical scope for a and b. That way, universe1 is never used more than once in the same scope, and it passes the uniqueness test. However, this may not be desirable—it means that the scoping rules for conditionals are different from those of function calls. In many functional languages, conditional expressions are implemented as function calls, and this allows programmers to define their own control flow constructs on top of this.
Here is another solution. Let’s add a new construct to Fython:
def cond(flag, universe, f, g):
# this is how the function behaves,
# but we can't actually implement this:
return f(universe) if flag else g(universe)
However, we can’t write the body of cond in Fython because universe is used twice—it will have to be an intrinsic language feature. cond takes a boolean flag, a universe, and two closures, f and g. It returns f(universe) or g(universe) depending on flag. For example, we could rewrite our example to use cond as follows:
def main(universe0):
c, universe1 = get_char(universe0)
return (cond(c == "y", universe0, lambda x: print("YES", x), lambda x: print("NO", x)))
Now whenever we want to do impure operations conditionally, we just have to wrap our computations in a couple of closures and use cond.
Tags:
home
theory