To its detriment and yours, the Java language makes no distinction between a
pure function, and any plain old subroutine. Even in the core libraries, the two are freely intermingled, with no obvious distinguishing characteristic. Yet we can all benefit from striving to make this distinction clear in our own code.
By "pure function" I mean a function in the mathematical sense: it performs a calculation with no observable side-effects, and its result depends only on its arguments. Invoke it again on the same instance (or Class if static), and with the same arguments in the same states, and you must
always get the same answer.
What are some advantages of pure functions?
- They're testable
- They're thread-safe (though not necessarily "thread-correct", more on this later)
- They're deterministic
- They never need to be mocked out*
- They're easier to understand and reason about
- They're "referentially transparent," so they can be "memoized" (more on this later)
They're the easy kind of functions to work with, just like immutables are the easy variety of data objects.
(*About this particular claim. Have you ever felt compelled to test how your class behaves if the implementation of integer addition were to change? I doubt it, unless you're just plain batshit crazy, or a mathematician (but I repeat myself). In rare cases, if a pure function is very expensive, you may want to mock it anyway just to make your test runs faster. But you didn't "need" to do it.)
When is a function pure?All its dependencies must be pure functions themselves (or constants, which are basically just pure functions that have no arguments). Impurity, just like it sounds, is a contaminant. If your method calls eight other methods, and just one of those calls a method which
sometimes calls a method which uses
System.currentTimeMillis(), kaboom: your function is not pure.
So a method which invokes
new Random(5) may still be pure (as guaranteed by that class's specification), while one that invokes
new Random() certainly is not.
Collections.shuffle(), the two-argument form, is pure, while Collections.shuffle() the one-argument form is not. (wait, duh, neither is pure, because they mutate the passed-in list! but maybe you see the point anyway?) Now you see the "intermingling" I was bemoaning before!
What are the most common sources of impurity in my code?Some I can think of:
- mutable state
- the system clock
- I/O
I'm sure there are more. Help me out here: what others can you think of?
Are impure functions evil?No, of course not. If they were, I would never be able to write any, as it would be against company policy. They're simply
very different from their pure cousins, and more challenging to work with and to test. Keeping your functions pure, like keeping your value objects immutable, just gives you less to worry about. (Remember that hit song "Mo' Mutatin', Mo' Problems?" Toootally analogous to that. Listen to Biggie, he knew.)
How to deal with impurity?I've told you that the system clock is a contaminant, that makes everything it touches impure. But, of course, some of your business logic probably needs to know the current time. Are you just hopelessly contaminated as well?
No! You have at your disposal a chlorine tablet called
dependency injection! (You just knew it would come to that, didn't you?)
Before:
public class SignUtils {
public static String getCurrentMessage() {
Instant now = new Instant(); // automatically set to now
return someCalculation(now) ? "OPEN" : "CLOSED";
}
}
After (simplified):
public class SignController {
@Inject Clock clock;
public static String getCurrentMessage() {
Instant now = clock.now();
return someCalculation(now) ? "OPEN" : "CLOSED";
}
}
The result is a function which can be either pure or impure depending on what dependencies are provided for it. In "real life", you need it to be impure, and return a different result at 9:01 than it did at 8:59. But this nondeterminism has now been walled off behind an interface. Because the result of
getCurrentMessage() itself now depends only on the states of its arguments (none) and the state of its instance, it will always be just as pure as its provided clock instance is. Now the code is testable, because we properly isolated the impurity.
In summary:
- Pay attention to the difference between your pure and impure functions.
- Use dependency injection to limit the damage radius of impure functions.
- If you're designing the Next Great Language, ferchrissakes handle these two things differently. Don't make the system time available via a simple static method call.
Thanks for reading. Let me know if this kind of post is helpful to you!