November 2021 - jpintelli

I have been a fan of functional programming for a while now. The reasons are plenty but mostly come from referential transparency. There is one feature of the FP approach that has been hard for me to explain to others. Due to the recent project I have finally started to have a better grasp on the subject and it comes down to using expressions instead of statements.

When writing Haskell, or other FP language code, expression approach is forced on the user but it can also be used in Java. Streams with lambdas are a great example of this.

What is the difference? Best to show on example:

Java Statements:

public String someMethodStatement() {
  var usernameList = getUserNames();

  var username = select(usernameList);
 
  var modifiedUsername = doSomething(username);
 
  // LOG :)
  System.out.println(modifiedUsername);
 
  return modifiedUsername;
}

Java Expressions (with minor modification, note: it can be written in multiple different ways):

public String someMethodExpression() {
  return getUserNames().stream()
          .filter(this::select)
          .findFirst()
          .map(this::doSomething)
          .stream()
          .peek(System.out::println)
          .findAny()
          .get();
}

For anyone that hasn’t been living under a rock in a Java community the second example should be totally understandable.

So why would I argue that the second example is potentially better code than the first one? One might say that it is actually less readable. Due to language limitations, it also changes from an Optional Monad back to a Stream in the middle of the execution. Those are valid concerns but miss one aspect: Scope.

In the third line:

var modifiedUsername = doSomething(username);

The operation has access to not only username variable but also to usernameList. Even though it is not used, when reading this, a programmer still has to keep a mental checklist of all the variables that are in the scope of the operation (just like a compiler 😉 ). Even if they are no longer needed. In the second example, when calling doSomething the code no longer has access to that list. The reader can focus only on the things that matter.

Since in Java this approach is somewhat clunky it might still be preferable to simply use statements. I will shamelessly plug that in Kotlin we can have this in a fluent and expressive form.

fun someMethodExpression() =
   getUserNames()
       .let(::select)
       .let(::doSomething)
       .apply(::println)

What we can see here are Scope functions. It almost looks like a mathematical equation and I love it. It has less code than the original Java Statement while still giving the benefit of an expression.

Scope is not the only advantage of using expressions. In this approach the code has a very clear start and SINGLE end of the function/operation. Having one exit from a function has been long said to be a good practice (unless you have performance reasons). Writing expressions forces this. No more multiple returns flying around a 200+ line method.

Last but not least, expressions guide us to better decompose our code. Instead of having one chunk of code after another, we have to separate them out into separate, clearly defined functions (or run a risk of deep indentations). This also helps to keep each function on one level of abstraction. Jumping between levels is harder when you do not have access to all the variables.

These are my reasons for preferring to write expressions over statements. They limit the cognitive load on the reader, encourage better practices and help keep the code modular.

man wearing black and white stripe shirt looking at white printer papers on the wall — Photo by Startup Stock Photos on Pexels.com

We have decided to build our current product using Kotlin and, like with any new project, do everything by the book (and try to keep it that way as long as possible :D). This meant a number of automated safeguards and checks, including having good test coverage from the start. We decided to go with Jacoco as it seemed to have the best compatibility with Kotlin and set the bar as high as possible: 100% branch coverage. Yes, branch coverage. Not line coverage. It is too easy to have good line coverage and useless tests. This topic has been covered many times by others so I will not go into details. For more info you can check out e.g.: https://linearb.io/blog/what-is-branch-coverage/. The check is part of the automated build and is done on each Pull Request. No code will be merged if the coverage is not high enough. These are CICD basics but I wanted to reiterate this point.

Of course it is still possible to write tests that have high branch coverage and tests nothing (although harder than for line coverage). It is possible to tackle this using mutation tests. Mutation tests change (mutate) the code and check if any of the tests fail. If a mutation passes all the tests, it means that the tests are of poor quality. This is slow so we run it only once a day but that is enough to keep us in check.

100% coverage sounds extreme but the assumption was to see how long we can maintain that and lower it if such need arises. If the coverage needs to be dropped we had to have a good reason for it. We have worked in this setup for over half a year now and few interesting outcomes have come from this.

Kotlin and Jacoco work well together, but not perfectly. The coverage limit had to be dropped to 98% due to some edge cases. Unfortunately as percentages work, this means that, as codebase grows, the number of branches that fall into the 1% also grows. Some critical branches might go unnoticed. We need to actively check the reports and see what actual other branches are currently not covered. Hopefully in the future we can improve on this.

No bugs reported so far were from introduction of regression to the code.

Having said that, here is the kicker. No bugs reported so far were from introduction of regression to the code. I had this realization quite recently and made me curious as to what could be the cause of that.

First is the void/null safety enforced by Kotlin. No other popular JVM language has this functionality. This is a huge boost to productivity and helps tremendously in keeping the code correct. If a variable/field can be null, the programmer is forced to handle it. This means that a visible branch is created. Jacoco can report that and force us to write a test that covers it.

This has another effect. Since humans (that includes developers) are lazy by nature, they want to write as little code as possible. If you push an optional parameter through multiple layers and execute a number of operations on it, Jacoco will force you to write a test for each case. Some cases might not be possible to test at all.

data class User(
    val firstName: String?
)

fun someUseCase1(user: User) {
    placeOrder1(user.firstName)
    sendEmail1(user.firstName)
}

fun placeOrder1(firstName: String?) {
    if (firstName == null) {
        throw Exception("First Name is missing!")
    }
    // Do something
}

fun sendEmail1(firstName: String?) {
    if (firstName == null) {
        throw Exception("First Name is missing!")
    }
    // Do something
}

No one wants to do that. It forced us to resolve the null object as early as possible.

fun someUseCase1FewerBranches(user: User) =
    (user.firstName ?: throw Exception("First Name is missing!"))
        .let { firstName ->
            placeOrdere1FewerBranches(firstName)
            sendEmaile1FewerBranches(firstName)
        }


fun placeOrdere1FewerBranches(firstName: String) {
    // Do something
}

fun sendEmaile1FewerBranches(firstName: String) {
    // Do something
}

We end up with an easier to test, more readable and safer code. And less of it. It is also just good design that should have been done from the start. Now we have an automated tool that makes us do it.

An aspect of this is also wider usage of the null object pattern. Say we have a list of strategies to select from or an optional function parameter.

fun someUseCase2(users: List<User>?) =
    users?.forEach(::someUseCase1)

Instead of having separate tests to handle cases where the selected strategy is missing or the parameter has not been passed, we can introduce sane default behaviour (e.g. use empty list instead of optional list).

fun someUseCase2FewerBranches(users: List<User> = emptyList()) =
    users.forEach(::someUseCase1)

All those worked in tandem to guide us in having a better quality solution and allow us to keep safely introducing new features. Although there are also other reasons for it, our productivity per developer has not dropped since the beginning of the project (and it is always highest in the beginning when there is no code :D). Bugs still happen and always will. They are, however, fewer of them and of different nature. It has only been a few months on this project so I am curious how will this evolve but so far it looks very promising.

jpintelli

Menu

Month: November 2021

Expressions Over Statements

Building regression free (or close enough) service using Kotlin, Jacoco and Mutation tests