Expressions Over Statements

pexels-photo-374918.jpeg
Photo by George Becker on Pexels.com

I have been a fan of functional programming for a while now. The reasons are plenty but mostly come from referential transparency. There is one feature of the FP approach that has been hard for me to explain to others. Due to the recent project I have finally started to have a better grasp on the subject and it comes down to using expressions instead of statements.

When writing Haskell, or other FP language code, expression approach is forced on the user but it can also be used in Java. Streams with lambdas are a great example of this.

What is the difference? Best to show on example:

Java Statements:

public String someMethodStatement() {
  var usernameList = getUserNames();

  var username = select(usernameList);
 
  var modifiedUsername = doSomething(username);
 
  // LOG :)
  System.out.println(modifiedUsername);
 
  return modifiedUsername;
}

Java Expressions (with minor modification, note: it can be written in multiple different ways):

public String someMethodExpression() {
  return getUserNames().stream()
          .filter(this::select)
          .findFirst()
          .map(this::doSomething)
          .stream()
          .peek(System.out::println)
          .findAny()
          .get();
}

For anyone that hasn’t been living under a rock in a Java community the second example should be totally understandable.

So why would I argue that the second example is potentially better code than the first one? One might say that it is actually less readable. Due to language limitations, it also changes from an Optional Monad back to a Stream in the middle of the execution. Those are valid concerns but miss one aspect: Scope. 

In the third line:

var modifiedUsername = doSomething(username);

The operation has access to not only username variable but also to usernameList. Even though it is not used, when reading this, a programmer still has to keep a mental checklist of all the variables that are in the scope of the operation (just like a compiler 😉 ). Even if they are no longer needed. In the second example, when calling doSomething the code no longer has access to that list. The reader can focus only on the things that matter.

Since in Java this approach is somewhat clunky it might still be preferable to simply use statements. I will shamelessly plug that in Kotlin we can have this in a fluent and expressive form.

fun someMethodExpression() =
   getUserNames()
       .let(::select)
       .let(::doSomething)
       .apply(::println)

What we can see here are Scope functions. It almost looks like a mathematical equation and I love it. It has less code than the original Java Statement while still giving the benefit of an expression.

Scope is not the only advantage of using expressions. In this approach the code has a very clear start and SINGLE end of the function/operation. Having one exit from a function has been long said to be a good practice (unless you have performance reasons). Writing expressions forces this. No more multiple returns flying around a 200+ line method.

Last but not least, expressions guide us to better decompose our code. Instead of having one chunk of code after another, we have to separate them out into separate, clearly defined functions (or run a risk of deep indentations). This also helps to keep each function on one level of abstraction. Jumping between levels is harder when you do not have access to all the variables.

These are my reasons for preferring to write expressions over statements. They limit the cognitive load on the reader, encourage better practices and help keep the code modular.

Expressions Over Statements

Building regression free (or close enough) service using Kotlin, Jacoco and Mutation tests

man wearing black and white stripe shirt looking at white printer papers on the wall
Photo by Startup Stock Photos on Pexels.com

We have decided to build our current product using Kotlin and, like with any new project, do everything by the book (and try to keep it that way as long as possible :D). This meant a number of automated safeguards and checks, including having good test coverage from the start. We decided to go with Jacoco as it seemed to have the best compatibility with Kotlin and set the bar as high as possible: 100% branch coverage. Yes, branch coverage. Not line coverage. It is too easy to have good line coverage and useless tests. This topic has been covered many times by others so I will not go into details. For more info you can check out e.g.: https://linearb.io/blog/what-is-branch-coverage/. The check is part of the automated build and is done on each Pull Request. No code will be merged if the coverage is not high enough. These are CICD basics but I wanted to reiterate this point. 

Of course it is still possible to write tests that have high branch coverage and tests nothing (although harder than for line coverage). It is possible to tackle this using mutation tests. Mutation tests change (mutate) the code and check if any of the tests fail. If a mutation passes all the tests, it means that the tests are of poor quality. This is slow so we run it only once a day but that is enough to keep us in check.

100% coverage sounds extreme but the assumption was to see how long we can maintain that and lower it if such need arises. If the coverage needs to be dropped we had to have a good reason for it. We have worked in this setup for over half a year now and few interesting outcomes have come from this.

Kotlin and Jacoco work well together, but not perfectly. The coverage limit had to be dropped to 98% due to some edge cases. Unfortunately as percentages work, this means that, as codebase grows, the number of branches that fall into the 1% also grows. Some critical branches might go unnoticed. We need to actively check the reports and see what actual other branches are currently not covered. Hopefully in the future we can improve on this.

No bugs reported so far were from introduction of regression to the code.

Having said that, here is the kicker. No bugs reported so far were from introduction of regression to the code. I had this realization quite recently and made me curious as to what could be the cause of that.

First is the void/null safety enforced by Kotlin. No other popular JVM language has this functionality. This is a huge boost to productivity and helps tremendously in keeping the code correct. If a variable/field can be null, the programmer is forced to handle it. This means that a visible branch is created. Jacoco can report that and force us to write a test that covers it. 

This has another effect. Since humans (that includes developers) are lazy by nature, they want to write as little code as possible. If you push an optional parameter through multiple layers and execute a number of operations on it, Jacoco will force you to write a test for each case. Some cases might not be possible to test at all.

data class User(
    val firstName: String?
)

fun someUseCase1(user: User) {
    placeOrder1(user.firstName)
    sendEmail1(user.firstName)
}

fun placeOrder1(firstName: String?) {
    if (firstName == null) {
        throw Exception("First Name is missing!")
    }
    // Do something
}

fun sendEmail1(firstName: String?) {
    if (firstName == null) {
        throw Exception("First Name is missing!")
    }
    // Do something
}

No one wants to do that. It forced us to resolve the null object as early as possible.

fun someUseCase1FewerBranches(user: User) =
    (user.firstName ?: throw Exception("First Name is missing!"))
        .let { firstName ->
            placeOrdere1FewerBranches(firstName)
            sendEmaile1FewerBranches(firstName)
        }


fun placeOrdere1FewerBranches(firstName: String) {
    // Do something
}

fun sendEmaile1FewerBranches(firstName: String) {
    // Do something
}

We end up with an easier to test, more readable and safer code. And less of it. It is also just good design that should have been done from the start. Now we have an automated tool that makes us do it.

An aspect of this is also wider usage of the null object pattern. Say we have a list of strategies to select from or an optional function parameter.

fun someUseCase2(users: List<User>?) =
    users?.forEach(::someUseCase1)

Instead of having separate tests to handle cases where the selected strategy is missing or the parameter has not been passed, we can introduce sane default behaviour (e.g. use empty list instead of optional list).

fun someUseCase2FewerBranches(users: List<User> = emptyList()) =
    users.forEach(::someUseCase1)

All those worked in tandem to guide us in having a better quality solution and allow us to keep safely introducing new features. Although there are also other reasons for it, our productivity per developer has not dropped since the beginning of the project (and it is always highest in the beginning when there is no code :D). Bugs still happen and always will. They are, however, fewer of them and of different nature. It has only been a few months on this project so I am curious how will this evolve but so far it looks very promising.

Building regression free (or close enough) service using Kotlin, Jacoco and Mutation tests

ADT with Java: Sealed Classes, Pattern Matching, Records

Photo by Karolina Grabowska on Pexels.com

I am a big fan of Algebraic Data Types. They allow us to declaratively specify data model grammar. Many modern statically typed languages deliver this functionality out of the box, allowing for writing very expressive code. Since I primarily work with Java I have tried to use ADTs in Java on a number of occasions. That has not been a pleasant experience. Java simply does not provide proper tools. Last time I tried, it was with Java 11. Now that we are at Java 15 I have decided to give another go, using new features.

One of the basic data structures that we work with are Lists. When learning a functional language they are usually the first hurdle that takes a while to get your head around. In Haskell a List is pretty much defined as:

data List a = Empty | Cons a (List a)

This means that a List is either Empty (no elements) or has an element and a pointer to the next “object”. In short this is the definition of a Linked List. Pretty neat, right? To not come across as some Haskell snob the same can be done in Typescript:

type List<T> = null | {value: T, next: List<T>}

I tried to recreate that in Java 15 and came up with this:

public sealed interface LinkedList<T> permits LinkedList.Nil, LinkedList.Cons {
    record Nil<T>() implements LinkedList<T> {}
    record Cons<T>(T value, LinkedList<T> next) implements LinkedList<T> {}
}

We have few new things done here that were not possible before.

First we have sealed classes (https://openjdk.java.net/jeps/360). Those are classes/interfaces that strictly define what classes can inherit from them. This means that when we check the type of the object we can do it exhaustively. One of the major critiques of using instanceof is the fact that we never truly know what implementations we can encounter. Until now. This allows us to safely deliver more logic through the type system, allowing the compiler to verify it for us.

Second are records (https://openjdk.java.net/jeps/384). Those allow us to declare immutable data models with far less boilerplate. Would be great if we didn’t need those curly brackets at the end :).

So this is the definition of the LinkedList using type system in Java 15. Lets see it in action:

LinkedList<String> emptyList = new LinkedList.Nil<>();
System.out.println(emptyList);
LinkedList<String> oneElementList = new LinkedList.Cons<>("Test", new LinkedList.Nil<>());
System.out.println(oneElementList);

Let’s try to build a bigger list. To do that we need a util method:

static <T> LinkedList<T> addElement(LinkedList<T> list, T element) {
    if (list instanceof Nil<T>) {
        return new Cons<>(element, new Nil<>());
    } else if (list instanceof Cons<T> cons) {
        return new Cons<>(cons.value, addElement(cons.next, element));
    } else {
        throw new IllegalArgumentException("Unknown type");
    }
}

Here we yet again take advantage of a new Java feature: instanceof pattern matching (https://openjdk.java.net/jeps/375). This allows us to skip the type casting after the instanceof check, making for a more readable code. Actually, once more work is done in this area and we get the planned switch expression for instanceof, will end up with something akin to:

static <T> LinkedList<T> addElement(LinkedList<T> list, T element) {
    return switch (list) {
        case Nil<T> nil -> new Cons<>(element, new Nil<>());
        case Cons<T> cons -> new Cons<>(cons.value, addElement(cons.next, element));
    }
}

Which will finally be quite pleasant to the eye. We can use this code as simply as:

LinkedList<Integer> list = new LinkedList.Nil<>();
for (int i = 0; i < 10; i++) {
    list = addElement(list, i);
}

I have added several more functions to the solution and the complete code can be found here.

Conclusion

So there it is. An immutable LinkedList written using the type system. There is still space for improvement but I feel like Java is on the right track. Although those features are still in preview I have high hopes that when we reach the next LTS (Java 17?) we will be able to truly take advantage of ADT techniques. Of course Java is playing a sort of catch up to other JVM languages like Kotlin and Scala but I hope that their implementation will be better since Java can play with JVM as it sees fit. Next time someone asks you to implement a Linked List in an interview, you can just use this :P.

ADT with Java: Sealed Classes, Pattern Matching, Records