Expressions Over Statements

pexels-photo-374918.jpeg
Photo by George Becker on Pexels.com

I have been a fan of functional programming for a while now. The reasons are plenty but mostly come from referential transparency. There is one feature of the FP approach that has been hard for me to explain to others. Due to the recent project I have finally started to have a better grasp on the subject and it comes down to using expressions instead of statements.

When writing Haskell, or other FP language code, expression approach is forced on the user but it can also be used in Java. Streams with lambdas are a great example of this.

What is the difference? Best to show on example:

Java Statements:

public String someMethodStatement() {
  var usernameList = getUserNames();

  var username = select(usernameList);
 
  var modifiedUsername = doSomething(username);
 
  // LOG :)
  System.out.println(modifiedUsername);
 
  return modifiedUsername;
}

Java Expressions (with minor modification, note: it can be written in multiple different ways):

public String someMethodExpression() {
  return getUserNames().stream()
          .filter(this::select)
          .findFirst()
          .map(this::doSomething)
          .stream()
          .peek(System.out::println)
          .findAny()
          .get();
}

For anyone that hasn’t been living under a rock in a Java community the second example should be totally understandable.

So why would I argue that the second example is potentially better code than the first one? One might say that it is actually less readable. Due to language limitations, it also changes from an Optional Monad back to a Stream in the middle of the execution. Those are valid concerns but miss one aspect: Scope. 

In the third line:

var modifiedUsername = doSomething(username);

The operation has access to not only username variable but also to usernameList. Even though it is not used, when reading this, a programmer still has to keep a mental checklist of all the variables that are in the scope of the operation (just like a compiler 😉 ). Even if they are no longer needed. In the second example, when calling doSomething the code no longer has access to that list. The reader can focus only on the things that matter.

Since in Java this approach is somewhat clunky it might still be preferable to simply use statements. I will shamelessly plug that in Kotlin we can have this in a fluent and expressive form.

fun someMethodExpression() =
   getUserNames()
       .let(::select)
       .let(::doSomething)
       .apply(::println)

What we can see here are Scope functions. It almost looks like a mathematical equation and I love it. It has less code than the original Java Statement while still giving the benefit of an expression.

Scope is not the only advantage of using expressions. In this approach the code has a very clear start and SINGLE end of the function/operation. Having one exit from a function has been long said to be a good practice (unless you have performance reasons). Writing expressions forces this. No more multiple returns flying around a 200+ line method.

Last but not least, expressions guide us to better decompose our code. Instead of having one chunk of code after another, we have to separate them out into separate, clearly defined functions (or run a risk of deep indentations). This also helps to keep each function on one level of abstraction. Jumping between levels is harder when you do not have access to all the variables.

These are my reasons for preferring to write expressions over statements. They limit the cognitive load on the reader, encourage better practices and help keep the code modular.

Expressions Over Statements

ADT with Java: Sealed Classes, Pattern Matching, Records

Photo by Karolina Grabowska on Pexels.com

I am a big fan of Algebraic Data Types. They allow us to declaratively specify data model grammar. Many modern statically typed languages deliver this functionality out of the box, allowing for writing very expressive code. Since I primarily work with Java I have tried to use ADTs in Java on a number of occasions. That has not been a pleasant experience. Java simply does not provide proper tools. Last time I tried, it was with Java 11. Now that we are at Java 15 I have decided to give another go, using new features.

One of the basic data structures that we work with are Lists. When learning a functional language they are usually the first hurdle that takes a while to get your head around. In Haskell a List is pretty much defined as:

data List a = Empty | Cons a (List a)

This means that a List is either Empty (no elements) or has an element and a pointer to the next “object”. In short this is the definition of a Linked List. Pretty neat, right? To not come across as some Haskell snob the same can be done in Typescript:

type List<T> = null | {value: T, next: List<T>}

I tried to recreate that in Java 15 and came up with this:

public sealed interface LinkedList<T> permits LinkedList.Nil, LinkedList.Cons {
    record Nil<T>() implements LinkedList<T> {}
    record Cons<T>(T value, LinkedList<T> next) implements LinkedList<T> {}
}

We have few new things done here that were not possible before.

First we have sealed classes (https://openjdk.java.net/jeps/360). Those are classes/interfaces that strictly define what classes can inherit from them. This means that when we check the type of the object we can do it exhaustively. One of the major critiques of using instanceof is the fact that we never truly know what implementations we can encounter. Until now. This allows us to safely deliver more logic through the type system, allowing the compiler to verify it for us.

Second are records (https://openjdk.java.net/jeps/384). Those allow us to declare immutable data models with far less boilerplate. Would be great if we didn’t need those curly brackets at the end :).

So this is the definition of the LinkedList using type system in Java 15. Lets see it in action:

LinkedList<String> emptyList = new LinkedList.Nil<>();
System.out.println(emptyList);
LinkedList<String> oneElementList = new LinkedList.Cons<>("Test", new LinkedList.Nil<>());
System.out.println(oneElementList);

Let’s try to build a bigger list. To do that we need a util method:

static <T> LinkedList<T> addElement(LinkedList<T> list, T element) {
    if (list instanceof Nil<T>) {
        return new Cons<>(element, new Nil<>());
    } else if (list instanceof Cons<T> cons) {
        return new Cons<>(cons.value, addElement(cons.next, element));
    } else {
        throw new IllegalArgumentException("Unknown type");
    }
}

Here we yet again take advantage of a new Java feature: instanceof pattern matching (https://openjdk.java.net/jeps/375). This allows us to skip the type casting after the instanceof check, making for a more readable code. Actually, once more work is done in this area and we get the planned switch expression for instanceof, will end up with something akin to:

static <T> LinkedList<T> addElement(LinkedList<T> list, T element) {
    return switch (list) {
        case Nil<T> nil -> new Cons<>(element, new Nil<>());
        case Cons<T> cons -> new Cons<>(cons.value, addElement(cons.next, element));
    }
}

Which will finally be quite pleasant to the eye. We can use this code as simply as:

LinkedList<Integer> list = new LinkedList.Nil<>();
for (int i = 0; i < 10; i++) {
    list = addElement(list, i);
}

I have added several more functions to the solution and the complete code can be found here.

Conclusion

So there it is. An immutable LinkedList written using the type system. There is still space for improvement but I feel like Java is on the right track. Although those features are still in preview I have high hopes that when we reach the next LTS (Java 17?) we will be able to truly take advantage of ADT techniques. Of course Java is playing a sort of catch up to other JVM languages like Kotlin and Scala but I hope that their implementation will be better since Java can play with JVM as it sees fit. Next time someone asks you to implement a Linked List in an interview, you can just use this :P.

ADT with Java: Sealed Classes, Pattern Matching, Records

The Mythical Modular Monolith

What is it and how to build one?

Photo by Raphael Koh on Unsplash

We all know the Microservice trend that has been around for years now. Recently voices started to rise that maybe it is not a solve-all solution and there are other, better suited approaches. Even if, eventually, by evolution, we end up with a Microservice architecture, then there are other intermediate steps to safely get there. Most prominent, a bit controversial in some circles, is the modular monolith. If you follow tech trends you would have already seen a chart like this:

I will not get into details of this diagram, since there are other great resources that talk about it. The main idea is that if are at the bottom left (Big Ball of Mud) then we want to move up and right through the modular monolith instead of the distributed ball of mud. Ideally we want to start with modular monolith and if our product is successful enough potentially move towards Microservices.

The problem I found with those articles/talks is that they talk about the modular monolith but rarely go into details as to what that actually means, as if that was self-explanatory. In this piece I will outline some patterns that can help when building a modular monolith.


What is a modular monolith?

Before we can talk about how to build a modular monolith we need to answer this question. After all, what makes it different from a regular monolith? Is it just a “correctly” written monolith? That is correct but too vague. We need a better definition.

The key is in the word modular. What is a module? It is a collection of functionalities that have high cohesion in an isolated environment (lowly coupled with other functionality). There are various techniques that can be used to collect such functionalities and build a boundary around them, e.g. Domain Driven Design (DDD).

Breaking it down:

  • Clear responsibility / high cohesion: Each module has a clearly defined business responsibility and handles its implementation from top to bottom. From DDD perspective: a single domain.
  • Loosely coupled: there should be little to any coupling between modules. If I change something in one module it should affect other modules in a minimal way (or even better not at all).
  • Encapsulation: the business logic and domain model should not be visible from outside of the module (linked with loose coupling).

A good rule of thumb to check if a module is well written is to analyse how difficult it would be for it to be extracted into a separate microservice. If it’s easy then the module is well written. If you need to make changes in multiple modules to do it, then it needs some work. A typical ball of mud might also have modules but they will break aforementioned guidelines.

Having defined what a module is, defining a Modular Monolith is straightforward. A Modular Monolith is a collection of modules that adhere to those rules. It differs from a Microservice architecture that all the modules are deployed in one deployment unit and often reside in one repository (aka mono-repo).


Integration Patterns

There are a number of integration patterns one can employ when building a modular monolith.

Each one has its strengths and weaknesses and should be used depending on the need. I have ranked them in the level of maturity.

Level 1: One compilation unit/module

The codebase has one compilation unit. The modules communicate between each other using exposed services or internal events. This approach is fastest to implement initially, however as the product grows it will become progressively harder to add new functionality as the coupling tends to be high in such systems. Same applies to ease of reasoning. At first it will be very easy to “understand” the system. As the time flows the number of cases needed to keep in mind will grow as it is very hard to determine what are the relations between domains. Benefit from this approach is that we can quickly deliver initial value while refining our development practices (especially new teams). From a practical point of view the build/test times of such a system will rise exponentially, slowing down development.

Recommendation: Use with a single small team (2–3 people). Ideal for Proof of Concept and MVPs.

Level 2: Multiple compilation units/modules

The codebase has multiple compilation units, e.g. multiple maven modules, one per domain. Each module exposes a clearly defined API. This approach allows for better encapsulation as there is a clear boundary between the modules. You can even split the team and distribute responsibility for each module, allowing for independent development. The readability also benefits from this approach since it is easy to determine what dependencies are between the modules. In addition we can only build and test the module that has been changed. This speeds up development significantly. Requires a little bit more fiddling with build tools but nothing a regular developer couldn’t handle.

Recommendation: Good for a typical product team. Team members can work fairly independently. Could work with 2–3 small teams. This approach will take you far as the implementation overhead is small, while it is very easy to maintain consistency through the code.

Level 3: Multiple compilation module groups

Each domain is split into 2+ modules. This is an expansion on the previous approach. This way we can extract an API module that other domains will be dependent on. This will further enforce encapsulation. You can even employ static analysis tools that ban other modules from being dependent on anything but the API modules. This approach could benefit from the Java Jigsaw Project.

Recommendation: This is ideal when moving from a medium sized product to a large product where 2+ full size product teams are needed. Each team will expose their API module that the others can ingest.

Level 4: Going web: Communicating through network

Same as Level 3 but modules are totally independent (no shared API module) and communicate using the network (REST/SOAP, queues, etc). This is an extreme step. One that should not be taken lightly. You lose compile time checking on the APIs and gain multiple problems related to networking. This allows a very high decoupling of the modules as there is no shared code (apart from some utils, etc). When deciding to take this step it means that we are nearing a Microservice architecture.

Option A: Having a single deployment unit. It might seem weird to call a REST API when everything is deployed in one unit but this approach does allow for better load distribution. This is especially possible when using a queue like kafka for communication. However I agree that in most cases this is a redundant approach. It is a good stepping stone when moving to Option B.

Option B: Separate deployment units. This is pretty much the final move from a modular monolith to microservice architecture.

Level 4.5: using separate repositories (aka moving away from monorepo), CI/CD pipelines, etc

Moving to full blown Microservice approach. Not going into detail as this article is not about Microservices.

Recommendation: Large/Multiple Products, many development teams, efficient DevOps culture, mature company.

Summary:

You might have noticed that Cost of Maintenance is never low. That is correct.

You can not hide complexity. You can only change its location.


Contracts and Testing

A very important aspect of modular monoliths is to treat the APIs or domain boundaries (call it as you wish) as Contracts between domains/modules that need to be respected. This might seem obvious but it is easy to fall into a trap in a monolith where the module APIs get treated as second class citizens. When designing and maintaining them we should think of them as if designing a REST API. Changes to them should be done carefully. They should have a proper set of tests. This is key when there are multiple teams cooperating.

One of the more common issue is no clear distinction between what team is responsible for which module. Responsibility for APIs becomes blurred and their quality drops rapidly. Each module should have one team responsible for it. Ownership is key. The API of this module should be a contract between that team and the teams that use it. That API MUST BE covered by tests. Only the team responsible for that module can introduce changes to that API. This does increase communication overhead and extends the time of introducing changes but keeps those factors constant instead of spinning out of control.


Conclusion

I hope those make the Modular Monolith a tiny bit less Mythical. I think we developers like to over-complicate things while in truth most software engineering comes down to a few basic principles. There are a few more topics I think would be worth discussing regarding the Modular Monolith (tooling, architecture as code) but I think this gives a good starting point. The most important takeaways are:

  • Encapsulated modules with high cohesion and low coupling
  • Ownership is key

Keep those in mind and you will get far.

The Mythical Modular Monolith