Why Abstraction is Really Important

Abstraction
Abstraction is one of the key elements of good software design.
It helps encapsulate behavior. It helps decouple software elements. It helps having more self-contained modules. And much more.

Abstraction makes the application extendable in much easier way. It makes refactoring much easier.
When developing with higher level of abstraction, you communicate the behavior and less the implementation.

General
In this post, I want to introduce a simple scenario that shows how, by choosing a simple solution, we can get into a situation of hard coupling and rigid design.

Then I will briefly describe how we can avoid situation like this.

Case study description
Let’s assume that we have a domain object called RawItem.

public class RawItem {
    private final String originator;
    private final String department;
    private final String division;
    private final Object[] moreParameters;
    
    public RawItem(String originator, String department, String division, Object... moreParameters) {
        this.originator = originator;
        this.department = department;
        this.division = division;
        this.moreParameters = moreParameters;
    }
}

The three first parameters represent the item’s key.
I.e. An item comes from an originator, a department and a division.
The “moreParameters” is just to emphasize the item has more parameters.

This triplet has two basic usages:
1. As key to store in the DB
2. As key in maps (key to RawItem)

Storing in DB based on the key
The DB tables are sharded in order to evenly distribute the items.
Sharding is done by a hash key modulo function.
This function works on a string.

Suppose we have N shards tables: (RAW_ITEM_REPOSITORY_00, RAW_ITEM_REPOSITORY_01,..,RAW_ITEM_REPOSITORY_NN),
then we’ll distribute the items based on some function and modulo:

String rawKey = originator + "_"  + department + "_" + division;
// func is String -> Integer function, N = # of shards
// Representation of the key is described below
int shard = func(key)%N;

Using the key in maps
The second usage for the triplet is mapping the items for fast lookup.
So, when NOT using abstraction, the maps will usually look like:

Map<String, RawItem> mapOfItems = new HashMap<>();
// Fill the map...

“Improving” the class
We see that we have common usage for the key as string, so we decide to put the string representation in the RawItem.

// new member
private final String key;

// in the constructor:
this.key = this.originator + "_" + this.department + "_"  + this.division;

// and a getter
public String getKey() {
  return key;
}

Assessment of the design
There are two flows here:
1. Coupling between the sharding distribution and the items’ mapping
2. The mapping key is strict. any change forces change in the key, which might introduce hard to find bugs

And then comes a new requirement
Up until now, the triplet: originator, department and division made up a key of an item.
But now, a new requirement comes in.
A division can have subdivision.
It means that, unlike before, we can have two different items from the same triplet. The items will differ by the subdivision attribute.

Difficult to change
Regarding the DB distribution, we’ll need to keep the concatenated key of the triplet.
We must keep the modulo function the same. So distribution will remain using the triplets, but the schema will change and hava ‘subdivision’ column as well.
We’ll change the queries to use the subdivision together with original key.

In regard to the mapping, we’ll need to do a massive refactoring and to pass an ItemKey (see below) instead of just String.

Abstraction of the key
Let’s create ItemKey

public class ItemKey {
    private final String originator;
    private final String department;
    private final String division;
    private final String subdivision;

    public ItemKey(String originator, String department, String division, String subdivision) {
        this.originator = originator;
        this.department = department;
        this.division = division;
        this.subdivision = subdivision;
    }

    public String asDistribution() {
        return this.originator + "_" + this.department + "_"  + this.division;
    }
}

And,

Map<ItemKey, RawItem> mapOfItems = new HashMap<>();
// Fill the map...
    // new constructor for RawItem
    public RawItem(ItemKey itemKey, Object... moreParameters) {
        // fill the fields
    }

Lesson Learned and conclusion
I wanted to show how a simple decision can really hurt.

And, how, by a small change, we made the key abstract.
In the future the key can have even more fields, but we’ll need to change only the inner implementation of it.
The logic and mapping usage should not be changed.

Regarding the change process,
I haven’t described how to do the refactoring, as it really depends on how the code looks like and how much is it tested.
In our case, some parts were easy, while others were really hard. The hard parts were around code that was looking deep in the implementation of the key (string) and the item.

This situation was real
We actually had this flow in our design.
Everything was fine for two years, until we had to change the key (add the subdivision).
Luckily all of our code is tested so we could see what breaks and fix it.
But it was painful.

There are two abstraction that we could have initially implement:
1. The more obvious is using a KEY class (as describe above). Even if it only has one String field
2. Any map usage need to be examined whether we’ll benefit by hiding it using abstraction

The second abstraction is harder to grasp and to fully understand and implement.

So,
do abstraction, tell a story and use the interfaces and don’t get into details while telling it.

Linkedin Twitter facebook github

Advertisements

Agile Mindset During Programming

I’m Stuck

Recently I found myself in several situations where I just couldn’t write code. Or at least, “good code”
First, I had “writer’s block”. I just could not see what was going to be my next test to write.
I could not find the name for the class / interface I needed.
Second, I just couldn’t simplify my code. Each time I tried to change something (class / method) to a simpler construction, things got worse. Sometimes to break.

I was stuck.

The Tasks

Refactor to Patterns

One of the situation we had was to refactor a certain piece in the code.
This piece of code is the manual wiring part. We use DI pattern in ALL of our system, but due to some technical constraints, we must do the injection by hand. We can live with that.
So the refactor in the wiring part would have given us a nice option to change some of the implementation during boot.
Some of the concrete classes should be different than others based on some flags.
The design patterns we understood we would need were: Factory Method and Abstract Factory
The last remark is important to understand why I had those difficulties.
I will get to it later.

New Module

Another task was to create a new module that gets some input items, extract data from them, send it to a service, parse the response, modify the data accordingly and returns items with modified data.
While talking about it with a peer, we understood we needed several classes.
As always we wanted to have high quality code by using the known OOD principles wherever we could apply them.

So What Went Wrong?

In the case of refactoring the wiring part, I constantly tried to immediately create the end result of the abstract factory and the factory method that would call it.
There are a-lot of details in that wiring code. Some are common and some needed to be separated by the factory.
I just couldn’t find the correct places to extract to methods and then to other class.
Each time I had to move code from one location and dependency to another.
I couldn’t tell what exactly the factory’s signature and methods would be.

In the case of the new module, I knew that I want several classes. Each has one responsibility. I knew I want some level of abstraction and good encapsulation.
So I kept trying to create this great encapsulated abstract data structure. And the code kept being extremely complicated.
Important note: I always to test first approach.
Each time I tried to create a test for a certain behavior, it was really really complicated.

I stopped

Went to have a cup of coffey.
I went to read some unrelated stuff.
And I talked to one of my peers.
We both understood what we needed to do.
I went home…

And then it hit me

The problem I had was that I knew were I needed to go, but instead of taking small steps, I kept trying to take one big leap at once.
Which brings me to the analogy of Agile to good programming habits (and TDD would be one of them).

Agile and Programming Analogy

One of the advantages in Agile development that I really like is the small steps (iteration) we do in order to reach our goal.
Check the two pictures below.
One shows how we aim towards a far away goal and probably miss.
The other shows how we divide to iterations and aim incrementally.

Aiming From Far

Aiming From Far


Aiming Iterative and Incremental

Aiming Iterative and Incremental

Develop in Small Incremental Iterations

This is the moral of the story.
Even if you know exactly how the structure of the classes should look like.
Even if you know exactly which design pattern to use.
Even if you know what to do.
Even if you know exactly how the end result should look like.

Keep on using the methods and practices that brings you to the goal in the safest and fastest way.
Do small steps.
Test each step.
Increment the functionality of the code in small chucks.
TDD.
Pair.
Keep calm.

Refactor Big Leap

Refactor Big Leap


Refactor Small Steps

Refactor Small Steps


Law of Demeter

Reduce coupling and improve encapsulation…

General
In this post I want to go over Law of Demeter (LoD).
I find this topic an extremely important for having the code clean, well-designed and maintainable.

In my experience, seeing it broken is a huge smell for bad design.
Following the law, or refactoring based on it, leads to much improved, readable and more maintainable code.

So what is Law of Demeter?
I will start by mentioning the 4 basic rules:

Law of Demeter says that a method M of object O can access / invoke methods of:

  1. O itself
  2. M’s input arguments
  3. Any object created in M
  4. O’s parameters / dependencies

These are fairly simple rules.

Let’s put this in other words:
Each unit (method) should have limited knowledge about other units.

Metaphors
The most common one is: Don’t talk to strangers

How about this:
Suppose I buy something at 7-11.
When I need to pay, will I give my wallet to the clerk so she will open it and get the money out?
Or will I give her the money directly?

How about this metaphor:
When you take your dog out for a walk, do you tell it to walk or its legs?

Why do we want to follow this rule?

  • We can change a class without having a ripple effect of changing many others.
  • We can change called methods without changing anything else.
  • Using LoD makes our tests much easier to construct. We don’t need to write so many ‘when‘ for mocks that return and return and return.
  • It improves the encapsulation and abstraction (I’ll show in the example below).
    But basically, we hide “how things work”.
  • It makes our code less coupled. A caller method is coupled only in one object, and not all of the inner dependencies.
  • It will usually model better the real world.
    Take as an example the wallet and payment.

Counting Dots?
Although usually many dots imply LoD violation, sometimes it doesn’t make sense to “merge the dots”.
Does:
getEmployee().getChildren().getBirthdays()
suggest that we do something like:
getEmployeeChildrenBirthdays() ?
I am not entirely sure.

Too Many Wrapper Classes
This is another outcome of trying to avoid LoD.
In this particular situation, I strongly believe that it’s another design smell which should be taken care of.

As always, we must have common sense while coding, cleaning and / or refactoring.

Example
Suppose we have a class: Item
The item can hold multiple attributes.
Each attribute has a name and values (it’s a multiple value attribute)

The simplest implementations would be using Map.

Let’s have a class ItemsSaver that uses the Item and attributes:
(please ignore the unstructured methods. This is an example for LoD, not SRP 🙂 )

Suppose I know that it’s a single value (from the context of the application).
And I want to take it. Then the code would look like:

I think that it is clear to see that we’re having a problem.
Wherever we use the attributes of the Item, we know how it works. We know the inner implementation of it.
It also makes our test much harder to maintain.

Let’s see an example of a test using mock (Mockito):
You can see imagine how much effort it should take to change and maintain it.

We can use real Item instead of mocking, but we’ll still need to create lots of pre-test data.

Let’s recap:

  • We exposed the inner implementation of how Item holds Attributes
  • In order to use attributes, we needed to ask the item and then to ask for inner objects (the values).
  • If we ever want to change the attributes implementation, we will need to make changes in the classes that use Item and the attributes. Probably a-lot classes.
  • Constructing the test is tedious, cumbersome, error-prone and lots of maintenance.

Improvement
The first improvement would be to ask let Item delegate the attributes.

And the test becomes much simpler.

We are (almost) hiding totally the implementation of attributes from other classes.
The client classes are not aware of the implementation expect two cases:

  1. Item still knows how attributes are built.
  2. The class that creates Item (whichever it is), also knows the implementation of attributes.

The two points above mean that if we change the implementation of Attributes (something else than a map), at least two other classes will need to be change. This is a great example for High Coupling.

The Next Step Improvement
The solution above will sometimes (usually?) be enough.
As pragmatic programmers, we need to know when to stop.
However, let’s see how we can even improve the first solution.

Create a class Attributes:

And the Item that uses it:

(Did you noticed? The implementation of attributes inside item was changed, but the test did not need to. This is thanks to the small change of delegation.)

In the second solution we improved the encapsulation of Attributes.
Now even Item does not know how it works.
We can change the implementation of Attributes without touching any other class.
We can make different implementations of Attributes:
– An implementation that holds a Set of values (as in the example).
– An implementation that holds a List of values.
– A totally different data structure that we can think of.

As long as all of our tests pass, we can be sure that everything is OK.

What did we get?

  • The code is much more maintainable.
  • Tests are simpler and more maintainable.
  • It is much more flexible. We can change implementation of Attributes (map, set, list, whatever we choose).
  • Changes in Attribute does not affect any other part of the code. Not even those who directly uses it.
  • Modularization and code reuse. We can use Attributes class in other places in the code.

Recommended Books

I have a list of books, which I highly recommend.
Each book taught me something different.

It all begun years ago, when I went into interviewing process for my second work place.
I was a junior Java developer, a coder. I didn’t have much experience and more importantly, I did not have a mentor or someone who would direct me. I learned on my own, after a CS Java course. Java 1.4 just came.

One of my first interviewers was a great mentor. We met for an hour (probably). I don’t remember the company.  I don’t remember the job position. I don’t remember his name.
But I DO remember a few things he asked me.
He asked me if I know what TDD was. He asked me about XP.
He also recommended a book: Effective Java by Joshua Bloch

He didn’t even know what a great gift he gave me.

So I went on and bought Effective Java, 1st edition. And TDD by Kent Beck.
That was my first step towards being craftsman.

Effective Java and Refactoring
These two books look as they are not entirely related.
However, both of these books thought me a-lot about design and patterns.
I started to understand how to write code using patterns (Refactoring), and how to do it in Java (Effective).
These books gave me the grounds for best practice in Java and Design Patterns and OOD.

Test Driven Development
I can’t say enough about this book.
At first, I really didn’t understand what it was all about.
But it was part of XP !! (which I didn’t understand as well).
The TDD was left on the shelf until I was ready for it.

Clean Code and The Pragmatic Programmer
Should I say more?
If you haven’t read both, stop everything and go to read.
They are MUST for anyone who wants to be craftsman and takes his / her profession seriously.
These books are also lots of fun to read. Especially the Pragmatic book.

The Clean Coder
If you want to take the next step of being a professional, read it.
I was sometimes frustrated while reading it. I thought to myself how can pass all of this material to my teammates…

Dependency Injection
Somewhat not related, but as I see it, if you don’t use DI, you can’t write clean, testable code.
If you can’t write clean, testable code, you are missing the point of craftsmanship.
The book covers some injectors frameworks, but also describe what is it all about.

Below is a table with the books I have mentioned.

One last remark,
This list does not contain the only books I read.
During the years I have read more technical / professional books, but these made the most difference for me.

Name Author(s) ISBN
Effective Java Joshua Bloch 978-032-135-668-0
Test-Driven Development Kent Beck 978-032-114-653-3
Refactoring Martin Fowler 978-020-148-567-7
Dependency Injection Dhanji R. Prasanna 978-193-398-855-9
Clean Code Robert C. Martin 978-013-235-088-4
The Clean Coder Robert C. Martin 978-013-708-107-3
The Pragmatic Programmer Andrew Hunt , David Thomas 978-020-161-622-4