Continuous Deployment circleci, AWS (Elastic Beanstalk), Docker


We run some of our services in Docker container, under Elastic Beanstalk (EB).
We use circleci for our CI cycle.
EB, Docker and Circlec integrate really nice for automatic deployment.

It’s fairly easy to set up all the services to work together.
In this post, I am summarising the steps to do it.

About EB Applications and Versions

Elastic Beanstalk has the concepts of application, environments and application-versions.
The automatic steps that I describe here are up to the point of creating a new application-version in EB.
The actual deployment is done manually using Elastic Beanstalk management UI. I describe it as well.

Making that final step automatic is easy, and I will add a post about it in the future.

I am not going to describe the CI cycle (test, automation, etc.).
It’s a completely different, very important topic.
But out of scope for this post.
Connecting GitHub to circleci is out of scope of this post as well.

The Architecture

There are four different services that I need to integrate:

Basic Flow

Everything starts with push to GitHub.
(which I didn’t include in the list above).
Once we push something to GitHub, circleci is triggered and runs based on the circle.yml file.
The CI will create the Docker image and upload it to Docker-hub. We use private repository.
Next step, CI will upload a special json file to S3. This file will tell EB from where to get the image, the image and other parameters.
As the last step, for delivery, it will create a new Application Version in EB.

Process Diagram

CI Docker EB Deployment High Level Architecture

CI Docker EB Deployment High Level Architecture

The description and diagram above are for the deployment part from CI (GitHub) to AWS (EB).
It doesn’t describe the last part for deploying a new application revision in EB.
I will describe that later in this post.


The post describes how to work with private repository in docker hub.
In order to work with the private repository, there are several permission we need to set.

  • circleci needs to be able to:
    1. Upload image to Docker-Hub
    2. Upload a JSON file to a bucket in S3
    3. Call an AWS command to Elastic Benastalk (create new application revision)
  • AWS EB needs to be able to:
    1. Pull (get/list) data from S3 bucket
    2. Pull an image from Docker-Hub

I am omitting the part of creating user in GitHub, Circleci, Docker-Hub and AWS.

Docker authentication

Before we set up authentication, we need to login to Docker and create a dockercfg file.

dockercfg file

Docker has a special configuration file, usually named .dockercfg.
We need to produce this file for the user who has permissions to upload images to docker-hub and to download images.
In order to create it, you need to run the following command:
docker login
This command will create the file in ~/.docker/.dockercfg
If you want to create this file for a different email (user), use -e option.
Check: docker login doc
The format of the file is different for Docker version 1.6 and 1.7.
Currently, we need to use 1.6 format. Otherwise AWS will not be able to connect to the repository.

“Older” Version, Docker 1.6

  "": {
    "auth": "AUTH_KEY",
    "email": "DOCKER_EMAIL"

Newer (Docker 1.7) version of the configuration file

This will probably be the file that was generated in your computer.

  "auths": {
    "": {
      "auth": "AUTH_KEY",
      "email": "DOCKER_EMAIL"

The correct format is based on the Docker version EB uses.
We need to add it to an accessible S3 bucket. This is explained later in the post.

Uploading from Circleci to Docker Hub

Setting up a user in Docker Hub

  1. In docker hub, create a team (for your organisation).
  2. In the repository, click ‘Collaborators’ and add this team with write permission.
  3. Under the organisation, click on teams. Add the “deployer” user to the team. This is the user that has the file previously described.

I created a special user, with specific email specifically for that.
The user in that team (write permission) need to have a dockercfg file.

Setting up circle.yml file with Docker-Hub Permissions

The documentation explains to set permissions like this:
But we did it differently.
In the deployment part, we manipulated the dockercfg file.
Here’s the part in out circle.yml file:

  - |
    cat > ~/.dockercfg << EOF
      "": {
        "auth": "$DOCKER_AUTH",
        "email": "$DOCKER_EMAIL"

Circleci uses environment variables. So we need to set them as well.
We need to set the docker authentication key and email.
Later we’ll set more.

Setting Environment Variables in circleci

Under setting of the project in Circelci, click Environment Variables.

Settings -> Environment Variables

Settings -> Environment Variables

Add two environment variables: DOCKER_AUTH and DOCKER_EMAIL
The values should be the ones from the file that was created previously.

Upload a JSON file to a bucket in S3

Part of the deployment cycle is to upload a JSON descriptor file to S3.
So Circleci needs to have permissions for this operation.
We’ll use the IAM permission policies of AWS.
I decided to have one S3 bucket for all deployments of all projects.
It will make my life much easier because I will be able to use the same user, permissions and policies.
Each project / deployable part will be in a different directory.

Following are the steps to setup AWS environment.

  1. Create the deployment bucket
  2. Create a user in AWS (or decide to use an exiting one)
  3. Keep the user’s credentials provided by AWS (downloaded) at hand
  4. Create Policy in AWS that allows to:
    1. access the bucket
    2. create application version in EB
  5. Add this policy to the user (that is set in circleci)
  6. Set environment variables in Circleci with the credentials provided by AWS

Creating the Policy

In AWS, go to IAM and click Policies in left navigation bar.
Click Create Policy.
You can use the policy manager, or you can create the following policy:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "Stmt1443479777000",
            "Effect": "Allow",
            "Action": [
            "Resource": [
            "Sid": "Stmt1443479924000",
            "Effect": "Allow",
            "Action": [
            "Resource": [

As mentioned above, this policy allows to access specific bucket (MY_DEPLOY_BUCKET), sub directory.
And it allows to trigger the creation of new application version in EB.
This policy will be used by the user who is registered in circleci.

AWS Permissions in Circleci

Circleci has special setting for AWS integration.
In the left navigation bar, click AWS Permissions.
Put the access key and secret in the correct fields.
You should have these keys from the credentials file that was produced by AWS.

Pull (get/list) data from S3 bucket

We now need to give access to the EB instances to get some data from S3.
The EB instance will need to get the dockercfg file (described earlier)
In EB, you can set an Instance profile. This profile will give the instance permissions.
But first, we need to create a policy. Same as we did earlier.

Create a Policy in AWS

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "Stmt1443508794000",
            "Effect": "Allow",
            "Action": [
            "Resource": [

This policy gives read access to the deployment bucket and the sub directories.
The EB instance will need to have access to the root directory of the bucket because this is were I will put the dockercfg file.
It needs the sub directory access, because this is the location were circleci uploads the JSON descriptor files.

Set this policy for the EB instance

In the EB dashboard:

  1. Go to Application Dashboard (click the application you are setting) ➜
  2. Click the environment you want to automatically deploy ➜
  3. Click Configuration in the left navigation bar ➜
  4. Click the settings button of the instances ➜
  5. You will see Instance profile
    You need to set a role.
    Make sure that this role has the policy you created in previous step. ➜
  6. Apply changes

Pull an image from Docker-Hub

In order to let EB instance be able to download image from Dockerhub, we need to give it permissions.
EB uses the dockercfg for that.
Upload dockercfg (described above) to the the bucket that EB has permission (in my example: MY_DEPLOY_BUCKET)
Put it in the root directory of the bucket.
Later, you will set environment variables in circleci with this file name.

Setting Up Circleci Scripts

After setting up all permissions and environments, we are ready to set circleci scripts.
Circleci uses circle.yml file to configure the steps for building the project.
In this section, I will explain how to configure this file for continuous deployment using Docker and EB.
Other elements in that file are out of scope.
I added the sample scripts to GitHub.

circle.yml File

Following are the relevant parts in the circle.yml file

# This is a Docker deployment
    - docker
# Setting the tag for Docker-hub
# MY_IMAGE_NAME is hard coded in this file. The project’s environment variables do not pass at this stage.

# An example for on environment
# The ‘automatic-.*’ is hook so we can automatically deploy from different branches.
# Usually we deploy automatically after a pull-request is merged to master.
    branch: [master, /automatic-.*/]
# This is our way for setting docker cfg credentials. We set project’s environment variables with the values.
      - |
          cat > ~/.dockercfg << EOF
              "": {
                  "auth": "$DOCKER_AUTH",
                  "email": "$DOCKER_EMAIL"
# Sample for RoR project. Not relevant specifically to Docker.
      - bundle package --all
# Our is located under directory: docker-images
      - docker build -t $DOCKER_IMAGE -f docker-images/ .
      - docker push $DOCKER_IMAGE
# Calling script for uploading JSON descriptor file
      - sh ./ $TAG
# Calling script for setting new application version in AWS EB
      - sh ./ $TAG 

Template Descriptor File

AWS EB uses a JSON file in order to have information of docker hub.
It needs to know where the image is (organisation, image, tag).
It also needs to know where to get the dockercfg file from.
Put this file in your root directory of the project.

  "AWSEBDockerrunVersion": "1",
  "Authentication": {
    "Bucket": "<DEPLOYMENT_BUCKET>",
  "Image": {
    "Update": "true"
  "Ports": [
      "ContainerPort": "<EXPOSED_PORTS>"

The first script we run will replace the tags and create a new file.
The environment variables list is described below.

Script that manipulates the descriptor template file

Put this file in your root directory of the project.

#! /bin/bash

# Prefix of file name is the tag.

# Replacing tags in the file and creating a file.

# Uploading json file to $S3_PATH

Script that adds a new application version to EB

The last automated step is to trigger AWS EB with a new application version.
Using label and different image per commit (in master), helps tracking which version is on which environment.
Even if we use single environment (“real” continuous deployment), it’s easier to track and also to rollback.
Put this file in your root directory of the project.

#! /bin/bash


# Run aws command to create a new EB application with label
aws elasticbeanstalk create-application-version --region=$REGION --application-name $AWS_APPLICATION_NAME 
    --version-label $DOCKER_TAG --source-bundle S3Bucket=$DEPLOYMENT_BUCKET,S3Key=$BUCKET_DIRECTORY/$DOCKERRUN_FILE

Setting up environment variables in circleci

In order to make the scripts and configuration files reusable, I used environment variables all over the place.
Following are the environment variables I using for the configuration file and scripts.

AUTHENTICATION_KEY – The name of the dockercfg file, which is in the S3 bucket.
AWS_APPLICATION_NAME – Name of the application in EB
BUCKET_DIRECTORY – The directory where we upload the JSON descriptor files
DEPLOYMENT_BUCKET – S3 bucket name
DOCKER_AUTH – The auth key to connect to dockerhub (created using docker login)
DOCKER_EMAIL – The email of the auth key
EXPOSED_PORTS – Docker ports
IMAGE_NAME – Every Docker image has a name. Then it is: Organisation:Image-Name
REGION – AWS region of the EB application

Some of the environment variables in the script/configuration files are provided by circleci (such as CIRCLE_SHA1 and CIRCLE_BRANCH)

Deploying in AWS EB

Once an application version is uploaded to EB, we can decide to deploy it to an environment in EB.
Follow these steps:

  1. In EB, in the application dashboard, click Application Versions in the left nav bar
  2. You will see a table with all labeled versions. Check the version you want to deploy (SHA1 can assist knowing the commit and content of the deployment)
  3. Click deploy
  4. Select environment
  5. You’re done
Aws EB Application Versions

AWS EB Application Versions


Once you do a setup for one project, it is easy to reuse the scripts and permissions for other projects.
Having this CD procedure makes the deployment and version tracking an easy task.
The next step, which is to deploy the new version to an EB environment is very easy. And I will add a different post for that.

Sample files in GitHub

Edit: This is helpful for setting AWS permissions –

Linkedin Twitter facebook github

SRP as part of SOLID

Clean Code Alliance organized a meetup about SOLID principles.
I had the opportunity to talk about Single Responsibility Principle at part of SOLID.
It’s a presentation I gave several times in the past.

It was fun talking about it.
There were many interesting and challenging questions, which gave me lots of things to think of.

SRP as part of SOLID

Single Responsibility Principle (SRP), is the part of the SOLID acronym.The SOLID principles help us design better code. Applying those principles helps us having maintainable code, less bugs and easier testing.The SRP is the foundation of having better designed code.In this session I will introduce the SOLID principles and explain in more details what SRP is all about.Applying those principles is not sci-fi, it is real, and I will demonstrate it.
Yesterday I gave a talk in a meetup about the SRP in SOLID.

Eyal Golan is a Senior Java developer and agile practitioner. Responsible of building the high throughput, low latency server infrastructure.Manages the continuous integration and deployment of the system. Leading the coding practices. Practicing TDD, clean code. In the path for software craftsmanship.

Following me, Hayim Makabee gave a really interesting talk about The SOLID Principles Illustrated by Design Patterns

Here are the slides.

And the video (in Hebrew)

Thanks for the organizers, Boris and Itzik and mostly for the audience who seemed very interested.

Linkedin Twitter facebook github

dropwizard-jobs – My First Open Source Contribution

I am very exited today.
Today I did an actual contribution to the open source community.
I helped publishing Java libraries to maven central.

The library we published is a plugin for dropwizard that uses quartz:

You can check it out. The README explains how to use it.
In this post I will not explain the plugin, but I will share my contribution experience to an open source project.

Why Even Contribute

There are so many reasons. Google is full of them.
I did it because I really wanted to help the community (In this case, the originators of the code).
It improves my skill-sets. I know now more than I knew before.
Exposed to technologies and processes which I usually don’t use.
Part of my digital signature and branding.

Why This Project

I know about dropwizard for more than a year.
I didn’t have the chance to use it at work.
I did some experiments with dropwizard to get the filling of it.

In one of my POCs, I wanted to create a scheduling mechanism in the micro-service I created.
By searching Google, I found this project.
First of all, I liked what it does and how.
I also liked the explanation (how to use it). It’s clear and I could work with it immediately.
I think the developers did a good job.

How It (my contribution) All Started

But one thing was missing. It wasn’t in maven repository (central or any other public repository).
So I asked whether the developers plan to publish it.

Issue #10 in the repository shows my question and the beginning of the conversation.
Issue 10, question from 2015/02/24

Basically the problem was the time to spend in order to comply requirements. The code itself was working.

My Contribution

I took upon myself to publish it to public maven repository.
I have never done something like that, so I wasn’t sure what to do.
I thought of using bintray by JFrog.
Eventually I decided to use sonatype. It felt more comfortable. So I started reading about OSSRH (Open Source Project Repository Hosting).
There’s an explanation for that below.

I forked the code to my GitHub account and used pull requests in order to merge the code I pushed.
I mostly modified the pom files so comply Sonatype requirements as explained in the tutorials.

Once we were all set, I did the actual publishing.
And now it’s there. Everyone can use it.

At first I was extra careful with any change. After all, “it’s not my code”…
Over time, I felt more comfortable modifying and pull requesting.

How To Upload to Sonatype

I used the tutorials, which explain clearly what to do.

  1. Create a user at OSSRH
  2. Open an issue with links to GitHub. Group ID and artifact ID
  3. Follow instructions (In our case, I had to modify the maven’s groups ID)
  4. Add the correct plugins to the pom file
    pgp – read it carefully
  5. Deploy


Linkedin Twitter facebook github

Fedora Installation

Aggregate Installation Tips

One of the reasons I am writing this blog, is to keep “log” for myself on how I resolved issues.

In this post I will describe how I installed several basic development tools on a Fedora OS.
I want this laptop to be my workstation for out-of-work projects.

Almost everything in this post can be found elsewhere in the web.
Actually, most of what I am writing here is from other links.

However, this post is intended to aggregate several installations together.

If you’re new to Linux (or not an expert, as I am not), you can learn some basic stuff here.
How to install (yum), how to create from source code, how to setup environment variables and maybe other stuff.

First, we’ll start with how I installed Fedora.

Installing Fedora

I downloaded Fedora ISO from
It is Gnome distribution.
I then used to create a self bootable USB. It’s very easy to use.
I switched to KDE by running: sudo yum install @kde-desktop

Installing Java

Download the rpm package Oracle site.

Under /etc/profile.d/ , create a file ( with the following content:

I used the following link, here’d how to install JDK

Installing Intellij



After installation, you can go to /opt/idea/latest/bin and run
Once you run it, you will be prompt to create a desktop entry.
You can create a command line launcher later on as well.

Installing eclipse


Create executable /usr/bin/eclipse

Create Desktop Launcher

See also

Installing Maven


Setting maven environment

Installing git

I wanted to have the latest git client.
Using yum install did not make it, so I decided to install from source code.
I found a great blog explaining how to do it.
Note: in the compile part, he uses export to /etc/bashrc .
Don’t do it. Instead create a file under /etc/profile.d
Installation commands

git Environment
Create an ‘sh’ file under /etc/profile.d

Linkedin Twitter facebook github

Working with Legacy Test Code

Legacy Code and Smell by Tests

Working with unit tests can help in many ways to improve the code-base.
One of the aspects, which I mostly like, is that tests can point us to code smell in the production code.
For example, if a test needs large setup or assert many outputs, it can point that the unit under test doesn’t follow good design, such as SRP and other OOD.

But sometimes the tests themselves are poorly structured or designed.
In this post I will give two examples for such cases, and show how I solved it.

Test Types

(or layers)
There are several types, or layers, of tests.

  • Unit Tests
    Unit test should be simple to describe and to understand.
    Those tests should run fast. They should test one thing. One unit (method?) of work.
  • Integration Tests
    Integration tests are more vague in definition.
    What kind of modules do they check?
    Integration of several modules together? Dependency-Injector wiring?
    Test using real DB?
  • Behavioral Tests
    Those tests will verify the features.
    They may be the interface between the PM / PO to the dev team.
  • End2End / Acceptance / Staging / Functional
    High level tests. May run on production or production-like environment.

Complexity of Tests

Basically, the “higher level” the test, the more complex it is.
Also, the ratio between possible number of tests and production code increase dramatically per test level.
Unit tests will grow linearly as the code grows.
But starting with integration tests and higher level ones, the options start to grow in exponential rate.
Simple calculation:
If two classes interact with each other, and each has 2 methods, how many option should we check if we want to cover all options? And imagine that those methods have some control flow like if.

Sporadically Failing Tests

There are many reasons for a test to be “problematic”.
One of the worst is a test that sometimes fails and usually passes.
The team ignores the CI’s mails. It creates noise in the system.
You can never be sure if there’s a bug or something was broken or it’s a false alarm.
Eventually we’ll disable the CI because “it doesn’t work and it’s not worth the time”.

Integration Test and False Alarm

Any type of test is subject for false alarms if we don’t follow basic rules.
The higher test level, there’s more chance for false alarms.
In integration tests, there’s higher chance for false alarms due to external resources issues:
No internet connection, no DB connection, random miss and many more.

Our Test Environment

Our system is “quasi legacy”.
It’s not exactly legacy because it has tests. Those test even have good coverage.
It is legacy because of the way it is (un)structured and the way the tests are built.
It used to be covered only by integration tests.
In the past few months we started implementing unit tests. Especially on new code and new features.

All of our integration tests inherit from BaseTest, which inherits Spring’s AbstractJUnit4SpringContextTests.
The test’s context wires everything. About 95% of the production code.
It takes time, but even worse, it connects to real external resources, such as MongoDB and services that connect to the internet.

In order to improve tests speed, a few weeks ago I change MongoDB to embedded. It improved the running time of tests by order of magnitude.

This type of setup makes testing much harder.
It’s very difficult to mock services. The environment is not isolated from the internet and DB and much more.

After this long introduction, I want to describe two problematic tests and the way I fixed them.
Their common failing attribute was that they sometimes failed and usually passed.
However, each failed for different reason.

Case Study 1 – Creating Internet Connection in the Constructor

The first example shows a test, which sometimes failed because of connection issues.
The tricky part was, that a service was created in the constructor.
That service got HttpClient, which was also created in the constructor.

Another issue, was, that I couldn’t modify the test to use mocks instead of Spring wiring.
Here’s the original constructor (modified for the example):

private HttpClient httpClient;
private MyServiceOne myServiceOne;
private MyServiceTwo myServiceTwo;

public ClassUnderTest(PoolingClientConnectionManager httpConnenctionManager, int connectionTimeout, int soTimeout) {
	HttpParams httpParams = new BasicHttpParams();
	HttpConnectionParams.setConnectionTimeout(httpParams, connectionTimeout);
	HttpConnectionParams.setSoTimeout(httpParams, soTimeout);
	HttpConnectionParams.setTcpNoDelay(httpParams, true);
	httpClient = new DefaultHttpClient(httpConnenctionManager, httpParams);

	myServiceOne = new MyServiceOne(httpClient);
	myServiceTwo = new MyServiceTwo();

The tested method used myServiceOne.
And the test sometimes failed because of connection problems in that service.
Another problem was that it wasn’t always deterministic (the result from the web) and therefore failed.

The way the code is written does not enable us to mock the services.

In the test code, the class under test was injected using @Autowired annotation.

The Solution – Extract and Override Call

Idea was taken from Working Effectively with Legacy Code.

  1. Identifying what I need to fix.
    In order to make the test deterministic and without real connection to the internet, I need access for the services creation.
  2. I will introduce a protected methods that create those services.
    Instead of creating the services in the constructor, I will call those methods.
  3. In the test environment, I will create a class that extends the class under test.
    This class will override those methods and will return fake (mocked) services.

Solution’s Code

public ClassUnderTest(PoolingClientConnectionManager httpConnenctionManager, int connectionTimeout, int soTimeout) {
	HttpParams httpParams = new BasicHttpParams();
	HttpConnectionParams.setConnectionTimeout(httpParams, connectionTimeout);
	HttpConnectionParams.setSoTimeout(httpParams, soTimeout);
	HttpConnectionParams.setTcpNoDelay(httpParams, true);
	this.httpClient = createHttpClient(httpConnenctionManager, httpParams);
	this.myserviceOne = createMyServiceOne(httpClient);
	this.myserviceTwo = createMyServiceTwo();

protected HttpClient createHttpClient(PoolingClientConnectionManager httpConnenctionManager, HttpParams httpParams) {
	return new DefaultHttpClient(httpConnenctionManager, httpParams);

protected MyServiceOne createMyServiceOne(HttpClient httpClient) {
	return new MyServiceOne(httpClient);

protected MyServiceTwo createMyServiceTwo() {
	return new MyServiceTwo();
private MyServiceOne mockMyServiceOne = mock(MyServiceOne.class);
private MyServiceTwo mockMyServiceTwo = mock(MyServiceTwo.class);
private HttpClient mockHttpClient = mock(HttpClient.class);

private class ClassUnderTestForTesting extends ClassUnderTest {

	private ClassUnderTestForTesting(int connectionTimeout, int soTimeout) {
		super(null, connectionTimeout, soTimeout);
	protected HttpClient createHttpClient(PoolingClientConnectionManager httpConnenctionManager, HttpParams httpParams) {
		return mockHttpClient;

	protected MyServiceOne createMyServiceOne(HttpClient httpClient) {
		return mockMyServiceOne;

	protected MyServiceTwo createMyServiceTwo() {
		return mockMyServiceTwo;

Now instead of wiring the class under test, I created it in the @Before method.
It accepts other services (not described here). I got those services using @Autowire.

Another note: before creating the special class-for-test, I ran all integration tests of this class in order to verify that the refactoring didn’t break anything.
I also restarted the server locally and verified everything works.
It’s important to do those verification when working with legacy code.

Case Study 2 – Statistical Tests for Random Input

The second example describes a test that failed due to random results and statistical assertion.

The code did a randomize selection between objects with similar attributes (I am simplifying here the scenario).
The Random object was created in the class’s constructor.

Simplified Example:

private Random random;

public ClassUnderTest() {
	random = new Random();
	// more stuff

//The method is package protected so we can test it
MyPojo select(List<MyPojo> pojos) {
	// do something
	int randomSelection = random.nextInt(pojos.size());
	// do something
	return pojos.get(randomSelection);

The original test did a statistical analysis.
I’ll just explain it, as it is too complicated and verbose to write it.
It had a loop of 10K iterations. Each iteration called the method under test.
It had a Map that counted the number of occurrences (returned result) per MyPojo.
Then it checked whether each MyPojo was selected at (10K / Number-Of-MyPojo) with some kind of deviation, 0.1.
Say we have 4 MyPojo instances in the list.
Then the assertion verified that each instance was selected between 2400 and 2600 times (10K / 4) with deviation of 10%.

You can expect of course that sometimes the test failed. Increasing the deviation will only reduce the number of false fail tests.

The Solution – Overload a Method

  1. Overload the method under test.
    In the overloaded method, add a parameter, which is the same as the global field.
  2. Move the code from the original method to the new one.
    Make sure you use the parameter of the method and not the class’s field. Different names can help here.
  3. Tests the newly created method with mock.

Solution Code

private Random random;

// Nothing changed in the constructor
public ClassUnderTest() {
	random = new Random();
	// more stuff

// Overloaded method
private select(List<MyPojo> pojos) {
	return select(pojos, this.random);

//The method is package protected so we can test it
MyPojo select(List<MyPojo> pojos, Random inRandom) {
	// do something
	int randomSelection = inRandom.nextInt(pojos.size());
	// do something
	return pojos.get(randomSelection);


Working with legacy code can be challenging and fun.
Working with legacy test code can be fun as well.
It feels really good to stop receiving annoying mails of failing tests.
It also increase the trust of the team on the CI process.

Linkedin Twitter facebook github

Dropwizard, MongoDB and Gradle Experimenting


I created a small project using Dropwizard, MongoDB and Gradle.
It actually started as an experimenting Guava cache as buffer for sending counters to MongoDB (or any other DB).
I wanted to try Gradle with MongoDB plugin as well.
Next, I wanted to create some kind of interface to check this framework and I decided to try out DropWizard.
And this is how this project was created.

This post is not a tutorial of using any of the chosen technologies.
It is a small showcase, which I did as an experimentation.
I guess there are some flaws and maybe I am not using all “best practices”.
However, I do believe that the project, with the help of this post, can be a good starting point for the different technologies I used.
I also tried to show some design choices, which help achieving SRP, decoupling, cohesion etc.

I decided to begin the post with the use-case description and how I implemented it.
After that, I will explain what I did with Gradle, MongoDB (and embedded) and Dropwizard.

Before I begin, here’s the source code:

The Use-Case: Counters With Buffer

We have some input requests into our servers.
During the process of a request, we choose to “paint” it with some data (decided by some logic).
Some requests will be painted by Value-1, some by Value-2, etc. Some will not be painted at all.
We want to limit the number of painted requests (per paint value).
In order to have limit, for each paint-value, we know the maximum, but also need to count (per paint value) the number of painted requests.
As the system has several servers, the counters should be shared by all servers.

The latency is crucial. Normally we get 4-5 milliseconds per request processing (for all the flow. Not just the painting).
So we don’t want that increasing the counters will increase the latency.
Instead, we’ll keep a buffer, the client will send ‘increase’ to the buffer.
The buffer will periodically increase the repository with “bulk incremental”.

I know it is possible to use directly Hazelcast or Couchbase or some other similar fast in-memory DB.
But for our use-case, that was the best solution.

The principle is simple:

  • The dependent module will call a service to increase a counter for some key
  • The implementation keeps a buffer of counters per key
  • It is thread safe
  • The writing happens in a separate thread
  • Each write will do a bulk increase
Counters High Level Design

Counters High Level Design


For the buffer, I used Google Guava cache.

Buffer Structure

private final LoadingCache<Counterable, BufferValue> cache;

this.cache = CacheBuilder.newBuilder()
	.expireAfterWrite(bufferConfiguration.getExpireAfterWriteInSec(), TimeUnit.SECONDS)
	.expireAfterAccess(bufferConfiguration.getExpireAfterAccessInSec(), TimeUnit.SECONDS)
	.removalListener((notification) -> increaseCounter(notification))
	.build(new BufferValueCacheLoader());

(Counterable is described below)

BufferValueCacheLoader implements the interface CacheLoader.
When we call increase (see below), we first get from the cache by key.
If the key does not exist, the loader returns value.

public class BufferValueCacheLoader extends CacheLoader<Counterable, BufferValue> {
	public BufferValue load(Counterable key) {
		return new BufferValue();

BufferValue wraps an AtomicInteger (I would need to change it to Long at some point)

Increase the Counter

public void increase(Counterable key) {
	BufferValue meter = cache.getUnchecked(key);
	int currentValue = meter.increment();
	if (currentValue > threashold) {
		if (meter.compareAndSet(currentValue, currentValue - threashold)) {
			increaseCounter(key, threashold);

When increasing a counter, we first get current value from cache (with the help of the loader. As descried above).
The compareAndSet will atomically check if has same value (not modified by another thread).
If so, it will update the value and return true.
If success (returned true), the the buffer calls the updater.

View the buffer

After developing the service, I wanted a way to view the buffer.
So I implemented the following method, which is used by the front-end layer (Dropwizard’s resource).
Small example of Java 8 Stream and Lambda expression.

return ImmutableMap.copyOf(cache.asMap())
		Collectors.toMap((entry) -> entry.getKey().toString(),
		(entry) -> entry.getValue().getValue()));


I chose MongoDB because of two reasons:

  1. We have similar implementation in our system, which we decided to use MongoDB there as well.
  2. Easy to use with embedded server.

I tried to design the system so it’s possible to choose any other persist implementation and change it.

I used morphia as the MongoDB client layer instead of using directly the Java client.
With Morphia you create a dao, which is the connection to a MongoDB collection.
You also declare a simple Java Bean (POJO), that represent a document in a collection.
Once you have the dao, you can do operations on the collection the “Java way”, with fairly easy API.
You can have queries and any other CRUD operations, and more.

I had two operations: increasing counter and getting all counters.
The services implementations do not extend Morphia’s BasicDAO, but instead have a class that inherits it.
I used composition (over inheritance) because I wanted to have more behavior for both services.

In order to be consistent with the key representation, and to hide the way it is implemented from the dependent code, I used an interface: Counterable with a single method: counterKey().

public interface Counterable {
	String counterKey();
final class MongoCountersDao extends BasicDAO<Counter, ObjectId> {
	MongoCountersDao(Datastore ds) {
		super(Counter.class, ds);

Increasing the Counter

protected void increaseCounter(String key, int value) {
	Query<Counter> query = dao.createQuery();
	UpdateOperations<Counter> ops = dao.getDs().createUpdateOperations(Counter.class).inc("count", value);
	dao.getDs().update(query, ops, true);

Embedded MongoDB

In order to run tests on the persistence layer, I wanted to use an in-memory database.
There’s a MongoDB plugin for that.
With this plugin you can run a server by just creating it on runtime, or run as goal in maven / task in Gradle.

Embedded MongoDB on Gradle

I will elaborate more on Gradle later, but here’s what I needed to do in order to set the embedded mongo.

dependencies {
	// More dependencies here
	testCompile 'com.sourcemuse.gradle.plugin:gradle-mongo-plugin:0.4.0'

Setup Properties

mongo {
	//	logFilePath: The desired log file path (defaults to 'embedded-mongo.log')
	logging 'console'
	mongoVersion 'PRODUCTION'
	port 12345
	//	storageLocation: The directory location from where embedded Mongo will run, such as /tmp/storage (defaults to a java temp directory)

Embedded MongoDB Gradle Tasks

startMongoDb will just start the server. It will run until stopping it.
stopMongoDb will stop it.
startManagedMongoDb test , two tasks, which will start the embedded server before the tests run. The server will shut down when the jvm finishes (the tests finish)

Although I only touch the tip of the iceberg, I started seeing the strength of Gradle.
It wasn’t even that hard setting up the project.

Gradle Setup

First, I created a Gradle project in eclipse (after installing the plugin).
I needed to setup the dependencies. Very simple. Just like maven.

One Big JAR Output

When I want to create one big jar from all libraries in Maven, I use the shade plugin.
I was looking for something similar, and found gradle-one-jar pluging.
I added that plugin
apply plugin: 'gradle-one-jar'
Added one-jar to classpath:

buildscript {
	repositories { mavenCentral() }
	dependencies {
		classpath 'com.sourcemuse.gradle.plugin:gradle-mongo-plugin:0.4.0'
		classpath 'com.github.rholder:gradle-one-jar:1.0.4'

And added a task:

mainClassName = 'org.eyalgo.server.dropwizard.CountersBufferApplication'
task oneJar(type: OneJar) {
	mainClass = mainClassName
	archiveName = 'counters.jar'
	mergeManifestFromJar = true

Those were the necessary actions I needed to do in order to make the application run.


Dropwizard is a stack of libraries that makes it easy to create web servers quickly.
It uses Jetty for HTTP and Jersey for REST. It has other mature libraries to create complicated services.
It can be used as an easy developed microservice.

As I explained in the introduction, I will not cover all of Dropwizard features and/or setup.
There are plenty of sites for that.
I will briefly cover the actions I did in order to make the application run.

Gradle Run Task

run { args 'server', './src/main/resources/config/counters.yml' }
First argument is server. Second argument is the location of the configuration file.
If you don’t give Dropwizard the first argument, you will get a nice error message of the possible options.

positional arguments:
  {server,check}         available commands

I already showed how to create one jar in the Gradle section.


In Dropwizard, you setup the application using a class that extends Configuration.
The fields in the class should align to the properties in the yml configuration file.

It is a good practice to put the properties in groups, based on their usage/responsibility.
For example, I created a group for mongo parameters.

In order for the configuration class to read the sub groups correctly, you need to create a class that align to the properties in the group.
Then, in the main configuration, add this class as a member and mark it with annotation: @JsonProperty.

private MongoServicesFactory servicesFactory = new MongoServicesFactory();
private BufferConfiguration bufferConfiguration = new BufferConfiguration();

Example: Changing the Ports

Here’s part of the configuration file that sets the ports for the application.

  adminMinThreads: 1
  adminMaxThreads: 64
    - type: http
      port: 9090
    - type: http
      port: 9091

Health Check

Dropwizard gives basic admin API out of the box. I changed the port to 9091.
I created a health check for MongoDB connection.
You need to extend HealthCheck and implement check method.

private final MongoClient mongo;
protected Result check() throws Exception {
	try {
		return Result.healthy();
	} catch (Exception e) {
		return Result.unhealthy("Cannot connect to " + mongo.getAllAddress());

Other feature are pretty much self-explanatory or simple as any getting started tutorial.

Ideas for Enhancement

The are some things I may try to add.

  • Add tests to the Dropwizard section.
    This project started as PoC, so I, unlike usually, skipped the tests in the server part.
    Dropwizard has Testing Dropwizard, which I want to try.
  • Different persistence implementation. (couchbase? Hazelcast?).
  • Injection using Google Guice. And with help of that, inject different persistence implementation.

That’s all.
Hope that helps.

Source code:

Linkedin Twitter facebook github