dropwizard-jobs – My First Open Source Contribution

I am very exited today.
Today I did an actual contribution to the open source community.
I helped publishing Java libraries to maven central.

The library we published is a plugin for dropwizard that uses quartz:
https://github.com/spinscale/dropwizard-jobs

You can check it out. The README explains how to use it.
In this post I will not explain the plugin, but I will share my contribution experience to an open source project.

Why Even Contribute

There are so many reasons. Google is full of them.
I did it because I really wanted to help the community (In this case, the originators of the code).
It improves my skill-sets. I know now more than I knew before.
Exposed to technologies and processes which I usually don’t use.
Part of my digital signature and branding.

Why This Project

I know about dropwizard for more than a year.
I didn’t have the chance to use it at work.
I did some experiments with dropwizard to get the filling of it.

In one of my POCs, I wanted to create a scheduling mechanism in the micro-service I created.
By searching Google, I found this project.
First of all, I liked what it does and how.
I also liked the explanation (how to use it). It’s clear and I could work with it immediately.
I think the developers did a good job.

How It (my contribution) All Started

But one thing was missing. It wasn’t in maven repository (central or any other public repository).
So I asked whether the developers plan to publish it.

Issue #10 in the repository shows my question and the beginning of the conversation.
Issue 10, question from 2015/02/24

Basically the problem was the time to spend in order to comply requirements. The code itself was working.

My Contribution

I took upon myself to publish it to public maven repository.
I have never done something like that, so I wasn’t sure what to do.
I thought of using bintray by JFrog.
Eventually I decided to use sonatype. It felt more comfortable. So I started reading about OSSRH (Open Source Project Repository Hosting).
There’s an explanation for that below.

I forked the code to my GitHub account and used pull requests in order to merge the code I pushed.
I mostly modified the pom files so comply Sonatype requirements as explained in the tutorials.

Once we were all set, I did the actual publishing.
And now it’s there. Everyone can use it.

At first I was extra careful with any change. After all, “it’s not my code”…
Over time, I felt more comfortable modifying and pull requesting.

How To Upload to Sonatype

I used the tutorials, which explain clearly what to do.
http://central.sonatype.org/pages/ossrh-guide.html

  1. Create a user at OSSRH
  2. Open an issue with links to GitHub. Group ID and artifact ID
  3. Follow instructions (In our case, I had to modify the maven’s groups ID)
  4. Add the correct plugins to the pom file
    maven
    pgp – read it carefully
  5. Deploy

Contributors

Linkedin Twitter facebook github

Advertisements

Fedora Installation

Aggregate Installation Tips

One of the reasons I am writing this blog, is to keep “log” for myself on how I resolved issues.

In this post I will describe how I installed several basic development tools on a Fedora OS.
I want this laptop to be my workstation for out-of-work projects.

Almost everything in this post can be found elsewhere in the web.
Actually, most of what I am writing here is from other links.

However, this post is intended to aggregate several installations together.

If you’re new to Linux (or not an expert, as I am not), you can learn some basic stuff here.
How to install (yum), how to create from source code, how to setup environment variables and maybe other stuff.

First, we’ll start with how I installed Fedora.

Installing Fedora

I downloaded Fedora ISO from https://getfedora.org/en/workstation/.
It is Gnome distribution.
I then used http://www.linuxliveusb.com/ to create a self bootable USB. It’s very easy to use.
I switched to KDE by running: sudo yum install @kde-desktop

Installing Java

Download the rpm package Oracle site.

Under /etc/profile.d/ , create a file (jdk_home.sh) with the following content:

I used the following link, here’d how to install JDK
http://www.if-not-true-then-false.com/2014/install-oracle-java-8-on-fedora-centos-rhel/

Installing Intellij

Location: https://www.jetbrains.com/idea/download/

Check https://www.jetbrains.com/idea/help/basics-and-installation.html

After installation, you can go to /opt/idea/latest/bin and run idea.sh
Once you run it, you will be prompt to create a desktop entry.
You can create a command line launcher later on as well.

Installing eclipse

Location: http://www.eclipse.org/downloads/

Create executable /usr/bin/eclipse

Create Desktop Launcher

See also http://www.if-not-true-then-false.com/2010/linux-install-eclipse-on-fedora-centos-red-hat-rhel/

Installing Maven

Download https://maven.apache.org/download.cgi

Setting maven environment

Installing git

I wanted to have the latest git client.
Using yum install did not make it, so I decided to install from source code.
I found a great blog explaining how to do it.
http://tecadmin.net/install-git-2-0-on-centos-rhel-fedora/
Note: in the compile part, he uses export to /etc/bashrc .
Don’t do it. Instead create a file under /etc/profile.d
Installation commands

git Environment
Create an ‘sh’ file under /etc/profile.d

Linkedin Twitter facebook github

Java 8 Stream and Lambda Expressions – Parsing File Example

Recently I wanted to extract certain data from an output log.
Here’s part of the log file:

2015-01-06 11:33:03 b.s.d.task [INFO] Emitting: eVentToRequestsBolt __ack_ack [-6722594615019711369 -1335723027906100557]
2015-01-06 11:33:03 c.s.p.d.PackagesProvider [INFO] ===---> Loaded package com.foo.bar
2015-01-06 11:33:04 b.s.d.executor [INFO] Processing received message source: eventToManageBolt:2, stream: __ack_ack, id: {}, [-6722594615019711369 -1335723027906100557]
2015-01-06 11:33:04 c.s.p.d.PackagesProvider [INFO] ===---> Loaded package co.il.boo
2015-01-06 11:33:04 c.s.p.d.PackagesProvider [INFO] ===---> Loaded package dot.org.biz

I decided to do it using the Java8 Stream and Lambda Expression features.

Read the file
First, I needed to read the log file and put the lines in a Stream:

Stream<String> lines = Files.lines(Paths.get(args[1]));

Filter relevant lines
I needed to get the packages names and write them into another file.
Not all lines contained the data I need, hence filter only relevant ones.

lines.filter(line -> line.contains("===---> Loaded package"))

Parsing the relevant lines
Then, I needed to parse the relevant lines.
I did it by first splitting each line to an array of Strings and then taking the last element in that array.
In other words, I did a double mapping. First a line to an array and then an array to a String.

.map(line -> line.split(" "))
.map(arr -> arr[arr.length - 1])

Writing to output file
The last part was taking each string and write it to a file. That was the terminal operation.

.forEach(packageName -> writeToFile(fw, packageName));

writeToFile is a method I created.
The reason is that Java File System throws IOException. You can’t use checked exceptions in lambda expressions.

Here’s a full example (note, I don’t check input)

import java.io.FileWriter;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Stream;

public class App {
	public static void main(String[] args) throws IOException {
		Stream<String> lines = null;
		if (args.length == 2) {
			lines = Files.lines(Paths.get(args[1]));
		} else {
			String s1 = "2015-01-06 11:33:03 b.s.d.task [INFO] Emitting: adEventToRequestsBolt __ack_ack [-6722594615019711369 -1335723027906100557]";
			String s2 = "2015-01-06 11:33:03 b.s.d.executor [INFO] Processing received message source: eventToManageBolt:2, stream: __ack_ack, id: {}, [-6722594615019711369 -1335723027906100557]";
			String s3 = "2015-01-06 11:33:04 c.s.p.d.PackagesProvider [INFO] ===---> Loaded package com.foo.bar";
			String s4 = "2015-01-06 11:33:04 c.s.p.d.PackagesProvider [INFO] ===---> Loaded package co.il.boo";
			String s5 = "2015-01-06 11:33:04 c.s.p.d.PackagesProvider [INFO] ===---> Loaded package dot.org.biz";
			List<String> rows = Arrays.asList(s1, s2, s3, s4, s5);
			lines = rows.stream();
		}
		
		new App().parse(lines, args[0]);

	}
	
	private void parse(Stream<String> lines, String output) throws IOException {
		final FileWriter fw = new FileWriter(output);
		
		//@formatter:off
		lines.filter(line -> line.contains("===---> Loaded package"))
		.map(line -> line.split(" "))
		.map(arr -> arr[arr.length - 1])
		.forEach(packageName-> writeToFile(fw, packageName));
		//@formatter:on
		fw.close();
		lines.close();
	}

	private void writeToFile(FileWriter fw, String packageName) {
		try {
			fw.write(String.format("%s%n", packageName));
		} catch (IOException e) {
			throw new RuntimeException(e);
		}
	}

}

Linkedin Twitter facebook github

Playing With Java Concurrency

Recently I needed to transform some filet that each has a list (array) of objects in JSON format to files that each has separated lines of the same data (objects).

It was a one time task and simple one.
I did the reading and writing using some feature of Java nio.
I used GSON in the simplest way.
One thread runs over the files, converts and writes.

The whole operation finished in a few seconds.

However, I wanted to play a little bit with concurrency.
So I enhanced the tool to work concurrently:

Threads
Runnable for reading file.
The reader threads are submitted to ExecutorService.
The output, which is a list of objects (User in the example), will be put in a BlockingQueue.

Runnable for writing file.
Each runnable will poll from the blocking queue.
It will write lines of data to a file.
I don’t add the writer Runnable to the ExecutorService, but instead just start a thread with it.
The runnable has a while(some boolen is true) {...} pattern.
More about that below…

Synchronizing Everything
BlockingQueue is the interface of both types of threads.

As the writer runnable runs in a while loop (consumer), I wanted to be able to make it stop so the tool will terminate.
So I used two objects for that:

Semaphore
The loop that reads the input files increments a counter.
Once I finished traversing the input files and submitted the writers, I initialized a semaphore in the main thread:
semaphore.acquire(numberOfFiles);

In each reader runable, I released the semaphore:
semaphore.release();

AtomicBoolean
The while loop of the writers uses an AtomicBoolean.
As long as AtomicBoolean==true, the writer will continue.

In the main thread, just after the acquire of the semaphore, I set the AtomicBoolean to false.
This enables the writer threads to terminate.

Using Java NIO
In order to scan, read and write the file system, I used some features of Java NIO.

Scanning: Files.newDirectoryStream(inputFilesDirectory, "*.json");
Deleting output directory before starting: Files.walkFileTree...
BufferedReader and BufferedWriter: Files.newBufferedReader(filePath); Files.newBufferedWriter(fileOutputPath, Charset.defaultCharset());

One note. In order to generate random files for this example, I used apache commons lang: RandomStringUtils.randomAlphabetic
All code in GitHub.

public class JsonArrayToJsonLines {
	private final static Path inputFilesDirectory = Paths.get("src\\main\\resources\\files");
	private final static Path outputDirectory = Paths
			.get("src\\main\\resources\\files\\output");
	private final static Gson gson = new Gson();
	
	private final BlockingQueue<EntitiesData> entitiesQueue = new LinkedBlockingQueue<>();
	
	private AtomicBoolean stillWorking = new AtomicBoolean(true);
	private Semaphore semaphore = new Semaphore(0);
	int numberOfFiles = 0;

	private JsonArrayToJsonLines() {
	}

	public static void main(String[] args) throws IOException, InterruptedException {
		new JsonArrayToJsonLines().process();
	}

	private void process() throws IOException, InterruptedException {
		deleteFilesInOutputDir();
		final ExecutorService executorService = createExecutorService();
		DirectoryStream<Path> directoryStream = Files.newDirectoryStream(inputFilesDirectory, "*.json");
		
		for (int i = 0; i < 2; i++) {
			new Thread(new JsonElementsFileWriter(stillWorking, semaphore, entitiesQueue)).start();
		}

		directoryStream.forEach(new Consumer<Path>() {
			@Override
			public void accept(Path filePath) {
				numberOfFiles++;
				executorService.submit(new OriginalFileReader(filePath, entitiesQueue));
			}
		});
		
		semaphore.acquire(numberOfFiles);
		stillWorking.set(false);
		shutDownExecutor(executorService);
	}

	private void deleteFilesInOutputDir() throws IOException {
		Files.walkFileTree(outputDirectory, new SimpleFileVisitor<Path>() {
			@Override
			public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
				Files.delete(file);
				return FileVisitResult.CONTINUE;
			}
		});
	}

	private ExecutorService createExecutorService() {
		int numberOfCpus = Runtime.getRuntime().availableProcessors();
		return Executors.newFixedThreadPool(numberOfCpus);
	}

	private void shutDownExecutor(final ExecutorService executorService) {
		executorService.shutdown();
		try {
			if (!executorService.awaitTermination(120, TimeUnit.SECONDS)) {
				executorService.shutdownNow();
			}

			if (!executorService.awaitTermination(120, TimeUnit.SECONDS)) {
			}
		} catch (InterruptedException ex) {
			executorService.shutdownNow();
			Thread.currentThread().interrupt();
		}
	}


	private static final class OriginalFileReader implements Runnable {
		private final Path filePath;
		private final BlockingQueue<EntitiesData> entitiesQueue;

		private OriginalFileReader(Path filePath, BlockingQueue<EntitiesData> entitiesQueue) {
			this.filePath = filePath;
			this.entitiesQueue = entitiesQueue;
		}

		@Override
		public void run() {
			Path fileName = filePath.getFileName();
			try {
				BufferedReader br = Files.newBufferedReader(filePath);
				User[] entities = gson.fromJson(br, User[].class);
				System.out.println("---> " + fileName);
				entitiesQueue.put(new EntitiesData(fileName.toString(), entities));
			} catch (IOException | InterruptedException e) {
				throw new RuntimeException(filePath.toString(), e);
			}
		}
	}

	private static final class JsonElementsFileWriter implements Runnable {
		private final BlockingQueue<EntitiesData> entitiesQueue;
		private final AtomicBoolean stillWorking;
		private final Semaphore semaphore;

		private JsonElementsFileWriter(AtomicBoolean stillWorking, Semaphore semaphore,
				BlockingQueue<EntitiesData> entitiesQueue) {
			this.stillWorking = stillWorking;
			this.semaphore = semaphore;
			this.entitiesQueue = entitiesQueue;
		}

		@Override
		public void run() {
			while (stillWorking.get()) {
				try {
					EntitiesData data = entitiesQueue.poll(100, TimeUnit.MILLISECONDS);
					if (data != null) {
						try {
							String fileOutput = outputDirectory.toString() + File.separator + data.fileName;
							Path fileOutputPath = Paths.get(fileOutput);
							BufferedWriter writer = Files.newBufferedWriter(fileOutputPath, Charset.defaultCharset());
							for (User user : data.entities) {
								writer.append(gson.toJson(user));
								writer.newLine();
							}
							writer.flush();
							System.out.println("=======================================>>>>> " + data.fileName);
						} catch (IOException e) {
							throw new RuntimeException(data.fileName, e);
						} finally {
							semaphore.release();
						}
					}
				} catch (InterruptedException e1) {
				}
			}
		}
	}

	private static final class EntitiesData {
		private final String fileName;
		private final User[] entities;

		private EntitiesData(String fileName, User[] entities) {
			this.fileName = fileName;
			this.entities = entities;
		}
	}
}

Linkedin Twitter facebook github

Using Groovy for Bash (shell) Operations

Recently I needed to create a groovy script that deletes some directories in a Linux machine.
Here’s why:
1.
We have a server for doing scheduled jobs.
Jobs such as ETL from one DB to another, File to DB etc.
The server activates clients, which are located in the machines we want to have action on them.
Most (almost all) of the jobs are written in groovy scripts.

2.
Part of our CI process is deploying a WAR into a dedicated server.
Then, we have a script that among other things uses soft-link to direct ‘webapps’ to the newly created directory.
This deployment happens once an hour, which fills up the dedicated server quickly.

So I needed to create a script that checks all directories in the correct location and deletes old ones.
I decided to keep the latest 4 directories.
It’s currently a magic number in the script. If I want / need I can make it as an input parameter. But I decided to start simple.

I decided to do it very simple:
1. List all directories with prefix webapp_ in a known location
2. Sort them by time, descending, and run delete on all starting index 4.

def numberOfDirectoriesToKeep = 4
def webappsDir = new File('/usr/local/tomcat/tomcat_aps')
def webDirectories = webappsDir.listFiles().grep(~/.*webapps_.*/)
def numberOfWeappsDirectories = webDirectories.size();

if (numberOfWeappsDirectories >= numberOfDirectoriesToKeep) {
  webDirectories.sort{it.lastModified() }.reverse()[numberOfDirectoriesToKeep..numberOfWeappsDirectories-1].each {
    logger.info("Deleteing ${it}");
    // here we'll delete the file. First try was doing a Java/groovy command of deleting directories
  }
} else {
  logger.info("Too few web directories")
}

It didn’t work.
Files were not deleted.
It happened that the agent runs as a different user than the one that runs tomcat.
The agent did not have permissions to remove the directories.

My solution was to run a shell command with sudo.

I found references at:
http://www.joergm.com/2010/09/executing-shell-commands-in-groovy/
and
http://groovy.codehaus.org/Executing+External+Processes+From+Groovy

To make a long story short, here’s the full script:

BTW,
There’s a minor bug of indexes, which I decided not to fix (now), as we always have more directories.

Linkedin Twitter facebook github

JUnit Rules

Introduction
In this post I would like to show an example of how to use JUnit Rule to make testing easier.

Recently I inherited a rather complex system, which not everything is tested. And even the tested code is complex.
Mostly I see lack of test isolation.
(I will write a different blog about working with Legacy Code).

One of the test (and code) I am fixing actually tests several components together.
It also connect to the DB. It tests some logic and intersection between components.
When the code did not compile in a totally different location, the test could not run because it loaded all Spring context.
The structure was that before testing (any class) all Spring context was initiated.
The tests extend BaseTest, which loads all Spring context.

BaseTest also cleans the DB in the @After method.

Important note: This article is about changing tests, which are not structured entirely correct.
When creating new code and tests they should be isolated, testi one thing etc.
Better tests should use mock DB / dependencies etc.
After I fix the test and refactor, I’ll have confidence making more changes.

Back to our topic…
So, what I got is slow run of the test suit, no isolation and even problem running tests due to unrelated problems.

So I decided separating the context loading with DB connection and both of them from the cleaning up of the database.

Approach
In order to achieve that I did three things:
The first was to change inheritance of the test class.
It stopped inheriting BaseTest.
Instead it inherits AbstractJUnit4SpringContextTests
Now I can create my own context per test and not load everything.

Now I needed two rules, a @ClassRule and @Rule
@ClassRule will be responsible for DB connection
@Rule will cleanup the DB after / before each test

But first, what are JUnit Rules?
A short explanation would be that they provide a possibility to intercept test method, similar to AOP concept.
@Rule allows us to intercept method before and after the actual run of the method.
@ClassRule intercepts test class run.
A very known @Rule is JUnit’s TemporaryFolder.

(Similar to @Before, @After and @BeforeClass).

Creating @Rule
The easy part was to create a Rule that cleanup the DB before and after a test method.
You need to implement TestRule, which has one method: Statement apply(Statement base, Description description);
You can do a-lot with it.
I found out that usually I will have an inner class that extends Statement.
The rule I created did not create the DB connection, but got it in the constructor.

Here’s the full code:

Creating @ClassRule
ClassRule is actually also TestRule.
The only difference from Rule is how we use it in our test code.
I’ll show it below.

The challenge in creating this rule was that I wanted to use Spring context to get the correct connection.
Here’s the code:
(ExternalResource is TestRule)

(Did you see that I could make DbCleanupRule inherit ExternalResource?)

Using it
The last part is how we use the rules.
A @Rule must be public field.
A @ClassRule must be public static field.

And there it is:

That’s all.
Hope it helps.

Eyal

[Edit]
I got some good remarks from Logan Mzz at DZone: http://java.dzone.com/articles/junit-rules#comment-125673

  1. Link to Junit Rules: https://github.com/junit-team/junit/wiki/Rules
  2. There’s ErrorCollector rule, which avoids annoying test-fail-fix cycles for a single test.
  3. And RuleChain, which described in the comment

Linkedin Twitter facebook github