Parse elasticsearch Results Using Ruby

One of our modules in our project is an elasticsearch cluster.
In order to fine tune the configuration (shards, replicas, mapping, etc.) and the queries, we created a JMeter environment.

I wanted to test a simple query with many different input parameters, which will return results.
I.e. query for documents that exist.

The setup for JMeter is simple.
I created the query I want to check as a POST parameter.
In that query, instead of putting one specific value, which means sending the same values in the query over and over, I used parameter.
I directed JMeter to read from a file (CSV) the parameters.

The next thing was to create that data file.
A file, which consists of rows with real values from the cluster.

For that I used another query, which I ran against the cluster using CURL.
(I am changing some parameters naming)

{
   "fields":[
      "FIELD_1"
   ],
   "size":10000,
   "query":{
      "constant_score":{
         "filter":{
            "bool":{
               "must":[
                  {
                     "term":{
                        "LIVE":true
                     }
                  },
                  {
                     "exists":{
                        "field":"FIELD_1"
                     }
                  }
               ]
            }
         }
      }
   }
}

I piped the result into a file.
Here’s a sample of the file (I changed the names of the index, document type and values for this example):

{
  "took" : 586,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 63807792,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "my_index",
      "_type" : "the_document",
      "_id" : "1111111",
      "_score" : 1.0,
      "fields" : {
        "FIELD_1" : "123"
      }
    }, {
      "_index" : "my_index",
      "_type" : "the_document",
      "_id" : "22222222",
      "_score" : 1.0,
      "fields" : {
        "FIELD_1" : "12345"
      }
    }, {
      "_index" : "my_index",
      "_type" : "the_document",
      "_id" : "33333333",
      "_score" : 1.0,
      "fields" : {
        "FIELD_1" : "4456"
      }
    } ]
  }
}

The next thing was parsing this json file, taking only FIELD_1 and put the value in a new file.
For that I used Ruby:

#!/usr/bin/ruby

require 'rubygems'
require 'json'
require 'pp'

input_file = ARGV[0]
output_file = ARGV[1]

json = File.read(input_file)
obj = JSON.parse(json)
hits = obj['hits']

actual_hits = hits['hits']
begin
  file = File.open(output_file, "w")
  actual_hits.each do |hit|
    fields = hit['fields']
    field1 = fields['FIELD_1']
    file.puts(field1)
  end
rescue IOError => e
  # there was an error
ensure
  file.close unless file == nil
end

Important note:
There’s a shorter, better, way to write to file in Ruby:

File.write(output_file, field1)

Unfortunately I can’t use it, as I have older Ruby version and I can’t upgrade it in our sandbox environment.

Linkedin Twitter facebook github

Advertisement

Learn Ruby

I decided to learn a new language, which is different than Java and thought that Ruby would be a good candidate.
After searching the web for good tutorials and not just basic ‘Hello World’ I found this one:
http://rubykoans.com/

I liked the idea that the learning is step by step and it feels that there’s lot of thinking behind it.
What I mostly liked was the TDD approach in the learning method.

At the beginning the tests are simple and teach the basic syntax and semantics.
Gradually the tests become more interesting and complex.
Each test file starts with ‘about_’.

The first interesting and more challenging test was the triangle. Not due to the “algorithm”, but the usage of another file. I tried to have separate class for validation. Just to make a habit.
Then came the ‘calculate score’. The basic solution was simple, but it took me several iterations to make the code cleaner and get familiar with Ruby Hash usage.
I wonder whether I did make it better and cleaner.

Inheritance exercise was simple although I needed to grasp the meaning of: “inheritance and cross-methods”
Modules: I didn’t really understand the modules usage in a class. What is it good for?
So “about scope” helped me understand it.
I really like the idea that the tests are built one on top of the other.

‘About Proxy’ is really interesting! It took me the longest time to solve.
It is based on many previous assignments and was a real challenge. I had to recheck classes, symbols, arrays and many other tests.
There’s an extra credit assignment, which I’ll do next.
In the meantime, here’s the code in GitHub:

https://github.com/eyalgo/ruby-koans-exercise

Resources
http://rubykoans.com/
http://www.ruby-doc.org/
https://github.com/bbatsov/ruby-style-guide