Arelastic for your Elasticsearch Queries pt.1

Update 08/2015: Elasticsearch-ruby has its own DSL. It’s officially supported by Elasticsearch so be sure to check it out.

When doing more than just a simple search with Elasticsearch-rails, a naive approach will lead you to this mess:

response = Article.search query:     { match:  { title: "Fox Dogs" } },
                          highlight: { fields: { title: {} } }

Don’t be a sucker! Read on to see how Arelastic can save you pain.

If you don’t know already, the Elasticsearch (ES) API takes a JSON hash when perfoming queries or filters or both. Elasticsearch’s DSL is incredibly flexibly, but can quickly lead you down the path of unmaintanable hashes.

Here is the json for the always popular filtered query, the search for a string after removing those outside the filter. In this case, the filter being greater than or equal to yesterday:

{
  "filtered": {
    "query": {
      "match": { "tweet": "full text search" }
    },
    "filter": {
      "range": { "created": { "gte": "now - 1d / d" }}
    }
  }
}

Now let’s say you want to change the query from “full text search” to “Rihanna” (obviously). One might be tempted to have a method that drops in a variable:

def filtered_gte_query(query_term, filter_term)
  {
    filtered: {
      query: {
        match: { tweet: query_term }
      },
      filter: {
        range: { created: { gte: filter_term }}
      }
    }
  }
end

And now we’re walking down the path to hell. Heaven forbid we want a filter than does less than or equal to, supports AND queries, or is chainable. The number of methods and hashes will spiral out of control.

Introducing Arelastic

Modeled after Rails’ Arel which is a SQL Abstract Syntax Tree (AST) manager, Arelastic is an AST manager for Elasticsearch Queries.

Rather than working with hashes, you work with objects that represent nodes in the AST:

range = Arelastic::Builders::Filter['book_id'].gteq(filter_term)
filter = Arelastic::Searches::Filter.new(range)

query = Arelastic::Queries::Match.new "tweet", query_term

dsl = Arelastic::Searches::Query.new(
  Arelastic::Queries::Filtered.new(query, filter)).as_elastic

Article.search dsl

Here’s one interesting line of code:

Arelastic::Builders::Filter['book_id'].gteq(...)

The gteq method exists alongside many other range filters, allowing one to make a chainable API much alike ActiveRecord. Funny enough, there exists an ElasticRecord that’s built on top of this, and precedes the new official Elasticsearch-rails.

The community is behind Elasticsearch-rails, but it has yet to introduce an Arel equivalent, and we all want to move towards that. Here’s hoping we can take Arelastic and incorporate it into Elasticsearch-rails so we no longer have to tame unwieldy hashes.

Now the implementation might look cryptic, but to those familiar with ES, you can see on the ES DSL’s page here all the query and filter nodes available, and know there’s an equivalent mapping in the Arelastic library. This would be the building block to an ORM like interface we’re far more comfortable with.

Then we can write Elasticsearch queries like this:

Gift.search_builder.filter('color' => 'red').match('flower')

After my first bold attempt right here to create a chainable API using monads, you can see how much simpler it is to create complicated Elasticsearch Queries:

###Consumption

class GiftsController < ApplicationController
  def index
    search_params = SearchBuilder.
      filter(organization_id: current_organization.id).
      filter(user_id: current_user.id).
      filter(created_at: { gte: 5.days.ago }).
      sort('id desc')

    @gifts = Gift.search(search_params).page(params[:page]).records
  end
  ...
end

In part 2, we’ll take the next step support even more ElasticSearch queries and see what it would take to make this badboy a gem.

comments powered by Disqus