Scroll Top

Starting with Elasticsearch Query DSL

queryDSL
Query DSL is a type of language, which you can use to send queries to elasticsearch database. Elasticsearch query DSL is based on json.
By default Elasticsearch uses mechanism called relevance score, which is used to present how well document matches query and is often used in unstructured data types, like social media posts. Usually data in Energy Logserver are sent from IT environments, which mostly have structured data format. That is why this paper ignores scoring and relevance evaluations, although it is absolutely possible to use this mechanism in Energy Logserver.
We can build Query DSL form with two types of search context: query and filter.
  • Query context is used for full text searches. We use it to find specific values or string in documents. We can use regexp syntax or Lucene syntax, making query context universal and easy to use.
  • Filter context is available in compound form of query. It is generally faster, because it doesn’t operate on full text. Instead of that, documents are returned based on matching/boolean requests, where elasticsearch answer is “Yes” or “No”.

For example:
Does the field “response_code” have a value of “200”?
Does “timestamp” is between year 2021 and 2022?

Query DSL can be created in one of two forms: basic or compound.

  • Basic query form is (as name suggests) simple request for documents based on query.
  • Compound form is used to create more advanced and combined queries.

In this paper we will focus only on basic form.

Examples of basic query structure

Queries are built with multiple clauses and sent to database through “_search” API.
Basic query, where we search anywhere in the document for a string “admin from index logs-05.2022, can be as simple as:

GET logs*/_search?pretty
{
  "query": {                #every query starts with this clause
    "query_string" : {      #this is a type of query
      "query" : "admin"     #this is parameter name and searched value
    }
  }
}

“query” is always opening section. Inside it we can use different search types, like Term, Regexp, Range, Wildcard, Match, and more. Each of them is used to search for data in a different way.

 

Term – returns documents that contain an exact term in a provided field.
GET logs*/_search
{
  "query": {
    "term": {
      "login": {
        "value": "admin"
      }
    }
  }
}

We can see that after “term” we open section with field name, in this case “login” and enter value that we are looking for in “value” field, which is “admin“.

Range – returns documents that contains values within a provided range.
GET logs*/_search
{
  "query": {
    "range": {
      "attempts": {
        "gte": 1,
        "lte": 3
      }
    }
  }
}

After “range” clause we open section with field name, in this case “attempts” and enter values for the range search. Providing second range limit is optional.

Regexp – returns a documents that contain terms matching a regular expression.
GET logs*/_search
{
  "query": {
    "regexp": {
      "login": {
        "value": "a[a-z]+n"
      }
    }
  }
}

After “regexp” clause we open section with field name, in this case “login” and enter regular expression for the search.

What’s next?
There a lot of different query options and clauses that you can use. Experience and curiosity makes learning much faster and easier, so stay curious of your own data!
Remember that if you’ll have any questions, feel free to use our community where our engineers and developers are also present!