Adding documents and querying Elasticsearch

Elasticsearch is an opensource text search engine. It is accessible using RESTFul API’s and uses JSON documents to store data. It allows users to search very large amounts of data at a very high speed. Since it is written using the Java programming language it can run on many platforms. See this blog post on how to install, configure and start Elasticsearch 8.5.0.

In this blog post I will show you how you can create an index, insert data into the index and query the data in Elasticsearch. In the rest of this post, I assume that you are executing the curl commands from the host that is running the Elasticsearch server. Hence all the curl queries are directed to localhost. If you are running the curl commands from a client machine, replace localhost with your actual hostname (or ip address) for the server running Elasticsearch.

Inserting a single document into an Index

curl -k -POST https://localhost:9200/messages/_doc -H "Content-Type: application/json" -d'
{
  "msg_id" : 1,
  "msg_case_id" : 55,
  "msg_message" : "Hi This is xyz from def media company"
}' --user "elastic:Password1"

In the command shown above, “messages” is the Index that we are inserting the document into. We are also providing the Elasticsearch username and password to set the POST requirement to Elasticsearch. The -k is used to tell curl to accept the selft signed certificated from Elasticsearch.

Bulk inserting documents into an Index

First create a file named msg.json with the following lines

{"index":{"_id":"1"}}
{ "msg_id" : 1, "msg_case_id" : 55, "msg_message" : "Hi This is xyz from def media company" }
{"index":{"_id":"2"}} 
{ "msg_id" : 2, "msg_case_id" : 55, "msg_message" : "We provide targeted advertising to different platforms" }
{"index":{"_id":"3"}}
{ "msg_id" : 3, "msg_case_id" : 55, "msg_message" : "Includes TV, Radio, Online, Social Media etc" }
{"index":{"_id":"4"}}
{ "msg_id" : 4, "msg_case_id" : 55, "msg_message" : "Our conversion ratios are very high" }
{"index":{"_id":"5"}}
{ "msg_id" : 5, "msg_case_id" : 55, "msg_message" : "provides search engine optimization" }

You can batch insert all the documents in the file above to the Elasticsearch index named “messages” using the command below

curl -k -X POST https://localhost:9200/messages/_bulk?pretty -H "Content-Type: application/x-ndjson" --user "elastic:Password1" --data-binary @msg.json

Note that we are using the content-type x-ndjson here

You can delete all the documents in the “messages” index, using the command below

curl -k -XPOST 'https://localhost:9200/messages/_delete_by_query?conflicts=proceed&pretty' -H 'Content-Type: application/json' --user "elastic:Password1" -d'
{
    "query": {
        "match_all": {}
    }
}'

Querying documents

You can query all the documents from the “messages” index using the command below

curl -k -X GET 'https://localhost:9200/messages/_search' --user "elastic:Password1"

You can query specific documents that match specific criteria using the command below

curl -k -X POST "https://localhost:9200/messages/_search?pretty" -H 'Content-Type: application/json' --user "elastic:Password1" -d'
{
  "query": {
    "bool": {
      "filter": [
        {
        "term": {
          "msg_message": "ratio"
        }
        },
        {
        "term": {
          "msg_id": "4"
        }
        }
      ]
    }
  }
}
'

The above query will display the documents where the word ratio occurs in the msg_message property and when the msg_id property is 4.

In this blog post I have shown you how to insert documents into Elasticsearch indexes and query them.

Installing Elasticsearch and Kibana 8.5.0

Elasticsearch is an opensource search engine based on the Lucene library and Kibana is an opensource data visualization dashboard for Elasticsearch. In this post I will show you how to install Elasticsearch and Kibana on a virtual machine (In this case running the Amazon Linux 2 Operating System).

Install the required packages

sudo yum update -y
sudo yum group install "Development Tools" -y
yum install wget -y
yum install readline-devel -y
yum install openssl-devel -y
 

Install ElasticSearch

curl -O "https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.5.0-linux-x86_64.tar.gz"
tar -xvf elasticsearch-8.5.0-linux-x86_64.tar.gz
chown -R ec2-user:ec2-user /home/ec2-user/elastic
cd /home/ec2-user/elastic/elasticsearch-8.5.0/config

Add the following lines to elasticsearch.yml

transport.host: localhost
transport.port: 9300
http.port: 9200
network.host: 0.0.0.0

Install Kibana

cd /home/ec2-user/
curl -O "https://artifacts.elastic.co/downloads/kibana/kibana-8.5.0-linux-x86_64.tar.gz"
tar -xzf kibana-8.5.0-linux-x86_64.tar.gz
chown -R ec2-user:ec2-user /home/ec2-user/kibana-8.5.0
cd /home/ec2-user/kibana-8.5.0/config

Add the following lines to kibana.yml

server.host=<your-host-ip-address>

In the line shown above, make sure to replace <your-host-ip-address> with your hosts actual ip address.

Start Elasticsearch and Kibana

cd /home/ec2-user/elastic/elasticsearch-8.5.0/bin
nohup ./elasticsearch -Epath.data=data -Epath.logs=log" &
cd /home/ec2-user/kibana-8.5.0/bin
nohup ./kibana &

Setup a new password for the Elasticsearch superuser

cd /home/ec2-user/elastic/elasticsearch-8.5.0
bin/elasticsearch-reset-password -u elastic --interactive

follow the prompts to setup a new password for the user elastic.

At this time elasticsearch has been installed and started and you can start creating your indexes and running your queries.

Listing Errors from Cloudwatch logs using Aws Cli

The following commands can be used to list the Error messages from cloudwatch logs, produced from DMS (Database Migration service).

First list the log groups

aws logs describe-log-groups

Next list the log streams in your log group

aws logs describe-log-streams --log-group-name <YourLogGroupNameHere>

Next list the error messages. Within the DMS log, the Errors are indicated with a pattern “E:” within the error string, so that is the pattern we search for.

aws logs filter-log-events --log-group-name <YourLogGroupNameHere> --log-stream-names <YourLogStreamNameHere> --filter-pattern "[message = \"*E:*\"]" --query 'events[*].message'

If you are searching in cloudwatch logs produced from other sevices, you should replace E: with the pattern that flags Error messages for that service.