How to Launch AWS Elasticsearch using Terraform (Terraform aws elasticsearch)

Machine-generated data is growing exponentially, and getting insights is important for your business so that you can search unstructured or semi-structured data on your site. You need a search analytic solution with speed, scalability, flexibility, and real-time search, and this is possible with Latest Amazon OpenSearch Service ( successor to Amazon Elasticsearch Service.

In this tutorial, you will learn about Amazon OpenSearch Service, Amazon Elasticsearch, and how to create an Amazon Elasticsearch domain using Terraform.

Let’s get started.

Join 28 other followers

Table of Content

  1. What Is Amazon Elasticsearch Service?
  2. Features of Amazon Elasticsearch Service
  3. What is Amazon OpenSearch Service?
  4. Prerequisites
  5. Terraform files and Terraform directory structure
  6. Building Terraform Configuration for AWS Elasticsearch
  7. Verify AWS Elasticsearch in Amazon Account
  8. Conclusion

What Is Amazon Elasticsearch Service?

Amazon Elasticsearch Service is a distributed search and analytics engine mainly used for log analytics, full-text search, business analytics, and operational intelligence. It performs real-time application monitoring and log analytics.

In the Amazon Elasticsearch service, you need to send the data in JSON format using the API or Logstash. Then Elasticsearch automatically stores the data and adds a searchable reference to the document in clusters index, and you can search using Elasticsearch API.

AWS Elastic search Working
AWS Elastic search Working

Amazon Elasticsearch service creates the AWS Elasticsearch clusters and nodes. If the nodes fail in the cluster, then the failed Elasticsearch nodes are automatically replaced.

Features of Amazon Elasticsearch Service

  • Amazon Elasticsearch service can scale up to 3 PB of attached storage and works with various instance types.
  • Amazon Elasticsearch easily integrates with other services such as IAM for security such as Amazon VPC , AWS S3 for loading data , AWS Cloud Watch for monitoring and AWS SNS for alerts notifications.

What is Amazon OpenSearch Service?

Amazon OpenSearch Service is a managed service that allows you to deploy, operate and scale OpenSearch clusters in Amazon Cloud. While you create the OpenSearch cluster, you can select the search engine of your choice.

Amazon OpenSearch is a fully open-source search and analytics engine for use cases such as log analytics, real-time application monitoring, and clickstream analysis. 

The latest version of OpenSearch is 1.1 and supports all elasticsearch versions, such as 7.10. 7.9 etc.

Prerequisites

Terraform files and Terraform directory structure

Now that you know what is Amazon Elastic search and Amazon OpenSearch service are. Let’s now dive into Terraform files and Terraform directory structure that will help you write the Terraform configuration files later in this tutorial.

Terraform code, that is, Terraform configuration files, are written in a tree-like structure to ease the overall understanding of code with .tf format or .tf.json or .tfvars format. These configuration files are placed inside the Terraform modules.

Terraform modules are on the top level in the hierarchy where configuration files reside. Terraform modules can further call another child to terraform modules from local directories or anywhere in disk or Terraform Registry.

Terraform contains mainly five files as main.tf , vars.tf , providers.tf , output.tf and terraform.tfvars.

  1. main.tf – Terraform main.tf file contains the main code where you define which resources you need to build, update or manage.
  2. vars.tf – Terraform vars.tf file contains the input variables which are customizable and defined inside the main.tf configuration file.
  3. output.tf : The Terraform output.tf file is the file where you declare what output paraeters you wish to fetch after Terraform has been executed that is after terraform apply command.
  4. .terraform: This directory contains cached provider , modules plugins and also contains the last known backend configuration. This is managed by terraform and created after you run terraform init command.
  5. terraform.tfvars files contains the values which are required to be passed for variables that are refered in main.tf and actually decalred in vars.tf file.
  6. providers.tf – The povider.tf is the most important file whrere you define your terraform providers such as terraform aws provider, terraform azure provider etc to authenticate with the cloud provider.

Building Terraform Configuration for AWS Elasticsearch

Now that you know what are Terraform configurations files look like and how to declare each of them. In this section, you will learn how to build Terraform configuration files for AWS Elasticsearch before running Terraform commands. Let’s get into it.

  • Log in to the Ubuntu machine using your favorite SSH client.
  • Create a folder in opt directory named terraform-Elasticsearch and switch to that folder.
mkdir /opt/terraform-Elasticsearch
cd /opt/terraform-Elasticsearch
  • Create a file named main.tf inside the /opt/terraform-Elasticsearch directory and copy/paste the below content. The below file creates the below components:
    • Creates domains are clusters with the settings, instance types, instance counts, and storage resources that you specify.
    • Creates the AWS Elasticsearch domain policy.
# Creating the Elasticsearch domain

resource "aws_elasticsearch_domain" "es" {
  domain_name           = var.domain
  elasticsearch_version = "7.10"

  cluster_config {
    instance_type = var.instance_type
  }
  snapshot_options {
    automated_snapshot_start_hour = 23
  }
  vpc_options {
    subnet_ids = ["subnet-0d8c53ffee6d4c59e"]
  }
  ebs_options {
    ebs_enabled = var.ebs_volume_size > 0 ? true : false
    volume_size = var.ebs_volume_size
    volume_type = var.volume_type
  }
  tags = {
    Domain = var.tag_domain
  }
}

# Creating the AWS Elasticsearch domain policy

resource "aws_elasticsearch_domain_policy" "main" {
  domain_name = aws_elasticsearch_domain.es.domain_name
  access_policies = <<POLICIES
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "es:*",
            "Principal": "*",
            "Effect": "Allow",
            "Resource": "${aws_elasticsearch_domain.es.arn}/*"
        }
    ]
}
POLICIES
}
  • Create one more file named vars.tf inside the /opt/terraform-Elasticsearch directory and copy/paste the below content. This file contains all the variables that are referred in the main.tf configuration file.
variable "domain" {
    type = string
}
variable "instance_type" {
    type = string
}
variable "tag_domain" {
    type = string
}
variable "volume_type" {
    type = string
}
variable "ebs_volume_size" {}

  • Create one more file named outputs.tf inside the /opt/terraform-Elasticsearch directory and copy/paste the below content. This file contains all the outputs variables that will be used to display he output after running the terraform apply command.
output "arn" {
    value = aws_elasticsearch_domain.es.arn
} 
output "domain_id" {
    value = aws_elasticsearch_domain.es.domain_id
} 
output "domain_name" {
    value = aws_elasticsearch_domain.es.domain_name
} 
output "endpoint" {
    value = aws_elasticsearch_domain.es.endpoint
} 
output "kibana_endpoint" {
    value = aws_elasticsearch_domain.es.kibana_endpoint
}

  • Create another file and name it as provider.tf. This file allows Terraform to interact with AWS cloud using AWS API.
provider "aws" {
  region = "us-east-2"
}
  • Create one more file terraform.tfvars inside the same folder and copy/paste the below content. This file contains the values of the variables that you declared in vars.tf file and refered in main.tf file.
domain = "newdomain" 
instance_type = "r4.large.elasticsearch"
tag_domain = "NewDomain"
volume_type = "gp2"
ebs_volume_size = 10
  • Now your folder should have all files as shown below and should look like.
 Terraform-elasticsearch folder
Terraform-elasticsearch folder
  • Now your files and code are ready for execution. Initialize the terraform using the terraform init command.
terraform init
Initializing the terraform using the terraform init command.
Initializing the terraform using the terraform init command.
  • Terraform initialized successfully , now its time to run the plan command which provides you the details of the deployment. Run terraform plan command to confirm if correct resources is going to provisioned or deleted.
terraform plan
Running the terraform plan command
Running the terraform plan command
The output of the terraform plan command
The output of the terraform plan command
  • After verification, now its time to actually deploy the code using terraform apply command.
terraform apply
Running the terraform apply command
Running the terraform apply command

Verify AWS Elasticsearch in Amazon Account

Terraform commands terraform init→ terraform plan→ terraform apply all executed successfully. But it is important to manually verify the AWS Elasticsearch domain on the AWS Management console.

  • Open your favorite web browser and navigate to the AWS Management Console and log in.
  • While in the Console, click on the search bar at the top, search for ‘Elasticsearch’, and click on the Elasticsearch menu item.
 Search for ‘Elasticsearch’ in AWS console
Search for ‘Elasticsearch’ in AWS console
  • Now you will see that the newdomain that you specified in Terraform configuration file is created succesfully.
AWS Elasticsearch domain created successfully
AWS Elasticsearch domain created successfully
  • Next, click on newdomain to check the details of the newly created domain.
Check the details of the newly created domain.
Check the details of the newly created domain.

In the new Amazon OpenSearch service, you should see something like below.

Amazon Open search domain
Amazon Open search domain

Join 28 other followers

Conclusion

In this tutorial, you learned Amazon Elasticsearch and how to create an Amazon Elasticsearch domain using Terraform.

Now that you have a strong basic understanding of AWS Elasticsearch, which documents will you upload for indexing and searching?

Learn ELK Stack from Scratch: Elasticsearch, Logstash, Kibana dashboard, and AWS Elasticsearch

If you want to analyze data for your website or applications, consider learning ELK Stack or Elastic Stack that contains Elasticsearch, logstash, and Kibana dashboard.

Elasticsearch is a powerful analytics search engine that allows you to store, index, and search the documents of all types of data in real-time. But if you need your search engine to automatically scale, load-balanced then AWS Elasticsearch (Amazon OpenSearch) is for you.

In this tutorial, you will learn what is Elastic Stack, Elasticsearch, Logstash, kibana dashboard, and finally AWS Elasticsearch from Scratch, and believe me, this tutorial will be helpful for you.

Let’s get into it.

Related: Install ELK Stack on Ubuntu: Elasticsearch, Logstash, and Kibana Dashboard.

Join 28 other followers

Table of Content

  1. What is ELK Stack or Elastic Stack?
  2. What is Elasticsearch ?
  3. QuickStart Kibana Dashboard
  4. What is Logstash?
  5. Features of Logstash
  6. What is AWS Elasticsearch or Amazon OpenSearch Service?
  7. Creating the Amazon Elasticsearch Service domain or OpenSearch Service domain
  8. Uploading data in AWS Elasticsearch
  9. Search documents in Kibana Dashboard
  10. Conclusion

What is ELK Stack or Elastic Stack?

The ELK stack or Elastic Stack is used to describe a stack that contains: Elasticsearch, Logstash, and Kibana. The ELK stack allows you to aggregate logs from all your systems and applications, analyze these logs, and create visualizations for application and infrastructure monitoring, faster troubleshooting, security analytics, and more.

  • E = Elasticsearch: Elasticsearch is a distributed search and analytics engine built on Apache Lucene
  • L = Logstash: Logstash is an open-source data ingestion tool that allows you to collect data from a various sources and then transforms it and send it to your desired destination
  • K = Kibana: Kibana is a data visualization and exploration tool for reviewing logs and events.
ELK Stack architecture
ELK Stack architecture

What is Elasticsearch ?

Elasticsearch is an analytics and full-text search engine built on the Apache Lucene search engine library where the indexing, search, and analysis operations occur. Elasticsearch is a powerful analytics search engine that allows you to store, index, and search the documents of all types of data in real-time.

Even if you have structured or unstructured text numerical data, Elasticsearch can efficiently store and index it in a way that supports fast searches. Some of the features of Elasticsearch are:

  • Provides the search box on the website, web page or on applications.
  • Stores and analyze the data and metrics.
  • Logstash and Beats helps with collecting, aggregating the data and storing it in Elasticsearch.
  • Elasticsearch is used in the machine learning.
  • Elasticsearch stores complex data structures that have been serialized as JSON documents.
  • If you have multiple Elasticsearch nodes in Elasticsearch cluster then documents are distributed across the cluster and can be accessed immediately from any node.
  • Elasticsearch also has the ability to be schema-less, which means that documents can be indexed without explicitly specifying how to handle each of the different fields.
  • The Elasticsearch REST APIs support structured queries, full text queries, and complex queries that combine the two.You can access all of these search capabilities using Elasticsearch’s comprehensive JSON-style query language (Query DSL).
  • Elasticsearch index can be thought of as an optimized collection of documents and each document is a collection of fields, which are the key-value pairs that contain your data.
  • Elasticsearch index is really just a logical grouping of one or more physical shards, where each shard is actually a self-contained index.
  • There are two types of shards: primaries and replicas. Each document in an index belongs to one primary shard. The number of primary shards in an index is fixed at the time that an index is created, but the number of replica shards can be changed at any time.
  • Sharding splits index or indices into smaller pieces. It is used so that more number of documents can be stored at index level, easier to fit large indices into nodes, improve query throughput. By default index have one shard and you can add more shards.
Elasticsearch Cluster
Elasticsearch Cluster

Elasticsearch provides REST API for managing your cluster and indexing and searching your data. For testing purposes, you can easily submit requests directly from the command line or through the Kibana dashboard by running the GET request in the Kibana console under dev tools, as shown below.

<IP-address-of-elasticsearch>/app/dev_tools#/console
Kibana console with Dev tools
Kibana console with Dev tools
  • You can find the Elasticsearch cluster health by running the below command where _cluster is API and health is the command.
GET _cluster/health
Checking the health of the Elasticsearch cluster
Checking the health of the Elasticsearch cluster
  • To check the Elasticsearch node details using below command.
GET _cat/nodes?v
Checking the health of the elasticsearch node
Checking the health of the elasticsearch node
  • To check the Elasticsearch indices configured, run the below command. You will notice kibana is also listed as indices because kibana data is also stored in elasticsearch.
GET _cat/indices
Checking the Elasticsearch indices on the elasticsearch cluster
Checking the Elasticsearch indices on the elasticsearch cluster
  • To check the Primary and replica shards from a kibana console run the below request.
GET _cat/shards
Checking all the primary shards and replica shards in elasticsearch cluster
Checking all the primary shards and replica shards in the elasticsearch cluster

QuickStart Kibana Dashboard

Kibana allows you to search the documents, observe the data and analyze the data, visualize in charts, maps, graphs, and more for the Elastic Stack in the form of a dashboard. Your data can be structured or unstructured text, numerical data, time-series data, geospatial data, logs, metrics, security events.

Kibana also manages your data, monitor the health of your Elastic Stack cluster, and control which users have access to the Kibana Dashboard.

Kibana also allows you to upload the data into the ELK stack by uploading your file and optionally importing the data into an Elasticsearch index. Let’s learn how to import the data in the kibana dashboard.

  • Create a file named shanky.txt and copy/paste the below content.
[    6.487046] kernel: emc: device handler registered
[    6.489024] kernel: rdac: device handler registered
[    6.596669] kernel: loop0: detected capacity change from 0 to 51152
[    6.620482] kernel: loop1: detected capacity change from 0 to 113640
[    6.636498] kernel: loop2: detected capacity change from 0 to 137712
[    6.668493] kernel: loop3: detected capacity change from 0 to 126632
[    6.696335] kernel: loop4: detected capacity change from 0 to 86368
[    6.960766] kernel: audit: type=1400 audit(1643177832.640:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=394 comm="apparmor_parser"
[    6.965983] kernel: audit: type=1400 audit(1643177832.644:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=396 comm="apparmor_parser"
  • Once the file is uploaded successfully you will see the details of all code that you uploaded.
Data uploaded in the kibana
Data uploaded in the kibana
Details of the data uploaded in the kibana
Details of the data uploaded in the kibana
  • Next create the elasticsearch index and click on import.
Creating the elasticsearch index on Elasticsearch cluster
Creating the elasticsearch index on the Elasticsearch cluster
  • After import is successful you will see the status of your elasticsearch index as below.
Status of file upload in kibana
Status of file upload in kibana
  • Next, click View index in Discover as shown in the the previous image. Now you should be able to see the logs within elasticsearch index (shankyindex).
Checking the logs in kibana with newly created index
Checking the logs in kibana with newly created index

Kibana allows you to perform the below actions such as:

  • Refresh, flush, and clear the cache of your indices or index.
  • Define the lifecycle of an index as it ages.
  • Define a policy for taking snapshots of your Elasticsearch cluster.
  • Roll up data from one or more indices into a new, compact index.
  • Replicate indices on a remote cluster and copy them to a local cluster.
  • Alerting allows you to detect conditions in different Kibana apps and trigger actions when those conditions are met.

What is Logstash?

Logstash allows you to collect the data with real-time pipelining capabilities. Logstash allows you to collect data from various sources beats and push it to the elasticsearch cluster. With Logstash, any type of event is transformed using an array of input, filter, and output plugins, further simplifying the ingestion process.

Working of Logstash
Working of Logstash

Features of Logstash

Now that you have a basic idea about Logstash, let’s look at some of the benefits of Logstash, such as:

  • Logstash hndle all types of logging data and easily ingest web logs like Apache, and application logs like log4j for Java.
  • Logstash captures other log formats like syslog, networking and firewall logs.
  • One of the main benefits of Logstash is to securely ingest logs with Filebeat.

What is AWS Elasticsearch or Amazon OpenSearch Service??

Amazon Elasticsearch Service or OpenSearch is a managed service that deploys and scales the Elasticsearch clusters in the cloud. Elasticsearch is an open-source analytical and search engine that performs real-time application monitoring and log analytics.

Amazon Elasticsearch service provisions all resources for Elasticsearch clusters and launches it. It also replaces the failed Elasticsearch nodes in the cluster automatically. Let’s look at some of the key features of the Amazon Elasticsearch Service.

  • AWS Elasticsearch or Amazon OpenSearch can scale up to 3 PB of attached storage and works with various instance types.
  • AWS Elasticsearch or Amazon OpenSearch easily integrates with other services such as IAM for security, VPC, AWS S3 for loading data, AWS Cloud Watch for monitoring and AWS SNS for alerts notifications.

Creating the Amazon Elasticsearch Service domain or OpenSearch Service domain

Now that you have a basic idea about the Amazon Elasticsearch Service domain or OpenSearch Service let’s create the Amazon Elasticsearch Service domain or OpenSearch Service domain using the Amazon Management console.

  • While in the Console, click on the search bar at the top, search for ‘Elasticsearch’, and click on the Elasticsearch menu item.

Now Elasticsearch service has been replaced with Opensearch service.

Searching for Elasticsearch service
Searching for Elasticsearch service
  • Creating a Amazon Elasticsearch domain is same as that of Elasticsearch cluster that means domains are clusters with the settings, instance types, instance counts, and storage resources that you specify. Click on create a new domain.
Creating an Amazon Elasticsearch domain
Creating an Amazon Elasticsearch domain
  • Next, select the deployment type as Development and testing.
Choosing the deployment type.
Choosing the deployment type.

Next, select the below settings as defined below:

  • For Configure domain provide the Elasticsearch domain name as “firstdomain”. A domain is the collection of resources needed to run Elasticsearch. The domain name will be part of your domain endpoint.
  • For Data nodes, choose the t3.small.elasticsearch and ignore rest of the settings and click on NEXT.
  • For Network configuration, choose Public access.
  • For Fine-grained access control, choose Create master user and provide username as user and password as Admin@123. Fine-grained access control keeps your data safe.
  • For Domain access policy, choose Allow open access to the domain. Access policies control whether a request is accepted or rejected when it reaches the Amazon Elasticsearch Service domain.
  • Further keep clicking on NEXT button and create the domain which takes few minutes for Domain to get Launched.
Viewing the Elasticsearch domain or Elasticcluster endpoint
Viewing the Elasticsearch domain or Elasticcluster endpoint
  • After successful creation of Elasticsearch domain. Click on the firstdomain Elasticsearch domain.
firstdomain Elasticsearch domain
Elasticsearch domain (first domain)

Uploading data in AWS Elasticsearch

You can load streaming data into your Amazon Elasticsearch Service (Amazon ES) domain from many different sources like Amazon Kinesis Data Firehose, Amazon Cloud Watch Logs, Amazon S3, Amazon Kinesis Data Streams, Amazon DynamoDB, AWS Lambda functions as event handlers.

  • In this tutorial you will use a sample data to upload the data. To upload the sample data go to the Elasticsearch domain URL using the username user and password Admin@123 and then click on Add data.
Adding data in Elasticsearch
Adding data in Elasticsearch
  • Now use sample data and add e-commerce orders.
 sample data to add e-commerce orders in Elasticsearch cluster
sample data to add e-commerce orders in Elasticsearch cluster

Search documents in Kibana Dashboard

Kibana is a popular open-source visualization tool that works with the AWS Elasticsearch service. It provides an interface to monitor and search the indexes. Let’s use Kibana to search the sample data you just uploaded in AWS ES.

  • Now in the Elasticsearch domain URL itself, Click on Discover option on the left side to search the data.
Click on the Discover option
Click on the Discover option.
  • Now you will notice that Kibana has the data that got uploaded. You can modify the timelines and many other fields accordingly.
Viewing the data in Kibana dashboard
Viewing the data in the Kibana dashboard

Join 28 other followers

Kibana provided the data when we searched in the dashboard using the sample data you uploaded.

Conclusion

In this tutorial, you learned what Elastic Stack, Elasticsearch, Logstash, kibana dashboard, and AWS Elasticsearch from Scratch using Amazon Management console. Also, you learned t how to upload the sample data in AWS ES.

Now that you have a strong understanding of ELK Stack, Elasticsearch, kibana, and AWS Elasticsearch, which site are you planning to monitor using ELK Stack and components?