If you want to analyze data for your website or applications, consider learning ELK Stack or Elastic Stack that contains Elasticsearch, logstash, and Kibana dashboard.
Elasticsearch is a powerful analytics search engine that allows you to store, index, and search the documents of all types of data in real-time. But if you need your search engine to automatically scale, load-balanced then AWS Elasticsearch (Amazon OpenSearch) is for you.
In this tutorial, you will learn what is Elastic Stack, Elasticsearch, Logstash, kibana dashboard, and finally AWS Elasticsearch from Scratch, and believe me, this tutorial will be helpful for you.
Let’s get into it.
Related: Install ELK Stack on Ubuntu: Elasticsearch, Logstash, and Kibana Dashboard.
Table of Content
- What is ELK Stack or Elastic Stack?
- What is Elasticsearch ?
- QuickStart Kibana Dashboard
- What is Logstash?
- Features of Logstash
- What is AWS Elasticsearch or Amazon OpenSearch Service?
- Creating the Amazon Elasticsearch Service domain or OpenSearch Service domain
- Uploading data in AWS Elasticsearch
- Search documents in Kibana Dashboard
What is ELK Stack or Elastic Stack?
The ELK stack or Elastic Stack is used to describe a stack that contains: Elasticsearch, Logstash, and Kibana. The ELK stack allows you to aggregate logs from all your systems and applications, analyze these logs, and create visualizations for application and infrastructure monitoring, faster troubleshooting, security analytics, and more.
- E = Elasticsearch: Elasticsearch is a distributed search and analytics engine built on Apache Lucene
- L = Logstash: Logstash is an open-source data ingestion tool that allows you to collect data from a various sources and then transforms it and send it to your desired destination
- K = Kibana: Kibana is a data visualization and exploration tool for reviewing logs and events.
What is Elasticsearch ?
Elasticsearch is an analytics and full-text search engine built on the Apache Lucene search engine library where the indexing, search, and analysis operations occur. Elasticsearch is a powerful analytics search engine that allows you to store, index, and search the documents of all types of data in real-time.
Even if you have structured or unstructured text numerical data, Elasticsearch can efficiently store and index it in a way that supports fast searches. Some of the features of Elasticsearch are:
- Provides the search box on the website, web page or on applications.
- Stores and analyze the data and metrics.
- Logstash and Beats helps with collecting, aggregating the data and storing it in Elasticsearch.
- Elasticsearch is used in the machine learning.
- Elasticsearch stores complex data structures that have been serialized as JSON documents.
- If you have multiple Elasticsearch nodes in Elasticsearch cluster then documents are distributed across the cluster and can be accessed immediately from any node.
- Elasticsearch also has the ability to be schema-less, which means that documents can be indexed without explicitly specifying how to handle each of the different fields.
- The Elasticsearch REST APIs support structured queries, full text queries, and complex queries that combine the two.You can access all of these search capabilities using Elasticsearch’s comprehensive JSON-style query language (Query DSL).
- Elasticsearch index can be thought of as an optimized collection of documents and each document is a collection of fields, which are the key-value pairs that contain your data.
- Elasticsearch index is really just a logical grouping of one or more physical shards, where each shard is actually a self-contained index.
- There are two types of shards: primaries and replicas. Each document in an index belongs to one primary shard. The number of primary shards in an index is fixed at the time that an index is created, but the number of replica shards can be changed at any time.
- Sharding splits index or indices into smaller pieces. It is used so that more number of documents can be stored at index level, easier to fit large indices into nodes, improve query throughput. By default index have one shard and you can add more shards.
Elasticsearch provides REST API for managing your cluster and indexing and searching your data. For testing purposes, you can easily submit requests directly from the command line or through the Kibana dashboard by running the GET request in the Kibana console under dev tools, as shown below.
- You can find the Elasticsearch cluster health by running the below command where _cluster is API and health is the command.
- To check the Elasticsearch node details using below command.
- To check the Elasticsearch indices configured, run the below command. You will notice kibana is also listed as indices because kibana data is also stored in elasticsearch.
- To check the Primary and replica shards from a kibana console run the below request.
QuickStart Kibana Dashboard
Kibana allows you to search the documents, observe the data and analyze the data, visualize in charts, maps, graphs, and more for the Elastic Stack in the form of a dashboard. Your data can be structured or unstructured text, numerical data, time-series data, geospatial data, logs, metrics, security events.
Kibana also manages your data, monitor the health of your Elastic Stack cluster, and control which users have access to the Kibana Dashboard.
Kibana also allows you to upload the data into the ELK stack by uploading your file and optionally importing the data into an Elasticsearch index. Let’s learn how to import the data in the kibana dashboard.
- Create a file named shanky.txt and copy/paste the below content.
[ 6.487046] kernel: emc: device handler registered [ 6.489024] kernel: rdac: device handler registered [ 6.596669] kernel: loop0: detected capacity change from 0 to 51152 [ 6.620482] kernel: loop1: detected capacity change from 0 to 113640 [ 6.636498] kernel: loop2: detected capacity change from 0 to 137712 [ 6.668493] kernel: loop3: detected capacity change from 0 to 126632 [ 6.696335] kernel: loop4: detected capacity change from 0 to 86368 [ 6.960766] kernel: audit: type=1400 audit(1643177832.640:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=394 comm="apparmor_parser" [ 6.965983] kernel: audit: type=1400 audit(1643177832.644:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=396 comm="apparmor_parser"
- Once the file is uploaded successfully you will see the details of all code that you uploaded.
- Next create the elasticsearch index and click on import.
- After import is successful you will see the status of your elasticsearch index as below.
- Next, click View index in Discover as shown in the the previous image. Now you should be able to see the logs within elasticsearch index (shankyindex).
Kibana allows you to perform the below actions such as:
- Refresh, flush, and clear the cache of your indices or index.
- Define the lifecycle of an index as it ages.
- Define a policy for taking snapshots of your Elasticsearch cluster.
- Roll up data from one or more indices into a new, compact index.
- Replicate indices on a remote cluster and copy them to a local cluster.
- Alerting allows you to detect conditions in different Kibana apps and trigger actions when those conditions are met.
What is Logstash?
Logstash allows you to collect the data with real-time pipelining capabilities. Logstash allows you to collect data from various sources beats and push it to the elasticsearch cluster. With Logstash, any type of event is transformed using an array of input, filter, and output plugins, further simplifying the ingestion process.
Features of Logstash
Now that you have a basic idea about Logstash, let’s look at some of the benefits of Logstash, such as:
- Logstash hndle all types of logging data and easily ingest web logs like Apache, and application logs like log4j for Java.
- Logstash captures other log formats like syslog, networking and firewall logs.
- One of the main benefits of Logstash is to securely ingest logs with Filebeat.
What is AWS Elasticsearch or Amazon OpenSearch Service??
Amazon Elasticsearch Service or OpenSearch is a managed service that deploys and scales the Elasticsearch clusters in the cloud. Elasticsearch is an open-source analytical and search engine that performs real-time application monitoring and log analytics.
Amazon Elasticsearch service provisions all resources for Elasticsearch clusters and launches it. It also replaces the failed Elasticsearch nodes in the cluster automatically. Let’s look at some of the key features of the Amazon Elasticsearch Service.
- AWS Elasticsearch or Amazon OpenSearch can scale up to 3 PB of attached storage and works with various instance types.
- AWS Elasticsearch or Amazon OpenSearch easily integrates with other services such as IAM for security, VPC, AWS S3 for loading data, AWS Cloud Watch for monitoring and AWS SNS for alerts notifications.
Creating the Amazon Elasticsearch Service domain or OpenSearch Service domain
Now that you have a basic idea about the Amazon Elasticsearch Service domain or OpenSearch Service let’s create the Amazon Elasticsearch Service domain or OpenSearch Service domain using the Amazon Management console.
- Open your favorite web browser and navigate to the AWS Management Console and log in.
- While in the Console, click on the search bar at the top, search for ‘Elasticsearch’, and click on the Elasticsearch menu item.
Now Elasticsearch service has been replaced with Opensearch service.
- Creating a Amazon Elasticsearch domain is same as that of Elasticsearch cluster that means domains are clusters with the settings, instance types, instance counts, and storage resources that you specify. Click on create a new domain.
- Next, select the deployment type as Development and testing.
Next, select the below settings as defined below:
- For Configure domain provide the Elasticsearch domain name as “firstdomain”. A domain is the collection of resources needed to run Elasticsearch. The domain name will be part of your domain endpoint.
- For Data nodes, choose the t3.small.elasticsearch and ignore rest of the settings and click on NEXT.
- For Network configuration, choose Public access.
- For Fine-grained access control, choose Create master user and provide username as user and password as
Admin@123. Fine-grained access control keeps your data safe.
- For Domain access policy, choose Allow open access to the domain. Access policies control whether a request is accepted or rejected when it reaches the Amazon Elasticsearch Service domain.
- Further keep clicking on NEXT button and create the domain which takes few minutes for Domain to get Launched.
- After successful creation of Elasticsearch domain. Click on the firstdomain Elasticsearch domain.
Uploading data in AWS Elasticsearch
You can load streaming data into your Amazon Elasticsearch Service (Amazon ES) domain from many different sources like Amazon Kinesis Data Firehose, Amazon Cloud Watch Logs, Amazon S3, Amazon Kinesis Data Streams, Amazon DynamoDB, AWS Lambda functions as event handlers.
- In this tutorial you will use a sample data to upload the data. To upload the sample data go to the Elasticsearch domain URL using the username user and password Admin@123 and then click on Add data.
- Now use sample data and add e-commerce orders.
Search documents in Kibana Dashboard
Kibana is a popular open-source visualization tool that works with the AWS Elasticsearch service. It provides an interface to monitor and search the indexes. Let’s use Kibana to search the sample data you just uploaded in AWS ES.
- Now in the Elasticsearch domain URL itself, Click on Discover option on the left side to search the data.
- Now you will notice that Kibana has the data that got uploaded. You can modify the timelines and many other fields accordingly.
Kibana provided the data when we searched in the dashboard using the sample data you uploaded.
In this tutorial, you learned what Elastic Stack, Elasticsearch, Logstash, kibana dashboard, and AWS Elasticsearch from Scratch using Amazon Management console. Also, you learned t how to upload the sample data in AWS ES.
Now that you have a strong understanding of ELK Stack, Elasticsearch, kibana, and AWS Elasticsearch, which site are you planning to monitor using ELK Stack and components?