Although there are lots of Storage service which stores ample of data but when it comes to analyzing the data performance has always remained a challenge. The issues with performance are like unable to retrieve data within time, storage leakage etc.
To solve these issues AWS Amazon provides its own managed service for both storing Gigs, Terabyte of data and then analyzing the data and the service is AWS Redshift.
In this tutorial you will learn about Amazons data warehouse and analytic service AWS Redshift , What is AWS Redshift cluster and how to create AWS Redshift cluster using AWS Management console.
Table of Content
- What is Amazon Redshift?
- What is Amazon Redshift Cluster?
- Amazon Redshift Cluster overview
- How to Create a basic Redshift Cluster using AWS Management console
What is Amazon Redshift?
Amazon Redshift is a AWS analytical service which is used to analyze the data. Amazon Redshift allows us to store massive data and analyze the data using query on the database. It is fully managed service that means you don’t need to worry about scalability and infrastructure.
First step to upload the data is to create the set of nodes which is known ad Amazon Redshift cluster. Cluster contains groups of nodes. Once Cluster is created then you can upload tons of data ( in Gigabits) and then start analyzing the data.
Amazon Redshift manages everything for you such as monitoring, scaling , applying patches, upgrades , capacity whatever is required at infrastructure end.
What is Amazon Redshift Cluster?
Amazon Redshift cluster can contain a single node or more than one node. It all depends on the requirements. IF you wish to create more than one node then that is known as cluster. AWS Redshift Cluster contains one leader node and other nodes are known as compute nodes.
You can create AWS Redshift cluster using various ways such as:
- AWS Command Line interface ( AWS CLI )
- AWS Management console
- AWS SDK’s ( Software Development kit) libraries .
Amazon Redshift Cluster overview
Lets see some of the concepts of Amazon Redshift cluster.
- Redshift cluster snapshots can be created either manually or automatically & are stored in AWS S3 bucket.
- Administrator assigns IAM permissions on Redshift cluster if any users wants to access it.
- Amazon cloud watch is primarily used to capture health and performance of Amazon Redshift cluster.
- As soon as you create Amazon Redshift cluster one database is also created. This database is used to query and analyze the data. While you provision the cluster you need to provide master user which is superuser for the database & has all rights.
- When a client queries Redshift cluster all the request are received by leader node , it further parses and develop query execution plans. Leader node coordinates with compute node and then provide final results to clients.
- You must have AWS account in order to setup AWS Redshift cluster. If you don’t have AWS account, please create a account from here AWS account.
- You must have access to create IAM role and AWS Redshift cluster.
- (Optional) : If you have AWS Administrator rights then it will be helpful.
How to Create a basic Redshift Cluster using AWS Management console
Before we start creating a Redshift cluster we need an IAM role which Redshift will assume to work with other services such as AWS S3 etc. So lets get started.
- Open your browser and and go to AWS Management console and on the top search for IAM , here click on
- Next , click on Create Role.
- Next , select service as Redshift
- Now , scroll down to the bottom and you will see “Select your use case”, here choose Redshift – Customizable, then choose Next: Permissions.
- Now attach
AmazonS3ReadOnlyAccesspolicy and click N
- Next , skip tagging as of now just click on Next: Tags and then Review & finally hit Create Role.
- IAM role is created successfully , keep the IAM role ARN handy with you:
- Now on AWS Management console search for Redshift on the top of the page.
- Now click on Create Cluster and provide the name of cluster . As this is the demo , we will use free trial cluster.
- Now , provide the database details and save them for future. Also Associate IAM role which we created earlier.
- Finally click on Create cluster
- By Now, AWS Redshift cluster is created successfully and available for use.
- Lets validate our database connection by running a simple query. Click on Query data
- Now Enter Database credentials for making the connecting to AWS Redshift cluster ( dev database was created by default)
- Now Run a query as below
- Some of the tables inside the database like events , date were created by default.
select * from date
This confirms that AWS Redshift Cluster is created successfully and we are able to hit queries on it .
In this tutorial we learnt about Amazons data warehouse and analytic service AWS Redshift , What is AWS Redshift cluster and how to create AWS Redshift cluster using AWS Management console.
By learning this Service now you are ready with working with Gigs and Terabyte of data and analyze it with best performance.