Comprehensive course on Amazon Athena 2020
Data analysis is a very complex process and there has always been attempts to ease it. There are many tools for analytics, and even the popular tech giant Amazon provides an AWS service named Amazon Athena. This Amazon Athena tutorial will guide you through the basics and advance usage of Amazon Athena.
Amazon Athena is an interactive data analysis tool used to process complex queries in relatively less time. It is server-less hence, there is no hassle of setting up and doesn’t require managing the infrastructure. It is not a Database service hence, you just pay for the queries you run. You just point your data in S3, define the schema required and with a standard SQL you are good to go.
On November 20, 2016, Amazon launched Athena as one of its services. As described earlier, Amazon Athena is a serverless query service that makes analysis of data, using standard SQL, stored in Amazon S3 simpler. With few clicks in the AWS Management Console, customers can point Amazon Athena at their data stored in Amazon S3 and run queries using standard SQL to get results in seconds.
With Amazon Athena, there is no infrastructure to set up or manage, and the customer pays only for the queries they run. Amazon Athena scales automatically, executing queries in parallel, which gives fast results, even with a large dataset and complex queries. Now, that you what is Amazon Athena let me take you through the difference it has compared to SQL Server.
What is AWS Athena?
· Is a query service that uses standard SQL
· Uses data stored as objects on Amazon S3
· Has no infrastructure to manage
· You only pay only for the queries you run
· Based on Facebook Presto, an open-source distributed Presto SQL query engine
· Supports CSV, JSON, Gzip files and columnar formats like Apache Parquet
· Performance scales automatically based on query profiling
Use Of Amazon Athena
If you are a Data Analyst and have an experience of analysing data stored on S3, you will relate to this,
Data Analysts/Developers: Do you offer Storage?
Data Analysts/Developers: Do you have tools for Analytics?
AWS: Not sure.”
Amazon worked on this and came up with Amazon Athena. Now, you have a tool to play with your data. Athena helps you analyze unstructured, semi-structured and structured data that is stored in Amazon S3. Using Athena you can create dynamic queries for your dataset. Athena also works with AWS Glue to give you a better way to store the metadata in S3.
Supported Business Intelligence Tools
AWS Athena also integrates with sophisticated BI tools like Tableau, Looker, Mode Analytics, AWS QuickSight, and others for advanced reports and visualisations, and it should be in your consideration set. This is particularly true for businesses that want the simplicity of using Athena for spot or ad hoc data analysis.
AWS Athena is embracing the pay-for-usage pricing model. This can be attractive who thought the power of this kind of querying system was out of their budget or required complex systems and DevOps support.
It can also add value and reduce costs in multi-cloud deployments. We worked with a customer that would send Adobe event data to an AWS data lake to support an enterprise Oracle Cloud environment. Using a query engine was an efficient and cost-effective data consumption pattern for the Oracle BI environment.
Features Of Athena
Out of the many services provided by Amazon, Athena is one of the services. It has many features that makes it suitable for Data Analysis. Let’s take a look at the different features one by one.
1. Easy Implementation: Athena doesn’t require installation. It can be accessed directly from the AWS Console also directly by AWS CLI.
2. Serverless: It is serverless, so the end-user doesn’t need to worry about infrastructure, configuration, scaling or failure. Athena takes care of everything on its own.
3. Pay per query: Athena charges you only for the query you run, i.e. the amount of data that is managed per query. You can save a lot if you can compress them and format your dataset accordingly.
4. Fast: Athena is a very fast analytics tool. It can perform complex queries in less time by breaking the complex queries into simpler ones and run them parallelly, then combine the results to give the desired output.
5. Secure: With the help of IAM policies and AWS Identity, Athena gives you complete control over the data set. As the data is stored in S3 buckets, IAM policies can help you manage control to users.
6. Highly available: With the assurance of AWS, Athena is highly available and the user can execute queries round the clock. As AWS is 99.999% available, so is Athena.
7. Integration: The best feature of Athena is that it can be integrated with AWS Glue. AWS Glue will help the user to create a better-unified data repository. This helps you create better versioning of data, better tables, views, etc.
8. Accessing Amazon Athena
Accessing Athena is very easy and it can be done by either:
o AWS Console
o AWS CLI
o Athena with your JDBC
These are few of the ways to access Amazon Athena. By now, you pretty much know everything important about Amazon Athena. Let’s me walk you through the different features of Athena.