The latest addition to Amazon Web Services (AWS) is a Business Intelligence (BI) tool called Amazon QuickSight. Built directly into AWS, QuickSight makes it easy to build visualizations, perform ad-hoc analysis and quickly get business insights from your data. We think QuickSight will take the market by storm because of its feature set, pricing and integration. This is the first in a series of posts on how to use Amazon QuickSight.
QuickSight is unique in that it allows you to join structured and unstructured data to create quick, meaningful visualizations from data that is stored in across many data sources such as databases, data warehouses and flat files. It enables the correlation of items like logs to user information and much more.
Pricing for Amazon QuickSight is unbelievably reasonable given it’s power. At US $9 per month per user, QuickSight creates a low barrier to entry for Business Intelligence. QuickSight requires a yearly commitment; however, there is a 60 day free trial for up to four users.
Getting Started with QuickSight
First things first: Unlike any other AWS service, you have to register for QuickSight (QS). QS is not available for role-based access and therefore federated user accounts don’t have access. You must use an IAM User account to access QS. While you can use the root account, we only suggest doing so in dire situations.
QuickSight requires read access to your resources. When registering for QS you are prompted to allow access to one or all resources that QS can connect to natively. These resources are Relational Database Service (RDS), [pullquote]Would you like our help getting started with QuickSight? Call us at (800) 931-8354.[/pullquote]Redshift, IAM and S3. You can limit S3 access to particular buckets. RDS and Redshift access is used to list instances, enabling you to choose RDS or Redshift instances to connect to. IAM access allows you to send invites to other users in your account to view QS. You can add permissions to QS after the initial prompting by clicking on your name in the top-right corner and selecting from the dropdown menu Manage QuickSight, then Manage permissions.
QuickSight allows you to ingest 1GB of data by default. If you need more storage, which is likely, you must request a limit increase. Simply request a larger SPICE limit from the AWS support center.
Creating a Data Set in QuickSight
You must create a data set in QS before you can visualize anything. There are many types and ways to ingest data sets. In this first part I describe working with S3.
We took a small but interesting data set from the Internet in CSV format and uploaded it to our data lake bucket. From there we built a QuickSight manifest file. Many file formats are available such as the Redshift manifest file format. However, we chose to use the QuickSight manifest file format. The following is an example JSON file that we used to tell QS to ingest all the data from our bucket directory:
This manifest tells QS to ingest the CSV data from the S3 bucket rbn-datalake within the directory interesting-data. It also instructs QS that there is a header in these files, and sets our text qualifier to default and the delimiter to a comma. You can ingest multiple files at once by adding to the array or by using the URIprefixes object within fileLocations for QuickSight to scan and ingest all objects that match the format.
To create a data set in QuickSight, select the Manage data button at the top right-hand side of the dashboard. You will be prompted for a data set name and a manifest file. You can specify a URL or upload the manifest file. Once your input is entered you can edit or visualize your data. Once our data was ingested we validated that the data looked correct with the data set editor. To get to the data set editor after data has been entered, select your data set from the Manage data sets view and select Edit data set.
Visualizing Data Sets in QuickSight
You can visualize your data directly after ingestion from the link provided or create an analysis from the main dashboard. In the upper left-hand corner select New analysis and choose your data set. You are prompted to either refresh, edit, delete or share your data, as well as the option to create an analysis. Select Create an analysis. You are taken to the analysis screen, and QuickSight automatically starts to graph your data. You can select a particular graph type and your metrics.
To create meaningful graphs you need some type of metric. QS will automatically determine which fields are metrics and which are measures. If your data does not have any metrics, you can easily turn any of the data types, strings, dates, integers or decimals into a metric with QuickSight. On the left-hand side of the analysis screen there is a field list, which represents columns of data. On each field name you can select the field and also a dropdown. Within the dropdown you have the option to convert to metric. Numerical or date-based measures allow you to graph aggregates of the data. When converting a string to a metric you only get a count.
At the top of the analysis screen is the Field Well. Clicking on the white space or labels of this field will expand or collapse this area of the analysis tool. The Field Well shows the fields applied for the measure, metric and dimensions of the graph. You can select multiple fields for metrics and measures on some graph types where applicable.
The visualization area can be modified to add a title and description, as well as more visualizations. In the upper left-hand corner of the console there’s an add sign, which opens a dropdown, enabling you to add these items. If you add another visualization you’ll be able to organize and resize the tiles as you see fit. Below is a example of the interface with multiple visualizations (click to enlarge).
In my next blog post I’ll cover:
Making alterations to data
Ingesting from RDS and Redshift