Evaluate your SIEM
Get the guideComplete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.
December 9, 2020
Amazon Redshift is a cloud-based data warehousing solution that makes it easy to collect and analyze large quantities of data within the cloud.
Cloud data warehouse services like Redshift can remove some of the performance and availability pain-points associated with on-premises data warehousing, but they are not a silver bullet. Getting the most out of Redshift requires carefully monitoring Redshift clusters in order to identify stability issues and performance bottlenecks.
Below, we take a look at the tools and processes that you can use to monitor Redshift, as well as some best practices for working with the monitoring data that you collect from Redshift clusters.
Redshift is a data warehouse service that is part of the Amazon cloud. Redshift clusters serve as central repositories where organizations can store different types of data, then analyze it using SQL queries. Using Redshift, you could collect all of the invoicing and sales data for your business, for example, and analyze it to identify relevant trends that stretch across different data sets.
Redshift is similar to traditional relational databases like MySQL in that it stores data in a structured way. However, a major difference between Redshift and most conventional databases is that Redshift uses a column-oriented rather than a row-oriented structure. This approach can improve I/O rates and, in turn, lead to faster database performance.
If you want a traditional database for storing individual data sets, your best solution is to set up a platform like MySQL. Redshift and other data warehousing solutions, however, are better for use cases where you need to store multiple types of data in a single location while still keeping it structured and enabling fast queries.
To get the best value out of Redshift, it’s important to optimize the performance of your Redshift clusters. Your goal should be to maximize the number of queries you can run in a given period of time while minimizing latency, which can reduce query response rates. Redshift monitoring can also help to identify underperforming nodes that are dragging down your overall cluster.
By using effective Redshift monitoring to optimize query speed, latency, and node health, you will achieve a better experience for your end-users while also simplifying the management of your Redshift clusters for your IT team.
Cost is a factor worth considering for Redshift monitoring, too. Redshift pricing is based largely on the volume of data you store and the amount of compute and memory resources assigned to your clusters. As a result, poorly performing clusters will cost the same amount of money as those that achieve optimal performance. This means that you’ll effectively end up paying more for each query on a cluster that does not respond as quickly as you’d like as you would on one that is properly monitored for performance issues.
The default approach to Redshift monitoring is to use CloudWatch, Amazon’s native monitoring tool, to track metrics associated with your Redshift clusters. CloudWatch automatically collects a variety of metrics from your clusters and makes them viewable through a Web-based monitoring interface. You can also use Amazon CLI or SDK tools to view CloudWatch data.
At a high level, the RedShift metrics available in CloudWatch can be broken down into four main categories:
CloudWatch allows you to track these metrics in real time. In addition, you can configure CloudWatch alarms, which will send you an alert when a metric surpasses whichever threshold you define. That way, you’ll be notified if CPU utilization exceeds a certain amount or the number of queries handled per second declines below a certain level, for example.
Although CloudWatch is the primary Redshift monitoring tool available from Amazon, Amazon also provides cluster-level monitoring metrics directly in the Redshift console.
Although it is possible to monitor Redshift using only Amazon’s native monitoring tools, you can monitor Redshift more effectively by taking advantage of a third-party solution like Sumo Logic.
Sumo Logic’s Redshift monitoring solution provides several key benefits:
In short, Sumo Logic makes it faster and easier to monitor Redshift in a comprehensive way, without having to juggle multiple monitoring tools or figure out how to analyze the data manually.
To see Sumo Logic’s Redshift monitoring features for yourself, sign up for a free trial.
Reduce downtime and move from reactive to proactive monitoring.
Build, run, and secure modern applications and cloud infrastructures.
Start free trial