Working with zero-ETL integrations
This topic includes prerelease documentation for Aurora PostgreSQL and RDS for MySQL zero-ETL integrations with Amazon Redshift,
which are in preview release. The documentation and the features are both subject to change.
We recommend that you use RDS for MySQL and Aurora PostgreSQL zero-ETL integrations only in test environments and not in production environments.
For preview terms and conditions, see Betas and Previews in
AWS Service Terms |
Zero-ETL integration is a fully managed solution that makes transactional or operational data available in Amazon Redshift in near real time. With this solution, you can configure an integration from your source to an Amazon Redshift data warehouse. You don't need to maintain an extract, transform, and load (ETL) pipeline. We take care of the ETL for you by automating the creation and management of data replication from the data source to the Amazon Redshift cluster or Redshift Serverless namespace. You can continue to update and query your source data while simultaneously using Amazon Redshift for analytic workloads, such as reporting and dashboards.
With zero-ETL integration you have fresher data for analytics, AI/ML, and reporting. You get more accurate and timely insights for use cases like real-time dashboards, optimized gaming experience, data quality monitoring, and customer behavior analysis. You can make data-driven predictions with more confidence, improve customer experiences, and promote data-driven insights across the business.
The following sources are currently supported for zero-ETL integrations:
-
Aurora MySQL-Compatible Edition
-
Aurora PostgreSQL-Compatible Edition (preview)
-
RDS for MySQL (preview)
To create a zero-ETL integration, you specify an integration source and an Amazon Redshift data warehouse as the target. The integration replicates data from the source to the target data warehouse. The data becomes available in Amazon Redshift within seconds. The integration monitors the health of the data pipeline and recovers from issues when possible. You can create integrations from sources of the same type into a single Amazon Redshift data warehouse to derive holistic insights across multiple applications.
With the data in Amazon Redshift, you can use analytics that Amazon Redshift provides. For example, built-in machine learning (ML), materialized views, data sharing, and direct access to multiple data stores and data lakes. A zero-ETL integration keeps your compute resources isolated from your data resources, so you're using the most efficient tools to process data. For data engineers, zero-ETL integration provides access to time-sensitive data that otherwise can get delayed by intermittent errors in complex data pipelines. You can run analytical queries and ML models on transactional data to derive near real-time insights for time-sensitive events and business decisions.
You can create an Amazon Redshift event notification subscription so you can be notified when an event occurs for a given zero-ETL integration. To view the list of integration-related event notifications, see Zero-ETL integration event notifications with Amazon EventBridge. The simplest way to create a subscription is with the Amazon SNS console. For information on creating an Amazon SNS topic and subscribing to it, see Getting started with Amazon SNS in the Amazon Simple Notification Service Developer Guide.
As you get started with zero-ETL integrations, consider the following concepts:
-
A source database is the database where data is replicated into Amazon Redshift.
-
A target data warehouse is the Amazon Redshift provisioned cluster or Redshift Serverless workgroup where data is replicated to.
-
A destination database is the database that you create from a zero-ETL integration in the target data warehouse.
You can monitor your zero-ETL integrations by querying the following system views in Amazon Redshift.
-
SVV_INTEGRATION provides information about configuration details of zero-ETL integrations.
-
SYS_INTEGRATION_ACTIVITY provides information about completed zero-ETL integrations.
-
SVV_INTEGRATION_TABLE_STATE provides information about integration state.
-
SYS_INTEGRATION_TABLE_STATE_CHANGE provides information about table state change log for integrations.
For pricing information for zero-ETL integrations, see the appropriate pricing page:
For more information about zero-ETL integration sources, see the following topics:
-
For Aurora zero-ETL integrations, see Benefits, Key concepts, Limitations, Quotas, and Supported Regions of zero-ETL integrations in the Amazon Aurora User Guide.
-
For RDS zero-ETL integrations, see Benefits, Key concepts, Limitations, Quotas, and Supported Regions of zero-ETL integrations in the Amazon RDS User Guide.
Topics
- Considerations when using zero-ETL integrations with Amazon Redshift
- Getting started with zero-ETL integrations
- Creating destination databases in Amazon Redshift
- Querying and creating materialized views with replicated data
- Managing zero-ETL integrations
- Metrics for zero-ETL integrations
- Troubleshooting zero-ETL integrations