Skip to content

Streaming to an Amazon S3 destination

Streaming pipelines can use Amazon S3 as a direct storage destination. This page describes the prerequisites, process, and other considerations of using Amazon S3 as a streaming pipeline destination.

You should have arrived here by first reading about the Create streaming pipeline wizard.


Prerequisites

  • You need an AWS account.
  • You need permissions to create and manage S3 buckets in AWS.
  • You'll need to ensure that the IAM role used by the streaming agent has PutObject permissions for the S3 bucket and its prefix to be used as the destination by the pipeline.

Destination configuration

Refer to this section to complete the Destination configuration section of the Create streaming pipeline wizard.

Bucket = string

The name of the Amazon S3 bucket you want to use as a destination. Find your bucket name in the AWS Management Console under ServicesS3.

Note

This must be the base directory of the S3 bucket, it is not currently possible to use a sub-directory.


Prefix = string

The prefix is the name of the "folder" or a location within the S3 bucket that all streaming data for this pipeline should be saved to. You can have multiple agents using the same bucket with different prefixes.

Click Continue to advance to setting up your source database connection.


Schema drift

Schema drift is supported for this destination. Read Schema drift to learn more.


Cross-account S3 access

Given that you have a streaming agent set up in one AWS account (let's call it Account A), it's possible to load your change data into an S3 bucket in another account (let's call this one Account B). To do this, you must configure the following access permissions for your agent task role in your bucket policy.

  • Allow GetBucketLocation and ListBucket on your bucket.
  • Allow PutObject, GetObject, and DeleteObject on the contents of your bucket.

An example bucket policy can be seen below:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Bucket Permissions",
            "Effect": "Allow",
            "Principal": {
                "AWS": "<account-a-agent-task-role-arn>"
            },
            "Action": [
                "s3:GetBucketLocation",
                "s3:ListBucket"
            ],
            "Resource": "<account-b-bucket-arn>"
        },
        {
            "Sid": "Bucket Content Permissions",
            "Effect": "Allow",
            "Principal": {
                "AWS": "<account-a-agent-task-role-arn>"
            },
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:DeleteObject"
            ],
            "Resource": "<account-b-bucket-arn>/*"
        }
    ]
}

Your task role ARN can be found by navigating to your ECS clusters list in Account A and following the steps below:

  1. In the search, type "ECS" and choose Elastic Container Service.
  2. Select your ECS cluster.

    Select your ECS cluster

  3. In the Services tab, locate and click your cluster's task definition.

    Select your task definition

  4. In the Overview panel, locate and click the task role.

    Select your ARN

  5. In the Summary panel, locate and copy your ARN.

    Copy ARN

Similarly, your bucket ARN can be found by navigating to your S3 buckets list (Account B), then selecting the corresponding radio button and clicking Copy ARN.

Locate bucket ARN