Skip to content

S3 Buckets

The Matillion CDC agent requires cloud storage to output to. On AWS this means setting up an S3 bucket with the correct permissions. Some factors to keep in mind are:

  • The bucket should be in the same account and region as the agent will be installed in.
  • The agent must have permissions to access the bucket.
  • Some installation templates will ask for a Role ARN that assumes both storage and Secrets Manager permissions are together. See the IAM Role with Secrets Manager section for more information.
  • There are many ways to configure an S3 bucket and it is possible to use a existing resource. This article will outline best practice including the assumption you would make a new bucket specifically for CDC.

Creating an S3 bucket

  1. Log in to your AWS account and browse to the S3 service.
  2. Click Create bucket.
  3. Configure your bucket details:
    • Bucket name: Name your bucket whatever you wish.
    • AWS Region: We highly recommend choosing the region in which you plan to install your cdc agent.
    • Object Ownership: We highly recommend leaving ACLs disabled and ensuring the bucket is in the same AWS account as the cdc agent.
    • Block Publish Access: We recommend leaving public access blocked.
    • Bucket Versioning: Given the streaming nature of CDC, it is recommended to disable Bucket Versioning.
    • Tags: Include tags if you wish; they will not affect your CDC process.
    • Default encryption: Bucket encryption is handled automatically in cases where an Amazon-managed key is used (SS3-S3, or SSE-KMS with the AWS/S3 managed key). If you wish to use your own KMS key, you must also ensure that the task role has the following permissions:
      • kms:GenerateDataKey
      • kms:Decrypt
  4. Click Create bucket.

Permissions

Your CDC agent will require the following AWS Secrets Manager permissions in the Task Role. These permissions must be given to the Task Role as well as given from the bucket itself. While the Task Role's permissions are dealt with in the IAM Roles article.

  • s3:PutObject
  • s3:PutObjectAcl
  • s3:GetObject
  • s3:GetObjectAcl
  • s3:DeleteObject

Bucket Policy

In addition to the above Task Role permissions, it is also required that your S3 Bucket allow access to that same role - this is done via a bucket policy. You will need to replace the resources with your own including the AWS account ID, bucket names and Task Role ARNs.

It is advisable to consult your cloud administrator if you are not comfortable with AWS Policies.

{
"Version": "2012-10-17",
"Statement": [
    {
        "Sid": "AgentAccessPolicy",
        "Effect": "Allow",
        "Principal": { "AWS": "TaskRoleARN" },
        "Action": [
            "s3:GetObject",
            "s3:GetObjectVersion",
            "s3:ListBucket",
            "s3:PutObject",
            "s3:GetBucketLocation"
        ],
        "Resource": [
            "arn:aws:s3:::StagingBucketName",
            "arn:aws:s3:::StagingBucketName/*",
        ]
    },
    {
        "Sid": "RootAccessPolicy",
        "Effect": "Allow",
        "Principal": { "AWS": "arn:aws:iam::AWSAccountId:root" },
        "Action": [
            "s3:*"
        ],
        "Resource": [
              "arn:aws:s3:::StagingBucketName",
              "arn:aws:s3:::StagingBucketName/*"
        ]
    }
  ]
}