Skip to content

Sizing CDC agents

Here we review and explain the factors that can affect pipeline performance. Please review the information below before making your choices. You can resize your agent if the requirements of the pipeline change. The agent can be redeployed with a different resource configuration. If the snapshot processing is CPU limited, the agent can initially be deployed with a larger vCPU allocation and downsized once the snapshot has completed to handle the ongoing changes.

There are three key determinants of pipeline performance:

  • CPU: The number of CPU cores.
  • Memory: The available memory to process pipeline data.
  • Network: The quality, speed, and capacity of your network connection.

Please note that it's not until sometime after the pipeline status has changed to streaming that the pipeline will have reached a point where the pipeline can be resumed without restarting another full snapshot.


CPU

Insufficient CPU availability will limit the maximum throughput of the pipeline. While the pipeline can continue to stream changes, computational delays in writing changes in the data source to the storage platform mean the CDC agent will gradually become less accurate. In these circumstances, the true rate of changes will become unclear, with the rate appearing as a constant while the agent shows constant maximum CPU usage.

The longer this delay persists, the more inaccurate the pipeline will become. Both the pipeline and data source will experience errors and may even fail. For example, Postgres retains transaction logs back to the point at which the replication slot is positioned. As the pipeline falls further behind, the retained transaction logs will grow, compounding the issue.


Memory

Insufficient memory will typically cause the pipeline to fail. In these circumstances you would need to increase the available memory or reduce the volume of data, or number of tables, being processed by the pipeline.

The required memory directly scales with the number of tables being processed. The CDC agent has parallel tasks for each table, each of which comes with its own independent memory footprint as each table retains its own data buffer for writing the changes out to cloud storage.


Results Summary

To help illustrate the earlier points, the table below shows the results of testing a pipeline capturing evenly distributed changes across 83 tables, running across different services and configurations. The table shows the snapshot and streaming rates for each of the service providers according to CPU and memory allocation levels.

:::info{title='Note'} Absolute values will vary depending on the structure and content of the data being captured, and values across different services cannot be directly compared as the resource abstractions are not necessarily comparable. :::

Service CPU Allocation Memory Allocation Snapshot Rate Streaming Rate
AWS Fargate 1 vCPU 2GB 22k 11k
2 vCPU 4GB 27k 29k
4 vCPU 8GB 28k 30k
Google Compute Engine 2 vCPU 4GB 25k 15k
4 vCPU 8GB 26k 17k
Azure Container Instances 1 vCPU 2GB 19k 9k
2 vCPU 4GB 25k 13k
4 vCPU 8GB 29k 16k

Sizing recommendations

The tables below show a general recommendation for CPU and memory allocation according to your anticipated or actual change rate and the number of tables. Please be aware that each pipeline has a unique performance profile and it's recommended that you monitor the agent resource utilization and ensure that the pipeline is keeping up with the incoming changes.

Change Rate CPU Allocation
Up to 5k/s 1 vCPU
Up to 10k/s 2 vCPU
Up to 20k/s 4 vCPU
Number of Tables Memory Allocation
Up to 100 2GB
Up to 200 4GB
Up to 400 8GB

:::info{title='Note'} Cloud container services don't allow arbitrary vCPU and memory allocations. When making a choice of resources, the recommendation is to take the lowest configuration that satisfies both requirements. :::