Sizing Streaming agents

Agent Type:

Streaming

Agent Platform:

AWS ✅ Azure ✅ GCP ✅

This guide provides best practices for:

Agent sizing: Determining the optimal resources for Streaming agents based on pipeline complexity, table and column count, and data volume.
Agent balancing: Distributing workloads effectively across multiple Streaming agents to enhance performance and avoid bottlenecks.
Scaling for growth: Reviewing Streaming agent distribution periodically, and adjusting assignments to ensure your system scales to meet increasing demands.

Following these recommendations will help ensure efficient, scalable, and reliable streaming pipelines, even as data demands grow.

Agent sizing

Correctly sizing agents is essential to ensure smooth performance of streaming pipelines. When an agent has the appropriate resources, it can handle data changes efficiently, minimising latency and avoiding resource bottlenecks.

When determining the optimal agent size, consider the following factors:

Table count and data volume. Agents managing high numbers of tables or high-volume data streams need greater capacity to prevent performance issues.
Data change frequency. Higher change rates increase agent workload, so pipelines with frequent updates may need more robust sizing.
Cloud platform and proximity to source. AWS ECS offers high performance when the agent is co-located with the database in the same region and account. Configurations further from the source database may experience slower throughput and higher latency.

General sizing recommendations

To help size agents effectively, use these general guidelines.

Configuration type	Number of tables	Snapshot volume	Changes per second	CPU	Memory
Small workloads	100	100,000 – 100 million	< 5,000	1	2 GB
Medium workloads	200	100,000 – 10 billion	5,000 – 10,000	2	4 GB
Large workloads	400	100,000 – 10 billion	10,000 – 20,000	4	8 GB

In Azure ACI and other remote configurations, smaller setups may experience a CPU bottleneck during streaming, leading to lower throughput.

Some configurations require nuanced resourcing adjustments. For example, if there is a high table count with a low record count per table, memory requirements may need to be adjusted to avoid over-allocation:

High table count, low record count per table: Use lower memory if snapshot volume is < 100,000 records.
High volume and top five streaming distribution: For scenarios where only the top five tables in change frequency drive most updates, consider targeted resource allocation.

Outlier configuration	Number of tables	Snapshot volume	Changes per second	CPU	Memory
High table count, low snapshop volume	> 500	< 100,000	5,000 – 10,000	2	4 GB – 8 GB
Targeted streaming (top 5 tables)	> 500	100,000 – 10 billion	5,000 – 10,000	2	4 GB

Agent balancing

For high-volume or complex streaming workflows, distributing workloads across multiple agents can improve performance, reduce latency, and prevent resource strain. Agent balancing can help to ensure that data moves efficiently, even in environments with large table counts or high data-change frequency.

Best practices for agent balancing

Separate high-volume tables: Distribute high-transaction or high-volume tables across different agents to avoid bottlenecks.
Group by data source or table type: Where feasible, group similar tables or data sources on the same agent. This approach can help streamline data flow and reduce the load on individual agents.
Monitor for load imbalance: Regularly monitor agent performance to spot any overloaded agents. Adjust assignments as needed to keep data moving smoothly across all agents. Note that remote agents may show more CPU and RAM headroom, but with slower throughput due to latency.

Scaling for growth

As data demands increase, it may be necessary to scale agents to maintain performance.

Signs that scaling is needed

High CPU or memory utilisation: Persistent high utilisation on an agent can indicate it's nearing capacity.
Event lag: Delays in data changes reaching target systems suggest an agent may be overloaded. If lag persists, evaluate scaling options.
Increased data volume or change frequency: If new tables or higher change frequencies are added to your pipeline, scaling is often necessary to prevent performance impacts.

Approaches to scaling

Horizontal scaling (agent balancing): For workloads with high table counts or diverse data sources, consider adding additional agents to distribute the load. Horizontal scaling can be particularly effective for spreading out high-frequency tables and improving throughput.
Vertical scaling (agent sizing): If adding agents isn't an option or the workload isn't complex enough to split, you can increase CPU, memory, or storage on existing agents to handle higher demands.

Additional considerations

Column count: Anecdotally, 10,000 columns across all tables is considered a soft limit before potential performance degradation. If your pipeline approaches this limit, monitor closely for signs of latency or resource strain.
Table distribution and change frequency:
- Consider the distribution of table sizes across your workload, as highly uneven distributions can impact performance differently than even loads.
- Workloads with tables that change frequently (i.e. result in continuous streaming updates) require more resources. It's helpful to monitor both the frequency of changes per table and the overall rate of source changes to determine if adjustments in sizing or balancing are necessary.

Got feedback or spotted something we can improve?

We'd love to hear from you. Join the conversation in the Documentation forum!