Skip to content

Tech note - AWS Redshift RingBuffer exceeding expected limits

Overview

An instability issue has been detected between the AWS Redshift JDBC driver version 2.1.0.9 and Matillion ETL for Redshift versions:

  • 1.68.x
  • 1.69.x
  • 1.70.x
  • 1.71.x

This tech note provides a workaround for Matillion ETL for Redshift users who are experiencing either or both of the following when processing large data volumes:

  • Out-of-memory (OOM) exception errors
  • High CPU usage

Cause

The cause of this issue is that the AWS Redshift JDBC driver is not releasing memory from the following object queue, RedshiftRowsBlockingQueue, which is causing exponential growth and leading to an out-of-memory exception after loading 1+ million rows.


Workarounds

Two workarounds are available:

Workaround 1. tune the ring buffer size

From the Matillion ETL UI:

  1. Navigate to the Environments panel (bottom left).
  2. Right-click your environment and click Edit environment.
  3. On the Redshift connection page of the wizard, click Manage on the Advanced Connection Settings.
  4. Click + to add two rows, then set the following parameters and values.
    1. fetchRingBufferSize=200M
  5. Click OK.
  6. Click Next.
  7. Click Finish.
  8. Rerun the jobs causing high CPU usage and/or out-of-memory exceptions.

Note

While we recommend the 200M size for the fetchRingBufferSize, we cannot guarantee that this size will work for all use cases, as the value relates to number of rows being processed, not to memory size.

Workaround 2. disable the ring buffer

From the Matillion ETL UI:

  1. Navigate to the Environments panel (bottom left).
  2. Right-click your environment and click Edit environment.
  3. On the Redshift connection page of the wizard, click Manage on the Advanced Connection Settings.
  4. Click + to add two rows, then set the following parameters and values.
    1. enableFetchRingBuffer=false
  5. Click OK.
  6. Click Next.
  7. Click Finish.
  8. Rerun the jobs causing high CPU usage and/or out-of-memory exceptions.

Warning

Opting for this workaround can lead to additional issues with high numbers of tables and/or schemas due to the current driver limitation returning 1000 rows of metadata at a time due to an issue in the current driver not having a caching mechanism if the ring buffer is disabled. Therefore, workaround 1 is advised.


Additional information

A GitHub issue submitted to AWS can be found here.