Skip to content

Agent overview

The Data Productivity Cloud agent is a component within the Data Productivity Cloud ecosystem. It serves as a bridge between the user's data plane and the centralized platform, enabling the execution and scheduling of compatible pipelines.

Here is a summary of the Data Productivity Cloud agent's key characteristics:

  • Execution and scheduling: The agent enables the execution of pipelines within the Data Productivity Cloud. It acts as the engine that carries out the tasks defined in the pipelines.
  • Data locality: Your data will remain in the region your agent is running in, allowing you to conform to data locality limitations set by your organization or law.
  • Scalability: The agent can scale up by running multiple instances concurrently, allowing for increased workload handling based on the user's requirements.

Architecture

Agent deployment architecture

The Data Productivity Cloud agent is required for Designer to schedule and execute the pipelines you create. While agents can be installed manually on your cloud platform, a Matillion Full SaaS deployment is provided for you when you create your first Designer project and this agent (hosted by Matillion) can be seen from the Manage Agents page in the Hub. Thus, you don't need to install your own agent if you don't want to.


What is a data plane?

A data plane refers to an abstract concept that represents an execution environment. This execution environment, such as AWS Fargate, is used in conjunction with an agent provided by Matillion as a Docker image.

The data plane serves as the infrastructure or environment where the agent instances operate. The agent instances, when deployed within the data plane, collectively form the agent. Each agent instance is responsible for executing individual pipeline tasks.

These pipeline tasks typically involve interactions with various data sources, including a data warehouse. The data plane provides the necessary resources and infrastructure for the agent instances to execute the assigned pipeline tasks and perform data-related operations.


What is the Data Productivity Cloud agent?

The agent is a software component responsible for executing the pipeline tasks within the specified environment. It serves as the execution engine for the pipelines and interacts with the agent gateway to receive the pipeline tasks that need to be executed. The agent communicates with the agent gateway using an egress-only method, meaning it can send requests and receive responses but can't accept incoming connections.

The pipeline tasks that are sent to the agent originate from the Data Productivity Cloud platform. These tasks define the specific operations and transformations to be performed on the data within the pipeline.


What is the Data Productivity Cloud agent manager?

The agent manager is an application or tool that facilitates the creation, configuration, and management of agents within the Data Productivity Cloud ecosystem. It is typically accessed through the Hub, which serves as a centralized platform for managing various aspects of the data pipelines.

With the agent manager, users can create and define configurations for installing and deploying agents. This includes specifying settings such as the agent image, environment variables, resource allocation, and any other required parameters. The agent manager provides a user-friendly interface or set of commands to simplify the process of configuring and setting up agents.

Note

The agent manager offers functionality for managing agents in compatible environments like AWS ECS Fargate. This includes features such as automatic updates, where the agent manager handles the process of updating agents to newer versions seamlessly. This ensures that agents deployed in compatible environments remain up to date with the latest features, bug fixes, and security patches. This means agent updates are controlled by Matillion, and as such, the user isn't required to manually update the agent themselves.


Matillion Full SaaS vs Hybrid SaaS

The Data Productivity Cloud provides two deployment models: Full SaaS and Hybrid SaaS.

  • With Full SaaS, Matillion manages the entire infrastructure, including agent deployment and security measures. Matillion ensures seamless updates and robust security protocols. The Matillion-hosted agent handles the execution of tasks, and customer secrets are securely stored in a Matillion-hosted secrets manager.
  • With Hybrid SaaS, you deploy and manage your own agents within your own cloud infrastructure. This gives you full control over security measures, network isolation, access control, and where your secrets are stored. This option is available for Enterprise edition customers only.

Regardless of the deployment model, your data remains secure at all times and is never visible to Matillion. Read Security overview for a more detailed explanation of Matillion's security framework.

Each deployment model offers distinct features and benefits, allowing organizations to choose the option that best aligns with their needs and infrastructure preferences. Each deployment model also comes with its own security considerations. For a full discussion of the deployment architecture and security considerations, read Deployment options.

Every Data Productivity Cloud account includes a Full SaaS agent, hosted by Matillion, offering a zero-install and zero-maintenance experience. By starting with Full SaaS, you can quickly leverage the Data Productivity Cloud's capabilities without any setup. As your requirements become more sophisticated, you can move to the Hybrid SaaS model and customize it to your unique infrastructure, if you wish.

Due to considerations such as security and data flow, some pipeline components can only be used in a Hybrid SaaS environment. This will be clearly stated in the component documentation. Components that are currently only available in Hybrid SaaS include:

Matillion Full SaaS

Every Matillion Data Productivity Cloud account is provided with an agent hosted by Matillion, providing a true zero-install experience.

If your projects are set up to use Matillion's Full SaaS infrastructure then there is no need to be concerned with managing the agent–this is handled by Matillion.

There's no setup needed for a Full SaaS solution—it's available for you to use straight away.

If you need to allow any access from the Matillion Full SaaS agent to your data plane, you'll find the IP addresses here.

Warning

Certain pipleline components, listed above, are not available for use in a Full Saas deployment. In addition, some cloud-specific components (for example, AWS SNS Message and SQS Message), as well as data ingestion and staging operations, may require extra work in setting up cloud provider credentials and ensuring that your cloud data warehouse has the appropriate access permissions.

Hybrid SaaS

If your account is on our Enterprise plan, you also have the option to install our agent software within your own cloud infrastructure and data plane. This is may be beneficial if:

  • Your data locality requirements need the agent to run in a specific region that Matillion doesn't currently provide agents in.
  • You aim to achieve faster processing speeds by locating the agent close to your applications, databases, and cloud data warehouse.
  • You need to access systems (such as database or file storage) that only have network access from within your VPC/VNet.
  • You have specific agent scaling requirements that a Matillion Full SaaS agent doesn't support.
  • You need to use any of the Hybrid SaaS pipeline components listed above.

Currently, the Data Productivity Cloud allows agents to be hosted in either an AWS or Azure infrastructure. Instructions for hosting an agent in your own cloud infrastructure are given in Create an agent in your infrastructure.


Migrating from Full SaaS to Hybrid SaaS

Projects are either Full SaaS or Hybrid SaaS, and can't switch between the two. If you want to move Full SaaS workloads to Hybrid SaaS infrastructure, you need to perform the following steps:

  1. Install the Hybrid SaaS agent in your infrastructure.
  2. Create a new project that uses your Hybrid SaaS agent.
  3. Recreate any secrets in the new project.
  4. Export and Import the pipelines from the Full SaaS project to your new Hybrid SaaS project.
  5. Recreate any schedules in the new project.

Video