Data Productivity Cloud architecture
This document provides a detailed explanation of the architecture behind the Data Productivity Cloud, focusing on key components such as Designer, scheduling, agent gateway, Git and the API gateway. Most importantly, we will illustrate how these services interact with Matillion-hosted agents and Customer-hosted agents that typically define Full SaaS and Hybrid SaaS deployments, respectively.
Full SaaS architecture
In Full SaaS deployments, Matillion will host the required agent and project metadata. This will call out to data warehouses and source services in the customer cloud.
Hybrid SaaS architecture
In Hybrid SaaS deployments, the agent is hosted in the customer's cloud (Amazon Web Services or Microsoft Azure) and can connect to services such as sources and data warehouses from there.
Detailed Architecture diagram
In the Data Productivity Cloud, users have the option to choose between two deployment models: the Full SaaS model and the Hybrid SaaS model, each offering distinct setups for agent deployment and workflow execution.
In the Full SaaS model, Matillion provides and manages the hosted agent, alleviating users from concerns related to deployment, upgrades, and monitoring. The Matillion hosted agent directly interfaces with the Hub and Matillion hosted vault, so secrets can be defined directly from within your project.
In the Hybrid SaaS model, users have more control over their infrastructure as they deploy and manage their own agent within their cloud environment. They must also manage their own secret vault and secrets on their chosen cloud platform.
Data Productivity cloud is built on top of Git. Git offers a range of features, including version control, collaboration, branching and merging capabilities, and a distributed system architecture. Additionally, users can integrate their own Git repository with the Designer for version control and collaboration on pipelines.
In the Data Productivity Cloud, Git seamlessly operates without users needing to install or maintain local copies of the application code. Instead, all Git interactions are supported natively in the Designer application. Read our GitHub app overview.
Workflow expanded
Referring to the architecture diagram above, the following notes can be made on the high-level workflow used by the Designer and its task execution in the Data Productivity Cloud:
Authentication and secret management
- Users authenticate themselves through the Hub to access the Data Productivity Cloud.
- Secrets, including API keys and other credentials, are managed securely. This ensures that only authorized users can access sensitive data.
Pipeline design and management
- Users design and configure data pipelines using the Designer.
- The Designer integrates with the Component Information service to provide metadata and handle design-time requests.
- You can also integrate Git with the Designer for version control and collaboration on pipelines. This allows you to track changes, revert to previous versions, and work together on pipelines as a team.
Agent management and monitoring
- The Agent Manager deploys, upgrades, and monitors agents within the Data Productivity Cloud.
- It also queries connection statuses to ensure seamless operation.
Workflow Orchestration and observability
- The Workflow Execution Engine orchestrates pipeline execution.
- The Data Productivity Cloud offers pipeline observability features for monitoring and performance tracking.
Task execution and scheduling
- Task requests are sent to the Agent Gateway for direct communication with customer-hosted agents.
- The scheduler coordinates pipeline executions based on schedules.
API gateway for public exposure
- Grant controlled access to data pipelines via secure, public API endpoints.
- Empower external applications to interact with your pipeline data programmatically.
- Authentication and authorization mechanisms for robust API access control.
Agent communication and secret access
- Matillion hosted agents communicate with the Hub and Hosted Vault for secure access to customer secrets.
- They also use the Connector Service to retrieve data from various sources such as Salesforce, SAP, and databases.
Customer secret management and agent responsibilities
- Customer Secret Vaults securely store and retrieve customer secrets.
- Customer hosted agents are responsible for running any components in data processing pipelines.
To understand the flow of secrets in Data Productivity Cloud, read Secret overview.
Comparison of deployment models
Deployment Components | Full SaaS | Hybrid SaaS |
---|---|---|
Agent Deployment | Matillion provides and manages the hosted agent. | Users deploy and manage their own agent within their cloud environment or on-premises data center. |
Secret Management | Matillion manages the secret vault and access control. | Users manage their own secret vault within their cloud environment or on-premises data center. |
Data Security | Matillion ensures the security of the hosted agent, secrets, and data in its environment. | Users have complete ownership and control over data security, potentially requiring additional expertise and resources. |
Concerns Addressed | Alleviates concerns related to deployment, upgrades, maintenance, and monitoring of the agent. Simplifies setup and ongoing management. | Provides more control over infrastructure and data security, allowing for customization and compliance with specific regulations. |
Central Interface | Designer remains the central interface for designing and managing pipelines. | Designer remains the central interface for designing and managing pipelines. |
API Interface | Users can interact programmatically with the Data Productivity Cloud using the API interface. | Users can interact programmatically with the Data Productivity Cloud using the API interface. |
Scalability | Highly scalable as Matillion manages the infrastructure. | May require manual scaling of the user-managed agent infrastructure. |
Performance | Matillion optimizes infrastructure for performance. | Performance depends on user-managed infrastructure and configuration. |
Compliance | Matillion ensures compliance with industry standards and regulations. | Users are responsible for ensuring compliance with their own requirements. |