Maia glossary🔗

A reference guide to key terms and concepts in Maia, based on official documentation.

A🔗

Account regions: The geographic region selected when creating a Maia account, which determines where data and resources are stored and processed and cannot be changed after account creation.

Account roles: Roles assigned at the account level that determine what a user can do across their entire Maia account.

ActiveCampaign: A Flex connector that loads data from ActiveCampaign into your data warehouse.

Add Partition: An orchestration component (Amazon Redshift only) that defines the Amazon S3 directory structure for partitioned external table data by attributing values to each partition.

Agent: A lightweight service installed within your infrastructure that executes pipeline tasks and securely connects Maia to your data sources and warehouses.

Agent logs: Records of events and errors generated by agents during pipeline execution, used for monitoring and troubleshooting.

Agent versions: Different releases of the agent software with specific features and capabilities.

Aggregate: A transformation component that groups data and applies summary functions (SUM, COUNT, AVG, MIN, MAX) to produce calculated results.

AI Analyze Sentiment: An AI component (Databricks only) that uses the Databricks ai_analyze_sentiment() function to perform sentiment analysis on input text.

AI Classify: An AI component (Databricks only) that uses the Databricks ai_classify() function to classify input text according to labels you provide.

AI Extract: An AI component (Databricks only) that uses the Databricks ai_extract() function to extract entities from a given text input according to labels you provide.

AI Fix Grammar: An AI component (Databricks only) that uses the Databricks ai_fix_grammar() function to correct grammatical errors in text in your input table.

AI Mask: An AI component (Databricks only) that uses the Databricks ai_mask() function to identify and mask specified entities in unstructured text.

AI Note: A feature that enables Maia to automatically annotate pipeline components with contextual notes using Markdown, making collaboration clearer and easier.

AI Query: An AI component (Databricks only) that uses the Databricks ai_query() function to obtain answers to natural-language questions from a chat model endpoint.

AI Similarity: An AI component (Databricks only) that uses the Databricks ai_similarity() function to compute the semantic similarity score between two strings.

AI Summarize: An AI component (Databricks only) that uses the Databricks ai_summarize() function to generate a summary of a given input text.

AI Translate: An AI component (Databricks only) that uses the Databricks ai_translate() function to translate text to a specified target language.

Alter table: An orchestration component that modifies the structure of an existing table in the data warehouse.

Alter Warehouse: An orchestration component (Snowflake only) that modifies the configuration of an existing Snowflake warehouse from within Designer.

Amazon Bedrock Prompt: An AI component that uses a large language model (LLM) via Amazon Bedrock to provide responses to user-composed prompts by combining inputs from a source table with configured prompts.

Amazon OpenSearch Upsert: A Modular connector that inserts or updates documents in Amazon OpenSearch Service.

Amazon Redshift Load: A Modular connector that loads data into Amazon Redshift.

Amazon S3 streaming destination: A streaming pipeline destination that writes captured change events directly to Amazon S3 for storage.

Amazon Textract Input: A Modular connector that extracts text and structured data from documents using Amazon Textract.

Amazon Transcribe Input: A Modular connector that transcribes audio to text using Amazon Transcribe.

Amplitude: A Flex connector that loads data from Amplitude into your data warehouse.

Analyze Tables: An orchestration component (Amazon Redshift only) that runs the Amazon Redshift ANALYZE command on a list of tables to rebuild statistical metadata for optimized query performance.

Anaplan: A Modular connector that loads data from Anaplan into your data warehouse.

And: An orchestration component that waits for all of its input components to complete before continuing the pipeline.

API: Programmatic interfaces that allow external systems to interact with Maia, including running pipelines, and managing projects, and retrieving execution status.

Append to Grid: An orchestration component that appends or prepends rows to an existing grid variable.

Artifacts: Deployable packages containing pipelines and configurations that can be scheduled and executed via the API.

Asana: A Flex connector that loads data from Asana into your data warehouse.

Assert External Table: An orchestration component that validates that an external table has been created with the correct metadata, and asserts a row count using comparison operators.

Assert Scalar Variables: An orchestration component that validates variable values against expected conditions during pipeline execution.

Assert Table: An orchestration component that validates a table exists, and optionally checks its structure.

Assert View: A transformation component that verifies a dataset meets specified conditions, including metadata, values, and row count.

Attio: A Flex connector that loads data from Attio into your data warehouse.

Audit Service: A service that records user actions and system events for security monitoring and compliance.

Authentication: The process of setting up and authorizing API credentials for accessing the Maia REST API.

AVRO Load: An orchestration component that loads data into a table from one or more AVRO files, automatically inferring the schema and creating the corresponding table.

AWS IAM roles: AWS Identity and Access Management roles required for agent deployment and operation on AWS.

AWS PrivateLink: A network configuration for secure, private connectivity between Maia and AWS services.

Azure Blob Storage Load: An orchestration component that loads data into an existing table from objects stored in Azure Blob Storage.

Azure Blob Storage streaming destination: A streaming pipeline destination that writes captured change events directly to Azure Blob Storage for storage.

Azure Blob Storage Unload: An orchestration component that creates files on a specified Azure Blob Storage account and populates them by copying data from a designated table or view.

Azure Cosmos DB for NoSQL: A Flex connector that loads data from Azure Cosmos DB for NoSQL into your data warehouse.

Azure Document Intelligence: A Modular connector that extracts text and structured data from documents using Azure Document Intelligence.

Azure OpenAI Prompt: An AI component that uses a large language model (LLM) via Azure OpenAI to provide responses to user-composed prompts by combining inputs from a source table with configured prompts.

Azure Queue Storage Message: An orchestration component that posts a message to Azure Queue Storage for processing by other applications.

Azure Speech Transcribe: A Modular connector that transcribes audio to text using Azure Speech Services.

Azure SQL: A Modular connector that loads data from Azure SQL Database into your data warehouse.

B🔗

Bash Pushdown: An orchestration component that uses SSH connections to run Bash scripts on your own instances, displaying script output in task messages.

Begin: An orchestration component that starts a new transaction within the database.

Bing Ads Query: A query connector that loads data from Bing Ads into your data warehouse.

Box: A Flex connector that loads data from Box into your data warehouse.

Branches: Parallel versions of your project that allow you to develop and test changes independently before merging into the main codebase.

Braze: A Flex connector that loads data from Braze into your data warehouse.

Brevo: A Flex connector that loads data from Brevo into your data warehouse.

Bugcrowd: A Flex connector that loads data from Bugcrowd into your data warehouse.

Buildkite: A Flex connector that loads data from Buildkite into your data warehouse.

C🔗

Calculator: A transformation component that creates new columns using SQL expressions and calculations.

Chargebee: A Flex connector that loads data from Chargebee into your data warehouse.

Chunk Text: An orchestration component that splits text or Markdown data into smaller chunks using a Python UDF in Snowflake, for use in AI pipelines.

CircleCI: A Flex connector that loads data from CircleCI into your data warehouse.

ClickUp: A Flex connector that loads data from ClickUp into your data warehouse.

Cloud Pub/Sub: An orchestration component that publishes messages to a topic on Google Cloud Platform for other applications to receive via subscriptions.

CloudWatch Publish: An orchestration component that publishes metrics to Amazon CloudWatch, enabling you to attach alarms and alerts for metric values.

Coalesce: A Flex connector that loads data from the Coalesce data platform into your data warehouse.

Commit: An orchestration component that ends a database transaction and makes all changes visible to other database users.

Commit changes (Git action): Saving your pipeline changes to the Git repository with a descriptive message.

Compare changes (Git action): Viewing differences between branches or commits in your project.

Components: The fundamental building blocks of pipelines—configurable units that perform specific operations.

Concord: A Flex connector that loads data from Concord into your data warehouse.

Confluence: A Flex connector that loads data from Confluence into your data warehouse.

Construct Variant: A transformation component (Snowflake only) that converts rows of data into key:value pair arrays, outputting them as variant-type data for storing semi-structured formats such as JSON.

Context Files: Custom files that provide Maia AI Agents with project-specific context and instructions.

Contributor (environment role): An environment role that allows users to validate, sample, run, and publish pipelines, view executions and lineage, and manage schedules within an environment.

Contributor (project role): A project role that allows users to create and manage pipelines, branches, secrets, and OAuth connections, and perform Git operations, but cannot manage project settings or users.

Convert String To Struct: A transformation component (Databricks only) that converts JSON-format data stored as a STRING data type into a Databricks STRUCT data type for querying and manipulation.

Convert Type: A transformation component that changes the data type of columns.

Cortex Completions: An AI component (Snowflake only) that uses Snowflake Cortex to receive a prompt and generate a response using a supported large language model.

Cortex Embed: An AI component (Snowflake only) that converts English-language text into a Cortex vector embedding of 768 or 1024 dimensions for use in AI pipelines.

Cortex Extract Answer: An AI component (Snowflake only) that uses Snowflake Cortex to answer a natural-language question by extracting the answer from your dataset.

Cortex Finetune: An AI component (Snowflake only) that uses Snowflake Cortex to fine-tune large language models (LLMs) on your organization's data for domain-specific use cases.

Cortex Multi Prompt: An AI component (Snowflake only) that uses Snowflake Cortex to process multiple prompts and generate responses using a supported large language model.

Cortex Parse Document: An AI component (Snowflake only) that uses Snowflake Cortex to extract content from PDFs and image files stored on an internal or external stage.

Cortex Sentiment: An AI component (Snowflake only) that uses Snowflake Cortex to analyze English-language input text and return a sentiment score.

Cortex Summarize: An AI component (Snowflake only) that uses Snowflake Cortex to summarize the input text of one or more columns and write the result to new output columns.

Cortex Translate: An AI component (Snowflake only) that uses Snowflake Cortex to translate the input text of one or more columns from one supported language to another.

Create External Table: An orchestration component that defines a table referencing data stored outside the data warehouse. External tables don't store data themselves but can be queried like regular tables.

Create File Format: An orchestration component (Snowflake only) that creates a named file format for use in bulk loading and unloading data in Snowflake tables.

Create Stream: An orchestration component (Snowflake only) that creates a new Snowflake Stream in a specified schema and table to capture data changes.

Create Table: An orchestration component that creates a new table in the data warehouse.

Create View: A transformation component that outputs a virtual table (SQL view) to your cloud data warehouse based on the upstream dataset, without storing data itself.

CSV Load: An orchestration component that loads data into a table from one or more CSV files, automatically inferring the schema and creating the corresponding table.

Custom Connector: User-defined connectors that extract data from REST APIs returning JSON responses.

D🔗

Data Cleanse: A transformation component that profiles data, identifies data quality issues, and applies configurable rules to cleanse a dataset.

Data Loader: A service for easily loading data from various sources into your cloud data warehouse.

Data quality framework: A set of approaches and components in Maia for identifying and addressing data quality issues across pipelines.

Data Transfer: An orchestration component that transfers files from a chosen source to a chosen target using common network protocols, supporting files of any type.

Database Query: An orchestration component that runs SQL queries on an accessible cloud or on-premises database and copies the results to a table.

Databricks: A Modular connector that loads data from Databricks into your data warehouse.

Databricks Jobs Compute: A configuration in Maia that enables transformation pipelines to run on a Databricks Jobs Compute cluster by specifying the cluster details and associating the configuration with an environment.

Databricks Vector Search: An AI component (Databricks only) that performs a vector similarity search on an input table to find content that best answers specific questions using embeddings in your Databricks account.

Datadog: A Flex connector that loads data from Datadog into your data warehouse.

Db2 for IBM i streaming connector: A streaming connector that monitors and captures row-level changes within Db2 for IBM i schemas using journal entries for tracked tables.

dbt Cloud: A Flex connector that loads data from dbt Cloud into your data warehouse.

dbt Core: An orchestration component that runs dbt commands as part of a pipeline.

Delete Tables: An orchestration component that removes tables from the data warehouse.

Delighted: A Flex connector that loads data from Delighted into your data warehouse.

Deployment: The way Maia is configured and hosted. The agents available include the Streaming agent, and Maia agent.

Designer: The visual, drag-and-drop interface for building data pipelines in Maia.

Detect Changes: A transformation component that compares two tables and adds a column indicating whether each row was inserted, deleted, changed, or unchanged.

Distinct: A transformation component that removes duplicate rows from data.

Dixa: A Flex connector that loads data from Dixa into your data warehouse.

Document AI Predict: An AI component (Snowflake only) that extracts data from documents such as PDFs and images using Snowflake's Document AI model prediction function.

Maia Flex connector: A Flex connector that loads data from the Maia API into your data warehouse.

Dropbox: A Flex connector that loads data from Dropbox into your data warehouse.

Dynamics 365 Business Central: A Flex connector that loads data from Microsoft Dynamics 365 Business Central into your data warehouse.

Dynamics 365 Query: A query connector that loads data from Microsoft Dynamics 365 into your data warehouse.

DynamoDB Query: A query connector that loads data from Amazon DynamoDB into your data warehouse.

E🔗

Editions: Different subscription tiers of Maia with varying features and capabilities.

Email Query: A query connector that loads data from email services into your data warehouse.

End Failure: An orchestration component that ends a pipeline execution with a failure status.

End Success: An orchestration component that ends a pipeline execution with a success status.

Environment roles: Roles assigned at the environment level that control what users can do within specific environments. The four environment roles are Owner, Contributor, Operator, and Viewer.

Environments: Configured connections to cloud data warehouses (Snowflake, Databricks, Redshift) with associated settings.

Eventbrite: A Flex connector that loads data from Eventbrite into your data warehouse.

Excel Query: A query connector that loads data from Microsoft Excel into your data warehouse.

Except: A transformation component that compares two datasets and returns only the rows that exist in the primary dataset but not the comparison dataset. Equivalent to the SQL EXCEPT operator.

Extract Nested Data: A transformation component that unpacks JSON-formatted data stored in variant-type columns into structured columns and rows.

Extract Structured Data: A transformation component (Databricks only) that unpacks arrays of structured data in STRUCT or ARRAY data types into columns and rows of data in a table.

F🔗

Facebook Ads Query: A query connector that loads data from Facebook Ads into your data warehouse.

Facebook Query: A query connector that loads data from Facebook into your data warehouse.

Fanatics: A Flex connector that loads data from Fanatics into your data warehouse.

File Iterator: An orchestration component that loops through files in cloud storage, executing operations for each file.

Filter: A transformation component that selects rows based on specified conditions.

FireHydrant: A Flex connector that loads data from FireHydrant into your data warehouse.

First/Last: A transformation component that groups data, sorts the groups, and returns only the first or last row of each group using SQL window functions.

Fixed Flow: A transformation component that generates a table of data for use downstream in the pipeline, with values defined at design time or from variables at runtime.

Fixed Iterator: An orchestration component that loops through a predefined list of values.

Flatten Variant: A transformation component (Snowflake only) that unpacks arrays of semi-structured data, such as JSON, into structured columns and rows.

Flex Connectors: Advanced custom connectors with extended capabilities for complex API integrations.

Float: A Flex connector that loads data from Float into your data warehouse.

Freshdesk: A Flex connector that loads data from Freshdesk into your data warehouse.

G🔗

Generate Sequence: An orchestration component that creates a sequence of numbers or dates for iteration.

Git in Designer: Version control integration for tracking changes to pipelines and enabling collaboration.

GitHub: A Flex connector that loads data from GitHub into your data warehouse.

Gmail: A Modular connector that loads data from Gmail into your data warehouse.

Gong: A Flex connector that loads data from Gong into your data warehouse.

Google Ads Query: A query connector that loads data from Google Ads into your data warehouse.

Google Analytics Query: A query connector that loads data from Google Analytics into your data warehouse.

Google BigQuery Query: A query connector that loads data from Google BigQuery into your data warehouse.

Google Cloud Storage Load: An orchestration component (Snowflake only) that loads data from Google Cloud Storage into an existing table.

Google Cloud Storage Unload: An orchestration component that writes data from a table or view to a specified Google Cloud Storage bucket.

Google Sheets: A Modular connector that loads data from Google Sheets into your data warehouse.

Google Vertex AI Prompt: An AI component that uses a Google Gemini large language model (LLM) to provide responses to user-composed prompts by combining text, image, audio, or video inputs with configured prompts.

Grid Iterator: An orchestration component that loops through rows in a grid variable, executing operations for each row.

Grid Variables: Variables that store tabular (two-dimensional) data for use in pipelines.

H🔗

HubSpot Load: A Modular connector that loads data from HubSpot into your data warehouse.

I🔗

IBM Db2 Load: A Modular connector that loads data from IBM Db2 into your data warehouse.

If: An orchestration component that evaluates an expression involving variables and routes pipeline execution depending on whether the expression is true, false, or produces a runtime error.

Infobip: A Flex connector that loads data from Infobip into your data warehouse.

Intercom: A Flex connector that loads data from Intercom into your data warehouse.

Intersect: A transformation component that compares two datasets and returns only the rows that are identical in both. Equivalent to the SQL INTERSECT operator.

J🔗

JDBC: A Modular connector that runs SQL queries against any JDBC-compatible database and loads the results into your data warehouse.

JDBC Load: A Modular connector that loads data from any JDBC-compatible database into your data warehouse.

JDBC Table Metadata to Grid: An orchestration component that retrieves metadata from a JDBC table and uses it to populate a grid variable.

Jira Load: A Modular connector that loads data from Jira into your data warehouse.

Join: A transformation component that combines data from multiple sources based on matching columns.

Jotform: A Flex connector that loads data from Jotform into your data warehouse.

JSON Load: An orchestration component that loads data into a table from one or more JSON files, automatically inferring the schema and creating the corresponding table.

K🔗

Kafka: A Modular connector that loads data from Apache Kafka into your data warehouse.

Klaviyo: A Flex connector that loads data from Klaviyo into your data warehouse.

L🔗

LaunchDarkly: A Flex connector that loads data from LaunchDarkly into your data warehouse.

LDAP Query: A query connector that loads data from LDAP directory services into your data warehouse.

Lead/Lag: A transformation component that retrieves values from rows preceding or following the current row, using the SQL LEAD and LAG window functions.

Lineage (Security): Security aspects of data lineage tracking and visibility.

Lineage (Maia): Tracking of data flow from source to destination, showing how data is transformed and where it originates.

LinkedIn Ads: A Flex connector that loads data from LinkedIn Ads into your data warehouse.

Lob: A Flex connector that loads data from Lob into your data warehouse.

Loop Iterator: An orchestration component that repeats operations a specified number of times or until a condition is met.

M🔗

Maia AI Agents: Matillion's AI-powered assistant that helps design, build, and optimize data pipelines.

Mailchimp: A Modular connector that loads data from Mailchimp into your data warehouse.

Mailgun: A Flex connector that loads data from Mailgun into your data warehouse.

Mandrill: A Flex connector that loads data from Mandrill into your data warehouse.

Map Values: A transformation component that replaces values based on a mapping table.

MariaDB Load: A Modular connector that loads data from MariaDB into your data warehouse.

Marketo Load: A Modular connector that loads data from Marketo into your data warehouse.

MCP Server: A Model Context Protocol server that provides a standardized way for large language models (LLMs) to interact with Maia resources such as pipeline executions, credit usage, and pipeline runs.

Merge from branch (Git action): Combining changes from one Git branch into another.

Microsoft Dataverse: A Flex connector that loads data from Microsoft Dataverse into your data warehouse.

Microsoft Exchange: A Modular connector that loads data from Microsoft Exchange into your data warehouse.

Microsoft SQL Server Load: A Modular connector that loads data from Microsoft SQL Server into your data warehouse.

Microsoft SQL Server streaming connector: A streaming connector that monitors and captures row-level changes within Microsoft SQL Server schemas by taking a consistent snapshot and tracking change events.

Mixpanel: A Flex connector that loads data from Mixpanel into your data warehouse.

ML Classification: An AI component (Snowflake only) that uses the Snowflake Classification machine learning function to sort data into classes using patterns detected in training data.

MongoDB Query: A query connector that loads data from MongoDB into your data warehouse.

Multi-factor Authentication: Additional security requiring multiple verification methods to access Maia.

Multi Table Input: A transformation component that reads data from multiple tables simultaneously.

MySQL streaming connector: A streaming connector that monitors and captures row-level changes within MySQL schemas in a non-intrusive and performant manner.

N🔗

NetSuite Query: A query connector that loads data from Oracle NetSuite into your data warehouse.

NetSuite SuiteAnalytics Load: A Modular connector that loads data from NetSuite SuiteAnalytics into your data warehouse.

New Relic: A Flex connector that loads data from New Relic into your data warehouse.

Notion: A Flex connector that loads data from Notion into your data warehouse.

O🔗

OAuth: Authentication protocol for securely accessing third-party APIs in custom connectors.

OData Query: A query connector that loads data from OData protocol services into your data warehouse.

Ongoing WMS: A Flex connector that loads data from Ongoing WMS into your data warehouse.

Open Exchange Rates: A Flex connector that loads data from Open Exchange Rates into your data warehouse.

OpenAI Prompt: An AI component that uses a large language model (LLM) via OpenAI to provide responses to user-composed prompts by combining inputs from a source table with configured prompts.

Operator (environment role): An environment role that allows users to view executions and lineage, and manage schedules, but cannot validate, sample, run, or publish pipelines.

Optimize: An orchestration component (Databricks only) that optimizes the layout of Databricks table data using bin-packing or Z-order colocation.

Or: An orchestration component that continues the pipeline as soon as any one of its input components completes.

Oracle Fusion Cloud Financials Load: A Modular connector that loads data from Oracle Fusion Cloud Financials into your data warehouse.

Oracle Fusion Cloud Procurement Load: A Modular connector that loads data from Oracle Fusion Cloud Procurement into your data warehouse.

Oracle Fusion Cloud Project Management Load: A Modular connector that loads data from Oracle Fusion Cloud Project Management into your data warehouse.

Oracle Load: A Modular connector that loads data from Oracle Database into your data warehouse.

Oracle streaming connector: A streaming connector that monitors and records row-level changes in Oracle databases using Oracle's native LogMiner database package.

Oracle Unload: An orchestration component (Snowflake only) that writes data from a Snowflake source table or view to a target Oracle database.

Orbit: A Flex connector that loads data from Orbit into your data warehouse.

ORC Load: An orchestration component that loads data into a table from one or more ORC files, automatically inferring the schema and creating the corresponding table.

Ortto: A Flex connector that loads data from Ortto into your data warehouse.

Owner (environment role): The highest environment role with full permissions including validating, sampling, running, and publishing pipelines, viewing all data, and managing schedules.

Owner (project role): The highest project role with full permissions including managing project settings, environments, secrets, pipelines, Git operations, and user access.

P🔗

PagerDuty: A Flex connector that loads data from PagerDuty into your data warehouse.

Pagination: Handling paginated API responses in custom connectors.

Parquet Load: An orchestration component that loads data into a table from one or more Parquet files, automatically inferring the schema and creating the corresponding table.

PayPal: A Flex connector that loads data from PayPal into your data warehouse.

Pendo: A Flex connector that loads data from Pendo into your data warehouse.

Persona: A Flex connector that loads data from Persona into your data warehouse.

Pinecone Vector Query: An AI component that ingests text search strings and returns data associated with similar vectors in a Pinecone vector database.

Pinecone Vector Upsert: An AI component that converts data from your cloud data warehouse into embeddings and stores them as vectors in a Pinecone vector database.

Pingdom: A Flex connector that loads data from Pingdom into your data warehouse.

Pipedrive: A Flex connector that loads data from Pipedrive into your data warehouse.

Pipeline notifications: A feature that allows Maia users to subscribe to email or Slack alerts for pipeline failures within projects and environments they have access to.

Pipeline Observability: A dashboard in Maia that lists all pipelines and their execution history, including those run manually, on a schedule, or via the API.

Pipeline quality review: A feature in Designer that checks pipelines against a defined set of quality rules to ensure they align with organizational standards.

Pipelines: Sequences of connected components that perform data operations. There are two types: orchestration pipelines (for loading data from sources to warehouses) and transformation pipelines (for transforming data within the warehouse).

Pivot: A transformation component that converts row values into columns, reducing the number of rows and increasing the number of columns. Equivalent to an SQL PIVOT function.

PivotalTracker: A Flex connector that loads data from Pivotal Tracker into your data warehouse.

Postgres Vector Upsert: An orchestration component that converts text data from your cloud data warehouse into embeddings and stores them as vectors in a Postgres vector database.

PostgreSQL Load: A Modular connector that loads data from PostgreSQL into your data warehouse.

PostgreSQL streaming connector: A streaming connector that monitors and captures row-level changes within PostgreSQL schemas using the pgoutput decoding plugin.

Print Variables: An orchestration component that outputs variable values to the execution log.

Private Link: Secure communication between the control plane and Maia agents using private network connectivity, avoiding the public internet.

Productboard: A Flex connector that loads data from Productboard into your data warehouse.

Project roles: Roles assigned at the project level controlling what users can do within specific projects.

Projects: Git-backed containers for pipelines, variables, and related files.

Pull remote changes (Git action): Fetching and merging changes from the remote Git repository.

Pull requests (Git action): Requests to merge changes between branches with review workflows.

Push local changes (Git action): Uploading committed changes to the remote Git repository.

Python Pushdown: A transformation component that executes Python code directly in the data warehouse for efficient processing.

Python Script: An orchestration component that executes Python scripts for tasks such as file handling, API calls, or complex logic.

Q🔗

Query Result to Grid: An orchestration component that queries a table and loads the returned rows into a predefined grid variable for use elsewhere in the pipeline.

Query Result to Scalar: An orchestration component that captures a single query result value into a scalar variable.

R🔗

Rank: A transformation component that adds a ranking column to data using SQL window functions such as RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST, or ROW_NUMBER.

RDS Bulk Output: An orchestration component that loads the contents of a table or view into a table in an Amazon RDS database.

RDS Query: A query connector that loads data from Amazon RDS into your data warehouse.

Recurly: A Flex connector that loads data from Recurly into your data warehouse.

Refresh Table: An orchestration component that refreshes table metadata or data.

Refresh External Table: An orchestration component that refreshes an external table, re-syncing it with the target data.

Refresh Materialized View: An orchestration component (Amazon Redshift only) that refreshes a materialized view by applying changes from its underlying tables.

Registration: The process of creating a Maia account.

Remove from Grid: An orchestration component that removes rows from a grid variable based on matching criteria.

Rename: A transformation component that changes column names.

Reset branch (Git action): Resetting a branch to a previous commit state.

Retry: An orchestration component that repeatedly attempts to run a connected component until it succeeds or a maximum number of attempts is reached.

Revert changes (Git action): Undoing changes by creating a new commit that reverses previous changes.

Rewrite Table: A transformation output component that replaces a table's contents with new data.

Role Based Access Control: Security model where permissions are assigned through roles at account, project, and environment levels.

Rollback: An orchestration component that ends a transaction in a database, and reverts any changes made to a previous state or version.

Rootly: A Flex connector that loads data from Rootly into your data warehouse.

Run Notebook: An orchestration component that executes a Databricks notebook from within a pipeline.

Run Orchestration: An orchestration component that executes another orchestration pipeline.

Run Shared Pipeline: An orchestration component that executes a shared pipeline to completion.

Run Transformation: An orchestration component that executes a transformation pipeline from within an orchestration pipeline.

S🔗

S3 Load: An orchestration component that loads data into an existing table from objects stored in Amazon S3.

S3 Unload: An orchestration component that creates files on a specified Amazon S3 bucket and populates them by copying data from a designated table or view.

Salesforce Load: A Modular connector that loads data from Salesforce into your data warehouse.

Salesforce Marketing Cloud Query: A query connector that loads data from Salesforce Marketing Cloud into your data warehouse.

Salesforce Output: An orchestration component that writes the contents of a source table or view back into a Salesforce table using the Salesforce API.

Salesforce Pardot: A Flex connector that loads data from Salesforce Pardot into your data warehouse.

Salesloft: A Flex connector that loads data from Salesloft into your data warehouse.

SAP NetWeaver Load: A Modular connector that loads data from SAP NetWeaver into your data warehouse.

SAP ODP: A Modular connector that loads data from SAP via Operational Data Provisioning into your data warehouse.

SAP SuccessFactors: A Flex connector that loads data from SAP SuccessFactors into your data warehouse.

Schedules: Automated triggers that run pipelines at specified times or intervals.

Schema Copy: An orchestration component that copies one or more tables from one schema to another.

SeatGeek: A Flex connector that loads data from SeatGeek into your data warehouse.

Secrets: Securely stored credentials referenced by name in pipelines.

Security: The security measures protecting Maia, including encryption standards, certificate validation, and secure communication protocols.

Send Email: An orchestration component that sends emails, with or without attachments, from within a pipeline.

SendGrid: A Flex connector that loads data from SendGrid into your data warehouse.

ServiceNow Query: A query connector that loads data from ServiceNow into your data warehouse.

Services: The various services available within Maia.

Shared pipelines: Pipelines that can be reused across multiple projects.

SharePoint Query: A query connector that loads data from Microsoft SharePoint into your data warehouse.

Shopify: A Flex connector that loads data from Shopify into your data warehouse.

Shopify Load: A Modular connector that loads data from Shopify into your data warehouse.

Show to Grid: An orchestration component (Snowflake only) that saves the results of a Snowflake SHOW command into a grid variable.

Single Sign-On (SSO): Authentication allowing users to accessMaia using organizational identity providers.

Slack: A Flex connector that loads data from Slack into your data warehouse.

Smartsheet: A Flex connector that loads data from Smartsheet into your data warehouse.

Snapchat: A Flex connector that loads data from Snapchat into your data warehouse.

SNS Message: An orchestration component that sends a message to an Amazon SNS topic to initiate downstream processing in a business workflow.

Snowflake Load: A Modular connector that loads data from Snowflake into your data warehouse.

Snowflake streaming destination: A streaming pipeline destination that writes captured change events directly to a Snowflake data warehouse.

Snowflake Vector Upsert: An AI component (Snowflake only) that converts data from your cloud data warehouse into embeddings and stores them as vectors in Snowflake.

Snowpark Container Prompt: An orchestration component (Snowflake only) that hosts and queries large language models running within Snowpark Container Services.

Snyk: A Flex connector that loads data from Snyk into your data warehouse.

Split Field: A transformation component that splits a column's contents at a specified delimiter and outputs the resulting parts as new columns.

SQL: A transformation component for executing custom SQL queries.

SQL Script: An orchestration component for executing SQL scripts.

SQS Message: An orchestration component that posts a message to an Amazon Simple Queue Service (SQS) queue for processing by other applications.

Square: A Flex connector that loads data from Square into your data warehouse.

SSH Tunneling: Secure connectivity to data sources through SSH tunnels.

Start: An orchestration component that marks the starting point of an orchestration pipeline. All orchestration pipelines require at least one Start component.

Start Compute: An orchestration component (Databricks only) that ensures the environment's default Databricks cluster is running before the pipeline continues.

Streaming agent logs: Logs specific to streaming agent execution.

Streaming pipelines: Pipelines that process data continuously in near real-time from source databases.

Stripe Query: A query connector that loads data from Stripe into your data warehouse.

SugarCRM Load: A Modular connector that loads data from SugarCRM into your data warehouse.

Super Admin: The highest account role that inherits all User Admin permissions and is automatically assigned the Owner role for all projects and environments in the account. Only another Super Admin can modify or remove a user with this role.

SurveyMonkey: A Modular connector that loads data from SurveyMonkey into your data warehouse.

Sybase ASE Load: A Modular connector that loads data from Sybase ASE into your data warehouse.

Sync All Tables: Pre-built pipelines that process Avro files from Amazon S3 or Azure Blob Storage streaming destinations and load them into a cloud data platform.

System variables: Built-in variables provided by the system (job ID, timestamps, etc.).

T🔗

Table Delete Rows: A transformation component (Amazon Redshift only) that deletes rows from a target table by matching them to key values from an input flow.

Table Input: A transformation component that reads data from an existing table.

Table Iterator: An orchestration component that loops through tables matching specified criteria.

Table Metadata to Grid: An orchestration component that retrieves table structure information into a grid variable.

Table Output: A transformation component that writes data to a destination table.

Table Update: A transformation component that updates existing rows in a table.

Ticketmaster: A Flex connector that loads data from Ticketmaster into your data warehouse.

TikTok: A Flex connector that loads data from TikTok into your data warehouse.

Toggl: A Flex connector that loads data from Toggl into your data warehouse.

Transactions: A series of database operations performed as a single unit of work, ensuring data integrity through ACID properties. Managed in Maia using the Begin, Commit, and Rollback components.

Transpose Columns: A transformation component that converts columns into rows, reshaping data from a wide format into a longer format.

Transpose Rows: A transformation component that combines multiple input rows into a single output row, reshaping data from a long format into a wider format.

Truncate Table: An orchestration component that removes all rows from a table while preserving its structure.

Twilio: A Flex connector that loads data from Twilio into your data warehouse.

U🔗

Unite: A transformation component that combines two or more datasets, optionally including all columns or only shared columns.

Unpivot: A transformation component that converts columns into rows.

Unstructured.io: A Flex connector that processes and loads data from unstructured documents into your data warehouse.

Update Scalar: An orchestration component that modifies the value of a scalar variable.

Usage Dashboard: A monitoring interface that provides a detailed daily breakdown of credit consumption, showing usage by project and pipeline. Available to users with the Billing service enabled.

User Admin: An account role that allows users to edit basic account settings, manage permissions, and administer other users including inviting users and assigning roles.

V🔗

Vacuum: An orchestration component (Amazon Redshift only) that reorganizes table data according to its sort key and reclaims space from deleted rows.

Variables: Named values that can be referenced throughout pipelines using ${variable_name} syntax.

Vector Search: An AI component (Snowflake only) that performs a vector similarity search on two input tables to find content that best answers specific questions using Cortex vector embeddings.

Viewer (environment role): An environment role that allows users to validate and sample pipelines, view executions, lineage, and schema, but cannot run or publish pipelines, and can only view (not manage) schedules.

Viewer (project role): A project role with read-only access that allows users to view projects, pipelines, environments, and other resources, but cannot create, modify, or delete anything.

Volume to Delta Table: An orchestration component (Databricks only) that transfers data from a Databricks volume into a Delta Lake table without replacing or deleting existing data.

W🔗

Webhook Post: An orchestration component that posts a payload to a specified webhook URL, triggering an automated message in an external application.

Window Calculation: A transformation component that performs calculations across rows using SQL window functions.

Workday Custom Reports: A Modular connector that loads data from Workday Custom Reports into your data warehouse.

Workday Load: A Modular connector that loads data from Workday into your data warehouse.

X🔗

X Ads Load: A Modular connector that loads advertising data from X (formerly Twitter) into your data warehouse.

Y🔗

Yelp: A Flex connector that loads data from Yelp into your data warehouse.

Z🔗

Zendesk Talk: A Flex connector that loads data from Zendesk Talk into your data warehouse.

Zendesk Ticketing: A Flex connector that loads data from Zendesk Ticketing into your data warehouse.

Zoom: A Flex connector that loads data from Zoom into your data warehouse.

Zuora: A Flex connector that loads data from Zuora into your data warehouse.