API Profiles Overview
Overview
API Profiles is a collective term for API Query Profiles and API Extract Profiles that are used by API Query and API Extract components, respectively. API Profiles define how data is taken from a JSON or XML based REST API and can be considered a collection of schemas that define how this data is ingested to become table data.
API Profiles are a global resource, shared across Matillion ETL groups and projects, that can be found via the Project menu: Project → Manage API Profiles.
Query vs Extract
API Query Profiles and API Extract Profiles are configured similarly as the primary difference between them is how they handle the response data. Below are some general concepts you will need to be aware of when creating API Profiles. Note that most of these concepts are common to REST APIs and any user wanting to use API Profiles should first be competent at using REST APIs.
API Query Profiles
- Incoming data is flattened and then staged (ETL).
- A filter is defined that selects which fields to load.
- This option can require some advanced custom scripting.
- Used by the API Query component.
API Extract Profiles
- Incoming data is staged raw. The loaded data requires flattening and filtered via transformations in the data warehouse (ELT).
- This is often the simpler option and requires no code.
- Used by the API Extract component.
Limitations
Matillion's API Query Profiles CANNOT load data from:
- Data other than valid JSON or XML.
- Binary or unstructured data, such as PDF or Word documents, or images.
- HTML.
- APIs where the data contains embedded CSVs.
- APIs which return XML embedded inside JSON, or vice versa.
- APIs in which the returned content is encoded or encrypted.
It may be possible to use the API Extract component for many of the cases above as the formatting of the response is left to the user.
API Profiles concepts
Configuring API Profile endpoints requires configuring properties collected under the following headings, displayed as tabs of the same name within the wizard.
Detailed explanations of specific fields in the wizards can be found in their respective articles, API Query Profiles and API Extract Profiles.
Auth
During the endpoint creation wizard, authentication details are used to test your endpoint configuration and generate a response. These details are not stored for use with the API Query or API Extract component that may use this profile and so will need to be reentered in the component, ideally using a Managed Password. In some cases, the authentication details used in the wizard are stored in the Connection Options for an API Query Profile and users can remove these as desired after creating their endpoint.
For examples on authentication, see API Profiles - Authentication.
Params
Parameters are parts of your API call that can be abstracted as variables. These are declared within the API Profile wizard and can then be defined by the component using that profile. In this way, a single API profile could be used in many different ways. For example, if part of the URL path is declared as a parameter, it can be specified at the component instead of the API Profile, allowing that API Profile to hit multiple different endpoints with different components.
For a full discussion on parameters with examples, see API Profiles - Parameters.
Body
The plaintext body to be sent with the request. This is only required for some POST method requests. See your target API's documentation for details on how the body should be formed.
Response
After configuring your API request in the wizard, you can see the generated response and decide if you want to continue editing the request. If you're happy with the response, you can move on to defining how that data is used.
Logs
The wizard provides logs for every API request, complete with any internal errors due to misconfiguration, HTTP response status codes, request information such as headers, and any other information returned in the response.
Pagination
Pagination is the process by which large API response data can be returned as multiple parts and iterated through, instead of dealing with potentially huge amounts of data in a single response.
Expanded explanations of paging stategies are available in API Profiles - Pagination.
RSD files
RSD files are the total configuration of an API Query Profile endpoint with its own syntax that defines all properties such as authorization, headers, syntax and expected data. RSD files are only present in API Query Profiles as, unlike API Extract Profiles, they must provide a definition for transforming response data.
We do not recommend users manually edit these files as most uses of API Query Profiles can be configured entirely using the wizard. For more information on RSD files and their properties, see API Query Profiles - RSD files.
Using API Profiles
- API Query Profiles are used through the API Query component.
- API Extract Profiles are used through the API Extract component.
By selecting the Profile property in those components, an API Profile of the appropriate type can be selected. Some choices made during the API Profile endpoint configuration may affect the component's properties once a profile is selected, such as available parameters.
API Profiles examples
To see worked examples and specific applications of API Profiles see the below articles: