Obtaining an API token and passing it to an API Query profile
Overview
Some API profiles require a refresh token to be passed in as part of the header, such as "Volo API", "Hubspot API" and "Salesforce API". Some of these refresh tokens expire regularly, even hourly, and will need to be refreshed as part of the Matillion ETL job. This document using an example endpoint to look at how to refresh an API token and how to pass this into an API profile.
The document is organized into following sections:
- Example API
- Passing the access token to the endpoint URI
- Obtaining an access token at runtime
- API Query component
Example API
The example API requires an authorization token to be passed through with every API call to authenticate the user. The authorization token expires every hour, so to ensure Matillion ETL is making the API call with a valid token, a separate API call must be made to get the the token required.
This is done using the API token endpoint, which requires a "Bearer" Authorisation Token. This token is only accepted by the token endpoint. No other endpoint in the API, not even an API key required to query all other endpoints will accept this token.
Some token API endpoints need authorization to generate the access token through clientId, client secret and refresh token as query parameters in the URL, such as Salesforce API.
Generate Access Token
Example Endpoint API token URI: https://InstanceName/services/oauth2/token
An example response from the example endpoint URI to generate access token shown below:

Please Note
Please provide required query parameters in the endpoint URL, if needed. In the example above, it is Salesforce API token URI that needs query parameter such as grant_type, client_id, client_secret, refresh_token
Passing the access token to the endpoint URI
You can pass the access token at an API endpoint URI through Manage Query profiles interface in the process of generating RSD file. By doing that, the generated RSD file will automatically fetch the access token you have provided in the interface.
Manage query profile interface
1. Click Project → Manage API Profiles → Manage Query Profiles to open the Manage Query Profiles interface window. Click to add a new query profile.
2. In the Add Query Profile pop-up window, enter "ExampleAPI" into the Profile Name field and click OK.
3. Return to the Manage Query Profiles pop-up window, find and click on "ExchangeRateAPI" in the API profiles list. This should open the Configure Query Profile pop-up window.
4. On the Configure Query Profile window, click New Endpoint to create a new query profile using API endpoint URI.
5. On the next window, provide the Source details. You'll need the Endpoint Name of what you want the table to be called. You can also provide a description. Once done, click Next.
6. In the URI field, enter the URI of the API call you want to make. At this point, you can configure Parameters, any required body, and/or a header and Authentication. In the example we'll be passing bearer access token to the endpoint URI. For that, click Auth tab, select Bearer Token authentication types and enter the access token. Use the same value, you received in generating an access token. Once done, click Next.
Please note
In the above example, the RSD will attempt to contact the real API endpoint, so it’s necessary to also add authentication parameters. For the example REST API, it’s the access token in the form of authentication bearer token. Again, use exactly the same value, you got in generating access token.
7. On the Response Configuration window, you may only want specific items such as those nested in an object of the JSON. The repeating group in this example is an array element named "records". "Set Repeating Element" using right-click. Enable the "Paging" feature and select the required strategy from the dropdown (in this case Paging is "Full Path"), then click Next.
8. You'll see a sample of the data in Data Preview tab. You can also review your configurations through the Config Review tab when you click Finish.
9. Returning to the Configure Query Profile window, the file "Example.rsd" will now be open in editable mode.
Important Information
The RSD script is fully functional as-is, but still contains a hardcoded access token. It’s bad practice to leave it this way, for two reasons:
- It’s insecure: owning the access token is equivalent to knowing the Salesforce username and password, so it needs to be properly protected.
- Access tokens expire after few hours, so the existing API Profile has a very short shelf life.
To remedy these things, the RSD needs to be parameterized as part of production.
Manage Connection Options
When passing authentication or parameters in the Manage Query interface window, the attribute values will be automatically added in Manage Connection Options. You can change or add any additional attribute value of any parameter using this window.
Normally after access tokens expire, the API call may stop working and you’ll need to get another access token.
Two steps remain to produce the Query Profile:
1. Remove the hardcoded access token from the "Manage Connection Option" window.
2. Obtain the access token at runtime and supply the value to the CustomeHeader Parameter in Manage Connection Option as "Authentication : Bearer <token> by clicking at Configure Query Profile window.
Test the API profile again with the parameter in place to make sure it’s being picked up correctly.
After the API profile has been parameterized, you can package the components into an Orchestration job. The next section will explain how to obtain access token at runtime using Python Script component and use Python component with API Query component to extract the access token at runtime.
Obtaining an access token at runtime
There are a number of different ways of achieving this, as outlined in this document.
For this example, we’ll replace the token access URL (https://InstanceName/services/oauth2/token) command shown earlier with a Python script in a Matillion "Python Script" component. The script is available for download at the end of this document.
1. Search for the Python Script component in the "Search bar".
2. Click on Properties of the component.
3. Open the Script window by clicking ....
4. On the Script window, enter the Jython Script for the "Token Access" URL.
The Orchestration job
There are three aspects to this:
1. Job variables, accepting all of the necessary parameters including access_token, client_id, client_secret, and refresh_token.
2. The Python Script component, which gets a new Salesforce access token using the client_id, client_secret and refresh_token.
3. The API Query component, which uses the access token to extract the API account data and load it into a database table.
In the API Query component, go to Connection Options and add a new record:
- Parameter: CustomHeaders
- Value: Bearer ${access_token}
Once done, check the response data in the Sample tab after adding Python Script to get the access token at runtime.
Warning
For security reasons it’s highly recommended that you do not hardcode the "client_id", "client_secret" and "refresh_token" anywhere, or store them as environment variable defaults. Instead, implement password management functionality to protect them - for example as described in this article.
API Query Component
Once an API Query profile (RSD file) is created and has been tested and returns data, it's ready to be used in the API Query component. This can be done as follows:
1. Create a new Orchestration job.
2. Drag an API Query component onto the job canvas.
3. Click on the component icon to open the Properties panel.
4. In the Basic/Advanced Mode, please keep the Mode of the properties as "Basic" which is a preferred mode. In the "Basic" mode, you need to choose a data source and column. In "Advanced" mode you need to make specific SQL queries in editor.
5. Next, select the Authentication Method from the dropdown and click OK.
6. Next, select the Profile from the dropdown, then click OK. Please note, the Profile is the name of the API Query profile you created. Example: Salesforce-API
.
7. In the Connection Options, you can specify the param attributes value same as in Manage Connection Options while generating Query Profile and also dynamically add any params or change the param values . Please note that when you configure Connection Options property in the API Query component. Its value, will not be persisted.
8. Select the valid Data Source from the dropdown. The Data Source is the name of the "Endpoint" provided in the Source Details window.
9. Select the items or variables in Data Selection and click OK. The Data Selection are the items or variables you wanted the data to be collected from endpoint.
10. Configure the rest of the API Query component by providing a target table name and selecting a valid staging area. Once all details are provided and validated as OK, the icon of the API Query component coverts to green.
11. Finally, you can run the component by right-clicking on the API Query Component and then selecting Run Component. If all properties are configured correctly, the job will run successfully. You can view the details in the Tasks Info.
12. Next, you click Sample and then click Data. It should fetch the response as expected.