Dataset API
Open Specification
The Open Data Blend Dataset API implements and extends the Frictionless Data Specifications. You can learn more about this in the Frictionless Data Compatibility section.
Terminology such as 'Packages' and 'Resources' is inherited from the Frictionless Data specifications. You can loosely translate these terms as referring to dataset and data file metadata, respectively.
You can translate 'Package' as meaning 'Dataset' and 'Resource' as meaning 'Data File'. An exception to this is that in the Open Data Blend Catalogue metadata, 'Package' refers to the 'Catalogue' and 'Resource' refers to a 'Dataset'.
High Availability
If you have an Analytics Plan subscription, your usage of the Open Data Blend Dataset API is covered by a service-level agreement (SLA) that includes a 99.5% up-time guarantee. You can check the Status Page for current and historic up-time statuses at any time.
Fair Usage
We regularly monitor usage and take reasonable actions to ensure that the Open Data Blend Dataset API service is used fairly and sustainably. We only impose hard limits on data file endpoints (see the Usage Limits section below).
Usage Limits
A data file can be downloaded without providing an access key. All requests without an access key are limited to 15 requests per month. Requests that exceed this limit will receive a HTTP 401 ('Unauthorized') response.
We always aim to strike a careful balance between openness, in terms of accessibility, and the on-going sustainability of the service. Many of our data files can be quite large. Imposing the 15 request per month limit on free downloads helps to keep the associated bandwidth and compute costs at a sustainable level.
There is no limit to the number of requests that can be made with a valid access key. All requests above the number included in your subscription will incur an additional cost. The cost per additional data file request can be found on the pricing page.
Preview CSV data files do not count towards the limit.
Rate Limits
All data file requests are limited to 30 requests per minute. Exceeding this limit will result in a HTTP 429 ('Too Many Requests') response being returned for up to an hour.
API Reference
The Open Data Blend Dataset API is simple and has three types of requests:
Catalogue
Dataset
Data File
All responses are in JSON with the following exceptions:
Requests for data files: the backend response is a data file.
Requests that result in a pass-through error: the backend response is presented verbatim which could be in XML, for example.
Each request type is described in this section with examples, where applicable.
Request Methods
Unless specified otherwise, requests should be made using the HTTP GET method.
Catalogue
The Catalogue endpoint returns a set of metadata for the published Open Data Blend Datasets. The request can be made using any of three endpoints.
Non-versioned Endpoint
The non-versioned endpoint is provided for convenience and will always point to the latest versioned endpoint of the Open Data Blend Dataset API. We recommend that you use the standard or Frictionless Data Specification endpoint for any production use. This will ensure that your code is not repointed to a new versioned endpoint if one is introduced in the future.
Standard Endpoint (Recommended)
Frictionless Data Specification Endpoint (Recommended)
Response Example
All variants of the Open Data Blend Catalogue endpoint will return the same JSON response.
Metadata Definitions
Open Data Blend Catalogue metadata is split into two main parts:
Package Properties: The collection of properties at the root level that describe the catalogue.
Resource Properties: The collection of properties within the 'resources' property that describe the published datasets.
Package Properties
Name | Description |
profile | Name of the Data Package profile. The value will always be 'data-package-catalog' which indicates that it is compatible with the Frictionless Data pattern for Data Package Catalogues. |
name | Name of the catalogue in slug form |
title | Name of the catalogue in title case |
description | Brief description of the catalogue |
terms_of_service | URL of the Open Data Blend terms of service |
updated | Last date that the catalogue was updated |
contributors | Details of the catalogue contributors |
resources | List of published datasets and their properties. See the ‘Resource Properties’ section below for resource object definitions. |
Resource Properties
Name | Description |
name | Name of the dataset in slug form |
path | URL of the dataset metadata |
title | Name of the Dataset in title case |
description | Brief description of the dataset |
updated | Last date that the dataset was updated |
is_beta | Beta status of the dataset. |
format | File format of dataset metadata. This is always |
image | URL of the dataset icon |
theme | Theme of the dataset which is used to group like themed datasets together |
blend_classes | Blend Class of the dataset. Indicates the maturity of the dataset in terms of refinement. |
resource_group_count | Number of data file groups. Indicates the number of logical tables. |
resource_count | Number of data files. |
Dataset
The Dataset endpoint returns the complete set of metadata for an Open Data Blend Dataset. The request can be made using any of two endpoints. In both endpoint variants, name
is the name of the dataset in slug form e.g. 'open-data-blend-date'.
Standard Endpoint
Frictionless Data Specification Endpoint
Response Example
The below response example has been abbreviated for clarity.
Metadata Definitions
Open Data Blend Dataset metadata is split into two main parts:
Package Properties: The collection of properties at the root level that describe the dataset.
Resource Properties: The collection of properties within the 'resources' property that describe the data files.
Package Properties
Name | Description |
profile | Name of the Data Package profile. The value will always be 'data-package' which indicates that it is compatible with the Frictionless Data specification for Data Packages. |
name | Name of the dataset in slug form |
title | Name of the dataset in title case |
description | Brief description of the catalogue |
is_beta | Beta status of the dataset. |
terms_of_service | URL of the Open Data Blend terms of service |
contributors | Details of the catalogue contributors |
theme | Theme of the dataset which is used to group like themed datasets together. |
homepage | URL for the dataset homepage |
image | URL for the dataset icon |
updated | Last date that the dataset was updated |
path | Location of the dataset metadata |
snapshot_path | Location of the specific version of the dataset metadata |
keywords | Keywords that describe the dataset |
blend_classes | Blend Class of the dataset. Indicates the maturity of the dataset in terms of refinement. |
resources | List of data files in the dataset. See the ‘Resource Properties’ section below for resource object definitions. |
sources | The data sources and information sources that were used to create the dataset |
reuse_ideas | Use cases for the dataset |
showcases | List of showcases that demonstrate how the dataset could be used |
useful_resources | List of useful resources such as relationship diagrams and data source documentation |
related_packages | List of related or relevant datasets |
Resource Properties
Name | Description |
group_name | Name of the logical group of data files. Except for preview data files, all data files with the same group name have the same table schema and contain the same data. |
group_title | Name of the data file group in title-case |
group_description | Description for the data file group |
group_row_count | Row count of eachh data file in the data file group. Except for preview data files, all data files in the same group will have matching row counts. |
is_preview | Indicates whether the data file has a top 100 row preview or all of the rows. |
name | Name of the dataset in slug form |
path | URL of the dataset metadata |
title | Name of the Dataset in title case |
description | Brief description of the dataset |
format | File format of the data file. This will be either |
licenses | Details of the licences that the data file has been licensed with |
profile | Name of the Data Resource profile. The value will always be 'data-resource' which indicates that it is compatible with the Frictionless Data specification for Data Resources. |
row_count | Number of rows in the data file excluding headers |
updated | Last date that the dataset was updated |
schema | Definition of the data file schema. The definition conforms to the Table Schema specification. |
Data File
The Data File endpoint is the bulk data API and returns the requested data file. Both public (i.e. unauthenticated) and authenticated requests can be made through the same endpoint.
Standard Endpoint
datafile
The path of the data file excluding the file format extension
extension
The file format extension of the data file which can be one of the following:
.csv
.csv.gz
.orc
.parquet
The '.csv' version of a data file is for preview purposes and only contains the first 100 rows.
Authenticated Requests
Access keys are used to authenticate data file requests. Without an access key, the number of requests you can make will be limited.
You can provide your access key in one of the following ways:
Using the
Open-Data-Blend-Access-Key
header with the POST methodUsing the
accesskey
query parameter with the GET methodUsing the
accesskey
body parameter with the POST method
The order of precedence for applying your access key matches the listed order above.
Below are examples of each authentication method.
Access Key in a Header
Access Key in a Query Parameter
Access Key in the Request Body
Last updated