Dataset API
Last updated
Was this helpful?
Last updated
Was this helpful?
The Open Data Blend Dataset API implements and extends the . You can learn more about this in the section.
Terminology such as 'Packages' and 'Resources' is inherited from the Frictionless Data specifications. You can loosely translate these terms as referring to dataset and data file metadata, respectively.
If you have an subscription, your usage of the Open Data Blend Dataset API is covered by a that includes a 99.5% up-time guarantee. You can check the for current and historic up-time statuses at any time.
We regularly monitor usage and take reasonable actions to ensure that the Open Data Blend Dataset API service is used fairly and sustainably. We only impose hard limits on data file endpoints (see the Usage Limits section below).
A data file can be downloaded without providing an access key. All requests without an access key are limited to 15 requests per month. Requests that exceed this limit will receive a HTTP 401 ('Unauthorized') response.
All data file requests are limited to 30 requests per minute. Exceeding this limit will result in a HTTP 429 ('Too Many Requests') response being returned for up to an hour.
The Open Data Blend Dataset API is simple and has three types of requests:
Catalogue
Dataset
Data File
All responses are in JSON with the following exceptions:
Requests for data files: the backend response is a data file.
Requests that result in a pass-through error: the backend response is presented verbatim which could be in XML, for example.
Each request type is described in this section with examples, where applicable.
Unless specified otherwise, requests should be made using the HTTP GET method.
The Catalogue endpoint returns a set of metadata for the published Open Data Blend Datasets. The request can be made using any of three endpoints.
The non-versioned endpoint is provided for convenience and will always point to the latest versioned endpoint of the Open Data Blend Dataset API. We recommend that you use the standard or Frictionless Data Specification endpoint for any production use. This will ensure that your code is not repointed to a new versioned endpoint if one is introduced in the future.
All variants of the Open Data Blend Catalogue endpoint will return the same JSON response.
Open Data Blend Catalogue metadata is split into two main parts:
Package Properties: The collection of properties at the root level that describe the catalogue.
Resource Properties: The collection of properties within the 'resources' property that describe the published datasets.
Package Properties
Name
Description
profile
name
Name of the catalogue in slug form
title
Name of the catalogue in title case
description
Brief description of the catalogue
terms_of_service
URL of the Open Data Blend terms of service
updated
Last date that the catalogue was updated
contributors
Details of the catalogue contributors
resources
List of published datasets and their properties. See the ‘Resource Properties’ section below for resource object definitions.
Resource Properties
Name
Description
name
Name of the dataset in slug form
path
URL of the dataset metadata
title
Name of the Dataset in title case
description
Brief description of the dataset
updated
Last date that the dataset was updated
is_beta
Beta status of the dataset. true
means in beta and false
means stable
format
File format of dataset metadata. This is always .json
.
image
URL of the dataset icon
theme
Theme of the dataset which is used to group like themed datasets together
blend_classes
Blend Class of the dataset. Indicates the maturity of the dataset in terms of refinement.
resource_group_count
Number of data file groups. Indicates the number of logical tables.
resource_count
Number of data files.
The Dataset endpoint returns the complete set of metadata for an Open Data Blend Dataset. The request can be made using any of two endpoints. In both endpoint variants, name
is the name of the dataset in slug form e.g. 'open-data-blend-date'.
Open Data Blend Dataset metadata is split into two main parts:
Package Properties: The collection of properties at the root level that describe the dataset.
Resource Properties: The collection of properties within the 'resources' property that describe the data files.
Package Properties
Name
Description
profile
name
Name of the dataset in slug form
title
Name of the dataset in title case
description
Brief description of the catalogue
is_beta
Beta status of the dataset. true
means in beta and false
means stable
terms_of_service
URL of the Open Data Blend terms of service
contributors
Details of the catalogue contributors
theme
Theme of the dataset which is used to group like themed datasets together.
homepage
URL for the dataset homepage
image
URL for the dataset icon
updated
Last date that the dataset was updated
path
Location of the dataset metadata
snapshot_path
Location of the specific version of the dataset metadata
keywords
Keywords that describe the dataset
blend_classes
Blend Class of the dataset. Indicates the maturity of the dataset in terms of refinement.
resources
List of data files in the dataset. See the ‘Resource Properties’ section below for resource object definitions.
sources
The data sources and information sources that were used to create the dataset
reuse_ideas
Use cases for the dataset
showcases
List of showcases that demonstrate how the dataset could be used
useful_resources
List of useful resources such as relationship diagrams and data source documentation
related_packages
List of related or relevant datasets
Resource Properties
Name
Description
group_name
Name of the logical group of data files. Except for preview data files, all data files with the same group name have the same table schema and contain the same data.
group_title
Name of the data file group in title-case
group_description
Description for the data file group
group_row_count
Row count of eachh data file in the data file group. Except for preview data files, all data files in the same group will have matching row counts.
is_preview
Indicates whether the data file has a top 100 row preview or all of the rows. true
means the data file is a preview and false
means the data file has all the rows.
name
Name of the dataset in slug form
path
URL of the dataset metadata
title
Name of the Dataset in title case
description
Brief description of the dataset
format
File format of the data file. This will be either .csv
, .csv.gz
, .orc
, or .parquet
.
licenses
Details of the licences that the data file has been licensed with
profile
row_count
Number of rows in the data file excluding headers
updated
Last date that the dataset was updated
schema
The Data File endpoint is the bulk data API and returns the requested data file. Both public (i.e. unauthenticated) and authenticated requests can be made through the same endpoint.
datafile
The path of the data file excluding the file format extension
extension
The file format extension of the data file which can be one of the following:
.csv
.csv.gz
.orc
.parquet
You can provide your access key in one of the following ways:
Using the Open-Data-Blend-Access-Key
header with the POST method
Using the accesskey
query parameter with the GET method
Using the accesskey
body parameter with the POST method
Below are examples of each authentication method.
Access Key in a Header
Access Key in a Query Parameter
Access Key in the Request Body
There is no limit to the number of requests that can be made with a valid access key. All requests above the number included in your subscription will incur an additional cost. The cost per additional data file request can be found on the .
Name of the . The value will always be 'data-package-catalog' which indicates that it is compatible with the Frictionless Data pattern for .
Name of the . The value will always be 'data-package' which indicates that it is compatible with the Frictionless Data specification for .
Name of the . The value will always be 'data-resource' which indicates that it is compatible with the Frictionless Data specification for .
Definition of the data file schema. The definition conforms to the specification.
are used to authenticate data file requests. Without an access key, the number of requests you can make will be .