Dataset API

Open Specification

The Open Data Blend Dataset API implements and extends the Frictionless Data Specifications. You can learn more about this in the Frictionless Data Compatibility section.

Terminology such as 'Packages' and 'Resources' is inherited from the Frictionless Data specifications. You can loosely translate these terms as referring to dataset and data file metadata, respectively.

You can translate 'Package' as meaning 'Dataset' and 'Resource' as meaning 'Data File'. An exception to this is that in the Open Data Blend Catalogue metadata, 'Package' refers to the 'Catalogue' and 'Resource' refers to a 'Dataset'.

High Availability

If you have an Analytics Plan subscription, your usage of the Open Data Blend Dataset API is covered by a service-level agreement (SLA) that includes a 99.5% up-time guarantee. You can check the Status Page for current and historic up-time statuses at any time.

Fair Usage

We regularly monitor usage and take reasonable actions to ensure that the Open Data Blend Dataset API service is used fairly and sustainably. We only impose hard limits on data file endpoints (see the Usage Limits section below).

Usage Limits

A data file can be downloaded without providing an access key. All requests without an access key are limited to 8 requests per month. Requests that exceed this limit will receive a HTTP 401 ('Unauthorized') response.

We always aim to strike a careful balance between openness, in terms of accessibility, and the on-going sustainability of the service. Many of our data files can be quite large. Imposing the 8 request per month limit on free downloads helps to keep the associated bandwidth and compute costs at a sustainable level.

There is no limit to the number of requests that can be made with a valid access key. All requests above the number included in your subscription will incur an additional cost. The cost per additional data file request can be found on the pricing page.

Preview CSV data files do not count towards the limit.

Rate Limits

All data file requests are limited to 10 requests per minute. Exceeding this limit will result in a HTTP 429 ('Too Many Requests') response being returned for up to an hour.

API Reference

The Open Data Blend Dataset API is simple and has three types of requests:

  • Catalogue

  • Dataset

  • Data File

All responses are in JSON with the following exceptions:

  • Requests for data files: the backend response is a data file.

  • Requests that result in a pass-through error: the backend response is presented verbatim which could be in XML, for example.

Each request type is described in this section with examples, where applicable.

Request Methods

Unless specified otherwise, requests should be made using the HTTP GET method.

Catalogue

The Catalogue endpoint returns a set of metadata for the published Open Data Blend Datasets. The request can be made using any of three endpoints.

Non-versioned Endpoint

https://packages.opendatablend.io

The non-versioned endpoint is provided for convenience and will always point to the latest versioned endpoint of the Open Data Blend Dataset API. We recommend that you use the standard or Frictionless Data Specification endpoint for any production use. This will ensure that your code is not repointed to a new versioned endpoint if one is introduced in the future.

https://packages.opendatablend.io/v1

https://packages.opendatablend.io/v1/open-data-blend-catalogue/datapackage.json

Response Example

All variants of the Open Data Blend Catalogue endpoint will return the same JSON response.

{
"profile": "data-package-catalog",
"name": "open-data-blend-catalogue",
"title": "Open Data Blend Catalogue",
"description": "A catalogue of the released Open Data Blend Packages",
"terms_of_service": "https://www.opendatablend.io/terms",
"updated": "2020-11-18T15:45:31Z",
"contributors": [
{
"title": "Open Data Blend Team (Nimble Learn Ltd)",
"path": "https://www.opendatablend.io",
"role": "author"
}
],
"resources": [
{
"name": "open-data-blend-anonymised-mot",
"path": "https://packages.opendatablend.io/v1/open-data-blend-anonymised-mot/datapackage.json",
"title": "Anonymised MOT",
"description": "MOT tests and results since the MOT system was computerised in 2005",
"updated": "2020-10-19T08:17:52Z",
"is_beta": true,
"format": "json",
"image": "https://packages.opendatablend.io/image/automotive.svg",
"theme": {
"name": "automotive",
"title": "Automotive",
"image": "https://packages.opendatablend.io/image/automotive.svg"
},
"blend_classes": [
{
"name": "class-1",
"title": "Class I",
"image": "https://packages.opendatablend.io/image/class-1.svg",
"description": "Carefully structured and optimised for data analytics"
},
{
"name": "class-2",
"title": "Class II",
"image": "https://packages.opendatablend.io/image/class-2.svg",
"description": "Cleansed and enriched using domain knowledge"
},
{
"name": "class-3",
"title": "Class III",
"image": "https://packages.opendatablend.io/image/class-3.svg",
"description": "Supports drilling across with one or more other blends"
}
],
"resource_group_count": 37,
"resource_count": 148
},
{
"name": "open-data-blend-date",
"path": "https://packages.opendatablend.io/v1/open-data-blend-date/datapackage.json",
"title": "Date",
"description": "A collection of sequential dates with several levels of roll-up",
"updated": "2020-10-19T08:17:52Z",
"is_beta": false,
"format": "json",
"image": "https://packages.opendatablend.io/image/temporal.svg",
"theme": {
"name": "temporal",
"title": "Temporal",
"image": "https://packages.opendatablend.io/image/temporal.svg"
},
"blend_classes": [
{
"name": "class-1",
"title": "Class I",
"image": "https://packages.opendatablend.io/image/class-1.svg",
"description": "Carefully structured and optimised for data analytics"
},
{
"name": "class-3",
"title": "Class III",
"image": "https://packages.opendatablend.io/image/class-3.svg",
"description": "Supports drilling across with one or more other blends"
}
],
"resource_group_count": 1,
"resource_count": 4
}
]
}

Metadata Definitions

Open Data Blend Catalogue metadata is split into two main parts:

  • Package Properties: The collection of properties at the root level that describe the catalogue.

  • Resource Properties: The collection of properties within the 'resources' property that describe the published datasets.

Package Properties

Name

Description

profile

Name of the Data Package profile. The value will always be 'data-package-catalog' which indicates that it is compatible with the Frictionless Data pattern for Data Package Catalogues.

name

Name of the catalogue in slug form

title

Name of the catalogue in title case

description

Brief description of the catalogue

terms_of_service

URL of the Open Data Blend terms of service

updated

Last date that the catalogue was updated

contributors

Details of the catalogue contributors

resources

List of published datasets and their properties. See the ‘Resource Properties’ section below for resource object definitions.

Resource Properties

Name

Description

name

Name of the dataset in slug form

path

URL of the dataset metadata

title

Name of the Dataset in title case

description

Brief description of the dataset

updated

Last date that the dataset was updated

is_beta

Beta status of the dataset. true means in beta and false means stable

format

File format of dataset metadata. This is always .json.

image

URL of the dataset icon

theme

Theme of the dataset which is used to group like themed datasets together

blend_classes

Blend Class of the dataset. Indicates the maturity of the dataset in terms of refinement.

resource_group_count

Number of data file groups. Indicates the number of logical tables.

resource_count

Number of data files.

Dataset

The Dataset endpoint returns the complete set of metadata for an Open Data Blend Dataset. The request can be made using any of two endpoints. In both endpoint variants, name is the name of the dataset in slug form e.g. 'open-data-blend-date'.

Standard Endpoint

Request
Example
Request
https://packages.opendatablend.io/v1/:name
Example
https://packages.opendatablend.io/v1/open-data-blend-date

Frictionless Data Specification Endpoint

Request
Example
Request
https://packages.opendatablend.io/v1/:name/datapackage.json
Example
https://packages.opendatablend.io/v1/open-data-blend-date/datapackage.json

Response Example

The below response example has been abbreviated for clarity.

{
"profile": "data-package",
"name": "open-data-blend-date",
"title": "Date",
"description": "A collection of sequential dates with several levels of roll-up",
"is_beta": false,
"terms_of_service": "https://www.opendatablend.io/terms",
"contributors": [
{
"title": "Open Data Blend Team (Nimble Learn Ltd)",
"path": "https://www.opendatablend.io",
"role": "author"
}
],
"theme": {
"name": "temporal",
"title": "Temporal",
"image": "https://packages.opendatablend.io/image/temporal.svg"
},
"homepage": "https://www.opendatablend.io/package/?name=open-data-blend-date",
"image": "https://packages.opendatablend.io/image/temporal.svg",
"updated": "2021-02-12T23:56:39Z",
"path": "https://packages.opendatablend.io/v1/open-data-blend-date/datapackage.json",
"snapshot_path": "https://packages.opendatablend.io/v1/open-data-blend-date/20210212T235639Z/datapackage.json",
"keywords": [
"Time Intelligence",
"Trend Analysis"
],
"blend_classes": [
{
"name": "class-1",
"title": "Class I",
"image": "https://packages.opendatablend.io/image/class-1.svg",
"description": "Carefully structured and optimised for data analytics"
},
{
"name": "class-3",
"title": "Class III",
"image": "https://packages.opendatablend.io/image/class-3.svg",
"description": "Supports drilling across with one or more other blends"
}
],
"resources": [
{
"group_name": "date",
"group_title": "Date",
"group_description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
"group_row_count": 80720,
"is_preview": true,
"name": "date-csv",
"path": "https://packages.opendatablend.io/data/dimension/date/20201018T195237Z/date_preview.csv",
"title": "Date (.csv)",
"description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
"format": "csv",
"licenses": [
{
"name": "ODbL-1.0",
"title": "Open Data Commons Open Database License 1.0",
"path": "https://www.opendatablend.io/open-database-licence"
}
],
"profile": "data-resource",
"row_count": 100,
"updated": "2020-10-18T19:52:37Z",
"schema": {
"fields": [
{
"name": "nll_licence_code",
"type": "string",
"format": "default",
"title": "Licence Code",
"description": "Open Definition licence ID (http://licenses.opendefinition.org/)."
},
{
"name": "nll_licence_name",
"type": "string",
"format": "default",
"title": "Licence Name",
"description": "Open Definition Licence name."
},
{
"name": "nll_licence_url",
"type": "string",
"format": "default",
"title": "Licence URL",
"description": "URL where the licence details can be found."
},
{
"name": "drv_date_key",
"type": "integer",
"format": "default",
"title": "Date Key",
"description": "Primary key."
},
{
"name": "drv_date",
"type": "string",
"format": "default",
"title": "Date",
"description": "Date in ISO format yyyy-mm-dd. The value is always a valid date."
},
...
]
}
},
{
"group_name": "date",
"group_title": "Date",
"group_description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
"group_row_count": 80720,
"is_preview": false,
"name": "date-csv-gz",
"path": "https://packages.opendatablend.io/data/dimension/date/20201018T195237Z/date.csv.gz",
"title": "Date (.csv.gz)",
"description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
"format": "csv",
"compression": "gz",
"licenses": [
{
"name": "ODbL-1.0",
"title": "Open Data Commons Open Database License 1.0",
"path": "https://www.opendatablend.io/open-database-licence"
}
],
"profile": "data-resource",
"row_count": 80720,
"updated": "2020-10-18T19:52:37Z",
"schema": {
"fields": [
{
"name": "nll_licence_code",
"type": "string",
"format": "default",
"title": "Licence Code",
"description": "Open Definition licence ID (http://licenses.opendefinition.org/)."
},
{
"name": "nll_licence_name",
"type": "string",
"format": "default",
"title": "Licence Name",
"description": "Open Definition Licence name."
},
{
"name": "nll_licence_url",
"type": "string",
"format": "default",
"title": "Licence URL",
"description": "URL where the licence details can be found."
},
{
"name": "drv_date_key",
"type": "integer",
"format": "default",
"title": "Date Key",
"description": "Primary key."
},
{
"name": "drv_date",
"type": "string",
"format": "default",
"title": "Date",
"description": "Date in ISO format yyyy-mm-dd. The value is always a valid date."
},
...
]
}
},
{
"group_name": "date",
"group_title": "Date",
"group_description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
"group_row_count": 80720,
"is_preview": false,
"name": "date-orc",
"path": "https://packages.opendatablend.io/data/dimension/date/20201018T195237Z/date.orc",
"title": "Date (.orc)",
"description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
"format": "orc",
"licenses": [
{
"name": "ODbL-1.0",
"title": "Open Data Commons Open Database License 1.0",
"path": "https://www.opendatablend.io/open-database-licence"
}
],
"profile": "data-resource",
"row_count": 80720,
"updated": "2020-10-18T19:52:37Z",
"schema": {
"fields": [
{
"name": "nll_licence_code",
"type": "string",
"format": "default",
"title": "Licence Code",
"description": "Open Definition licence ID (http://licenses.opendefinition.org/)."
},
{
"name": "nll_licence_name",
"type": "string",
"format": "default",
"title": "Licence Name",
"description": "Open Definition Licence name."
},
{
"name": "nll_licence_url",
"type": "string",
"format": "default",
"title": "Licence URL",
"description": "URL where the licence details can be found."
},
{
"name": "drv_date_key",
"type": "integer",
"format": "default",
"title": "Date Key",
"description": "Primary key."
},
{
"name": "drv_date",
"type": "string",
"format": "default",
"title": "Date",
"description": "Date in ISO format yyyy-mm-dd. The value is always a valid date."
},
...
]
}
},
{
"group_name": "date",
"group_title": "Date",
"group_description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
"group_row_count": 80720,
"is_preview": false,
"name": "date-parquet",
"path": "https://packages.opendatablend.io/data/dimension/date/20201018T195237Z/date.parquet",
"title": "Date (.parquet)",
"description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
"format": "parquet",
"licenses": [
{
"name": "ODbL-1.0",
"title": "Open Data Commons Open Database License 1.0",
"path": "https://www.opendatablend.io/open-database-licence"
}
],
"profile": "data-resource",
"row_count": 80720,
"updated": "2020-10-18T19:52:37Z",
"schema": {
"fields": [
{
"name": "nll_licence_code",
"type": "string",
"format": "default",
"title": "Licence Code",
"description": "Open Definition licence ID (http://licenses.opendefinition.org/)."
},
{
"name": "nll_licence_name",
"type": "string",
"format": "default",
"title": "Licence Name",
"description": "Open Definition Licence name."
},
{
"name": "nll_licence_url",
"type": "string",
"format": "default",
"title": "Licence URL",
"description": "URL where the licence details can be found."
},
{
"name": "drv_date_key",
"type": "integer",
"format": "default",
"title": "Date Key",
"description": "Primary key."
},
{
"name": "drv_date",
"type": "string",
"format": "default",
"title": "Date",
"description": "Date in ISO format yyyy-mm-dd. The value is always a valid date."
},
...
]
}
}
],
"sources": [
{
"name": "open-data-blend",
"title": "Open Data Blend",
"path": "https://www.opendatablend.io/about-us"
}
],
"reuse_ideas": [
"Time Intelligence",
"Trend Analysis"
],
"showcases": [
{
"name": "placeholder-showcase-1",
"title": "Placeholder Showcase 1",
"image": "https://packages.opendatablend.io/image/showcase-placeholder.jpg",
"description": "Placeholder showcase 1 caption."
},
{
"name": "placeholder-showcase-2",
"title": "Placeholder Showcase 2",
"image": "https://packages.opendatablend.io/image/showcase-placeholder.jpg",
"description": "Placeholder showcase 2 caption."
},
{
"name": "placeholder-showcase-3",
"title": "Placeholder Showcase 3",
"image": "https://packages.opendatablend.io/image/showcase-placeholder.jpg",
"description": "Placeholder showcase 3 caption."
}
],
"useful_resources": [
{
"name": "open-data-blend-feedback",
"title": "Open Data Blend Feedback",
"path": "https://github.com/opendatablend/feedback",
"description": "Report data quality issues, provide feedback, share domain knowledge, and suggest new datasets."
},
{
"name": "open-data-blend-help-centre",
"title": "Open Data Blend Help Centre",
"path": "https://www.opendatablend.io/help-centre",
"description": "Covers topics/questions relating to Open Data Blend including etiquette, licensing, and tooling."
}
],
"related_packages": [
{
"name": "open-data-blend-age",
"title": "Open Data Blend Age",
"homepage": "https://www.opendatablend.io/dataset?name=open-data-blend-age",
"path": "https://packages.opendatablend.io/v1/open-data-blend-age/datapackage.json"
},
{
"name": "open-data-blend-mileage",
"title": "Open Data Blend Mileage",
"homepage": "https://www.opendatablend.io/dataset?name=open-data-blend-mileage",
"path": "https://packages.opendatablend.io/v1/open-data-blend-mileage/datapackage.json"
},
{
"name": "open-data-blend-time-of-day",
"title": "Open Data Blend Time of Day",
"homepage": "https://www.opendatablend.io/dataset?name=open-data-blend-time-of-day",
"path": "https://packages.opendatablend.io/v1/open-data-blend-time-of-day/datapackage.json"
}
]
}

Metadata Definitions

Open Data Blend Dataset metadata is split into two main parts:

  • Package Properties: The collection of properties at the root level that describe the dataset.

  • Resource Properties: The collection of properties within the 'resources' property that describe the data files.

Package Properties

Name

Description

profile

Name of the Data Package profile. The value will always be 'data-package' which indicates that it is compatible with the Frictionless Data specification for Data Packages.

name

Name of the dataset in slug form

title

Name of the dataset in title case

description

Brief description of the catalogue

is_beta

Beta status of the dataset. true means in beta and false means stable

terms_of_service

URL of the Open Data Blend terms of service

contributors

Details of the catalogue contributors

theme

Theme of the dataset which is used to group like themed datasets together.

homepage

URL for the dataset homepage

image

URL for the dataset icon

updated

Last date that the dataset was updated

path

Location of the dataset metadata

snapshot_path

Location of the specific version of the dataset metadata

keywords

Keywords that describe the dataset

blend_classes

Blend Class of the dataset. Indicates the maturity of the dataset in terms of refinement.

resources

List of data files in the dataset. See the ‘Resource Properties’ section below for resource object definitions.

sources

The data sources and information sources that were used to create the dataset

reuse_ideas

Use cases for the dataset

showcases

List of showcases that demonstrate how the dataset could be used

useful_resources

List of useful resources such as relationship diagrams and data source documentation

related_packages

List of related or relevant datasets

Resource Properties

Name

Description

group_name

Name of the logical group of data files. Except for preview data files, all data files with the same group name have the same table schema and contain the same data.

group_title

Name of the data file group in title-case

group_description

Description for the data file group

group_row_count

Row count of eachh data file in the data file group. Except for preview data files, all data files in the same group will have matching row counts.

is_preview

Indicates whether the data file has a top 100 row preview or all of the rows. true means the data file is a preview and false means the data file has all the rows.

name

Name of the dataset in slug form

path

URL of the dataset metadata

title

Name of the Dataset in title case

description

Brief description of the dataset

format

File format of the data file. This will be either .csv, .csv.gz, .orc, or .parquet.

licenses

Details of the licences that the data file has been licensed with

profile

Name of the Data Resource profile. The value will always be 'data-resource' which indicates that it is compatible with the Frictionless Data specification for Data Resources.

row_count

Number of rows in the data file excluding headers

updated

Last date that the dataset was updated

schema

Definition of the data file schema. The definition conforms to the Table Schema specification.

Data File

The Data File endpoint returns the data file that has be requested. Both unauthenticated and authenticated requests can be made through the same endpoint.

Standard Endpoint

Request
Example
Request
https://packages.opendatablend.io/v1/:datafile:extension
Example
https://packages.opendatablend.io/v1/data/dimension/date/20201018T195237Z/date.parquet

datafile The path of the data file excluding the file format extension

extension The file format extension of the data file which can be one of the following:

  • .csv

  • .csv.gz

  • .orc

  • .parquet

The '.csv' version of a data file is only for preview purposes and only contains the top 100 rows.

Authenticated Requests

Access keys are used to authenticate data file requests. Without an access key, the number of requests you can make will be limited.

You can provide your access key in one of the following ways:

  1. Using the Open-Data-Blend-Access-Key header with the POST method

  2. Using the accesskey query parameter with the GET method

  3. Using the accesskey body parameter with the POST method

The order of precedence for applying your access key matches the listed order above.

Below are examples of each authentication method.

Access Key in a Header

Open-Data-Blend-Access-Key : YOUR_ACCESS_KEY

Access Key in a Query Parameter

https://packages.opendatablend.io/data/dimension/date/20201018T195237Z/date.parquet?accesskey=YOUR_ACCESS_KEY

Access Key in the Request Body

{
"accesskey" : "YOUR_ACCESS_KEY"
}