Dataset API

Open Specification

The Open Data Blend Dataset API implements and extends the Frictionless Data Specifications. You can learn more about this in the Frictionless Data Compatibility section.

Terminology such as 'Packages' and 'Resources' is inherited from the Frictionless Data specifications. You can loosely translate these terms as referring to dataset and data file metadata, respectively.

You can translate 'Package' as meaning 'Dataset' and 'Resource' as meaning 'Data File'. An exception to this is that in the Open Data Blend Catalogue metadata, 'Package' refers to the 'Catalogue' and 'Resource' refers to a 'Dataset'.

High Availability

If you have an Analytics Plan subscription, your usage of the Open Data Blend Dataset API is covered by a service-level agreement (SLA) that includes a 99.5% up-time guarantee. You can check the Status Page for current and historic up-time statuses at any time.

Fair Usage

We regularly monitor usage and take reasonable actions to ensure that the Open Data Blend Dataset API service is used fairly and sustainably. We only impose hard limits on data file endpoints (see the Usage Limits section below).

Usage Limits

A data file can be downloaded without providing an access key. All requests without an access key are limited to 15 requests per month. Requests that exceed this limit will receive a HTTP 401 ('Unauthorized') response.

We always aim to strike a careful balance between openness, in terms of accessibility, and the on-going sustainability of the service. Many of our data files can be quite large. Imposing the 15 request per month limit on free downloads helps to keep the associated bandwidth and compute costs at a sustainable level.

There is no limit to the number of requests that can be made with a valid access key. All requests above the number included in your subscription will incur an additional cost. The cost per additional data file request can be found on the pricing page.

Preview CSV data files do not count towards the limit.

Rate Limits

All data file requests are limited to 30 requests per minute. Exceeding this limit will result in a HTTP 429 ('Too Many Requests') response being returned for up to an hour.

API Reference

The Open Data Blend Dataset API is simple and has three types of requests:

  • Catalogue

  • Dataset

  • Data File

All responses are in JSON with the following exceptions:

  • Requests for data files: the backend response is a data file.

  • Requests that result in a pass-through error: the backend response is presented verbatim which could be in XML, for example.

Each request type is described in this section with examples, where applicable.

Request Methods

Unless specified otherwise, requests should be made using the HTTP GET method.

Catalogue

The Catalogue endpoint returns a set of metadata for the published Open Data Blend Datasets. The request can be made using any of three endpoints.

Non-versioned Endpoint

https://packages.opendatablend.io

The non-versioned endpoint is provided for convenience and will always point to the latest versioned endpoint of the Open Data Blend Dataset API. We recommend that you use the standard or Frictionless Data Specification endpoint for any production use. This will ensure that your code is not repointed to a new versioned endpoint if one is introduced in the future.

https://packages.opendatablend.io/v1

https://packages.opendatablend.io/v1/open-data-blend-catalogue/datapackage.json

Response Example

All variants of the Open Data Blend Catalogue endpoint will return the same JSON response.

{
    "profile": "data-package-catalog",
    "name": "open-data-blend-catalogue",
    "title": "Open Data Blend Catalogue",
    "description": "A catalogue of the released Open Data Blend Packages",
    "terms_of_service": "https://www.opendatablend.io/terms",
    "updated": "2020-11-18T15:45:31Z",
    "contributors": [
        {
            "title": "Open Data Blend Team (Nimble Learn Ltd)",
            "path": "https://www.opendatablend.io",
            "role": "author"
        }
    ],
    "resources": [
        {
            "name": "open-data-blend-anonymised-mot",
            "path": "https://packages.opendatablend.io/v1/open-data-blend-anonymised-mot/datapackage.json",
            "title": "Anonymised MOT",
            "description": "MOT tests and results since the MOT system was computerised in 2005",
            "updated": "2020-10-19T08:17:52Z",
            "is_beta": true,
            "format": "json",
            "image": "https://packages.opendatablend.io/image/automotive.svg",
            "theme": {
                "name": "automotive",
                "title": "Automotive",
                "image": "https://packages.opendatablend.io/image/automotive.svg"
            },
            "blend_classes": [
                {
                    "name": "class-1",
                    "title": "Class I",
                    "image": "https://packages.opendatablend.io/image/class-1.svg",
                    "description": "Carefully structured and optimised for data analytics"
                },
                {
                    "name": "class-2",
                    "title": "Class II",
                    "image": "https://packages.opendatablend.io/image/class-2.svg",
                    "description": "Cleansed and enriched using domain knowledge"
                },
                {
                    "name": "class-3",
                    "title": "Class III",
                    "image": "https://packages.opendatablend.io/image/class-3.svg",
                    "description": "Supports drilling across with one or more other blends"
                }
            ],
            "resource_group_count": 37,
            "resource_count": 148
        },
        {
            "name": "open-data-blend-date",
            "path": "https://packages.opendatablend.io/v1/open-data-blend-date/datapackage.json",
            "title": "Date",
            "description": "A collection of sequential dates with several levels of roll-up",
            "updated": "2020-10-19T08:17:52Z",
            "is_beta": false,
            "format": "json",
            "image": "https://packages.opendatablend.io/image/temporal.svg",
            "theme": {
                "name": "temporal",
                "title": "Temporal",
                "image": "https://packages.opendatablend.io/image/temporal.svg"
            },
            "blend_classes": [
                {
                    "name": "class-1",
                    "title": "Class I",
                    "image": "https://packages.opendatablend.io/image/class-1.svg",
                    "description": "Carefully structured and optimised for data analytics"
                },
                {
                    "name": "class-3",
                    "title": "Class III",
                    "image": "https://packages.opendatablend.io/image/class-3.svg",
                    "description": "Supports drilling across with one or more other blends"
                }
            ],
            "resource_group_count": 1,
            "resource_count": 4
        }
    ]
}

Metadata Definitions

Open Data Blend Catalogue metadata is split into two main parts:

  • Package Properties: The collection of properties at the root level that describe the catalogue.

  • Resource Properties: The collection of properties within the 'resources' property that describe the published datasets.

Package Properties

Resource Properties

Dataset

The Dataset endpoint returns the complete set of metadata for an Open Data Blend Dataset. The request can be made using any of two endpoints. In both endpoint variants, name is the name of the dataset in slug form e.g. 'open-data-blend-date'.

Standard Endpoint

https://packages.opendatablend.io/v1/:name

Frictionless Data Specification Endpoint

https://packages.opendatablend.io/v1/:name/datapackage.json

Response Example

The below response example has been abbreviated for clarity.

{
    "profile": "data-package",
    "name": "open-data-blend-date",
    "title": "Date",
    "description": "A collection of sequential dates with several levels of roll-up",
    "is_beta": false,
    "terms_of_service": "https://www.opendatablend.io/terms",
    "contributors": [
        {
            "title": "Open Data Blend Team (Nimble Learn Ltd)",
            "path": "https://www.opendatablend.io",
            "role": "author"
        }
    ],
    "theme": {
        "name": "temporal",
        "title": "Temporal",
        "image": "https://packages.opendatablend.io/image/temporal.svg"
    },
    "homepage": "https://www.opendatablend.io/package/?name=open-data-blend-date",
    "image": "https://packages.opendatablend.io/image/temporal.svg",
    "updated": "2021-02-12T23:56:39Z",
    "path": "https://packages.opendatablend.io/v1/open-data-blend-date/datapackage.json",
    "snapshot_path": "https://packages.opendatablend.io/v1/open-data-blend-date/20210212T235639Z/datapackage.json",
    "keywords": [
        "Time Intelligence",
        "Trend Analysis"
    ],
    "blend_classes": [
        {
            "name": "class-1",
            "title": "Class I",
            "image": "https://packages.opendatablend.io/image/class-1.svg",
            "description": "Carefully structured and optimised for data analytics"
        },
        {
            "name": "class-3",
            "title": "Class III",
            "image": "https://packages.opendatablend.io/image/class-3.svg",
            "description": "Supports drilling across with one or more other blends"
        }
    ],
    "resources": [
        {
            "group_name": "date",
            "group_title": "Date",
            "group_description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
            "group_row_count": 80720,
            "is_preview": true,
            "name": "date-csv",
            "path": "https://packages.opendatablend.io/data/dimension/date/20201018T195237Z/date_preview.csv",
            "title": "Date (.csv)",
            "description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
            "format": "csv",
            "licenses": [
                {
                    "name": "ODbL-1.0",
                    "title": "Open Data Commons Open Database License 1.0",
                    "path": "https://www.opendatablend.io/open-database-licence"
                }
            ],
            "profile": "data-resource",
            "row_count": 100,
            "updated": "2020-10-18T19:52:37Z",
            "schema": {
                "fields": [
                    {
                        "name": "nll_licence_code",
                        "type": "string",
                        "format": "default",
                        "title": "Licence Code",
                        "description": "Open Definition licence ID (http://licenses.opendefinition.org/)."
                    },
                    {
                        "name": "nll_licence_name",
                        "type": "string",
                        "format": "default",
                        "title": "Licence Name",
                        "description": "Open Definition Licence name."
                    },
                    {
                        "name": "nll_licence_url",
                        "type": "string",
                        "format": "default",
                        "title": "Licence URL",
                        "description": "URL where the licence details can be found."
                    },
                    {
                        "name": "drv_date_key",
                        "type": "integer",
                        "format": "default",
                        "title": "Date Key",
                        "description": "Primary key."
                    },
                    {
                        "name": "drv_date",
                        "type": "string",
                        "format": "default",
                        "title": "Date",
                        "description": "Date in ISO format yyyy-mm-dd. The value is always a valid date."
                    },
                    ...
                ]
            }
        },
        {
            "group_name": "date",
            "group_title": "Date",
            "group_description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
            "group_row_count": 80720,
            "is_preview": false,
            "name": "date-csv-gz",
            "path": "https://packages.opendatablend.io/data/dimension/date/20201018T195237Z/date.csv.gz",
            "title": "Date (.csv.gz)",
            "description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
            "format": "csv",
            "compression": "gz",
            "licenses": [
                {
                    "name": "ODbL-1.0",
                    "title": "Open Data Commons Open Database License 1.0",
                    "path": "https://www.opendatablend.io/open-database-licence"
                }
            ],
            "profile": "data-resource",
            "row_count": 80720,
            "updated": "2020-10-18T19:52:37Z",
            "schema": {
                "fields": [
                    {
                        "name": "nll_licence_code",
                        "type": "string",
                        "format": "default",
                        "title": "Licence Code",
                        "description": "Open Definition licence ID (http://licenses.opendefinition.org/)."
                    },
                    {
                        "name": "nll_licence_name",
                        "type": "string",
                        "format": "default",
                        "title": "Licence Name",
                        "description": "Open Definition Licence name."
                    },
                    {
                        "name": "nll_licence_url",
                        "type": "string",
                        "format": "default",
                        "title": "Licence URL",
                        "description": "URL where the licence details can be found."
                    },
                    {
                        "name": "drv_date_key",
                        "type": "integer",
                        "format": "default",
                        "title": "Date Key",
                        "description": "Primary key."
                    },
                    {
                        "name": "drv_date",
                        "type": "string",
                        "format": "default",
                        "title": "Date",
                        "description": "Date in ISO format yyyy-mm-dd. The value is always a valid date."
                    },
                    ...
                ]
            }
        },
        {
            "group_name": "date",
            "group_title": "Date",
            "group_description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
            "group_row_count": 80720,
            "is_preview": false,
            "name": "date-orc",
            "path": "https://packages.opendatablend.io/data/dimension/date/20201018T195237Z/date.orc",
            "title": "Date (.orc)",
            "description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
            "format": "orc",
            "licenses": [
                {
                    "name": "ODbL-1.0",
                    "title": "Open Data Commons Open Database License 1.0",
                    "path": "https://www.opendatablend.io/open-database-licence"
                }
            ],
            "profile": "data-resource",
            "row_count": 80720,
            "updated": "2020-10-18T19:52:37Z",
            "schema": {
                "fields": [
                    {
                        "name": "nll_licence_code",
                        "type": "string",
                        "format": "default",
                        "title": "Licence Code",
                        "description": "Open Definition licence ID (http://licenses.opendefinition.org/)."
                    },
                    {
                        "name": "nll_licence_name",
                        "type": "string",
                        "format": "default",
                        "title": "Licence Name",
                        "description": "Open Definition Licence name."
                    },
                    {
                        "name": "nll_licence_url",
                        "type": "string",
                        "format": "default",
                        "title": "Licence URL",
                        "description": "URL where the licence details can be found."
                    },
                    {
                        "name": "drv_date_key",
                        "type": "integer",
                        "format": "default",
                        "title": "Date Key",
                        "description": "Primary key."
                    },
                    {
                        "name": "drv_date",
                        "type": "string",
                        "format": "default",
                        "title": "Date",
                        "description": "Date in ISO format yyyy-mm-dd. The value is always a valid date."
                    },
                    ...
                ]
            }
        },
        {
            "group_name": "date",
            "group_title": "Date",
            "group_description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
            "group_row_count": 80720,
            "is_preview": false,
            "name": "date-parquet",
            "path": "https://packages.opendatablend.io/data/dimension/date/20201018T195237Z/date.parquet",
            "title": "Date (.parquet)",
            "description": "Collection of sequential dates starting from the 1800s and running well into the future, with several levels of roll-up",
            "format": "parquet",
            "licenses": [
                {
                    "name": "ODbL-1.0",
                    "title": "Open Data Commons Open Database License 1.0",
                    "path": "https://www.opendatablend.io/open-database-licence"
                }
            ],
            "profile": "data-resource",
            "row_count": 80720,
            "updated": "2020-10-18T19:52:37Z",
            "schema": {
                "fields": [
                    {
                        "name": "nll_licence_code",
                        "type": "string",
                        "format": "default",
                        "title": "Licence Code",
                        "description": "Open Definition licence ID (http://licenses.opendefinition.org/)."
                    },
                    {
                        "name": "nll_licence_name",
                        "type": "string",
                        "format": "default",
                        "title": "Licence Name",
                        "description": "Open Definition Licence name."
                    },
                    {
                        "name": "nll_licence_url",
                        "type": "string",
                        "format": "default",
                        "title": "Licence URL",
                        "description": "URL where the licence details can be found."
                    },
                    {
                        "name": "drv_date_key",
                        "type": "integer",
                        "format": "default",
                        "title": "Date Key",
                        "description": "Primary key."
                    },
                    {
                        "name": "drv_date",
                        "type": "string",
                        "format": "default",
                        "title": "Date",
                        "description": "Date in ISO format yyyy-mm-dd. The value is always a valid date."
                    },
                    ...
                ]
            }
        }
    ],
    "sources": [
        {
            "name": "open-data-blend",
            "title": "Open Data Blend",
            "path": "https://www.opendatablend.io/about-us"
        }
    ],
    "reuse_ideas": [
        "Time Intelligence",
        "Trend Analysis"
    ],
    "showcases": [
        {
            "name": "placeholder-showcase-1",
            "title": "Placeholder Showcase 1",
            "image": "https://packages.opendatablend.io/image/showcase-placeholder.jpg",
            "description": "Placeholder showcase 1 caption."
        },
        {
            "name": "placeholder-showcase-2",
            "title": "Placeholder Showcase 2",
            "image": "https://packages.opendatablend.io/image/showcase-placeholder.jpg",
            "description": "Placeholder showcase 2 caption."
        },
        {
            "name": "placeholder-showcase-3",
            "title": "Placeholder Showcase 3",
            "image": "https://packages.opendatablend.io/image/showcase-placeholder.jpg",
            "description": "Placeholder showcase 3 caption."
        }
    ],
    "useful_resources": [
        {
            "name": "open-data-blend-feedback",
            "title": "Open Data Blend Feedback",
            "path": "https://github.com/opendatablend/feedback",
            "description": "Report data quality issues, provide feedback, share domain knowledge, and suggest new datasets."
        },
        {
            "name": "open-data-blend-help-centre",
            "title": "Open Data Blend Help Centre",
            "path": "https://www.opendatablend.io/help-centre",
            "description": "Covers topics/questions relating to Open Data Blend including etiquette, licensing, and tooling."
        }
    ],
    "related_packages": [
        {
            "name": "open-data-blend-age",
            "title": "Open Data Blend Age",
            "homepage": "https://www.opendatablend.io/dataset?name=open-data-blend-age",
            "path": "https://packages.opendatablend.io/v1/open-data-blend-age/datapackage.json"
        },
        {
            "name": "open-data-blend-mileage",
            "title": "Open Data Blend Mileage",
            "homepage": "https://www.opendatablend.io/dataset?name=open-data-blend-mileage",
            "path": "https://packages.opendatablend.io/v1/open-data-blend-mileage/datapackage.json"
        },
        {
            "name": "open-data-blend-time-of-day",
            "title": "Open Data Blend Time of Day",
            "homepage": "https://www.opendatablend.io/dataset?name=open-data-blend-time-of-day",
            "path": "https://packages.opendatablend.io/v1/open-data-blend-time-of-day/datapackage.json"
        }
    ]
}

Metadata Definitions

Open Data Blend Dataset metadata is split into two main parts:

  • Package Properties: The collection of properties at the root level that describe the dataset.

  • Resource Properties: The collection of properties within the 'resources' property that describe the data files.

Package Properties

Resource Properties

Data File

The Data File endpoint is the bulk data API and returns the requested data file. Both public (i.e. unauthenticated) and authenticated requests can be made through the same endpoint.

Standard Endpoint

https://packages.opendatablend.io/v1/:datafile:extension

datafile The path of the data file excluding the file format extension

extension The file format extension of the data file which can be one of the following:

  • .csv

  • .csv.gz

  • .orc

  • .parquet

The '.csv' version of a data file is for preview purposes and only contains the first 100 rows.

Authenticated Requests

Access keys are used to authenticate data file requests. Without an access key, the number of requests you can make will be limited.

You can provide your access key in one of the following ways:

  1. Using the Open-Data-Blend-Access-Key header with the POST method

  2. Using the accesskey query parameter with the GET method

  3. Using the accesskey body parameter with the POST method

The order of precedence for applying your access key matches the listed order above.

Below are examples of each authentication method.

Access Key in a Header

Open-Data-Blend-Access-Key : YOUR_ACCESS_KEY

Access Key in a Query Parameter

https://packages.opendatablend.io/data/dimension/date/20221022T074643Z/date.csv.gz?accesskey=YOUR_ACCESS_KEY

Access Key in the Request Body

{
    "accesskey" : "YOUR_ACCESS_KEY"
}

Last updated