Dataset Versions

Dataset Immutability

Changes to Open Data Blend Datasets are always published as new Dataset Versions. Once a dataset version has been created, it remains unchanged. It is immutable.

Every data file change is versioned. This means that all data file updates are reflected through new data file versions, and new data file versions result in new dataset versions.

The URLs for data files always point to a specific data file version. This allows you to share these links knowing that they will always download the same data. The dataset version URLs are currently only surfaced through the Open Data Blend Dataset API and are visible in the snapshot_path property.

Example of the Snapshot Path for a Dataset

{
    "profile": "data-package",
    "name": "open-data-blend-prescribing",
    "title": "Prescribing",
    "description": "NHS England prescriptions that have been dispensed in the UK",
     ...
     "snapshot_path": "https://packages.opendatablend.io/v1/open-data-blend-prescribing/20210715T154432Z/datapackage.json",
     ...
}

Reproducible Analysis and Research

A pleasant side-effect of Dataset Immutability is that you can produce analysis or research using an Open Data Blend Dataset knowing that others can reproduce the same result at a later point in time.

Bundling the data with the analysis or research output isn't always practical, especially when the data files are very large. If you are using the Open Data Blend Dataset UI, you can do the following to get the dataset and data file version URLs:

Click the 'Get metadata' button on the Dataset page.

Then save the Open Data Blend Dataset API response (datapackage.json).

Note that the versioned URL value can be seen in the snapshot_path property.

If you are using the Open Data Blend Dataset API, simply make a dataset request as usual and write-down the response to a long-term data store for future reference.

Last updated

Was this helpful?