# Modelling Conventions

## Star Schemas

All Open Data Blend Datasets are created by using [dimensional modelling techniques](https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/) to optimise the data for data analysis. We carefully transform the source data into facts (i.e. the tables with columns containing measurable values such as the number of prescription items) and dimensions (i.e. tables with the columns containing descriptive values such as the name of a prescribed medicine).

There are a vast number of resources that explain what these modelling practices are and why they exist. Rather than reinventing the wheel, we recommend looking through this comprehensive [glossary](http://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques) by the Kimball Group.

## Naming Conventions

### Column Prefixes

We use the following column prefixes in our data files:

| Prefix | Description                                 |
| ------ | ------------------------------------------- |
| nlm\_  | Metadata column\*                           |
| nll\_  | Licence information column\*                |
| src\_  | Source column                               |
| srk\_  | Source column that is also a natural key    |
| drv\_  | Derived column\*                            |
| drk\_  | Derived column that is also a natural key\* |

\*The values in these columns either originate from Open Data Blend or are derived from source columns by Open Data Blend.

### Column Names

All column names are in lower-case, alpha-numeric (no special characters), and spaced using underscores. This is to ensure the broadest level of compatibility across all tools and platforms.

### Acronym Handling

Acronyms are expanded to their corresponding word parts except for when the acronym is very well known and non-ambiguous. Even then, it may still be expanded.

### Foreign Key References

Foreign key references are implied. It's safe to assume that whenever two tables each have a column name that starts with `drv_` and ends with `_key` (e.g. `drv_prescribing_practice_key`), these columns are intended to be joined.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opendatablend.io/open-data-blend-datasets/modelling-conventions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
