Analytics
Last updated
Last updated
Open Data Blend Analytics is a hosted business intelligence service that you connect to from analytical tools such as Excel, Power BI, and Tableau. The catalogue of available data is the same as the Open Data Blend Datasets but limited at most to four years of history plus the latest available year.
Once connected to Open Data Blend Analytics from supported analytical tools, you can instantly analyse the data and build insightful dashboards and reports using convenient drag-and-drop interfaces. There is no need to download data files or write any code.
Analysis Services is the in-memory analytics engine that powers Open Data Blend Analytics. The highly optimised Open Data Blend Analytics Model enables you to flexibly summarise 100s of millions of rows in seconds. This makes it possible to analyse large data volumes interactively.
Most tools that can connect to Analysis Services models will be compatible with Open Data Blend Analytics.
The following short clips demonstrate the Open Data Blend Analytics drag-and-drop experience in three of the most popular BI tools that are supported.
Supported by using Get Data in Excel 2016 and later.
Supported by all recent versions of Power BI Desktop but we recommend using the latest version whenever possible. Power BI Desktop is free.
Supported by Tableau Desktop Professional Edition 10 or later.
The Open Data Blend Analytics Model is a business intelligence semantic data model. It already 'knows' how the tables are related and what meaningful aggregations can be applied. The model hides the complexity of the underlying relationships and calculations, enabling data analysts to ask complex analytical questions through simple client tool interfaces.
There are five significant types of object in the Open Data Blend Analytics Model:
Tables
Columns
Relationships
Hierarchies
Measures
All table names are aligned with the friendly names that can be seen in the corresponding Open Data Blend Datasets. Look-up tables (i.e. dimension tables) always contain the same data as the data files in the corresponding Open Data Blend Dataset.
For tables that contain a set of measurements or observations over time (i.e. fact tables), the amount of history available in the model depends on the total number of rows in the table.
Number of Rows | History |
Up to 100 million | All years |
Over 100 million | At least four full years plus the latest year |
All column names are aligned with the friendly names that can be seen in the corresponding Open Data Blend Datasets. Columns, such as keys, are hidden in the model because they are not useful for data analysis. Some client tools like Power BI Desktop can show the hidden columns.
The columns of each table are grouped into two top-level folders:
Source: Columns in this folder contain the same values as in the original source data.
Derived: Columns in this folder contain derived values. These are cleansed versions of the source values, values that have been inferred, or other enrichments that have been made by the Open Data Blend Team.
This has been done to make it clear where additional value has been added and to make it possible to tie back derived column values to their source values.
Being able to clearly distinguish between columns that originated from the source and derived columns helps to keep the data transparent and trustworthy. You can always reason about the derived values and provide us with feedback if you spot any issues.
Hierarchies have been included in the model to save you time and effort when building your dashboards and reports. Imagine a scenario where you need to break down a set of numbers (i.e. measures) by years, months, and dates.
Without a hierarchy, you would need to find and add the columns to your report one at a time and ensure that the levels are arranged in the right order:
Year
Month Name
Date
The convenience of a hierarchy allows you to use a single object named 'Year > Month Name > Date' that will break down the numbers in the same way with much less effort. For less obvious hierarchies, such as those for prescription medicines, it takes the guesswork out of what the hierarchies should be and the ordering of the levels.
You are free to use the hierarchies that have been pre-defined, or create your own using the manual method described above. The choice is yours.
The Open Data Blend Analytics Model includes pre-defined relationships for tables where one or more known relationships exists. You can get a high-level view of the relationships between tables by viewing the relationship diagrams in the relevant Open Data Blend Datasets or by using the Model View Power BI Desktop. The power of these relationships comes from the ability to break down the numbers (i.e. measures) from one table by the descriptive columns in one or more related tables.
You'll find predefined calculations in the Open Data Blend Analytics Model that can be used to dynamically aggregate the data. These calculations are called 'measures' and they can do things as simple as summing a single column or more complicated calculations like year-on-year percentage change.
The Calculation table is a special purpose table that only contains measures. The measures are organised into the following folder structure:
|---- {Subject Area Name}
|---- {Logical Table Name}
|---- General
|---- Time Intelligence
|---- Month to Date
|---- Previous Year
|---- Previous Year Month to Date
....
For example, you will find measures belonging to the ‘Anonymised MOT’ Open Data Blend Dataset under the following folder structure:
|---- Anonymised MOT
|---- Test Item
|---- General
|---- Time Intelligence
|---- Month to Date
|---- Previous Year
|---- Previous Year Month to Date
|---- ....
|---- Test Result
|---- General
|---- Time Intelligence
|---- Month to Date
|---- Previous Year
|---- Previous Year Month to Date
|---- ...
Below is a summary of what the General and Time Intelligence folders contain:
General: All measures except for time intelligence measures.
Time Intelligence: Measures that apply some form of time intelligence to a base measure.
‘Base measure' refers to the version of a measure without any time intelligence applied.
All the time intelligence calculations are based on calendar dates, months, quarters, and years. In the below table, 'base measure' refers to the version of a measure without any time intelligence applied.
Calculation Type | Measure Name Suffix | Calculation Description |
Previous Year | PY | The value of the base measure in the same period a year ago. |
Month to Date | MTD | The value of the base measure after applying a running total starting from the beginning of the month. |
Quarter to Date | QTD | The value of the base measure after applying a running total starting from the beginning of the quarter. |
Year to Date | YTD | The value of the base measure after applying a running total starting from the beginning of the year. |
Year on Year | YOY | The difference between the base measure value in the current year compared to the same period a year ago. |
Year on Year % | YOY % | The difference between the base measure value in the current year compared to the same period a year ago, expressed as a percentage. |
Previous Year Month to Date | PYMTD | The value of the base measure after applying a running total starting from the beginning of the month and shifted back in time by a year. |
Previous Year Quarter to Date | PYQTD | The value of the base measure after applying a running total starting from the beginning of the quarter and shifted back in time by a year. |
Previous Year Year To Date | PYYTD | The value of the base measure after applying a running total starting from the beginning of the year and shifted back in time by a year. |
Over time we may introduce more time intelligence calculation types. As we do, we will update the above table to reflect the new additions.
You will find versions of base measures (e.g. 'English Prescriptions') that have a ' %' suffix such as 'English Prescriptions %'. This type of measure will return the percentage of the total that is visible for the base measure in your report.
If you try to slice (i.e. break down) a measure by a column from a table that is not related, you will see a result like the following.
Notice that the measure's value is repeated for every value in the column. This means there is no relationship between the table that the measure is based on and the table that the column belongs to.
To help guide you on which table columns can be used to slice which measures, we recommend checking the relationship diagram of the corresponding Open Data Blend Dataset. If you are connected from Power BI Desktop, you can view the relationships for the entire model in the Model View.