< Back to home

Data validation


Validating data gives us confidence it is in the right format.

It does not guarantee the data is accurate. Data could still be factually wrong. But it ensures the data we publish will work in systems that are designed to use it, and it can be combined with other services’ data.

Instructions


These instructions use the tool, Good Tables, developed by the Open Knowledge Foundation.

  1. In your web browser, navigate to https://try.goodtables.io
  2. In the Source field, choose the option to ‘Upload File’
  3. Using the Browse button, upload your CSV file
  4. In the Schema field, enter the URL of the schema that you want to validate the data against. Options for these are listed in the table further down this page
  5. The Validate button will start the validation process.

The tool will report whether the table is valid.

A screenshot of the Good Tables tool, reporting successful data validation
An example of a successful data validation, using Plymouth library listings.

Possible schemas to use

Schema URL
Events https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/events.json
Libraries https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/libraries.json
Loans https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/loans.json
Membership https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/membership.json
Mobile library stops https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/mobile-library-stops.json
Physical visits https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/physical_visits.json
Stock summary https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/stock_summary.json

Dealing with problems


The data validation may tell you the table isn’t valid.

In these cases, it will display what is wrong. In the example shown below, the data has failed validation because some required columns have no values in them (reported as a ‘required constraint’). Another error reported is an ‘enumerable constraint’, which means a column value does not match what the schema is expecting. In this case, ‘Mon’ has been used instead of ‘Monday’. Although it often seems pedantic, it’s important we use consistent values for the data to be useful.

It may not be clear what is wrong with the data. This documentation should give guidance of the data expected. The sample files have all successfully passed validation, and can be used as reference.

A screenshot of the Good Tables tool reporting an error.
An example of invalid data, with errors displayed.

< Back to home