Data validation
Validating data gives us confidence it is in the right format.
It does not guarantee the data is accurate. Data could still be factually wrong. But it ensures the data we publish will work in systems that are designed to use it, and it can be combined with other services’ data.
Instructions
These instructions use the tool, Good Tables, developed by the Open Knowledge Foundation.
- In your web browser, navigate to https://try.goodtables.io
- In the Source field, choose the option to ‘Upload File’
- Using the Browse button, upload your CSV file
- In the Schema field, enter the URL of the schema that you want to validate the data against. Options for these are listed in the table further down this page
- The Validate button will start the validation process.
The tool will report whether the table is valid.
Possible schemas to use
Schema | URL |
---|---|
Events | https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/events.json |
Libraries | https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/libraries.json |
Loans | https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/loans.json |
Membership | https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/membership.json |
Mobile library stops | https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/mobile-library-stops.json |
Physical visits | https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/physical_visits.json |
Stock summary | https://raw.githubusercontent.com/LibrariesHacked/schema-librarydata/master/stock_summary.json |
Dealing with problems
The data validation may tell you the table isn’t valid.
In these cases, it will display what is wrong. In the example shown below, the data has failed validation because some required columns have no values in them (reported as a ‘required constraint’). Another error reported is an ‘enumerable constraint’, which means a column value does not match what the schema is expecting. In this case, ‘Mon’ has been used instead of ‘Monday’. Although it often seems pedantic, it’s important we use consistent values for the data to be useful.
It may not be clear what is wrong with the data. This documentation should give guidance of the data expected. The sample files have all successfully passed validation, and can be used as reference.