Bad Data
Bad Data
Real-world examples of how not to do data
Bad Data is a site providing real-world examples of how not to prepare or provide data. It showcases the poorly structured, the mis-formatted, or the just plain ugly.
Its primary purpose is to serve as an educational tool for governments and other organizations – though there may also be some aspect of entertainment.
It is also a good source of practice material for budding data wranglers—those tasked with cleaning and transforming data (in fact the repo in fact began as a place to keep practice data for Data Explorer).
New examples are wanted and welcome – submit them here »
History
- [Bad Data: real-world examples of how not to do data (2013)]https://okfnlabs.org/blog/2013/11/19/bad-data-examples-how-not-to-do-data.html
- Tracking Issues with Data the Simple Way (2013)
- Original project website: https://okfnlabs.org/projects/bad-data/
Examples
- Plain Text (ASCII) Spreadsheet from US Government Bureau of Labor Statistics
- Cairo Transport Authority Data
- Greater London Authority Spending
- List of Mexican Towns With Population Under 5000
- Nature Magazine supplementary information
- Russian foreign trade statistics
- Passenger Numbers for Humans Only
Add an Example
New examples are wanted! Here's how to submit one.
What information to provide
What's we'd like to see in a good example is:
- Original data url (and name of organization providing the data)
- File format (e.g. CSV, Shapefile etc)
- A description of what's wrong
- A nice illustrative image or screenshot
Optional (but desirable): a backup of the data in case it goes away! (If the file is more than ~100kb please provide a chopped down version illustrating the main "badness").
Also don't forget to credit yourself in an appropriate way (if you want to be credited!).
Lastly, note that your contribution will be licensed under this site's general CC Attribution License.
How to Contribute
Option 1: Fork and Pull
This website is stored in a github repo which you can fork and pull to add your example. Here are detailed instructions:
-
Choose a "slug" for your example e.g.
my-bad-data-example
-
Copy and paste the
ex/template/
to a directoryex/{your-slug}
-
Edit the
ex/{your-slug}/index.md
file- Change the frontmatter attributes (i.e. key/value items at very top of file) as appropriate
- Add the description (markdown formatted) to the main section of that page
-
Commit the new file
-
Then submit the pull request
Option 2: Open an Issue
If the thought of forks and pulls give you the jitters there's an ultra-simple alternative: just open an issue in the issue tracker and add the information requested!