Collections pages improvements

Collections pages improvements

Summary

Recommendation:

  • /collections landing page: Replace the markdown file with a Next.js page for greater UI flexibility. In the future this page would also allow for direct datasets lookup by filtering e.g. by category, region, industry etc.
  • Individual /collections/xxx Pages: Make sure that all follow a consistent markdown format across all pages, including only essential information for clarity and ease of navigation.

Situation

  • The /collections page and individual /collections/xxx pages are key discovery points for core datasets on datahub.io.
  • These pages are generated from the /datasets/awesome-data repository and published with DHC.

Problem & Opportunity

Problems:

  1. The main /collections page appears basic and unrefined, relying on a simple markdown bullet list.

assets/Pasted image 20241020220732.png

  1. Individual collection pages lack a standardised format, resulting in a messy and inconsistent look, making it harder to navigate and find relevant data.

For example:

assets/Pasted image 20241020220743.png

assets/Pasted image 20241020220753.png

assets/Pasted image 20241020220829.png

assets/Pasted image 20241020220913.png

assets/Pasted image 20241020220919.png

Opportunity:

Enhancing the presentation of the /collections pages could increase dataset discoverability, improve visitor retention, and enhance the overall user experience.

Proposed Solution

/collections landing page

Option 1: Stick to pure markdown (Easy and quick improvement) 👍

Replace the current collections list with something like this for now:

## Collection 1 Title

Collection 1 short description ...

[5 datasets](https://datahub.io/collections/collection-1) 
// or [Discover ->](https://datahub.io/collections/collection-1) or similar

## Collection 2 Title

Collection 2 short description ...

[10 datasets](https://datahub.io/collections/collection-2)
// or [Discover ->](https://datahub.io/collections/collection-2) or similar

...
 

Note: having "X datasets" indicator is nice but hard to maintain as needs to be updated manually - too easy to forget.

Use some tailwind component, e.g. something like this:

assets/Pasted image 20241020220937.png

that could be used like this in markdown:

<Collections items={[
  { title: "Air Pollution", href: "/collections/air-pollution" },
  { title: "Education", href: "/collections/education" },
  ...
]}/>

Note: If we're not going to reuse this component anywhere else or include it in our components suite available to the users, it's probably better to just create a Next.js page instead. (see option 3 below)

Replace current README.md with pure JSX Next.js page.

Benefits:

  • no need for new component implementation that would only be used in one place - main /collections page
  • much greater flexibility when it comes to UI
  • sets stage for dynamic datasets population and filtering.

assets/Pasted image 20241022141615.png

Individual collections pages

Make sure this date is updated regularly or get rid of it completely:

assets/Pasted image 20241020220948.png

Also, make sure all collections pages follow the same standard format, .e.g:

---
title: Collection A Title
description: Collection A Description
---

## Core Datasets

### Dataset 1 Title

Dataset 1 Description

[core/dataset-1](https://datahub.io/core/dataset-1)

### Dataset 2 Title

Dataset 2 Description

[core/dataset-2](https://datahub.io/core/dataset-)

...

## External Datasets (?)

...

## Useful Information (?)

...

Rabbit holes

  • Do we really need all the extra info about each collection (including descriptions or e.g. external sources)? Or could we have a single /collections page that queries the db based on selected filters like category, industry, region etc. like here https://www.marketplace.spglobal.com/en/datasets ? **Ans: Yes, we do want to keep collections pages even if we have /collections page with db populated data and filtering options. The collections pages allow for adding additional, non-structured data about each collection that is often useful.

No-go

  • Implementing and incorporating a new compontent for in-markdown use, that is effectively only going to be needed in one place - main /collections page

© 2025 All rights reservedBuilt with DataHub Cloud

Built with LogoDataHub Cloud