Introducing Datahub pages for data publishing

Introducing Datahub pages for data publishing

We are excited to announce Datahub pages! So what is it, and how can you use it? Well, in this article, we will show you how.

What is Datahub pages

Datahub pages is a tool for generating and deploying data-driven pages from static data. Data can be located either in a local folder, or online repository like Github, Gitlab or Bitbucket.

That is, given a local or online data directory with the following files:


├── README.md

├── data

│   └── data.csv

└── datapackage.json

You can generate a data page that looks like this:

Datahub pages provides an array of tools that makes page generation, deployment, and sharing easy. In the following paragrahs, we'll introduce you to these tools, and show you how to use them.

Datahub CLI tool

The datahub CLI tool helps you to instantly build and deploy data-driven pages. It doesn't require you to know any frontend technologies (HTML, CSS, JS) so that you can focus on your data. Before you install the CLI tool, you need to have the following installed in your system:

  • Nodejs v14 and above.
  • A package manager like npm or yarn installed.
  • Git for version control, and optionally a Github, Gitlab or Bitbucket account. This is used for deployment.

Installing the CLI tool

In order to install the Datahub cli tool, open a terminal/command prompt and run the following command:

npm install -g 'https://gitpkg.now.sh/datopian/portal.js/bin?main'

Note: It is recommended you install the CLI tool globally so you can run it from anywhere on your machine.

After successfull installation, you can confirm if the datahub command is available in your system by running the command:

which datahub
# should print path to the executable

Using the CLI tool

After installation, you're now ready to create a datahub page from your data. The simplest way to create a page is to first prepare a data directory.

A data directory is simply a folder that contains:

  • Dataset: A tabular dataset. E.g CSV
  • datapackage.json: This is required, and is used to describes your dataset. See frictionless spec on how to create one.
  • ReadMe.md: This is optional, and contains markdown with extra text describing your dataset.

You can clone/download this sample data directory which we will use in the following examples.

Preview your datahub page locally

If you have your data directory ready or have cloned the example above, then you can build and preview your datahub page locally, by opening a terminal in the data directory, and running the following command:

datahub show

Running this command will immediately start generating a datahub page from your data, and on complete, it will be available at http://localhost:3000:

![[DataHub_Show_Demo.webm]]

That's it! You now have a nice page for your dataset. The page generated is built using Portal.js-an open source, next generation Javascript framework for rapidly building rich data portals.

Also, if you look in your data directory, you will find a new folder called portal. This folder contains the generated build files. It used tools like React and Next.js, so if you have basic frontend development skills, then you can customized the generated page.

Now that you have generated a Datahub page, you can deploy it yourself, or using Datahub CLI tool. In the next sections, we'll show you how to achieve either of these.

Deploying Datahub pages to Datahub Cloud from your local machine

Datahub cloud lets you host your data pages. It is fully integrated with the Datahub CLI, and with just a single command you can deploy and share your pages easily.

In order to publish you data pages, you need to create an account on Datahub cloud, and get an API token from your dashboard. Follow the steps below to achieve that:

  • Step 1: Go to the signup page and create an account.
  • Step 2: After your account has been created and verified, go to your dashboard, and copy your API token.
  • Step 3: In your terminal or command prompt, run the following command to activate your token:
datahub activate <API TOKEN>
  • Step 4: With the API token activated, you can start deploying your data pages to Datahub cloud by running the command:

    datahub deploy
    

After successful building and deployment, a url is generated for your page similar to the one below:

You can customized your domain from the dashboard.

Deploying Datahub pages to Datahub Cloud from a Github repository

If you have a data directory on Github, something similar to this, then you can turn it to a Datahub page, using the following steps:

  • Step 1: Go to your Datahub-cloud dashboard, and click on the button Deploy from Github repository
  • Step 2: You will be redirected to Github, and asked to authorize access to the repository
  • Step 3: From the dropdown list, select the repository you want to deploy, and authorize access
  • Step 4: You can monitor the build and deploy process from the build logs

On successful build, a url will be generated for your data page, and you can share/customize it.

Self deployment of Datahub pages

Datahub pages are built using the familiar frameworks like React and Nextjs, and as such you can self-deploy your generated pages.

If you have run datahub show, then a build folder called portal is created in the current directory. Opening this folder, you will find a Nextjs application which you can deploy.

You can use any of the following guides:

  • Deployment on Vercel, see official guide
  • Deployment on Github pages, see this guide
  • Deployment on Google Cloud Run, see guide

Conclusion

With Datahub pages you can take a static dataset and convert it into an interactive page which you can share publicly with anyone.

For suggestions, bug reports, feature request, see Github repo

© 2024 All rights reserved

Built with DataHub LogoDataHub Cloud