Rufus NTS

Rufus NTS

#done/process consolidated into Planning Stream etc


Outflow

  • What is current SCQH? πŸ”‘ Old Feb 2021 brainstorm that is unfinished
    • What is our current consolidated plan of work?
  • State of existing datahub.io
    • Existing user database
  • Discord server and users there
    • ASIDE: switching to element from discord (v low priority => leave for now)

Current thought tree

Where does this issue tree fit in https://docs.google.com/spreadsheets/d/1sprrkUeMRa3nrma4HvY_k-T0zGYttXJikqe9XPPmSiA/edit#gid=755438186 (think we deprecate it …)

Documents to process

Team Meeting - Anu, Leo, Rufus

Gdoc: https://docs.google.com/document/d/1JNtVUCVmMhrko-HeB–dPus4WJWKqpPn85Jo9cJ3RUg/edit#heading=h.zcn9xydnmw7v

Present: Leo, Rufus, Anu

Leo Email

I've been going through portal js, datahub and the product documents we have.

Basically I've been trying to see how to make this sentence I said last year real "Make sharing new datasets stupidly easy" after a few trials and trying to see how things work the current state of the things we have is really far from it. This is something we have already discussed some time ago.

My first goal is to do the following:

  1. Have a CSV of the data we want to publish and [optional] have a markdown file with the text description then:
  2. run a single command/script, it creates a static webpage
  3. be able to git push into github pages and it just works

To get to publish a dataset with what we currently have we need to:

  • study and understand what Frictionless data is and how to create it (the end user shouldn't, they should be able to start without knowing anything about it)

  • understand what portal.js is, how to install it and that there are different ways of publishing with it, including a single dataset option

  • understand that portal.js is built on next.js which makes us lose focus, and then this runs over react.js which has its own complexity by itself.

  • Understand how to deploy a react application in github pages

Just installing the npm dependencies with a fiber connection takes about 1 minute which, from a user point of view can already be a problem, but the most frustrating thing is all the knowledge and work needed to just be able to set up things to start to build my data page.

This is quite a challenge for somebody that just wants to publish some data, with some text and maybe some graphics.

So my goal for the next days is to understand how to make this much simpler and from there build the scripts (this might or might not use all of portal.js in the first version).

Some notes:

  • the user does NOT need to know frictionless, nextjs, react, portaljs
  • the user should be able to run a simple script (the first one should be text based) that allows for a creation of a simple static webpage that can be pushed to github pages (maybe even create a script that does it for them)
  • the user should be able to install the tool with a simple shell script and/or pip (python). Note that I choose python because it is one of the most used languages in the data and scientific domains, which means that most people would be able to run a python script.

For this some individual goals are:

  • automatic generation of the frictionless json file
  • templated UI (as for datahub)
  • static website generation
  • some default graphs that can be chosen (possible generated by Vega Lite)

The power users then should then be able to take advantage of all the setup (frictionless, portal.js and so on) to make more complex and personalized modifications.

Β© 2024 All rights reservedBuilt with Find, Share and Publish Quality Data with Datahub

Built with Find, Share and Publish Quality Data with DatahubFind, Share and Publish Quality Data with Datahub