Pages 2022-02 v0.0.1 README + CSV with table preview Project
Pages 2022-02 v0.0.1 README + CSV with table preview Project
Plan
Purpose and principles
Have a running "core system" that takes README (+ local csv) and generates a website. Sufficient functionality to be usable by us and perhaps the hardest core of users (and provides base for future development)
- Local only
- OK to assume usage of npm, node etc
- Table view only
- Need not be that elegant
Outcome visioning
We have a data literate README with data in CSV or inline and we can install (and config) something locally (e.g. a nextjs template) and run it and we get a site running locally displaying the data literate document
- Blog post about this (on datahub.io?)
- Table only (no need even for graphing yet - though bonus if we have that)
Acceptance
- Goal 1: have a template nextjs app that works locally (with a demo mdx file in it to show off functionality)
- Goal 2: walk through of running locally with my own project of README.md plus a CSV. Imagine something like:
- Install the template + Configure it to point at the README (and CSV??)
- Run => You have a website
- Goal 3: document and publicize
- Document / blog
- Identify any pain points on this and fix
Tasks
Goal 1 Tasks
Prep / background: Review the data literate work so far
- What do we mean by "data literate"? 🔑 A Markdown-like document where one can add special tags that will target CSV data and display it inline tables, graphs, and other data representations.
- What is the current data literate builder/app? 🔑 https://github.com/datopian/portal.js/tree/main/site - note this is also building the portaljs.org site so we should probably split out It's currently feasible but difficult to use.
- What is MDX background and approach? 🔑 MDX is an extension to markdown that allow you to embed React components into markdown … In short: Markdown + JSX.
- What mdx processor we use 🔑 We are using https://github.com/hashicorp/next-mdx-remote. NB: nextjs now bundles this out of the box as an extension
- what is data parsing / access approach? 🔑 for now we directly load csv off disk so done in frontend but not centralized or library-ised - e.g. it's here https://github.com/datopian/portal.js/blob/4be1c1aecf9f2ca7cd9f5c3927e16eea374a1629/site/components/Table.js#L73
- Do we have standard code for this? 🔑 no, at least not in portal.js - https://github.com/datopian/portal.js/tree/main/src has no lib. However, we do have frictionless-js which sort of does some of this. What really want is a clear API definition and a plugin model.
Build
- Refactor the current data literate setup in
site/...
into standalone exampledata-literate
so that we can install it directly into e.g. our demo ✅ it is done and working https://github.com/datopian/portal.js/tree/main/examples/data-literate- copy across the key code
- test it works with a local example ✅ works with the bundled demo.mdx example
- What table component does it currently use? 🔑 hand-crafted one here https://github.com/datopian/portal.js/blob/main/site/components/Table.js
- What about remote files e.g. you want to paste in a CSV hosted somewhere on line? 🔑 right now let's keep the proxy since nice for local preview and we can warn it won't work on static export
- Verify static export works reasonably well ✅ you have to run python3 -m http.server (obviously)
- set up auto-build of this on vercel (?) ✅2022-02-22 https://portal-js.pages.dev/demo.
- 🚩 we are at our limit on our vercel plan (Free apparently) for deployments per repo. In any case, it seems we need to upgrade (either to Hobby or Pro). Will do on cloudflare pages but note this only does static export (may not be a bad thing though to only do static for this)
Goal 1.1: Improvements / Fixes to Basic Template
- ⏫ Upgrade to next v12 https://nextjs.org/docs/upgrading @risenw ✅2022-02-24 see https://github.com/datopian/portal.js/issues/667**
- ⏫ Move to the new Next.js mdx support https://nextjs.org/docs/advanced-features/using-mdx @risenW ✅2022-02-24
- WONTFIX Add support for frontmatter parsing e.g. via gray-matter as nextjs does not support this by default (https://nextjs.org/docs/advanced-features/using-mdx#frontmatter)
- Summarize pros/cons of using nextjs default module
- Do we break SSG and have some API routes e.g. for proxy (ok for local development … and a nice feature to have)? ✅
- 🔽 display default content (currently demo.mdx) at the root location so that when users try out this example they have something at the default url when they open their browser **LOW atm because various ways to do this and not sure which is good
- Option 1: Redirect option (issue is that this is hardcoded)
- Option 2: Turn current dynamic route into a catch-all route e.g. Add
slug: false
to list & make demo / index what we render forslug: false
- Option 3: (KISS) just move demo.mdx to index.mdx
- Switch to our standard components for the Table?
- Fix whatever bug there is with "Table from Raw CSV" as not working …
Goal 2: works with my own content
- Create example project (and store in a github repo) that has README and data that we can test this process with ✅ https://github.com/datopian/portal.js/tree/main/examples/data-literate here
- Questions ✅2022-02-22
- How many markdown files? 🔑 Just one for now
- Is markdown file always at README.md or do we allow flexibility? 🔑 Let's hardcode README.md for now
- How are data files resolvable (how) from the markdown file? 🔑 assuming the paths are correct in source folder we can just copy data into public and markdown into pages or content folder (if using content folder)
- Create README.md with sample content
- Create a csv with sample content and use from README
- Questions ✅2022-02-22
- Write out planned steps for an end user ✅2022-02-22 see steps for a user below
- Make data-literate configurable so that it works with markdown in any location ✅2022-02-22 crude version working based on file copy - https://github.com/datopian/portal.js/commit/7a5130af190b8d5fbdfd9d167392415a7d0874de commit
- Add sample user readme + data
- Create mini-script
- Copy over data files (using script)
How to we use data literate with my own markdown and data?
- How to we use data literate with my own markdown and data?
- Do we copy the content and data into this repo (build time) or do we link it in in the static site generation (nextjs build)? 🔑 let's just copy data for now, rather than doing anything fancy
- How do we copy the static export back somewhere useful (if we are doing static build)
Improvement
- Symlink rather than copy? 🔑 don't do this as complex and we prob don't even want to copy stuff over - it's a temporary hack
- copy all data and markdown files (not just readme and data.csv)
- Opening root http://localhost:3000/ should show the README (rather than 404) => Render the README at the root directory (or similar fix e.g. redirect)
- Make template app creation faster (??)
- maybe we do a git checkout and then can pull for fixes … - need to only checkout a directory (see https://stackoverflow.com/questions/600079/how-do-i-clone-a-subdirectory-only-of-a-git-repository)
- anyway that's an optimization
- aside: the simplest thing to start with is to symlink in fact
Other ideas
- Could offer tutorial on using the data-literate app not just as a template but as the full app you want to use and moving your data into it … (or at least in a subfolder e.g.
.datahub/data-literate
)
Steps for a User
Simplest possible
Crudest approach: brute install next.js app into a subfolder and configure with links to the relevant content (this is what we did last year in our demos)
node datahub-portal-local-cli.js your-own-data-example/
npm run dev
open localhost:3000
Better version
https://excalidraw.com/#room=58cad8f74b6f1be3ea2f,p5RM_ybIKDOP6L832JrTfw
# get the template
npx create-next-app@latest --example https://github.com/datopian/portal.js/examples/data-literate my-app
cd my-app
# run the script to connect your content
# put the path to your project directory
node datahub-portal-local-cli.js your-project-directory
# start the preview portal
npm run dev
# open a browser
open localhost:3000
As a proper tutorial (#someday)
- Create a new project :
- Copy and paste this example markdown/folder
- All you need to start is a single markdown file
-
- Create a README.md in your folder
-
- Run the portal command:
datahub portal
- Open your browser to localhost:3000
Goal 3: Market
- Set up datahub.io for publishing posts (do we need to tweak anything)
- Choice about whether this is "launchable" 🔑 don't think this is launchable but it is bloggable
- within Datopian (e.g. we try this for something and with some team members)
- Any alpha users we could recruit
- Do we put this on e.g. datahub.io
- Do we market more generally
Future
- Data loading library
- Reflect on plugin model
Analysis
How to add Data Literate support to a Next project
What's involved?
- Markdown parsing with support for component references
- Linker system to indicate where you get components from
- Add default components that MDX system automatically knows about
- Ability for a user to explicit import e.g.
import MyComponent from 'my-component'
- System for dereferencing data/content references in the components e.g.
<Table data="mydata.csv" />
- how does it locate the mydata.csv- Especially an issue because nextjs (for example) won't parse stuff in same directory as markdown (data needs probably to be in public directory or moved there in the build step … which then requires rewriting paths)