Website for DataHub (Pages) v0.1
Website for DataHub (Pages) v0.1
Summary
NB: 2022-03-14 we decided simpler for now to stick with separate domain and worry about "taking over" datahub.io later => publish at pages.datahub.io
Plan
Acceptance
- next.datahub.io tidied and data-literate webapp in
site
directory and deploying to next.datahub.io - last 2 blog posts from datahub.io/blog are on next.datahub.io/blog with same url scheme
- ⏫ Where do we deploy the DataHub Pages site? Does it take over datahub.io a bit or have its own domain? ✅ stick with a separate domain for now
- it's own subdomin e.g. next.datahub.io or pages.datahub.io
- OR take-over datahub.io front page and blog etc
- KISS is option (1) but option (2) may have better traffic etc.
Tasks
- Create the repo ✅2022-03-08 all done
- where do we create the repo?
site
subdirectory of next.datahub.io- do we reuse next.datahub.io or boot a new repo ✅ reuse next.datahub.io but move the site to a subdirectory
site
**- Pros: it's the simplest thing we can do. monorepo style …
- Cons: we have the docs and the new site in one place … maybe that's annoying … but monorepos are nice
- Do we keep docs in for now (or move out)? ✅ keep them in
- What happens to existing next.datahub.io webapp? ✅ Delete it and start from data-literate template
- What do we do with next.datahub.io front page? ✅ save it and we'll copy it into the next site (as mdx (?))
- do we reuse next.datahub.io or boot a new repo ✅ reuse next.datahub.io but move the site to a subdirectory
- Is the repo private? ✅ Yes
- where do we create the repo?
- Deploy it ✅2022-03-08 https://next-datahub-io.pages.dev & https://next.datahub.io
- What preliminary url does it have (even if later we proxy over datahub.io)? ✅ next.datahub.io
- Do we deploy with cloudflare pages or netlify or vercel? ✅ cloudflare
- Configure the domain ✅2022-03-08
- Bootstrap it ✅2022-03-08
- do we template off data-literate? ✅ Yes
- Remove old site content (but preserve home page)
Next steps - 2022-03-08
- 🔼 DataHub.io blog posts migrated
- All the blog posts moved over so that we can test out new system and switch over datahub.io blog to new setup so that we can post the new blog post we have on datahub.io
- first 2 blog posts ✅ https://next.datahub.io/blog/covid19-and-compartmental-models-in-epidemiology & https://next.datahub.io/blog/frictionless-specs-european-commission
- ⏫ Fixes for images
- ⏫ front matter fixes (see below in technical) ✅2022-03-09
- Support for authors and their images (?)
- Migrate the rest of the blog posts ✅2022-03-08
- 🚩 2022-03-10 haven't found a good solution for frontmatter stuff and realise this is a real issue and now are pursuing another option - see summary in 2022-03-09 => switch to contentlayer approach => actions are below
- Take over blog section on datahub.io so that we have the new post live and we know this approach of incrementally taking over urls works … ❌ WONTFIX
- All the blog posts moved over so that we can test out new system and switch over datahub.io blog to new setup so that we can post the new blog post we have on datahub.io
Technical questions
- ⏫ Frontmatter support so that we can have proper rendering of the blog post (4-8h) ✅2022-03-08
- How do we process mdx files not in pages directory? WONTFIX
- How do we build the CMS layer? **✅2022-03-14 use contentlayer.dev for now. KISS: let's just put markdown files and hand code stuff that is remote OR look into Tina type stuff aka Content Layer API ✅2022-03-10 we do want a Content Layer API and as per notes there we are going with contentlayer.dev for now
Moving to contentlayer 2022-03-10
- Move content out of pages into data/blog
- Created pages/blog/[slug].jsx
- Basic working
- Make it look nicer
- Create pages/blog/index.jsx - list of all the blogs
- Check images and their paths … (should be easy)
Next
- Get demo working again
- Create
allOtherPages
in contentlayer.config.ts - Create catch-all route and generate all paths from that. Standard route works. Thoughts in the PR https://github.com/datopian/next.datahub.io/pull/29
- Create
- How do you render MDX (not markdown)? ✅ Use contentlayer/useMDXComponent like https://github.com/leerob/leerob.io/blob/main/pages/blog/%5Bslug%5D.tsx#L1 or roll our own renderer using post.body.raw (?) ==> We settled for and used contentLayer
- What is our solution to files in the main tree (ie. not in a specific category/directory)? ✅ https://github.com/leerob/leerob.io/blob/main/contentlayer.config.ts#L74-L82 seems to show a solution for loading allOtherPages - he only has one so renders it with a single explicit page https://github.com/leerob/leerob.io/blob/main/pages/uses.tsx but we can just reinstate […slug.js] approach for now - see https://github.com/datopian/nextjs-tailwind-mdx/blob/main/pages/%5B…slug%5D.js (though obviously we now have list of pages from content layer)
- How do we pass in specific components to the mdx system e.g. our DataTable? Done: We wrap parsed MDX by contentLayer with our custom MDX provider component.
- How do we have extra remark plugins/rehype Done: Configure all remark or rehype plugins in the contentlayer.config.js file
Misc
- ContentLayer references? ✅ this https://github.com/contentlayerdev/contentlayer/issues/86 answers that
Motivtion & Content: SCQH
Situation:
- Want to post up stuff e.g. updates re DataHub Pages, even Leo stuff) => need somewhere => most natural place is datahub.io blog
- Have existing site datahub.io with good traffic and existing blog etc
- Have existing content on datahub.io
- docs: separate repo: https://github.com/datopian/datahub-content
- collections: from https://github.com/datasets/awesome-data (IIRC)
- blog: https://github.com/datopian/frontend/tree/master/blog
- Have a new approach to building data literate sites
- And it needs to be tried out
- cloudflare allows us to rapidly layer in new urls using cloudflare workers (or could do at nginx level of the frontend routing k8s)
- Or we have the nginx reverse proxy running the k8s app
Complication:
- If we create a separate site we need to drive traffic to it (vs using datahub.io) => would like to use datahub.io
- But we don't want to replace whole of datahub.io at once (quite a lot running there)
- DataHub.io has a blog, docs and collections content already (that is quite inactive atm)
- ❗ current datahub.io blog heavily coupled into old datahub frontend ❌
Question: how could we build a new site for new DataHub (pages) and potentially takes over / replaces url space on datahub.io
H: let's create a new frontend site for datahub.io **where we "eat our own dogfood" in the sense of using our new framework to build the site
- new nextjs based data literate content site
- "take over" (incrementally) the content pages on datahub.io starting with the blog and the front page …