DataHub Next standup

Agenda

Blockers and updates
- Rufus: conversation with David Gasquez
  - Pushed me to draw an architecture diagram that may be helpful https://app.excalidraw.com/l/9u8crB2ZmUo/RQX7IPEY2G 👈
Reminder of our goal: …
- Where are we at with loading remote files?
AOB
- 🐛 doubled headings on datahub.io/notes e.g. https://datahub.io/notes/markdown-pipeline
- Explain about the options for a Flowershow like experience i.e. "Pages" like or "Cloud" like

Next steps

Ola: create a new render pipeline for content using new markdown pages/test/[...slug].ts
Make a choice to try vercel if edge does not seem likely
Joao: deploy on vercel

Re markdown pipelines

https://datahub.io/notes/markdown-pipeline - this contains write-up of most of our work so far.

Goals

Can run against content pulled remotely (not on local disk)
Can easily enhance (what does that mean?)

Questions

What plugins currently work w/o filesystem ✅2023-03-09 almost all of them, maybe not wiki-link-plus
What work with v8 rather than node runtime? 🚧2023-03-09 unlikey to work given experience so far
What does a test for working look like for this pipeline?

testIt() {
  const out = parse(document)
  // need to test the render in some way i guess ...
}

Re MarkdownDB

Goal: we can replace contentlayer for local stuff.

Questions

Ola's questions

the upgrade of Flowershow template in datahub-next could probably be even cleaner, i.e. we could get rid of some more components that are currently included in @flowershow/core, but it would require a bit more time so I decided to first confirm if this is even needed ✅2023-03-09 maybe … if very fast and clear benefits
- How much time? ✅2023-03-09 around 2h
- What woud be the benefits? any bugs fixed and latest improvements to components installed with latest versions on core, no need to copy-paste stuff
- Q: are we going to use @floweshow/core as a dependency in the future? ✅2023-03-09 probably yes? is there a reason we wouldn't?
  - Why / Why not?
    - not sure, I thought we were considering somehow merging the two? Also, depends on how much do we want to customize stuff we import from core? If we need to customize them significantly we will end up copy-pasting and adjusting codeanyway. If only slight modifications are needed, then this can be done on @flowershow/core side - i.e. extending component API etc.
- it would require some slight refactoring as the custom layout was built using an old component (in the current Flowershow template we're importing it from core but here we need to use a custom one)
- also, some custom datahub-component are using components from the original template code (which are now included in the core package) and I thought maybe it's better to leave them as this way we are actually able to customize them (e.g. Card.jsx used in CollectionItem)
what

João's questions

Are we going to support querying based on frontmatter fields? ✅2023-03-10 yes
Are tags frontmatter fields? ✅2023-03-10 i would split them out to their own table
- What is a tag?
  - Obsidian: Tags are keywords or topics that help you quickly find the notes you want.
  - In Obsidian, tags are defined in the metadata, so in our cause it would endup in the frontmatter field. We probably could use the "LIKE" operator to select rows based on tags listed on the frontmatter field, but I think it would be better if tags had its own DB field.

What does obsidian have?

Read https://datahub.io/notes/obsidian

https://flowershow.app/notes/obsidian-database-research (read it raw in the repo 😬)

class MarkdownFile extends File {

  links
  tags
  frontmatter
  metadata
}

Plan (Rufus)

Close out ../projects/product-plan-2023

Scratch (Rufus)

something about the decision tree for what you want e.g. if i want custom javascript components then …, if i want server side routing then …

Why? helps explain our choices.

e.g. datahub pages makes sense if xxx

../notes/content-monorepos idea again

Summary: obsidian is already so awesome locally. what is the add for us? why do you want to publish? or why do you want cloud?

Possible answers:

Collaboration in a team (or shared source of truth e.g. like a common github repo)
Scale: e.g. you grow to something where cloud is easier
Asset storage and processing
- e.g. images
- e.g. data - both storage and stuff like APIs
Workflows: this relates to previous point. 🔥 personally think this is the strongest item.

Came out of thought train where noticing

https://www.reddit.com/r/ObsidianMD/comments/11ma41d/tables_in_obsidian/
- https://forum.obsidian.md/t/csv-backed-data-table-idea/645/3 - someone moving off notion to obsidian and thinking about csv backed table

Aside: notice others already looking at obsidian and markdown as alternative to notion. i think this direction of travel is totally right. notes/markdown-is-eating-the-world

dropbase.io

Source: https://forum.obsidian.md/t/csv-backed-data-table-idea/645/5 (see previous item)

#links https://www.dropbase.io/ - just raised Feb 2023 1.75m VC round. Seems related to ../notes/data-import-tooling and e.g. flatfile etc.

We Raised a $1.75M Round Led by Gradient Ventures to Help Companies Unlock CSV/Excel Data and Automate Data Sharing Between Incompatible Systems

More about it:

Dropbase is a collaborative data import and data management platform. Dropbase helps companies import, validate, and manage all their data from CSV and Excel files inside cloud databases optimized to handle large amounts of data, with no technical help required. We’ll use the funding to expand our data platform capabilities and continue closing the gap between spreadsheets and analytical databases.