Git DMS
Vision
🚚 MOVED to datahub-next/notes/vision
Refinements to vision for v1
- Github first (rather than command line) i.e. push to github (+ storage?) and then connect to datahub
- Why?
- eliminates meta storage, (some) user management and access control
- piggy backs off popularity of github ("turn github into a datahub")
- where vercel/cloudflare have gone indicating best UX and demand (ie. publishing directly from github).
- provides versioning etc.
- most of our users already using github.
- Why?
- Cloud first (ignore self-publishing). Why? a) easier UX for avg user b) this is what you can monetize
- Standard presentation: rather than a full data-driven website that you can modify as in DataHub Pages. Why? b/c that makes for faster build, upgrading and better UX
- On DataHub.io (not subdomains) Publish on datahub.io e.g. datahub.io/@xyz/abc rather than subdomain
- Data API comes later: do we provide a data API from get go? ✅2023-02-24 no, we will add that in v0.2 b/c it is a bit complex to do. that does mean we want "direct loading" approach for data explorer to start with
Job Stories version
When I have a data-literate document I want to share it with colleagues or the world so they can quickly explore or use it
When I have a dataset I want to share/showcase it with colleagues or the world in a useable way so they can quickly explore or use it
- Great "Data Developer eXperience" (DDX): clean, smooth and easy experience. bonus for checks / corrections e.g. data validation.
- [Focus] Deploy a dataset on github and it gets a published url online you can share with others
- [Future] Can deploy a local dataset to a url online so that you can share it with others e.g.
results in showcase page at: https://datahub.io/@myusername/my-dataset (or more vercel like https://my-dataset.myusername.datahub.io/)
cd my-dataset data deploy
Milestones & OKRs
OKRs
- Active Users.
- Metric: MaUs
- Launch + 3m: 100 active users
- Revenue generation:
- Metric: subscribers.
- Key result: M6. 10 paying users.
- DataHub has 100s of paying customers using it to publish, present and share data.
- Publish our own data so that we generate traffic and can try out DaaS
- Metric: datasets from Datopian/DataHub team on DataHub.
- Key result: Launch + 1m. 50 core datasets on datahub v3 by end of March e.g. at datahub.io/@datasets/xyz
Features, Flows and Job Stories
To Process
- https://coggle.it/diagram/XwM2fshG0AglV4rq/t/datahub-job-stories-focused-on-the-power-users (Feb 2021) - list of job stories in coggle form
- [From DataHub Pages] There are many ways to schedule features into a roadmap and this will probably change rapidly. Thus, rather than set out an explicit sequence see the full feature list: https://coggle.it/diagram/YgpxWTL-yUb82LfW/t/datahub-pages-feature-tree-push-to-github-have-graphs-etc [#todo migrate to a spreadsheet]
Questions
- âž•2023-02-26 How do we test / dev deployment?
- What is "Kubernetes for Cloudflare workers"? ie. how do you orchestrate deployments? 🚧2023-02-26 cloudflare have what are called services https://developers.cloudflare.com/workers/learning/using-services/ and introduction post https://blog.cloudflare.com/introducing-worker-services/
Appendix: Business model 30%
Monetize via a freemium model where features like Data API are pay for and also an enterprise version.
What could people pay for
- Data API
- private stuff
- Teams
- Build minutes
NB: can build on / reuse work for DataHub Pages last year as quite similar especially:
- pages#SCQH - Hypothesis and its coggle
- Business plan section in spreadsheet in drive