DataHub Cloud Design
DataHub Cloud Design
- Features
- Architecture …
- User journeys
Features
Essential
- Showcase page for a dataset or data story
- Nice layout
- Nice tables
- (Later?) List of data files
- (Later?) Visualizations
- Single site urls ie. `datahub.io/@{team}/{project}`
- Editor guide (assuming editor flow outside our app e.g. via github)
- Scales i.e. functions for reasonable number of users ⏭️ convert to authenticated requests
- Landing page and other marketing materials
Nice to have
- Catalog
Next
- Own domain
- Private repos
Backlog
- User site configurations e.g. nav items, logo, site title, and maybe custom CSS
- Large filees
- Data API
- Drag and drop create from a data file
- Built-in editor (maybe)
- Issues integration?
Questions:
- what is plan/sequence of implementation.
- what do we want by beginning of March for open data day
Parking lot:
- would be good to add support for relative paths to dataset files
- what is nice layout?
- for now similar to current datahub.io dataset pages)
- navigation components (nav, faooter, ToC, sidebar with sitemap)
User journeys
- Project creation and management
- Create project
- Edit project
- Delete
- Dashboard
- View project
Workflows
- Data storage: data storage, indexing and caching
- Renderer: user site styles, layouts, and configurations thereof, data catalogs, preview components etc.
- Editor: built-in WYSIWYG Markdown editor with support for portaljs components and datasets uploads
Architecture
Caching and cache invalidation strategies (preventing hitting GitHub API rate limits)
See #145
Data indexing
We need to keep an up-to-date index of all user files (generated with our mddb package).
👍 yes, this is the content store above …
Support for private repositories
This can be done and it is no different as for public repos in terms of hitting GH rate limits, as we're going to switch to authenticated requests only (using user tokens) anyway.
Questions:
- is it ok to use GH user tokens to fetch their content for rendering their pages for other people (Ola) I guess, if we clearly state it in the terms of service
Are there other approaches here?
Built-in editor
We need some …
User site layout improvements
Main focus: Dataset type pages for now (not data stories) and make them look similar to datasets available here https://datahub.io/search
Appendix: Template for Shaping
## Summary
*1-3 sentences for each item*
- Problem
- Appetite
- Solution
- Rabbit
## Problem
## Solution
## Rabbit holes
## No gos
## Appendix e.g. alternative solutions