Appendix: Cloudflare materials

https://developers.cloudflare.com/reference-architecture/diagrams/serverless/serverless-etl/ (archived copy)

This diagram describes our architecture and needs almost exactly.

Figure 2: Serverless: Object storage ingest

Question tree

  • What are examples of the kind of processing we'd want to do in Flowershow/DataHub
  • What advantages / disadvantages does using Cloudflare have over our existing approach using e.g. inngest
  • What would be the architecture we would use?
  • What actual code examples / demos are there we can draw on?

The Concept

Our needs

  • Copy files from github into R2 (currently handled by inngest)
  • Build a meatastore ie. index of those files likely with additional metadata
  • Build other things e.g. full text search

Consumption needs

  • Get me a list of files that are blog posts
  • Get me for those blog posts their title, description and image
  • Get me pages that match these text search criteria provided by a user

How to actually do this

Roughly we need …

  • A workflow to get stuff into R2
  • Workflow(s) once in R2 to do processing

How do we trigger workflows from events in R2?

FAQs

Prompts for code design

Design the layout on R2

Create a cloudflare worker that syncs from github to R2

Appendix: current materials

What does the MetaStore look like?

This is the main question …

And how is it accessed …

Algorithm for home page when a dataset

See ../pitchs/2410-metadata-store#Appendix Current Architecture

Appendix: Cloudflare materials

© 2025 All rights reservedBuilt with DataHub Cloud

Built with LogoDataHub Cloud