Architecture¶
High-level view of how the pieces fit together. For a detailed inventory of every library, framework, and service in the stack, see Stack.
Main components¶
LocaScore is split into seven logical components. Each has a single responsibility and talks to the others through well-defined interfaces.
1. Data pipeline (code/)¶
What: Python code that runs locally on the developer's machine.
Purpose: Ingest raw data from public sources, compute KPIs per H3 cell,
produce the KV-ready JSON batches.
Inputs: SITG (GDB files), OSM (via osmnx), OFEV (rasters), OCSTAT
(hardcoded), GTFS (via r5py), curated JSON (international schools).
Output: output/geneva_kpis_by_h3.parquet + output/kv_export/kv_batch_*.json.
Runs: On-demand (manual), not in production. Takes ~1.5 minutes.
See Data pipeline for the full step-by-step.
2. Data store (Cloudflare KV)¶
What: Cloudflare's eventually-consistent key-value store,
accessed from the Worker at the edge.
Purpose: Serve 17,097 pre-computed cells to the Worker with
sub-10ms latency globally.
Schema: Key = H3 cell ID (string), value = JSON-serialized KPI object.
Size: ~60 MB total.
Updated: Via wrangler kv bulk put whenever the pipeline re-runs.
Critical property: The H3 cell ID is never exposed in any API
response — it's the key, not a field. See Backend.
3. Worker API (worker/)¶
What: Cloudflare Worker — TypeScript code running on V8 isolates
at Cloudflare's edge.
Purpose: Thin API layer. Verifies JWT, rate-limits, reads KV, calls
Supabase/Stripe, returns JSON.
Stateless: Every request is independent. No in-memory session state.
Routes: /api/teaser, /api/report/unlock, /api/reports/*,
/api/checkout, /api/feedback, /api/email-capture, etc.
See Backend for the full route table.
4. Frontend (frontend/)¶
What: Vite + React 19 single-page application.
Purpose: User interface. Address search, teaser view, report view,
auth, purchase flow, saved reports, comparison.
Built: npm run build → static files in dist/.
Served: Cloudflare Pages (CDN + HTTPS + custom domain).
Talks to: Worker API (Supabase-auth'd), Supabase client directly
(for auth state), Mapbox (maps), Swiss Federal API (geocoding).
See Frontend.
5. User database (Supabase Postgres)¶
What: Managed Postgres with auth, RLS, and REST (PostgREST).
Purpose: Users, sessions, purchases, saved reports, feedback, email
captures.
Access pattern: The Worker uses the service role key (bypasses RLS).
The frontend uses the anon key for auth flows only — user data is read
back through the Worker, not directly via PostgREST.
Tables: profiles, purchases, reports, feedback,
email_captures. Plus 9 migrations adding columns, RPCs, views.
See Backend.
6. Payment processor (Stripe)¶
What: Hosted Stripe Checkout sessions.
Purpose: Process CHF payments for report tokens (single, pack3, pack10).
TWINT enabled (Swiss mobile payments).
Flow: Worker creates a session → frontend redirects user → Stripe
hosts the payment UI → Stripe webhooks back to the worker on success.
Idempotency: process_purchase() RPC uses stripe_session_id as a
natural key to prevent double-crediting on webhook retries.
7. Map rendering (Mapbox GL JS + deck.gl)¶
What: Client-side map rendering with vector tiles + custom layers. Purpose: Interactive maps on the report page (POI markers, H3 hex overlay). Also static images for printable reports (future). Why not server-side: Client-side is cheaper (no tile server), lazy-loaded only when a user opens a map (~1.6 MB gzipped, so we don't ship it on every page).
System diagram¶
┌──────────────────────────────────────────────────────────────────────┐
│ CLOUDFLARE │
│ │
│ ┌────────────┐ ┌──────────────────┐ ┌────────────────┐ │
│ │ Pages │ │ Worker (API) │ │ KV │ │
│ │ (React SPA)│ ───► │ /api/teaser │ ───► │ (17K H3 cells, │ │
│ │ Vite │ │ /api/report │ │ ~170 KPIs ea) │ │
│ │ build │ │ /api/geocode │ │ │ │
│ │ │ │ /api/feedback │ │ │ │
│ │ │ │ /api/email │ │ │ │
│ │ │ │ /api/checkout │ └────────────────┘ │
│ │ │ │ /api/unlock │ │
│ │ │ │ + auth middleware │
│ │ │ │ + rate limiting │
│ └────────────┘ └────┬─────────────┘ │
│ │ │
└───────────────────────────┼─────────────────────────────────────────┘
│
┌───────────┼────────────┐
│ │ │
┌──────▼───┐ ┌────▼─────┐ ┌──▼──────────────┐
│ Supabase │ │ Stripe │ │ Mapbox │
│ Postgres │ │ + TWINT │ │ (maps) │
│ + Auth │ │ │ │ │
└──────────┘ └──────────┘ └─────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ DATA PIPELINE (runs locally) │
│ │
│ SITG ─┐ │
│ OSM ──┤─► Python pipeline ─► Parquet ─► JSON batches ─► KV upload │
│ OFEV ─┤ (pandana, h3, │
│ OCSTAT┤ geopandas) │
│ GTFS ─┘ │
└──────────────────────────────────────────────────────────────────────┘
Request flow: anonymous user views teaser¶
- User lands on
locascore.ch(Cloudflare Pages) - Types an address → frontend geocodes via Swiss Federal API (api3.geo.admin.ch, no API key needed, commercial use OK)
- Frontend resolves lat/lng → navigates to
/teaser/:slug?lat=...&lng=... TeaserPage.tsxcallsGET /api/teaser?lat=...&lng=...on the Worker- Worker converts lat/lng → H3 cell ID (server-side, never exposed)
- Worker reads cell from KV → returns a subset of fields (grades, headlines, fun facts, air quality)
- Frontend renders the teaser, shows paywall CTA
Request flow: authenticated user unlocks report¶
- User clicks "Unlock" on teaser →
AuthCheckoutModalopens - User signs in / signs up → Supabase Auth returns JWT
- If balance > 0: user clicks "Use 1 token" → calls
POST /api/report/unlock - Worker verifies JWT, reads cell from KV, calls Supabase RPC
unlock_report()which atomically: deducts 1 token, inserts report snapshot, returns report ID - Frontend navigates to
/report/:slugand renders full data
If balance is 0: user picks a pricing tier → POST /api/checkout → Stripe
checkout URL → redirect → after payment, Stripe returns to
/teaser/:slug?checkout=success → auto-navigate to report.
Tech stack¶
Data pipeline (code/)¶
- Python 3.11 in a conda env called
hood-analyzer - pandas, geopandas — tabular + spatial data
- h3 — hexagonal grid (resolution 10)
- pandana — network-based shortest-path KPIs (walk/bike/car distances)
- osmnx — OSM network download (walk + drive + bike graphs)
- r5py — GTFS-based multimodal transit routing (optional — requires Java 21+)
- duckdb — parquet I/O
- rasterio — noise + air quality raster sampling
Frontend (frontend/)¶
- Vite + React 19 + TypeScript
- Tailwind CSS v4 (brand design system)
- framer-motion for animations
- @nivo/radar + @nivo/bar for charts
- react-map-gl + deck.gl for Mapbox maps with H3HexagonLayer
- react-i18next for bilingual FR/EN
- react-router v7
- @tanstack/react-query for server state
- Playwright for e2e tests
Backend (worker/)¶
- Cloudflare Workers (TypeScript)
- Thin API layer — routes in
worker/src/routes/ - Auth middleware verifies Supabase JWTs
- KV bindings:
NEIGHBORHOOD_KV(the cell data),RATE_LIMIT_KV(counters)
Storage¶
- Cloudflare KV — pre-computed KPI data (17,097 entries, ~60 MB)
- Supabase Postgres — users, purchases, saved reports, feedback, email captures
- Supabase Auth — email + password
Data flow¶
Cell data (KV)¶
SITG + OSM + OFEV + OCSTAT + GTFS
│
▼
code/pipeline.py
│ 1. Load sources (cached in data/cache/)
│ 2. Build H3 grid (resolution 10, ~17K cells)
│ 3. Compute KPIs (pandana networks, raster sampling, k-ring counts)
│ 4. Assign commune + tax rate + price/m²
│ 5. Compute composite scores (scores.py)
│ 6. Generate insights (insights.py) → fun facts, headlines
▼
output/geneva_kpis_by_h3.parquet
│
▼
code/export.py → output/kv_export/kv_batch_{001,002}.json
│
▼
wrangler kv bulk put → Cloudflare KV
User data (Supabase)¶
profiles— user metadata + token balancepurchases— Stripe transactionsreports— snapshots at unlock time (historical record) +location_key(H3 cell, server-side only)feedback— "Report a problem" submissionsemail_captures— lead magnet emails from teaser
See backend.md for the full schema.
Hosting cost¶
| Service | Plan | Cost |
|---|---|---|
| Cloudflare Pages | Free | CHF 0 |
| Cloudflare Workers | Paid (for KV write headroom) | CHF ~5/mo |
| Cloudflare KV | Included in Workers Paid | CHF 0 |
| Supabase | Free tier (50K MAU) | CHF 0 |
| Stripe | Per-transaction (~2.9% + CHF 0.30) | Variable |
| Mapbox | Free tier (50K loads/mo) | CHF 0 until scale |
| Domain | locascore.ch | CHF 4.64/year |
| Baseline | ~CHF 5/month |
Anti-scraping¶
The dataset IS the product, so several layers protect it:
- Cloudflare Bot Fight Mode — edge-level bot filtering
- Rate limits (in
RATE_LIMIT_KVviaworker/src/middleware/rateLimit.ts): 10 req/min per IP on all public endpoints - H3 cell IDs never exposed — the worker converts lat/lng to H3 internally;
responses never include the cell ID (see
worker/src/utils/h3.ts) - Full report data requires auth — JWT verified on every request
- Daily unique lookup cap — 50 unique cells per IP per day for teasers
- Turnstile on geocoding — prevents automated address enumeration (currently dormant since we switched to the Swiss Federal geocoder client-side)
Design principles¶
- Solo developer project — minimize operational surface
- Near-zero hosting cost until revenue justifies it
- Bilingual FR/EN from day one — not an afterthought
- Data quality > feature count — the scores have to be credible
- Fail-closed on security — auth gates, rate limits, RLS policies
- Fail-open on non-critical infra — e.g., rate limit KV errors don't brick public endpoints
Next: Data pipeline