Conversion Studio
B2B document conversion worker on Cloud Run
Why a separate service
Lasanta India and Narven Group both need document conversion: trade documents, OCR-assisted invoices, format normalization for incoming files. The conversion path is CPU-heavy and benefits from scale-to-zero — keeping it on Vercel would inflate cold starts for everything else.
Splitting it to Cloud Run lets the consuming apps stay snappy while the worker scales horizontally under load.
What it does
- PDF / document conversion — DOCX, ODT, RTF → PDF via headless LibreOffice (
soffice). - Markdown / HTML via Pandoc with extension- aware presets.
- Image conversion + OCR — image to PDF with optional OCR-extracted text layer.
- ZIP packaging for downloadable bundles.
- Format detection + normalization — sniff source, route to the right converter.
How it integrates
Consuming app (Lasanta or Narven) signs a request with a shared HMAC secret + CONVERTER_INTERNAL_CALLER identity. Worker verifies the signature, runs the conversion, returns the artifact (or a queued job id for async paths).
Internal-only. Users never touch the worker directly; they stay inside the product they’re using.
What I learned
- Shared B2B services reward themselves when two products both need the same heavy compute.
- HMAC + caller identity beats per-consumer API keys for shared internal services — adding a third consumer is a one-line registration.
- Headless LibreOffice in a container is a research project the first time, a copy-paste the second.
- Scale-to-zero is the right default for irregular workloads. Cold start cost is acceptable when the alternative is paying for idle compute 24/7.
