AI Application Assistant — Telegram Bot for GovTech

Overview

Built a conversational AI assistant delivered as a Telegram bot for a Ukrainian application-writing consultancy serving small businesses applying for government and EU programs.

Role: Solo architect, engineer, DevOps Timeline: 8 weeks MVP + 2-week calibration Domain: Business Automation / GovTech — Ukraine

One complete proposal takes 2–4 hours per client. With 50+ active clients per month the bottleneck was clear. The client wanted a bot their team could use: describe a client, get a ready .xlsx back.

Hard constraint: Excel formula preservation. The template’s calculation section depends on formula chains — any solution that replaced formulas with static values would produce a document that breaks the moment a user edits a cell.

Solution

The user describes their client’s business across 10–20 messages. The system:

Routes the conversation to the correct application program via RAG + LLM classifier (pgvector HNSW + OpenRouter)
Extracts structured facts through LLM function-calling with schema validation
Validates consistency against donor-specific rules (funding limits, minimum employees, eligible expense categories)
Fills the Excel template — input cells only; all formula cells left untouched
Returns the completed .xlsx via Telegram

When AI confidence drops below threshold, the conversation escalates to a human operator who receives full context (history, extracted facts, current state) via a separate Telegram interface.

Architecture

Nine subsystems compiled to a single Go binary: Channel Adapter, AI Core, Knowledge Base, Session Store, Billing, People & Access, Workplaces, Observability, Integration Layer. Deployed via docker compose up on a Hetzner CCX23 (4 vCPU AMD EPYC, 16 GB RAM, ~€28/mo).

System architecture overview

Session State Machine

12-state conversation FSM

Conversation sessions are persisted as a 12-state FSM in PostgreSQL — not in memory. The bot survives server restarts mid-conversation. LLM calls are idempotent via a two-phase transaction with an inflight_request table: no duplicate billing or duplicate file generation on retry.

Escalation Flow

Escalation to human operator

Billing Flow

Subscription billing flow

Key Technical Decisions

Local embeddings (Ollama bge-m3)

Vector operations run entirely on the application server — no cloud dependency for knowledge base search. bge-m3 (567M params, 1024-dim, multilingual) served via Ollama on the same Hetzner VPS. If the LLM cloud provider goes down, RAG retrieval continues uninterrupted.

Two motivations: (1) resilience — cloud embedding APIs have been the single point of failure in similar systems; (2) privacy — applicant data never leaves the Ukrainian VPS.

Excel formula preservation

Go’s excelize library. A Template Validator runs at ingest time and blocks publication of any template that would lose formula integrity on fill-and-save. The artefact generator fills only cells marked as input placeholders; all formula cells are identified during ingest and excluded from the write path entirely.

Go monolith, single binary

At 50 concurrent users, microservice network overhead is pure waste. The AI Core components have tight data dependencies — Extractor output flows into the Consistency Controller which flows into the Artefact Generator. In a monolith: three function calls. In microservices: three HTTP round-trips plus circuit breakers. docker compose up and one log stream at 2am.

Spec-first with multi-model AI review

Before writing a line of code: a comprehensive platform specification was reviewed through several iterations of multi-model AI review (Claude, Gemini, GPT-4o, DeepSeek, Grok in parallel), producing dozens of Architecture Decision Records.

Examples caught before implementation: Ollama RAM estimate was 2 GB (actual: 3–4 GB); embedding provider primary/fallback order was inverted; Prompt Registry was misplaced in the Integration Layer instead of AI Core.

Results (post-calibration, week 10)

Metric	Target	Actual
Pilot users onboarded	10–20	15
Total sessions completed	≥50	65
`.xlsx` artefacts generated	—	~50
AI response p50	≤6s	5.1s
AI response p95	≤12s	10.6s
Excel artefact validity	99.9%	100%
Escalation rate	<30%	22%
Cost per user/month	$5–10	~$7
Delivery	8 weeks	on schedule

Stack

Layer	Technology
Language	Go 1.26
Bot framework	go-telegram/bot
Database	PostgreSQL 16 + pgvector
Embeddings	Ollama bge-m3 (local, 1024-dim, multilingual)
LLM provider	OpenRouter (Claude / Gemini)
Excel generation	excelize
DB queries	sqlc + pgx
Background jobs	river (Go job queue, pg-backed)
Secrets	SOPS + age
Observability	OTel + Loki + Prometheus + Grafana + Tempo
Infra	Hetzner CCX23, Docker Compose, Caddy + Let’s Encrypt
CI/CD	GitHub Actions: golangci-lint, trivy, go test