What does Akforges build?

Akforges builds production-grade web apps, native iOS and Android apps, AI-powered systems, n8n automations, cloud infrastructure, and provides technical SEO and bug rescue services.

How long does a typical project take?

Discovery and architecture takes 2 weeks. Build phase runs 3-6 weeks. Deployment week 7. Most projects ship in 5-8 weeks end to end.

Do you work with AI prototype applications?

Yes. We specialize in taking AI prototypes — Cursor apps, ChatGPT wrappers, LangChain demos — and hardening them for production with proper evals, guardrails, observability, and cost controls.

What is your pricing model?

We work on fixed-fee project engagements, hourly retainers, or a dedicated hours bank. Every engagement starts with a free discovery call and written proposal.

← All work

01— AI · Production

Telos AI

Sales intelligence · YC W24

From Cursor demo to 18,000 paying users.

We re-architected the inference pipeline, replaced a fragile LangChain demo with a typed agent graph, added evals and observability, and cut p95 latency by 71%.

−71%

p95 latency

−58%

AI cost / req

5 wks

Time to ship

18k

Paying users

The problem

Telos AI was a YC W24 company with a working sales intelligence prototype — built fast in Cursor, demoed to investors, and suddenly acquired 18,000 beta signups. The prototype worked. Production didn't.

The inference pipeline was a raw LangChain chain — no evals, no guardrails, no observability. P95 latency was 8.4 seconds. The model frequently returned malformed JSON that crashed downstream code. Costs were running $0.0038 per request with no visibility into which calls were expensive.

They had 5 weeks before their public launch. The founding team couldn't afford to spend that time debugging LangChain internals — they needed to ship features and close enterprise pilots.

What we did

Week 1 was an audit. We read every prompt, traced every model call, and benchmarked each stage. The root causes were clear: the chain was doing 6 sequential LLM calls where 3 would do, structured output was not enforced (so the model could and did return anything), and there was zero caching.

We replaced the LangChain chain with a typed LangGraph agent graph. Each node had a strict Zod schema for its output. Parallel calls replaced sequential ones where the model calls were independent. A Redis semantic cache cut repeat lookups by 34%.

We wired LangSmith tracing on every call — latency, token count, cost, and model version all logged with the user's org ID. This gave the Telos team the observability to debug issues themselves after handoff.

We shipped an eval harness with 240 hand-labelled test cases covering their most common sales research queries. It ran on every PR and became the team's definition of "the model is working."

Results

Delivered in 5 weeks. P95 latency dropped from 8.4s to 2.4s. AI cost per request fell from $0.0038 to $0.0016 — $58% reduction — through parallel calls, model routing (GPT-4o mini for cheaper tasks), and Redis caching.

Telos launched publicly, hit 18,000 paying users in the first month, and the inference pipeline has not been the bottleneck since. The eval harness caught two silent regressions from OpenAI model updates that would have reached users.

Tech stack

TypeScriptLangGraphOpenAI GPT-4oLangSmithPostgreSQLRedisAWS ECSGitHub Actions

“They shipped in five weeks what our last vendor couldn't in nine months. The eval harness alone changed how we think about model updates.”

Mei Park

CTO, Telos AI

Start a similar project Next case →