Tech Stack

The toolchain behind shipped AI agents.

Senior software engineer with 15+ years in production banking, fintech, and telecom. Since 2024, focused on AI agents and evals: multi-agent systems, MCP, model routing, RAG, fine-tuning, LLM-as-judge, and guardrails.

Stockholm, Sweden 15+ years shipping software Full CV ›

01 / AI Agents & Orchestration

Multi-agent systems with sharp tool boundaries, deliberate routing, and tested handoffs.

LangChain
DSPy
MCP servers & skills
Multi-agent systems
Agent orchestration
Agent routing
Tool & function calling
LSP
Model routing per task
Prompt engineering
AI agent security checks

02 / LLMs, Fine-Tuning & Training

Picking the right model for the job, then squeezing cost and latency with fine-tuning and distillation.

Frontier & open-weight LLMs
Fine-tuning
Model training
Knowledge distillation
Unsloth
Google Colab
Self-hosted GPU inference

03 / Evals, RAG & Safety

Golden datasets, LLM-as-judge gates, and guardrails so agents stay reliable in production.

LLM-as-judge
Golden datasets
Trajectory & outcome scoring
ML experimentation pipelines
NLP
RAG
Hybrid search
Embedding models
Vector databases
Reranking
AI red-teaming
Prompt-injection defense
Output guardrails
PII filtering

04 / Infrastructure & Serverless

Cheap edge compute when it fits, real clusters when it does not.

Cloudflare Workers
Durable Objects
Kubernetes
Google Cloud
Docker
REST
gRPC
CI/CD
PostgreSQL

05 / Languages & Frameworks

Python and TypeScript drive most agent work today. Kotlin and PHP cover the production mobile and backend tail.

Python
TypeScript
Kotlin
Java
PHP
Flutter
Laravel
Go (fundamentals)
Rust (fundamentals)

Proof, not promises

Production systems shipped

A snapshot of AI agents and platforms running today, drawn from Royan AB, SBAB Bank, and ParkUp Inc.

LLM-as-judge eval pipelines

Royan AB

Golden datasets, regression back-testing, and pass/fail gates that catch agent-quality regressions before they ship.

Embodied 3D AI assistant

Royan AB

Real-time, low-latency LLM agent loop with STT, MCP tools, TTS, and gesture sync. Output guardrails and prompt-injection checks on the deployed voice agent.

Autonomous pricing agent

ParkUp Inc. · 600K+ users

Multi-agent system with RAG over a vector database and embedding models, serving the parking inventory at scale.

Order-interpretation agent

Royan AB

LangChain pipeline that turns natural language into structured JSON, with schema validation and LLM-as-judge gating.

Bank-side AI tooling

SBAB Bank · AI forum (5 members)

MCP servers, agent skills, and model platforms (including AWS Bedrock) integrated into the bank's engineering workflows.

Distilled open-weight models

Royan AB

Fine-tuned and distilled with Unsloth on Google Colab to cut inference cost and latency on the deployed voice agent.

Mindset

How I think about agents

The non-negotiables I bring to every project, from a one-off internal tool to a multi-agent system in front of paying users.

01

Reliability before novelty.

A slow, boring agent that always works beats a clever one that fails 10% of the time.
02

Evals are the release gate.

No eval, no merge. Golden sets and LLM-as-judge catch regressions before users do.
03

Smaller, task-specific models when they win.

A fine-tuned 7B can beat a frontier model on cost, latency, and accuracy for narrow tasks.
04

Cost-aware model routing.

Cheap models for easy turns, strong models for hard ones, judges only where they pay back.
05

Tool design over prompt cleverness.

Most agent failures are tool design failures. Tight schemas and small surfaces beat long prompts.
06

Guardrails and PII filtering by default.

Input checks, output guardrails, and prompt-injection defense belong in the first commit, not the second incident.

LLM-as-judge eval pipelines

Embodied 3D AI assistant

Autonomous pricing agent

Order-interpretation agent

Bank-side AI tooling

Distilled open-weight models

Reliability before novelty.

Evals are the release gate.

Smaller, task-specific models when they win.

Cost-aware model routing.

Tool design over prompt cleverness.

Guardrails and PII filtering by default.