Home
  • CV
  • Tech Stack
  • Books
  • Projects
  • List 100
Context window management for 200K token large language models
Context Engineering: Mastering the 200K Token Era

With Claude 3.5 Sonnet supporting 200K tokens and Gemini 2.5 reaching 2M tokens, context engineering has become as important as prompt engineering....

PROMPT AND CONTEXT ENGINEERING
LoRA fine-tuning for efficient large language model training and optimization
Fine-Tuning LLMs with LoRA: 2025 Guide

Low-Rank Adaptation (LoRA) has revolutionized how we fine-tune large language models in 2025. This technique allows developers to adapt models like Llama...

LLM MODELS, PROVIDERS AND TRAINING
LLM inference optimization strategies to reduce AI costs
LLM Inference: Cut AI Costs by 80%

AI costs are crushing startups. One company I talked to was spending $47,000/month on LLM API calls—more than their entire engineering payroll....

INFERENCE, SERVING AND COST CONTROL

© 2025 Amir Teymoori