FastAPI Patterns I Actually Use in Real Projects
Not the patterns from the docs — the ones that have actually held up in production, across multiple projects with real teams maintaining them.
3 października 2024 · 3 min czytania
15 listopada 2024 · 3 min czytania
Building a demo that impresses in a 5-minute screen share is easy. Building an LLM-powered feature that works reliably for real users, across edge cases, at 2 AM when no one is watching — that's the actual job.
Here are the lessons I've learned the hard way.
The biggest misconception I see is treating prompts like config: write them once, check them in, forget about them.
In practice, prompts drift. Your data changes. Your users find edge cases. Your model provider quietly updates the underlying model. Every one of these can silently degrade your output quality.
What actually works:
If your LLM call returns freeform text that you then parse with string operations, you have a time bomb.
Use structured output from the start. With OpenAI's function calling or instructor, you get typed, validated output that your application code can depend on. Pydantic schemas mean extraction failures surface as explicit errors, not silent garbage.
from pydantic import BaseModel
from instructor import patch
from openai import OpenAI
client = patch(OpenAI())
class ExtractionResult(BaseModel):
company_name: str
revenue: float | None
year: int
result = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": document_text}],
response_model=ExtractionResult,
)
# result is now a typed Python object, not a string
In a RAG system, the bottleneck is almost always retrieval, not generation.
I've seen teams spend weeks optimizing prompts while the real problem is that their chunking strategy destroys context, or their embeddings don't capture domain-specific semantics.
Before you tune the prompt, audit your retrieval:
A response that takes 8 seconds feels broken to users, even if it's technically correct.
Options I've used in production:
LLMs fail silently. The model doesn't throw an exception when it confabulates — it just returns a confident-sounding wrong answer.
Build explicit uncertainty signals into your system:
Users tolerate "I couldn't find that information" far better than they tolerate plausible-but-wrong answers.
These lessons cost time to learn. I'd have shipped better software faster if someone had told me them early. Hopefully this saves you some iteration cycles.
If you're building something in this space and want to talk through your architecture, reach out.
Not the patterns from the docs — the ones that have actually held up in production, across multiple projects with real teams maintaining them.
3 października 2024 · 3 min czytania
Most automation projects fail not because of technical problems, but because they solve the wrong thing. Here's how to identify what's actually worth automating.
10 września 2024 · 3 min czytania
Ograniczona dostępność na nowe projekty — chętnie o przyszłej współpracy.