thought umwelt

❯

❯

❯

OCR Document Extraction

❯

Document OCR Decisions

Document OCR - Decisions

Oct 07, 20251 min read

Decisions

Framework: Pydantic AI
1. Why:
  1. We need models to guide the LLM and Pydantic is the best.
  2. They have a light weight wrapper lauched as Pydantic AI → does a good job of abstraction
  3. Logging - traces and evals - logfire has been launched and is in active development + it is open source.
2. Evaluated Langchain, bees, and raw implementation
  1. Langchain is unnecessarily heavy, documentation is fragmented and
3. Choose Pydantic because it is straightforward, integrated well with logfire, which does traces, evals, and MCP for debugging. It can also be self-hosted for maximum security.
Providers
1. Decision pattern:
  1. Highest priority of data security and isolation
    1. Gemini via Vertex AI
    2. mistral via Vertex AI - Mistral OCR via Vertex + Pydantic
    3. Openai via Azure
  2. Future compatibility is better if we are using some framework(?)

Implementation notes

Mistral OCR via Vertex + Pydantic

Graph View

Decisions
Implementation notes

Backlinks

index

GitHub
Twitter
Email