This project exposes the Vertex AI hosted Mistral OCR (25.05) model behind a /mistral-ocr FastAPI endpoint.
Requirements
-
VERTEX_PROJECT_ID,VERTEX_LOCATION, andMISTRAL_VERTEX_MODEL(publishers/mistralai/models/mistral-ocr-2505) must be present in the environment (see.env). -
Service-account credentials are loaded via
VERTEX_CREDENTIALS_JSON,VERTEX_CREDENTIALS_PATH, orGOOGLE_APPLICATION_CREDENTIALS. -
Python dependencies already include
google-cloud-aiplatformandpydantic-ai.
Implementation Overview
- Raw Predict + Data URLs (
src/ocr_pydantic/helpers/mistral_ocr.py)
-
Documents are base64 encoded into a
data:<mime>;base64,<payload>URL and sent toprojects/<project>/locations/<region>/publishers/mistralai/models/mistral-ocr-2505:rawPredict. -
We keep a cached
PredictionServiceClientand lazily initialise Vertex credentials throughload_vertex_credentials. -
Responses are parsed for markdown fragments; every page is concatenated into a single string. Embedded images (if present) are also captured for internal logging.
- FastAPI Endpoint (
src/ocr_pydantic/main.py)
-
POST /mistral-ocraccepts multipart uploads (application/pdf,image/png/jpeg). -
The endpoint returns
{ "markdown": "<concatenated markdown>" }with HTTP 200 on success. Validation errors surface as HTTP 400, unexpected failures as HTTP 500.
- Observability
-
logfire.instrument_pydantic_ai()is called at module load. -
Each rawPredict invocation is wrapped in a
logfire.spannamed"Mistral OCR rawPredict"withgen_ai.*attributes (model, usage, has_image) to populate Logfire’s LLM dashboards. -
Additional info logs (upload receipt, completion summaries) preserve structured metadata without storing document contents.
Development Notes
-
The initial attempt used the Google GenAI
generate_contentAPI, but Vertex rejectsmistral-ocr-2505there; switching torawPredictresolves the limitation. -
Tests (
tests/test_mistral_ocr_endpoint.py) monkeypatch the helper to avoid live calls while verifying HTTP behaviour and error propagation.
Use poetry run uvicorn ocr_pydantic.main:app --reload to exercise the endpoint locally, then issue:
curl -X POST \
-F "file=@contract.pdf" \
http://127.0.0.1:8000/mistral-ocr
The JSON response contains a single field, markdown, with all pages concatenated.
Challenge
Automatic metric tracking is not available via mistral as the response does not include the usage metrics Doc Reference and we had to implement via rawPredict as the pydantic wrapper does not handle mistral via different provider.

