This project exposes the Vertex AI hosted Mistral OCR (25.05) model behind a /mistral-ocr FastAPI endpoint.

Requirements

  • VERTEX_PROJECT_ID, VERTEX_LOCATION, and MISTRAL_VERTEX_MODEL (publishers/mistralai/models/mistral-ocr-2505) must be present in the environment (see .env).

  • Service-account credentials are loaded via VERTEX_CREDENTIALS_JSON, VERTEX_CREDENTIALS_PATH, or GOOGLE_APPLICATION_CREDENTIALS.

  • Python dependencies already include google-cloud-aiplatform and pydantic-ai.

Implementation Overview

  1. Raw Predict + Data URLs (src/ocr_pydantic/helpers/mistral_ocr.py)
  • Documents are base64 encoded into a data:<mime>;base64,<payload> URL and sent to projects/<project>/locations/<region>/publishers/mistralai/models/mistral-ocr-2505:rawPredict.

  • We keep a cached PredictionServiceClient and lazily initialise Vertex credentials through load_vertex_credentials.

  • Responses are parsed for markdown fragments; every page is concatenated into a single string. Embedded images (if present) are also captured for internal logging.

  1. FastAPI Endpoint (src/ocr_pydantic/main.py)
  • POST /mistral-ocr accepts multipart uploads (application/pdf, image/png/jpeg).

  • The endpoint returns { "markdown": "<concatenated markdown>" } with HTTP 200 on success. Validation errors surface as HTTP 400, unexpected failures as HTTP 500.

  1. Observability
  • logfire.instrument_pydantic_ai() is called at module load.

  • Each rawPredict invocation is wrapped in a logfire.span named "Mistral OCR rawPredict" with gen_ai.* attributes (model, usage, has_image) to populate Logfire’s LLM dashboards.

  • Additional info logs (upload receipt, completion summaries) preserve structured metadata without storing document contents.

Development Notes

  • The initial attempt used the Google GenAI generate_content API, but Vertex rejects mistral-ocr-2505 there; switching to rawPredict resolves the limitation.

  • Tests (tests/test_mistral_ocr_endpoint.py) monkeypatch the helper to avoid live calls while verifying HTTP behaviour and error propagation.

Use poetry run uvicorn ocr_pydantic.main:app --reload to exercise the endpoint locally, then issue:

 
curl -X POST \
 
-F "file=@contract.pdf" \
 
http://127.0.0.1:8000/mistral-ocr
 

The JSON response contains a single field, markdown, with all pages concatenated.

Challenge

Automatic metric tracking is not available via mistral as the response does not include the usage metrics Doc Reference and we had to implement via rawPredict as the pydantic wrapper does not handle mistral via different provider.