Files
exo/tmp/old_tests/test_vision_cache.py
ciaranbor fa57131374 Integration tests infra (#1995)
## Motivation

No automated integration tests exist for exo. Manual testing against
real hardware clusters is slow and error-prone. We need a pytest
framework that deploys clusters via `eco`, runs inference scenarios, and
tears down cleanly.

## Changes

- **`tools/src/exo_tools/`** — New workspace member shared by bench,
eval, and tests:
- `client.py` — `ExoClient` HTTP client (extracted from
`bench/harness.py`)
- `harness.py` — instance lifecycle helpers (placement, wait-for-ready,
etc.)
- `cluster.py` — `EcoSession` for eco cluster lifecycle
(deploy/stop/start/release/logs/exec) with unique `USER=<prefix>-<uuid>`
per session and atexit/signal cleanup
- **`tests/integration/`** — 17 pytest tests across 5 files:
- `test_1node.py` — place, chat, multi-turn, delete, state/models
endpoints, cluster snapshot, download-from-scratch
- `test_2node.py` — parametrized tensor/jaccl + pipeline/ring inference
and multi-turn
- `test_4node.py` — parametrized 4-node pipeline/ring inference, cluster
state
- `test_resilience.py` — full disconnect/reconnect cycle (2-node →
disconnect → 1-node → reconnect → 2-node)
- `test_dashboard.py` — Playwright: dashboard loads, shows node info,
chat flow
- `helpers.py` — placement/inference helpers, re-exports from
`exo_tools`
- `conftest.py` — session-scoped cluster fixtures with constraint-based
eco reservations; `--hosts` override; `EXO_REF` env var for CI
deployments from a GitHub branch
- **`bench/`** — Updated imports from `exo_tools.client` /
`exo_tools.harness`
- **`pyproject.toml`** — Added `tools` workspace member, `playwright`
dev dep, `--ignore=tests/integration`

## Why It Works

Tests use `eco` for cluster lifecycle and `ExoClient` for API
interactions — same tools humans use. Session-scoped fixtures deploy
once per file. Unique eco users prevent test runs from interfering with
each other or manual usage.

## Test Plan

### Automated Testing

- `uv run pytest tests/integration/ -v -s` — full suite (~4-5 min, 17/17
passing)
- `uv run pytest tests/integration/ -v -s --hosts s4,s9,s10,s22` — pin
specific hosts
- `EXO_REF=main uv run pytest tests/integration/ -v` — deploy from a
GitHub branch (CI)
- `uv run pytest` — confirms integration tests are excluded from default
runs
2026-05-08 17:15:08 +01:00

64 lines
2.2 KiB
Python

from exo.worker.engines.mlx.cache import KVPrefixCache
from exo.worker.engines.mlx.vision import MediaRegion
validate = KVPrefixCache._validate_media_match
class TestValidateMediaMatch:
def test_text_only_no_truncation(self):
assert validate(8000, [], []) == 8000
def test_text_prefix_before_image(self):
cached = [MediaRegion("hashA", 5000, 8600)]
assert validate(5000, cached, []) == 5000
def test_same_image_same_position(self):
cached = [MediaRegion("hashA", 5000, 8600)]
query = [MediaRegion("hashA", 5000, 8600)]
assert validate(9000, cached, query) == 9000
def test_different_image_truncates(self):
cached = [MediaRegion("hashA", 5000, 8600)]
query = [MediaRegion("hashB", 5000, 8600)]
assert validate(9000, cached, query) == 5000
def test_match_below_region_start(self):
cached = [MediaRegion("hashA", 5000, 8600)]
query = [MediaRegion("hashB", 5000, 8600)]
assert validate(4000, cached, query) == 4000
def test_text_followup_no_images_in_query(self):
cached = [MediaRegion("hashA", 5000, 8600)]
assert validate(9000, cached, []) == 9000
def test_multiple_images_first_mismatch_truncates(self):
cached = [
MediaRegion("hashA", 2000, 4000),
MediaRegion("hashB", 6000, 8000),
]
query = [
MediaRegion("hashA", 2000, 4000),
MediaRegion("hashC", 6000, 8000),
]
assert validate(9000, cached, query) == 6000
def test_multiple_images_all_match(self):
cached = [
MediaRegion("hashA", 2000, 4000),
MediaRegion("hashB", 6000, 8000),
]
query = [
MediaRegion("hashA", 2000, 4000),
MediaRegion("hashB", 6000, 8000),
]
assert validate(9000, cached, query) == 9000
def test_no_cached_regions(self):
query = [MediaRegion("hashA", 100, 200)]
assert validate(500, [], query) == 500
def test_cached_region_beyond_match(self):
cached = [MediaRegion("hashA", 10000, 12000)]
query = [MediaRegion("hashB", 10000, 12000)]
assert validate(5000, cached, query) == 5000