exo/nix at distributed-settings - exo - Gitea: Git with a cup of tea

mirror/exo

mirror of https://github.com/exo-explore/exo.git synced 2026-04-17 12:30:29 -04:00

Files

rltakashige 43b3df45fb Fix BatchGenerator in line with upstream refactor (and prevent Qwen3.5 memory leak) (#1835 )

## Motivation

MLX LM has had a massive refactor to their BatchGenerator recently.
Since we'd like new features from MLX LM such as Gemma 4, we need to
update the code to handle this.

Additionally this fixes a significant memory leak in GatedDeltaNet (the
difference is quite substantial, up to 1GB every 1000 tokens, explaining
several memory issues users were facing with Qwen3.5 models)

## Testing
Before
<img width="3146" height="884" alt="image"
src="https://github.com/user-attachments/assets/5af0f55a-393c-4a32-9eed-ae43f1611af4"
/>


After (no memory leak, as one of the changes upstream)
<img width="3190" height="892" alt="image"
src="https://github.com/user-attachments/assets/f0bd128d-fd48-40d4-9bbd-50a564beab14"
/>

2026-04-07 11:50:12 +00:00

apple-sdk/metadata

nix: override apple-sdk to 26.2 and enable MLX_BUILD_CPU (#1443 )

2026-02-10 19:53:53 +00:00

apple-sdk-overlay.nix

nix: override apple-sdk to 26.2 and enable MLX_BUILD_CPU (#1443 )

2026-02-10 19:53:53 +00:00

darwin-build-fixes.patch

Fix BatchGenerator in line with upstream refactor (and prevent Qwen3.5 memory leak) (#1835 )

2026-04-07 11:50:12 +00:00

metal-toolchain.nix

mlx: build with Nix (#1285 )

2026-01-29 14:07:00 +00:00

mlx.nix

Fix BatchGenerator in line with upstream refactor (and prevent Qwen3.5 memory leak) (#1835 )

2026-04-07 11:50:12 +00:00