## Motivation
Previous fix still failed in CI. Suspecting permissions issue with
GITHUB_TOKEN not being able to see draft releases via API.
## Changes
1. Add explicit `permissions: contents: write` to the job
2. Use `gh release list` first to check if draft exists (this uses a
different code path that might work better)
3. Add debug echo statements
## Test Plan
Delete v1.0.63 tag and re-push after merging.
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Motivation
Fixes the draft release detection that failed on the v1.0.63 release
attempt.
## Changes
The jq query was piped to `head -1` which truncated multi-line JSON
output to just `{`, causing the empty check to fail.
Changed to use `first // empty` in jq instead.
## Test Plan
Tested locally:
```bash
GITHUB_REF_NAME="v1.0.63"
gh api repos/exo-explore/exo/releases --jq "[.[] | select(.draft == true) | select(.name == \"$GITHUB_REF_NAME\")] | first // empty"
# Returns the full draft release JSON (2711 chars)
```
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Motivation
Closes#1140
Currently releases are uploaded to S3 for Sparkle updates but there's no
GitHub Release created, and Sparkle update dialogs don't show release
notes. Users have no visibility into what changed.
## Changes
- Added release workflow documentation comment at top of `build-app.yml`
- Added "Fetch release notes for Sparkle" step that converts markdown
from draft GitHub release to HTML
- Added "Inject release notes into appcast" step that embeds HTML in
appcast.xml with CDATA
- Added "Publish GitHub Release" step that attaches DMG and publishes
the draft
## Why It Works
- Sparkle's `<description>` tag supports HTML wrapped in CDATA for
rendering in update dialogs
- GitHub's markdown API (`/markdown`) converts the release notes to HTML
with proper formatting
- Draft releases allow writing polished notes before the build, then the
workflow publishes them automatically
- The workflow fails if no draft release exists, ensuring release notes
are always provided
## Test Plan
### Manual Testing
1. Create a draft GitHub release for a new tag with markdown release
notes
2. Push the tag to trigger the workflow
3. Verify the GitHub release is published with DMG attached
4. Download appcast.xml from S3 and verify
`<description><![CDATA[...]]></description>` contains HTML
5. Test Sparkle update dialog on macOS to confirm release notes appear
### Automated Testing
No automated tests added - this is CI workflow configuration.
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Continue working towards a fully Nix based build by building the
dashboard with Nix. Continuing the theme of using the existing lock
files, use dream2nix to parse the lock file and build the tree of
dependency derivations.
dream2nix doesn't like the bundleDependencies, so we apply a small patch
to the lock file that drops all dependencies that are bundled. This
should ideally be contributed upstream but that can be done later.
Use this new dashboard build in the build-app CI workflow, meaning
future macOS apps will include this reproducible dashboard.
Test plan:
- Built a DMG, shipped to a cluster, loaded in a browser with no cache
and the dashboard looks good.
- Directory layout is as expected:
```
$ nix build .#dashboard
$ find result/
...
result/_app/immutable/entry
result/_app/immutable/entry/app.CTPAnMjf.js
result/_app/immutable/entry/start.fUSEa-2O.js
result/_app/immutable/nodes
result/_app/immutable/nodes/3.DqQr1Obm.js
result/_app/immutable/nodes/0.DgEY44RO.js
result/_app/immutable/nodes/2.BjZg_lJh.js
result/_app/immutable/nodes/1.D6vGUYYT.js
result/_app/env.js
result/_app/version.json
result/exo-logo.png
result/favicon.ico
result/index.html
```
The CI was only running `nix flake check` on ubuntu-latest, missing
builds for other platforms and not caching packages or devShells.
Added a matrix-based `nix-build` job that runs on macos-26 (aarch64-darwin),
ubuntu-latest (x86_64-linux), and ubuntu-24.04-arm (aarch64-linux). Each
job enumerates all packages and devShells via `nix flake show --json`,
builds them in a single `nix build` call for parallelization, then runs
`nix flake check`. The cachix-action pushes all built outputs automatically.
This ensures all Nix outputs are built and cached for every supported
platform, speeding up local development and CI runs.
Test plan:
- Tested jq enumeration command locally, correctly outputs devShell paths
- Verified xargs pipeline works with the enumerated outputs
Enable cachix and push to it in the pipeline.yml workflow. This won't
cache a huge amount yet but will automatically extend our caching as we
build more of the repo with Nix in CI. It can also be used by local
users by accepting our cache to improve the speed of local builds.
Test plan:
- CI
Build app is the most convenient way to get a DMG for testing, but
currently it's a bit limited. You have to push to test-app every time
which is far from ideal and requires a bit too much force pushing for my
liking.
Add the workflow_dispatch trigger. This adds a button in the actions UI
to trigger a workflow for a named branch, which means you can use your
normal dev branch instead of having to push to test-app. We'll leave
that behaviour there for now too, though it may change in future.
Filter on `"${{ github.event_name }}" == "workflow_dispatch"` and set
those to alpha as well. Will verify by pushing the first version from
`main` just in case. Unfortunately we do have to merge this before we
can test it.
Test plan:
- Looking really hard.
## Motivation
Previously we hardcoded AWS credentials into the app.
This is not good practice.
## Changes
Use presigned URLs instead.
## Why It Works
Presigned URLs are an S3 feature for this kind of thing. They provide an
expiring presigned URL with certain permissions. In this case we have a
presigned URL with `s3:PutObject` permission that expires after 5
minutes. The client uses this presigned URL to upload a bug report
instead of using its own credentials to sign a request. This also
simplifies a lot of the Swift code.
## Test Plan
### Manual Testing
On a single MacBook, I downloaded the app and sent a bug report. It
worked and appeared in the bucket.
## Motivation
This PR implements benchmarking in the style of llama-bench. The main
difficulty here is the fact that exo is not a library - it exposes an
endpoint. This means that benchmarking numbers will be inaccurate if the
API is measured.
The solution assumes nodes are set up with uv run exo (or via the app),
and then hits the new endpoint /bench/chat/completions to retrieve
generation statistics directly from mlx_lm.
<!-- Why is this change needed? What problem does it solve? -->
This will allow us to release benchmarks for models and perform
regression tests.
TODO: Performance benchmarking.
<!-- If it fixes an open issue, please link to the issue here -->
## Changes
<!-- Describe what you changed in detail -->
- Adds /bench/chat/completions endpoint
- Adds BenchChatCompletion/Response
- Adds a logits processor to prevent response from ending early
- Adds a "Prompt Sizer" which downloads the tokenizer and dynamically
adjusts the prompt of "a" to fit the desired prompt size.
- Reduce prefill step size to 2048 for now (in future, dynamically
adjust this value)
<!-- Explain why your approach solves the problem -->
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
Benchmarked Llama, Qwen, DeepSeek and Kimi models. Will require several
fixes to run consistently on all configurations (to be done in the
future).
Manually tested the normal API to verify chat requests complete as
expected.
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
Not really possible. Type checker passes.
As I've been working on the .dmg, it's become clear we need a way to
test changes to the app. It's too hard to reproduce the full DMG locally
to be reasonable and much more convenient to test if it's signed.
Add a feature to the build-app workflow where if you push specifically
to the `test-app` branch it'll perform a build. The version is stubbed
to `0.0.0-alpha.0`, which is about as low as it gets in semver so you'll
always update away from it automatically with Sparkle. The resulting DMG
won't be pushed to S3 but will be uploaded as a GitHub Actions artifact.
I've been using similar commits to this for a while for testing. It's
worked well and not interfered with auto updating at all.
Test plan:
- Pushed this change to `test-app`.
- Generated action at
https://github.com/exo-explore/exo/actions/runs/20447213358/job/58752909332
- Installed the DMG on a Mac. It worked as intended.