mirror of
https://github.com/exo-explore/exo.git
synced 2026-02-05 19:52:16 -05:00
closing the http request to the api now - sends a cancellation from the api - writes that canellation in the master - worker plans off the cancellation - runner observes that cancellation after every generation step (+1 communication per token) - cancellation happens synchronously to prevent gpu locks