exo/resources at 55152fa99d7f08250d3580d0d9b05532b07a9e4d - exo

mirror/exo

mirror of https://github.com/exo-explore/exo.git synced 2026-02-06 12:11:22 -05:00

Files

ciaranbor 6177550c34 Ciaran/parallel cfg (#1361 )

## Motivation

Enable parallel classifier-free guidance (CFG) for Qwen image models.
CFG requires two forward passes (positive/negative prompts) - this
allows them to run on separate nodes simultaneously, reducing latency.

## Changes

  - Added uses_cfg flag to ModelCard to identify CFG-based models
- Extended PipelineShardMetadata with CFG topology fields (cfg_rank,
cfg_world_size, peer device info)
- Updated placement to create two CFG groups with reversed ordering
(places CFG peers as ring neighbors)
- Refactored DiffusionRunner to process CFG branches separately with
exchange at last pipeline stage
- Added get_cfg_branch_data() to PromptData for single-branch embeddings
  - Fixed seed handling in API for distributed consistency
  - Fixed image yield to only emit from CFG rank 0 at last stage
  - Increased num_sync_steps_factor from 0.125 to 0.25 for Qwen

## Why It Works

- 2 nodes + CFG: Both run all layers, process different CFG branches in
parallel
  - 4+ even nodes + CFG: Hybrid - 2 CFG groups × N/2 pipeline stages
  - Odd nodes or non-CFG: Falls back to pure pipeline parallelism
 
 Ring topology places CFG peers as neighbors to enable direct exchange.

## Test Plan

### Manual Testing

Verified performance gain for Qwen-Image for 2 node and 4 node cluster.
Non-CFG models still work

### Automated Testing
 
Added tests in test_placement_utils.py covering 2-node CFG parallel,
4-node hybrid, odd-node fallback, and non-CFG pipeline modes.

2026-02-04 21:16:35 +00:00

image_model_cards

Ciaran/parallel cfg (#1361 )

2026-02-04 21:16:35 +00:00

inference_model_cards

feat: add model picker modal with grouped models and HF Hub search (#1369 )

2026-02-04 05:56:23 -08:00