38 KiB
Parallel Region Processing Architecture
Executive Summary
This document outlines a comprehensive plan to parallelize the Arnis world generation pipeline by splitting large user-selected areas into smaller processing units (1 Minecraft region = 512×512 blocks per unit). The goal is to:
- Reduce memory usage by ~90% by processing and flushing regions incrementally
- Utilize multiple CPU cores for parallel generation
- Maintain visual consistency across region boundaries (colors, elevation, etc.)
Current Architecture Analysis
Processing Pipeline Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ CURRENT PROCESSING FLOW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ [1/7] Fetch Data (retrieve_data.rs) │
│ └── Downloads OSM data for entire bbox from Overpass API │
│ └── Single HTTP request with full bounding box │
│ │
│ [2/7] Parse Data (osm_parser.rs) │
│ └── Transforms lat/lon → Minecraft X/Z coordinates │
│ └── Clips ways/relations to bounding box (clipping.rs) │
│ └── Sorts elements by priority │
│ │
│ [3/7] Fetch Elevation (ground.rs / elevation_data.rs) │
│ └── Downloads Terrarium tiles for entire bbox │
│ └── Builds height grid matching world dimensions │
│ │
│ [4/7] Process Data (data_processing.rs) │
│ └── Pre-computes flood fills in parallel (floodfill_cache.rs) │
│ └── Builds highway connectivity map │
│ └── Collects building footprints │
│ │
│ [5/7] Process Terrain + Elements (data_processing.rs) │
│ └── Iterates ALL elements sequentially │
│ └── Calls element_processing/* for each element type │
│ └── Places blocks via WorldEditor │
│ │
│ [6/7] Generate Ground (data_processing.rs) │
│ └── Iterates ALL blocks in bbox │
│ └── Sets grass, dirt, stone, bedrock layers │
│ │
│ [7/7] Save World (world_editor/mod.rs → java.rs) │
│ └── Iterates ALL regions in memory │
│ └── Writes .mca files in parallel │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Key Data Structures and Memory Usage
WorldToModify (world_editor/common.rs)
pub struct WorldToModify {
pub regions: FnvHashMap<(i32, i32), RegionToModify>, // Key: (region_x, region_z)
}
pub struct RegionToModify {
pub chunks: FnvHashMap<(i32, i32), ChunkToModify>, // 32×32 chunks per region
}
pub struct ChunkToModify {
pub sections: FnvHashMap<i8, SectionToModify>, // 24 sections per chunk (-4 to 19)
}
pub struct SectionToModify {
pub blocks: [Block; 4096], // 16×16×16 = 4096 blocks
pub properties: FnvHashMap<usize, Value>, // Block properties (stairs, slabs, etc.)
}
Memory estimate per region:
- Section: ~4KB (blocks) + ~variable (properties)
- Chunk: ~24 sections × 4KB = ~96KB minimum, typically ~200-500KB with properties
- Region: ~1024 chunks × 300KB = ~300MB per region
- For a 10×10 region area: ~30GB of memory required!
Why Elements Are "Scattered"
The current design processes elements in OSM priority order (entrance → building → highway → waterway → water → barrier → other), NOT by spatial location. This means:
- A building in region (0,0) might be followed by a highway in region (5,5)
- Each
set_block()call potentially accesses different regions - ALL regions must remain in memory until the end because any element might touch any region
Proposed Architecture: Region-Based Parallel Processing
Core Concept
Split the user-selected area into processing units of 1 Minecraft region each (512×512 blocks = 32×32 chunks). Process each unit independently in parallel, then flush to disk immediately.
┌─────────────────────────────────────────────────────────────────────────────┐
│ PROPOSED PARALLEL PROCESSING FLOW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ GLOBAL PHASE (Run Once for Entire Area) │
│ ═══════════════════════════════════════ │
│ │
│ [1] Fetch Elevation Data for ENTIRE bbox │
│ └── Must be consistent across all units │
│ └── Store as shared read-only Arc<Ground> │
│ │
│ [2] Compute Processing Unit Grid │
│ └── Divide bbox into N×N region units │
│ └── Create sub-bboxes with small overlap for boundary elements │
│ │
│ PARALLEL PHASE (Per Processing Unit) │
│ ═════════════════════════════════════ │
│ │
│ For each processing unit (in parallel, using N-1 CPU cores): │
│ │
│ [3] Fetch OSM Data for Unit's Sub-BBox │
│ └── Separate Overpass API query per unit │
│ └── Include small buffer zone for boundary elements │
│ │
│ [4] Parse & Clip Elements to Unit Bounds │
│ └── Same as current, but for smaller area │
│ │
│ [5] Pre-compute Flood Fills │
│ └── Only for elements in this unit │
│ │
│ [6] Process Elements │
│ └── Generate buildings, roads, etc. │
│ └── Use deterministic RNG keyed by element ID │
│ │
│ [7] Generate Ground Layer │
│ └── Only for this unit's blocks │
│ │
│ [8] Save Regions to Disk │
│ └── Write .mca files immediately │
│ └── FREE MEMORY for this unit │
│ │
│ FINALIZATION PHASE │
│ ══════════════════ │
│ │
│ [9] Wait for all units to complete │
│ [10] Generate map preview (optional) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Critical Considerations
1. Deterministic Randomness ✅ ALREADY IMPLEMENTED
The codebase already has deterministic_rng.rs which provides:
// Creates RNG seeded by element ID - same element always produces same random values
pub fn element_rng(element_id: u64) -> ChaCha8Rng {
ChaCha8Rng::seed_from_u64(element_id)
}
// For coordinate-based randomness
pub fn coord_rng(x: i32, z: i32, element_id: u64) -> ChaCha8Rng
Impact on buildings crossing boundaries:
- Building colors are chosen using
element_rng(element.id)in buildings.rs - Even if a building is processed in two different units, SAME element ID → SAME color
- The existing implementation already supports this use case!
Files using deterministic RNG:
element_processing/buildings.rs- wall colors, window styles, accent blockselement_processing/natural.rs- grass/flower distributionelement_processing/tree.rs- tree variations
2. Elevation Data Consistency ⚠️ REQUIRES CHANGES
Current behavior:
- Elevation is fetched once in
ground.rs→Ground::new_enabled() - Height grid dimensions match the world's XZ dimensions
- Lookup uses relative coordinates:
ground.level(XZPoint::new(x - min_x, z - min_z))
Problem:
- If each unit downloads its own elevation tiles, slight differences in tile boundaries or interpolation could cause height discontinuities at unit boundaries
Solution:
- Download elevation ONCE for the entire area before parallel processing starts
- Pass
Arc<Ground>(read-only) to all processing units - The
Ground::level()function already uses world-relative coordinates, so no changes needed
// Proposed: Global elevation fetch before parallel processing
let global_ground = Arc::new(Ground::new_enabled(&args.bbox, args.scale, args.ground_level));
// Each processing unit receives a clone of the Arc
for unit in processing_units {
let ground_ref = Arc::clone(&global_ground);
// spawn task with ground_ref
}
3. Element Clipping ⚠️ REQUIRES NEW LOGIC
Current clipping (clipping.rs):
- Uses Sutherland-Hodgman algorithm to clip polygons to user's bbox
- Works on the OUTER boundary of the entire selected area
New requirement:
- Need to clip elements to each processing unit's internal boundary
- But with OVERLAP to handle elements that straddle unit boundaries
Proposed approach:
┌─────────────────────────────────────────────────────────────────┐
│ UNIT BOUNDARY HANDLING │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Example: 4 processing units arranged in a 2×2 grid │
│ │
│ Unit A │ Unit B │
│ (regions 0,0-1,1) │ (regions 2,0-3,1) │
│ │ │
│ ┌──────────────────┼──────────────────┐ │
│ │ │ │ │
│ │ ████████ │ │ ← Building straddles │
│ │ █ BLD █─────┼──────────────────│ Unit A and B │
│ │ ████████ │ │ │
│ │ │ │ │
│ ├──────────────────┼──────────────────┤ │
│ │ │ │ │
│ │ │ │ │
│ │ │ │ │
│ │ │ │ │
│ └──────────────────┴──────────────────┘ │
│ Unit C │ Unit D │
│ (regions 0,2-1,3) │ (regions 2,2-3,3) │
│ │
└─────────────────────────────────────────────────────────────────┘
Strategy for boundary elements:
- Expanded Fetch BBox: Each unit fetches OSM data with a buffer zone (e.g., +256 blocks)
- Clip to Processing BBox: Clip elements to the unit's actual processing bounds
- Process Normally: Elements partially in the unit are still processed, just clipped
- Deterministic Results: Same element in adjacent units produces identical blocks due to RNG seeding
Example: Building straddling Unit A and B
| Step | Unit A | Unit B |
|---|---|---|
| Fetch | Gets building (with buffer) | Gets building (with buffer) |
| Clip | Clips to Unit A bounds → left half | Clips to Unit B bounds → right half |
| Color | element_rng(building_id) → BLUE |
element_rng(building_id) → BLUE |
| Place | Places left half in blue | Places right half in blue |
| Result | Seamless blue building across boundary |
4. OSM Data Downloading Strategy ⚠️ REQUIRES CAREFUL DESIGN
Options:
Option A: Download Once, Distribute Elements (RECOMMENDED)
┌─────────────────────────────────────────────────────────────────┐
│ │
│ [1] Download ALL OSM data for entire bbox (single API call) │
│ [2] Parse into ProcessedElements │
│ [3] For each processing unit: │
│ └── Filter elements that intersect unit's bbox │
│ └── Clip filtered elements to unit bounds │
│ └── Send to parallel processor │
│ │
└─────────────────────────────────────────────────────────────────┘
Pros:
- Single Overpass API call (respects rate limits)
- No duplicate data transfer
- Elements are already parsed, just need filtering
Cons:
- Must keep all elements in memory during distribution phase
- For very large areas, this might still be memory-intensive
Option B: Download Per Unit (Simpler, Higher Bandwidth)
┌─────────────────────────────────────────────────────────────────┐
│ │
│ For each processing unit (sequentially or with rate limiting): │
│ [1] Download OSM data for unit's expanded bbox │
│ [2] Parse into ProcessedElements │
│ [3] Send to parallel processor │
│ │
└─────────────────────────────────────────────────────────────────┘
Pros:
- Lower peak memory usage
- Simpler code structure
Cons:
- Multiple API calls (may hit rate limits)
- Duplicate data transfer for overlapping areas
- Slower due to network latency
Recommendation: Option A with Streaming
Download once, but use a streaming approach to distribute elements to units:
// Pseudo-code for element distribution
fn distribute_elements_to_units(
elements: Vec<ProcessedElement>,
units: &[ProcessingUnit],
) -> Vec<Vec<ProcessedElement>> {
let mut unit_elements = vec![Vec::new(); units.len()];
for element in elements {
let element_bbox = compute_element_bbox(&element);
for (i, unit) in units.iter().enumerate() {
if unit.expanded_bbox.intersects(&element_bbox) {
// Clone element for each unit that needs it
// (or use Arc for large elements)
unit_elements[i].push(element.clone());
}
}
}
unit_elements
}
5. Flood Fill Cache ⚠️ REQUIRES CHANGES
Current behavior:
FloodFillCache::precompute()runs in parallel for ALL elements- Results are stored in a
FnvHashMap<u64, Vec<(i32, i32)>> - Cache is consumed during sequential element processing
Problem:
- If we process units in parallel, each unit needs its own flood fill cache
- But we don't want to re-compute the same flood fills multiple times
Solution A: Per-Unit Flood Fill (Simpler)
// Each unit computes flood fills only for its elements
fn process_unit(unit_elements: Vec<ProcessedElement>) {
let flood_fill_cache = FloodFillCache::precompute(&unit_elements, timeout);
// Process elements using this cache
}
Pros: Simple, no coordination needed Cons: Elements at boundaries may be flood-filled twice
Solution B: Global Flood Fill + Distribution (More Complex)
// Compute flood fills globally, then distribute to units
let global_cache = FloodFillCache::precompute(&all_elements, timeout);
// For each unit, create a view into the global cache
let unit_caches: Vec<_> = units.iter()
.map(|unit| global_cache.filter_for_bbox(&unit.bbox))
.collect();
Recommendation: Start with Solution A. The overhead of re-computing some flood fills at boundaries is acceptable given the simplicity.
6. Building Footprints Bitmap ⚠️ REQUIRES CHANGES
Current behavior:
BuildingFootprintBitmapis a memory-efficient bitmap covering the entire world- Used to prevent trees from spawning inside buildings
- Created AFTER flood fill precomputation
Problem:
- With parallel processing, each unit only knows about buildings in its own area
- A tree in Unit B might spawn inside a building that exists in Unit A (near boundary)
Solution:
- Compute building footprints GLOBALLY before parallel processing
- Use
Arc<BuildingFootprintBitmap>shared across all units (read-only)
// Global building footprint computation
let all_building_coords = compute_all_building_footprints(&all_elements, &global_xzbbox);
let global_footprints = Arc::new(BuildingFootprintBitmap::from(all_building_coords));
// Each unit receives Arc clone
for unit in units {
let footprints = Arc::clone(&global_footprints);
// spawn task
}
7. Highway Connectivity ⚠️ REQUIRES CHANGES
Current behavior:
highways::build_highway_connectivity_map()creates a map of connected highway segments- Used for intersection detection and road marking placement
Problem:
- Highway segments crossing unit boundaries won't see their full connectivity
Solution:
- Build highway connectivity map GLOBALLY before parallel processing
- Pass as
Arc<HighwayConnectivityMap>to all units
8. Water Areas and Ring Merging ✅ ALREADY SUPPORTED
Current behavior:
- Water relations contain multiple ways that must be merged into closed rings
merge_way_segments()in water_areas.rs handles this- Clipping happens AFTER ring merging via
clip_water_ring_to_bbox() - Water uses
inverse_floodfill()which iterates over bounding box (not flood fill)
Why water CAN be clipped per-unit:
- Ring merging happens on the UNCLIPPED ways (preserved in osm_parser.rs)
- After merging,
clip_water_ring_to_bbox()clips the assembled polygon - The
inverse_floodfillalgorithm iterates block-by-block within bounds - Each unit can independently clip and fill its portion of a water body
No special handling needed - water relations work the same as other elements:
- Distribute relation to units that intersect its bbox
- Each unit clips to its own bounds
- Each unit fills its portion independently
9. Element Priority Order ⚠️ MUST BE PRESERVED
Current behavior:
- Elements are sorted by priority before processing (osm_parser.rs):
const PRIORITY_ORDER: [&str; 6] = [ "entrance", "building", "highway", "waterway", "water", "barrier", ]; - This ensures entrances are placed before buildings (so doors work)
- Buildings before highways (so sidewalks don't overwrite buildings)
Requirement:
- Each unit must process its elements in the SAME priority order
- This is natural: just sort the unit's elements the same way
10. SPONGE Block as Placeholder ⚠️ MINOR CONSIDERATION
Current behavior:
SPONGEblock is used as a blacklist marker in some places- Example:
editor.set_block(actual_block, x, 0, z, None, Some(&[SPONGE])); - Prevents certain blocks from overwriting sponge blocks
Impact on parallel processing:
- None - this is a per-block check, not cross-region coordination
- Each unit handles its own sponge blocks independently
11. Tree Placement and Building Footprints ⚠️ REQUIRES GLOBAL FOOTPRINTS
Current behavior:
- Trees check
building_footprints.contains(x, z)before spawning - Prevents trees from appearing inside buildings
- Uses
coord_rng(x, z, element_id)for deterministic placement
Problem with per-unit footprints:
- A tree near a unit boundary might not see a building from the adjacent unit
- Could spawn a tree inside a building that exists in neighbor unit
Solution (already planned):
- Compute building footprints GLOBALLY before parallel processing
- Pass as
Arc<BuildingFootprintBitmap>to all units - Tree placement will correctly avoid all buildings
12. Relations with Multiple Members Across Units ⚠️ REQUIRES CAREFUL HANDLING
Current behavior:
- Relations (buildings, landuse, leisure, natural) process each member way
- Member ways can be scattered across the entire bbox
Example: Building relation with courtyard
Building Relation:
- Outer way 1 (in Unit A)
- Outer way 2 (in Unit A and B) ← straddles boundary
- Inner way (courtyard, in Unit A)
Strategy:
- Distribute entire relation to all units that any member touches
- Each unit clips all members to its bounds
- Each unit processes the clipped relation independently
- Deterministic RNG ensures consistent colors/styles
Important: The relation-level tags (e.g., building:levels) must be preserved for all units processing that relation.
Proposed Processing Unit Structure
ProcessingUnit Definition
struct ProcessingUnit {
/// Which region this unit covers (1 region per unit)
region_x: i32,
region_z: i32,
/// Minecraft coordinate bounds for this unit (512×512 blocks)
min_x: i32, // region_x * 512
max_x: i32, // region_x * 512 + 511
min_z: i32, // region_z * 512
max_z: i32, // region_z * 512 + 511
/// Expanded bounds for element fetching (includes buffer for boundary elements)
fetch_min_x: i32,
fetch_max_x: i32,
fetch_min_z: i32,
fetch_max_z: i32,
}
Unit Grid Calculation
fn compute_processing_units(
global_xzbbox: &XZBBox,
buffer_blocks: i32, // e.g., 64-128 blocks overlap
) -> Vec<ProcessingUnit> {
let blocks_per_region = 512; // 32 chunks × 16 blocks
// Calculate which regions are covered by the bbox
let min_region_x = global_xzbbox.min_x() >> 9; // divide by 512
let max_region_x = global_xzbbox.max_x() >> 9;
let min_region_z = global_xzbbox.min_z() >> 9;
let max_region_z = global_xzbbox.max_z() >> 9;
let mut units = Vec::new();
// Create one unit per region
for rx in min_region_x..=max_region_x {
for rz in min_region_z..=max_region_z {
// Compute Minecraft coordinate bounds for this region
let min_x = rx * blocks_per_region;
let max_x = min_x + blocks_per_region - 1;
let min_z = rz * blocks_per_region;
let max_z = min_z + blocks_per_region - 1;
// Add buffer for fetch bounds (clamped to global bbox)
let fetch_min_x = (min_x - buffer_blocks).max(global_xzbbox.min_x());
let fetch_max_x = (max_x + buffer_blocks).min(global_xzbbox.max_x());
let fetch_min_z = (min_z - buffer_blocks).max(global_xzbbox.min_z());
let fetch_max_z = (max_z + buffer_blocks).min(global_xzbbox.max_z());
units.push(ProcessingUnit {
region_x: rx,
region_z: rz,
min_x, max_x, min_z, max_z,
fetch_min_x, fetch_max_x, fetch_min_z, fetch_max_z,
});
}
}
units
}
Parallel Execution Strategy
fn process_units_parallel(
units: Vec<ProcessingUnit>,
elements: &[ProcessedElement],
global_ground: Arc<Ground>,
global_building_footprints: Arc<BuildingFootprintBitmap>,
global_highway_connectivity: Arc<HighwayConnectivityMap>,
args: &Args,
) {
// Use CPU-1 cores for parallel processing
let num_threads = std::thread::available_parallelism()
.map(|n| n.get().saturating_sub(1).max(1))
.unwrap_or(1);
units.into_par_iter()
.with_min_len(1) // Process 1 unit per task
.for_each(|unit| {
// 1. Filter elements that intersect this unit's fetch bounds
let unit_elements = filter_elements_for_unit(elements, &unit);
// 2. Clip elements to unit's actual bounds
let clipped_elements = clip_elements_to_unit(unit_elements, &unit);
// 3. Create per-unit structures
let unit_xzbbox = XZBBox::new(unit.min_x, unit.max_x, unit.min_z, unit.max_z);
let mut editor = WorldEditor::new(args.path.clone(), &unit_xzbbox, ...);
editor.set_ground(Arc::clone(&global_ground));
// 4. Pre-compute flood fills for this unit's elements
let flood_fill_cache = FloodFillCache::precompute(&clipped_elements, args.timeout.as_ref());
// 5. Process elements (same as current, just for this unit)
for element in clipped_elements {
process_element(&mut editor, &element, ...);
}
// 6. Generate ground layer for this unit
generate_ground_for_unit(&mut editor, &unit, &global_ground);
// 7. Save region immediately and FREE MEMORY
editor.save_single_region(unit.region_x, unit.region_z);
drop(editor); // Release memory
});
}
Memory Usage Comparison
Understanding Minecraft Region Sizes
1 Region = 32×32 chunks = 512×512 blocks (horizontally)
1 Chunk = 16×16×384 blocks (with sections from Y=-64 to Y=319)
Current Architecture (All Regions in Memory)
| Stage | Memory Usage |
|---|---|
| OSM Data (parsed) | ~50-200 MB |
| Flood Fill Cache | ~100-500 MB |
| Building Footprints | ~10-50 MB |
| WorldToModify (all regions) | ~300 MB × N regions |
| Total for 100 regions | ~30+ GB |
Unit Size Analysis
The optimal unit size depends on balancing:
- Memory per unit - Larger units = more memory
- Parallelism overhead - Smaller units = more coordination
- Boundary overhead - More units = more elements processed multiple times
| Unit Size | Blocks | Memory per Unit | Parallel Units (8 cores) | Peak Memory |
|---|---|---|---|---|
| 1 region (32×32 chunks) | 512×512 | ~300 MB | 7 units | ~2.5 GB |
| 2×2 regions | 1024×1024 | ~1.2 GB | 7 units | ~9 GB |
| 4×4 regions | 2048×2048 | ~4.8 GB | 7 units | ~35 GB |
Recommendation: 1 Region Per Unit
1 region per unit is optimal because:
- Lowest memory footprint - Only ~300 MB per unit
- Natural alignment - Regions are the atomic save unit in Minecraft (.mca files)
- Maximum parallelism - More units = better CPU utilization
- Simple boundary logic - No partial region handling
Memory calculation for 7 parallel units (8-core CPU, using 7):
- Per-unit WorldToModify: ~300 MB
- Per-unit flood fill cache: ~50 MB
- Per-unit OSM elements: ~20 MB
- Peak memory: ~370 MB × 7 = ~2.6 GB
Plus global shared data:
- Elevation data: ~50-100 MB
- Building footprints: ~10-50 MB
- Highway connectivity: ~20-50 MB
Total peak: ~3 GB (vs ~30 GB for 100 regions currently!)
Why Not Smaller Than 1 Region?
- Regions are the minimum save unit for Minecraft
- Going smaller would require buffering partial regions
- No memory benefit (still need full region in memory to save)
Implementation Phases
Phase 1: Refactor Global Data Preparation
Goal: Extract global computations that must run before parallel processing
Changes:
- Move elevation fetching to a separate global phase
- Move building footprint collection to global phase
- Move highway connectivity map building to global phase
- Create shared data structures with
Arc<T>
Files affected:
data_processing.rs- restructuregenerate_world_with_options()ground.rs- no changes, already returnsGroundfloodfill_cache.rs- add method to collect building footprints globallyelement_processing/highways.rs- extract connectivity map building
Phase 2: Implement Processing Unit Grid
Goal: Add logic to divide the world into processing units
Changes:
- Create
processing_unit.rsmodule - Implement grid computation
- Implement element-to-unit distribution
- Add unit-level bounding box clipping
New files:
src/processing_unit.rs
Phase 3: Parallelize Unit Processing
Goal: Process units in parallel using rayon
Changes:
- Create per-unit WorldEditor instances
- Implement unit processing function
- Add parallel execution with CPU cap
- Implement region saving after unit completion
Files affected:
data_processing.rs- main parallel loopworld_editor/mod.rs- support per-unit saving
Phase 4: Handle Boundary Cases
Goal: Ensure seamless results across unit boundaries
Changes:
- Verify deterministic RNG produces identical results
- Implement special handling for large water bodies
- Add boundary verification tests
- Optimize overlap buffer size
Files affected:
element_processing/water_areas.rs- global water handlingclipping.rs- potential optimizations
Phase 5: Optimize Memory Management
Goal: Fine-tune memory usage and parallelism
Changes:
- Implement memory pressure monitoring
- Add dynamic unit size adjustment
- Optimize flood fill cache memory
- Profile and optimize hot paths
Testing Strategy
Unit Tests
-
Deterministic RNG Test
- Process same building in two units
- Verify identical colors/styles
-
Elevation Consistency Test
- Check ground level at unit boundaries
- Verify no height discontinuities
-
Clipping Accuracy Test
- Verify elements clipped correctly at unit boundaries
- Check polygon integrity after clipping
Integration Tests
-
Small Area Test
- Process 2×2 region area
- Verify world loads correctly in Minecraft
-
Boundary Building Test
- Create world with buildings at unit boundaries
- Verify buildings are complete and correctly colored
-
Large Water Body Test
- Process area with lake spanning multiple units
- Verify water body is continuous
Performance Tests
-
Memory Usage Test
- Monitor peak memory during processing
- Compare with current architecture
-
CPU Utilization Test
- Verify parallel units use expected cores
- Measure speedup vs sequential processing
Configuration Options
Proposed CLI Arguments
/// Number of CPU cores to use for parallel processing (default: available - 1)
/// Set to 1 to disable parallel processing
#[arg(long, default_value_t = 0)]
pub parallel_cores: usize,
/// Buffer size for boundary overlap in blocks (default: 64)
/// Larger values ensure buildings at boundaries are complete but increase processing time
#[arg(long, default_value_t = 64)]
pub boundary_buffer: i32,
Risk Assessment
High Risk
| Risk | Mitigation |
|---|---|
| Elevation discontinuities at boundaries | Use global elevation data (already planned) |
| Race conditions in file writing | Each unit writes different regions (no overlap) |
| Trees spawning inside buildings at boundaries | Use global building footprints bitmap |
Medium Risk
| Risk | Mitigation |
|---|---|
| Overpass API rate limiting | Download once globally, distribute elements |
| Complex relations broken at boundaries | Distribute full relation to all touching units |
| Highway connectivity missing at boundaries | Build connectivity map globally |
Low Risk
| Risk | Mitigation |
|---|---|
| Different random values at boundaries | Deterministic RNG already implemented |
| Performance regression | Benchmark before/after, make parallel optional |
| Water bodies split incorrectly | Water already supports clipping via clip_water_ring_to_bbox |
Questions to Resolve
-
Should we support Bedrock format with this change?
- Bedrock writes to a single .mcworld file (LevelDB database)
- May need different handling (write to temp, merge at end)
- Could be deferred to a follow-up implementation
-
What buffer size for boundary overlap?
- Current thinking: 64-128 blocks should be sufficient
- Most buildings are smaller than this
- Larger buffers = more duplicate processing
-
Should flood fills be computed globally or per-unit?
- Per-unit is simpler and avoids coordination
- Some redundant computation at boundaries (acceptable)
- Recommendation: Start per-unit
-
How to report progress across parallel units?
- Current progress is linear (element by element)
- With parallel, need aggregated progress reporting
- Option: Track completed regions, report as percentage
-
Should we limit parallelism based on available RAM?
- Could detect system RAM and adjust parallel units
- Or just document memory requirements per parallel unit
- Recommendation: Start with CPU-1 cores, let users override
Summary
The proposed parallel region processing architecture will:
- ✅ Reduce memory usage by ~90% by processing 1 region at a time per unit (~300 MB vs ~30 GB for 100 regions)
- ✅ Utilize multiple CPU cores through rayon-based parallel processing (CPU-1 cores)
- ✅ Maintain visual consistency using deterministic RNG and global shared data
- ✅ Be backward compatible with a
--no-parallelflag for the current behavior
The main implementation work is:
- Refactoring to extract global computations (elevation, building footprints, highway connectivity)
- Adding element-to-unit distribution logic with proper clipping
- Per-unit WorldEditor instances with immediate region saving
The design is simpler than originally thought because:
- Water relations already support clipping (no special handling)
- Deterministic RNG already exists (no changes needed)
- Priority order is preserved naturally (just sort per-unit)
Estimated implementation effort: 3-4 weeks for a fully tested solution.