Commit Graph

76 Commits

Author SHA1 Message Date
Matt Beton
1fe4ed3442 Worker Exception & Timeout Refactor
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Co-authored-by: Seth Howes <sethshowes@gmail.com>
2025-08-02 08:28:37 -07:00
Alex Cheema
92c9688bf0 Remove rust 2025-08-02 08:16:39 -07:00
Gelu Vrabie
0e32599e71 fix libp2p + other prs that were wrongly overwritten before (111,112,117,118,1119 + misc commits from Alex)
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
Co-authored-by: Alex Cheema <41707476+AlexCheema@users.noreply.github.com>
Co-authored-by: Seth Howes <71157822+sethhowes@users.noreply.github.com>
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-31 20:36:47 +01:00
Matt Beton
b350ededb2 Test Supervisor Errors. 2025-07-30 13:30:54 +01:00
Alex Cheema
a2b4093d25 add metrics: gpu_usage, temp, sys_power, pcpu_usage, ecpu_usage, ane_… 2025-07-28 23:02:33 +01:00
Alex Cheema
12566865d5 better profiling 2025-07-28 22:15:04 +01:00
Gelu Vrabie
b88abf1cc2 fix topology disconnects and add heartbeat
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-28 22:00:05 +01:00
Alex Cheema
20241e3290 some finishing touches to get this working e2e 2025-07-28 13:07:29 +01:00
Seth Howes
176d077c87 Fix IPv4 serialisation for topology 2025-07-28 13:07:10 +01:00
Seth Howes
e9b803604b Add Multiaddr type and refactor Hosts type for creating shard placement 2025-07-28 11:39:46 +01:00
Alex Cheema
57ca487fde Fixes for running this end to end
Co-authored-by: Gelu Vrabie <gelu.vrabie.univ@gmail.com>
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-28 10:51:03 +01:00
Andrei Cravtov
b687dec6b2 Discovery integration master
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-27 13:43:59 +01:00
Matt Beton
93330f0283 Inference Integration Test
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-26 20:08:25 +01:00
Gelu Vrabie
2e4635a8f5 add node started event
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-26 19:12:26 +01:00
Gelu Vrabie
261e575262 Serialize topology
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-25 15:09:03 +01:00
Alex Cheema
a241c92dd1 Glue 2025-07-25 13:10:29 +01:00
Seth Howes
6f8e3419d5 Placement strategy
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-24 20:22:40 +01:00
Matt Beton
f41531d945 Worker Loop
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-24 18:44:31 +01:00
Alex Cheema
67c70b22e4 Best master 2025-07-24 17:12:52 +01:00
Gelu Vrabie
df1fe3af26 Topology apply
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-24 14:27:09 +01:00
Matt Beton
5097493a42 Fix tests 2025-07-24 13:22:58 +01:00
Alex Cheema
a6b3ab6332 Worker plan
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
Co-authored-by: Seth Howes <71157822+sethhowes@users.noreply.github.com>
Co-authored-by: Gelu Vrabie <gelu.vrabie.univ@gmail.com>
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
Co-authored-by: Andrei Cravtov <the.andrei.cravtov@gmail.com>
Co-authored-by: Seth Howes <sethshowes@gmail.com>
2025-07-24 12:45:27 +01:00
Gelu Vrabie
56d3565781 Add apply functions
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-24 11:02:20 +01:00
Matt Beton
7a452c3351 Fix tests 2025-07-23 18:25:50 +01:00
Seth Howes
7ac23ce96b Refactor tasks / commands / api 2025-07-23 15:52:29 +01:00
Andrei Cravtov
8d2536d926 Implemented basic discovery library in Rust + python bindings
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
Co-authored-by: Seth Howes <sethshowes@gmail.com>
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
2025-07-23 13:11:29 +01:00
Seth Howes
cd9a1a9192 Topology update 2025-07-22 22:29:17 +01:00
Matt Beton
14b3c4a6be New API! 2025-07-22 21:21:12 +01:00
Matt Beton
53c652c307 Fix tests! 2025-07-22 15:20:32 +01:00
Matt Beton
5adad08e09 New events 2025-07-22 15:16:06 +01:00
Gelu Vrabie
108128b620 fix sqlite connector
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-21 22:43:09 +01:00
Alex Cheema
449fdac27a Downloads 2025-07-21 22:42:37 +01:00
Seth Howes
cb101e3d24 Refactor model types 2025-07-21 20:35:27 +01:00
Seth Howes
bae58dd368 Refactor worker + master state into single state 2025-07-21 19:36:54 +01:00
Seth Howes
d19aa4f95a Simplify Task type + merge control & data plane types into single type 2025-07-21 17:10:09 +01:00
Gelu Vrabie
2f64e30dd1 Add sqlite connector
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-21 14:10:29 +01:00
Alex Cheema
bb7f1ae994 New worker
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
2025-07-18 10:08:56 +01:00
Matt Beton
cc45c7e9b9 Fixed events issue. 2025-07-17 12:21:01 +01:00
Arbion Halili
038cc4cdfa fix: Normalize Naming 2025-07-16 16:11:51 +01:00
Arbion Halili
e2a7935019 fix: Fix incorrect logic 2025-07-16 14:39:20 +01:00
Arbion Halili
6a671908a3 fix: FrozenSet Related Bits 2025-07-16 13:45:57 +01:00
Arbion Halili
520b1122a3 fix: Many Fixes 2025-07-16 13:35:31 +01:00
Arbion Halili
9f96b6791f fix: Some, still broken 2025-07-15 13:11:21 +01:00
Arbion Halili
8060120136 tweak 2025-07-14 22:37:53 +01:00
Arbion Halili
df6626fa31 fix: Event definitions, state definitions 2025-07-14 21:41:14 +01:00
Arbion Halili
70f0f09c05 Tweaked, Still Broken tho 2025-07-14 21:19:39 +01:00
Arbion Halili
8799c288b0 BROKEN: work thus far 2025-07-14 21:09:08 +01:00
Arbion Halili
74d56e52ff fix: Improve naming 2025-07-07 20:22:27 +01:00
Arbion Halili
fe17aaf9f8 fix: Make master hold a queue of task data 2025-07-07 20:22:00 +01:00
Arbion Halili
e1894bc106 refactor: A Lot 2025-07-07 20:19:08 +01:00