Commit Graph

2055 Commits

Author SHA1 Message Date
Ryuichi Leo Takashige
8aeeb46d2f failures 2026-02-02 21:33:16 +00:00
Ryuichi Leo Takashige
edb2015607 failures 2026-02-02 21:13:42 +00:00
Ryuichi Leo Takashige
f613ebdc6c failures 2026-02-02 21:12:34 +00:00
Ryuichi Leo Takashige
e72a1778dd maybe fix 2026-02-02 20:24:52 +00:00
Ryuichi Leo Takashige
eb4c76e758 log text 2026-02-02 19:27:34 +00:00
Ryuichi Leo Takashige
b890c671b8 use new auto parallel 2026-02-02 19:23:29 +00:00
Ryuichi Leo Takashige
e7f3f47754 jeez that was dumb 2026-02-02 19:14:19 +00:00
Ryuichi Leo Takashige
d935c7a372 maybe fix? 2026-02-02 19:08:32 +00:00
Ryuichi Leo Takashige
bd089b30d7 raise timeouts 2026-02-02 18:50:26 +00:00
Ryuichi Leo Takashige
13b397a3c9 raise max concurrency 2026-02-02 18:45:29 +00:00
Ryuichi Leo Takashige
cf5fddf3f8 oops 2026-02-02 18:40:41 +00:00
Ryuichi Leo Takashige
c9df4ff004 save properly 2026-02-02 18:30:53 +00:00
Ryuichi Leo Takashige
4f7869b91b cleanup after control c 2026-02-02 18:23:42 +00:00
Ryuichi Leo Takashige
b08ec25ef6 better limit? 2026-02-02 18:22:39 +00:00
Ryuichi Leo Takashige
f235019c28 make control c exit cleanly and add --limit 2026-02-02 18:04:58 +00:00
Ryuichi Leo Takashige
68a77f0910 little confusing pyproject change 2026-02-02 17:47:08 +00:00
Ryuichi Leo Takashige
8456e3f74b actually fix exo eval 2026-02-02 17:37:37 +00:00
Ryuichi Leo Takashige
83e4725415 add 4bit attention 2026-02-02 17:30:52 +00:00
Ryuichi Leo Takashige
49dc7a8798 livecodebench fix 2026-02-02 17:30:34 +00:00
Ryuichi Leo Takashige
dea52342ca livecodebench fix 2026-02-02 17:27:59 +00:00
Ryuichi Leo Takashige
aae28d8e8b livecodebench eval 2026-02-02 17:14:56 +00:00
Ryuichi Leo Takashige
a28def8e45 revert use ssh 2026-02-02 16:06:32 +00:00
Ryuichi Leo Takashige
56a9864e19 use ssh 2026-02-02 15:59:42 +00:00
Ryuichi Leo Takashige
10afd08427 optimizations 2026-02-02 15:46:18 +00:00
Ryuichi Leo Takashige
04a0690746 faster prompt sizer 2026-02-02 14:50:04 +00:00
Ryuichi Leo Takashige
970717f1bb dont time out pleaseee 2026-02-02 13:49:31 +00:00
Ryuichi Leo Takashige
774eb1756a fix 2026-02-02 13:31:32 +00:00
Ryuichi Leo Takashige
061e58ce39 add livebench 2026-02-02 13:26:36 +00:00
Ryuichi Leo Takashige
e8b6ec131b fix exo bench 2026-02-02 13:12:50 +00:00
Ryuichi Leo Takashige
7b4c5d0c6d relative import 2026-02-02 11:44:14 +00:00
Ryuichi Leo Takashige
fb3d1e887f relative import 2026-02-02 11:43:56 +00:00
Ryuichi Leo Takashige
2d15e49f4e tagged model 2026-02-02 11:41:22 +00:00
Ryuichi Leo Takashige
c0f192897c dumb upstream changes 2026-02-02 11:37:11 +00:00
Ryuichi Leo Takashige
7587cb872c several fixes from main 2026-02-02 11:35:10 +00:00
Ryuichi Leo Takashige
bcb07782c1 no batch 2026-02-02 11:30:19 +00:00
Ryuichi Leo Takashige
24a6adf022 Add metadata to results.json 2026-01-29 13:02:35 +00:00
Ryuichi Leo Takashige
5d3b407602 parallelise Qwen3Next 2026-01-28 18:01:23 +00:00
Ryuichi Leo Takashige
e7a5826aed try reshaping 2026-01-28 15:23:28 +00:00
Ryuichi Leo Takashige
ebe279018f import 2026-01-28 15:11:57 +00:00
Ryuichi Leo Takashige
bf67e7d334 maybe? 2026-01-28 15:09:50 +00:00
Ryuichi Leo Takashige
0cd2f6aab4 oops 2026-01-28 14:57:05 +00:00
Ryuichi Leo Takashige
ba8a44e6a2 use wrapped minimax attention 2026-01-28 14:53:23 +00:00
Ryuichi Leo Takashige
07c4be157b Oops 2026-01-28 14:07:19 +00:00
Ryuichi Leo Takashige
1e1eb8f8a1 Format 2026-01-28 14:01:45 +00:00
Ryuichi Leo Takashige
1bc2d9728d Use different minimax sharding 2026-01-28 14:00:56 +00:00
Ryuichi Leo Takashige
7823fd7b1a fix exo eval 2026-01-27 22:20:04 +00:00
Ryuichi Leo Takashige
05caab0047 Extract minimax think tokens 2026-01-27 21:59:51 +00:00
Ryuichi Leo Takashige
bd8f9f2d10 Extract thinking models generally. 2026-01-27 21:30:53 +00:00
Ryuichi Leo Takashige
34fcafa68a Ignore timeout 2026-01-27 21:10:59 +00:00
Ryuichi Leo Takashige
5152789e00 lengthen timeout 2026-01-27 18:25:54 +00:00