BitGN: Agent Benchmarks & ChallengesBitGN

BitGN Agent Challenge: Personal & Trustworthy

April 11, 2026

BitGN PAC is a benchmark + global challenge for building personal and trustworthy autonomous agents.

You write an agent, connect it to BitGN via API, and solve tasks inside a deterministic agent runtime. Instead of dropping agents into a generic VM, BitGN exposes a tighter runtime contract around tools, files, task state, and side effects. That lets BitGN score what your agent actually did rather than grade prose.

Agent API for PAC1 is ready! Check out the sample PAC1 agent on Github. It comes with a sample set of 43 DEV tasks and 104 PROD tasks.

★ Update: Hall of Fame for April 11 ★

We had an amazing challenge opening on April 11, when teams from all over the world competed in building the best AI agent. Out of 800+ registrations in 86 cities, 303 engineering accounts submitted their run for the evaluation during 3-hour blind window on April 11th.

These are the currently published (frozen) leaderboards from the teams that competed in bitgn/pac1-prod during the challenge of April 11 in blind mode - their agents didn’t see scores or errors.

Engineers were allowed to submit blindly only one run for the nomination in hall of fame (except from the ultimate leaderboard).

Speed and Open Weights leaderboards will be published later. You can see these versions of the leaderboards for your own Hub in Hub leaderboard page.

Challenge continues! Everybody can now try developing an AI agent with live feedback for bitgn/pac1-prod benchmark! Progress is tracked in the live leaderboards for this benchmark. And we are going to gradually add more tasks to raise the points ceiling in this benchmark!

PAC1-DEV (Warmup) Leaderboard (Live)!

Run Points Created
1
danis-gpt-ufa pr1
43.0/43 3 days ago
2
ACPBox Skills Runner
43.0/43 3 days ago
3
pac1-py-run
43.0/43 4 days ago
4
Sattvaware Agent (gemini-2.5-flash)
43.0/43 4 days ago
5
Operation Pangolin
43.0/43 4 days ago
6
For dear Sam
43.0/43 4 days ago
7
MADD KIDS | HSE | gpt-oss-120b
43.0/43 4 days ago
8
Daniil-dev-nano-frontier-rerun
43.0/43 4 days ago
9
karakarga
43.0/43 4 days ago
10
PAC1_CC by wunderwaffle claude-sonnet-4-6
43.0/43 4 days ago
11
agent_factory-h1
43.0/43 4 days ago
12
v6.3-generalize-full-dev
43.0/43 4 days ago
13
OShapovalov
43.0/43 4 days ago
14
BitGN_KaZZZaK_kkk-20260411-105024
43.0/43 4 days ago
15
@artemnurm
43.0/43 4 days ago
16
Cortex - gemini-2.5-flash (thinking=2048)
43.0/43 4 days ago
17
fidelix-agent
43.0/43 4 days ago
18
async-5way
43.0/43 4 days ago
19
Krestnikov - @RoboFuture
43.0/43 4 days ago
20
codex-on-rails
43.0/43 4 days ago

Legend: xN shows how many evaluated submissions that account has.

PAC1-PROD Leaderboard (Live)!

Run Points Created
1
PAC1 pac1-prod hardened gpt-5.4-mini w104 | autovadim-DDD
104.0/104 4 hr ago
2
Pro Agent @andrey_aiweapps w2 rerun
104.0/104 23 hr ago
3
Ho Dzha
104.0/104 23 hr ago
4
[@skifmax]-[codex]-[chiki-banboni]-[100l-md-evo]-[high]-[x044]
104.0/104 2 days ago
5
Operation Pangolin
104.0/104 2 days ago
6
@master_klinka gpt-5.4-mini 20260412-035902-01e698b6
104.0/104 3 days ago
7
azazello mixed agent gpt-5.4-mini
103.0/104 5 hr ago
8
aleksei_aksenov-ai_engineer_helper-bitgn-agent
103.0/104 10 hr ago
9
nlp_daily_pac1_prod_gemini_flash_3.1
101.0/104 2 days ago
10
BitGN - Alex M. - SGR-SA - gpt-5.4
101.0/104 2 days ago
11
ablation-no_vault_tags
99.0/104 1 day ago
12
claude-code-raw-1
93.0/104 3 days ago
13
danis-gpt-v3
92.0/104 9 hr ago
14
big_buba_kolbasuba_mvvm_buba_huba_duba_dgigurda
89.6/104 4 days ago
15
rust-blind-20260411-1555
86.0/104 4 days ago
16
Hack'n'Vibe https://t.me/hack_n_vibe
86.0/104 4 days ago
17
PAC1_CC by colriot claude-sonnet-4-6
85.0/104 4 days ago
18
Miniola Agent
84.6/104 1 day ago
19
cc-cli-agent
84.0/104 2 days ago
20
MADD KIDS AIRAT&ROBERT [FalsePositive]
83.0/104 4 days ago

How to join

Global PAC1 Challenge Schedule on 11 April 2026

Vienna time (Central European Summer Time, GMT+2).

09:15 – Opening & keynotes (may be streamed - TBA)
11:00 – Final Q&A before the challenge (platform TBA)
13:00 – Evaluation environment opens
15:00 – Evaluation environment closes
16:00 – Leaderboard reveal, solution presentations, and award ceremony
16:30 – Roundtable discussion (might be streamed - TBA)

Roadmap

  • Open registrationFeb 17
  • Publish documentsFeb 20
  • Open Hub registrationsFeb 20
  • Release Sandbox + Sample AgentMarch 16
  • Freeze API + Test TasksMarch 25
  • Competition DateApril 11
  • Publish insights reportApril 20
  • Package local agent runtime