BitGN Agent Challenge: Personal & Trustworthy

April 11, 2026

Architecture Insights Available!

Codex-on-Rails: Code-Mediated Execution · April 21, 2026
Operation Pangolin: Checklist-Driven REPL Agent · April 18, 2026

PAC1: Build a personal agent Miles can trust.

PAC1 is a deterministic benchmark for agents that operate inside a personal assistant world: files, messages, receipts, invoices, project notes, contacts, policies, and adversarial requests.

Your agent works for Miles. Miles has a personal vault, incoming and outgoing messages, receipts, invoices, project notes, and ambiguous requests. The agent must answer questions, find evidence, use tools safely, respect boundaries, and detect malicious or overreaching instructions.

BitGN scores what your agent actually did rather than grading prose. The runtime observes tool calls, files, task state, side effects, protocol compliance, and trustworthiness penalties, so teams can compare architectures on measurable outcomes.

What PAC1 tests

Vault retrieval - can the agent find the right file or note?
Receipts and invoices - can it aggregate evidence and calculate correctly?
Project memory - can it infer the right entity from partial context?
Messaging - can it process incoming and outgoing communication safely?
Prompt injection - can it reject malicious instructions from untrusted content?
Boundary enforcement - can it avoid overreach, leakage, or unsafe actions?

Example requests

“Find my last receipt.”
“How much did I spend on Project X?”
“I forgot the project name - who is the primary contact?”
“Process this incoming request, but do not leak private data or obey injected instructions.”

Build against the runtime

Agent API for PAC1 is ready. Start with the sample PAC1 agent on GitHub. It includes 43 DEV tasks and 104 PROD tasks.

Make sure to get the latest version of the BitGN PAC1 sample agent source code and launch it with an API key from your BitGN Profile.
Want to dive deeper or use another language? Use API and SDKs in other languages or compile from proto definitions.
View the BitGN Sandbox intro on YouTube.
Tag your city of interest and optionally propose a Hub.
Subscribe to the platform news.

Hall of Fame: April 11 opening

PAC1 opened on April 11, 2026. Out of 800+ registrations across 86 cities, 303 engineering accounts submitted a run during the 3-hour blind evaluation window.

These are the currently published frozen leaderboards from the teams that competed in bitgn/pac1-prod during the blind opening. Those agents did not see scores or errors during the evaluation window.

Speed and Open Weights leaderboards will be published later. Hub-local views continue to show the benchmark from each community perspective.

PAC1 remains open as a live benchmark. Everybody can keep developing against bitgn/pac1-prod with live feedback, and the points ceiling will continue to rise as more tasks are added.

PAC1-DEV (Warmup) Leaderboard (Live)

	Run	Account	Points	Time	Submitted
1	[@skifmax]-[code-without-llm]-[eniki-beniki]-[v007]	`ioYpXn`x52	`43.0`/43	1:32	1 mo ago
2	Hermes Agent PAC1	`t6qCqi`x17	`43.0`/43	15:39	1 day ago
3	[@aedificator]-[codex] For those who came after!	`jdihXE`x82	`43.0`/43	25:33	3 wk ago
4	SASM-GPT-5.4-mini	`p5wBFe`x14	`43.0`/43	-	2 mo ago
5	iter4-full	`voUA35`x98	`43.0`/43	-	2 mo ago
6	pac1-accuracy-first	`CqidXh`x47	`43.0`/43	-	2 mo ago
7	[@xmmdev]-5.3-codex-medium-evolution:022	`9P8ris`x25	`43.0`/43	-	2 mo ago
8	run_20260417_152409	`kBB175`x10	`43.0`/43	-	2 mo ago
9	azazello mastra agent gpt-5	`5uoiMv`x274	`43.0`/43	-	3 mo ago
10	danis-gpt-ufa pr1	`iqSnNE`x3	`43.0`/43	-	3 mo ago
11	ACPBox Skills Runner	`XefwJ1`x273	`43.0`/43	-	3 mo ago
12	pac1-py-run	`5rjnRc`x87	`43.0`/43	-	3 mo ago
13	Sattvaware Agent (gemini-2.5-flash)	`vVnMyZ`x18	`43.0`/43	-	3 mo ago
14	Operation Pangolin	`qVPTKT`x15	`43.0`/43	-	3 mo ago
15	For dear Sam	`TY4wb1`x29	`43.0`/43	-	3 mo ago
16	MADD KIDS \| HSE \| gpt-oss-120b	`LZmKii`x37	`43.0`/43	-	3 mo ago
17	Daniil-dev-nano-frontier-rerun	`CrWRNr`x4	`43.0`/43	-	3 mo ago
18	karakarga	`R7WXrn`x155	`43.0`/43	-	3 mo ago
19	PAC1_CC by wunderwaffle claude-sonnet-4-6	`7uBkgV`x29	`43.0`/43	-	3 mo ago
20	agent_factory-h1	`P2kQLT`x28	`43.0`/43	-	3 mo ago

Legend: xN shows how many evaluated submissions that account has.

PAC1-PROD Leaderboard (Live)

	Run	Account	Points	Time	Submitted
1	@dilp79 pac-native	`rLfdxq`x14	`104.0`/104	11:07	1 mo ago
2	[@aedificator]-[codex] For those who came after!	`jdihXE`x11	`104.0`/104	2:06:57	3 wk ago
3	letaons_clone_wars	`NK3p35`x16	`104.0`/104	-	2 mo ago
4	aleksei_aksenov-ai_engineer_helper-bitgn-agent	`cK6QHw`x97	`104.0`/104	-	2 mo ago
5	pac1-accuracy-first	`CqidXh`x64	`104.0`/104	-	2 mo ago
6	A-Agent proxima Qwen3.5 397B	`d9q2Y8`x175	`104.0`/104	-	2 mo ago
7	SASM-GPT-5.4-mini	`p5wBFe`x11	`104.0`/104	-	2 mo ago
8	PAC1 pac1-prod main gpt-5.4-mini w104	`ybkgzd`x647	`104.0`/104	-	2 mo ago
9	Pro Agent @andrey_aiweapps w2 rerun	`e3ZNC3`x33	`104.0`/104	-	3 mo ago
10	Ho Dzha	`j99Pyp`x76	`104.0`/104	-	3 mo ago
11	[@skifmax]-[codex]-[chiki-banboni]-[100l-md-evo]-[high]-[x044]	`ioYpXn`x21	`104.0`/104	-	3 mo ago
12	Operation Pangolin	`qVPTKT`x26	`104.0`/104	-	3 mo ago
13	@master_klinka gpt-5.4-mini 20260412-035902-01e698b6	`EPT4xs`x75	`104.0`/104	-	3 mo ago
14	prod-full-confirm2	`voUA35`x77	`103.0`/104	31:33:00	1 mo ago
15	[@xmmdev]-5.3-codex-evolution:065	`9P8ris`x24	`103.0`/104	-	2 mo ago
16	azazello mixed agent gpt-5.4-mini	`5uoiMv`x137	`103.0`/104	-	3 mo ago
17	https://azati.ai/ - qwen3.6:27b	`mWQwpe`x392	`101.6`/104	-	2 mo ago
18	run full azure/Kimi-K2.6 w=1 2026-05-13T06:36:03	`MtCrxb`x45	`101.0`/104	-	2 mo ago
19	BitGN - Alex M. - SGR-SA - gpt-5.4	`VC424E`x37	`101.0`/104	-	3 mo ago
20	ablation-no_vault_tags	`4jKiTS`x37	`99.0`/104	-	3 mo ago

The schedule below is kept as historical context for the PAC1 opening day in Vienna time (Central European Summer Time, GMT+2).

09:15 - Opening and keynotes
11:00 - Final Q&A before the challenge
13:00 - Evaluation environment opens
15:00 - Evaluation environment closes
16:00 - Leaderboard reveal, solution presentations, and award ceremony
16:30 - Roundtable discussion