Skip to content

fix: handle typeberry assertion on high-address faults#6

Merged
tomusdrw merged 2 commits intomainfrom
td-debug-doom-pvm-diff
Apr 14, 2026
Merged

fix: handle typeberry assertion on high-address faults#6
tomusdrw merged 2 commits intomainfrom
td-debug-doom-pvm-diff

Conversation

@tomusdrw
Copy link
Copy Markdown
Member

Summary

  • Doom example (and any program storing to unmapped high addresses) caused Typeberry to crash with a JS assertion error while Ananas correctly returned fault status
  • Root cause: upstream bug in @typeberry/libgetStartPageIndex() uses signed << without >>> 0, producing negative values for addresses >= 0x80000000 in the fault-handling code path
  • Added defensive workaround: TypeberrySyncInterpreter.step() catches assertion errors and reports synthetic fault status (code 2), cleared on load/reset
  • Both PVMs now agree on fault outcome for doom

Test plan

  • Both PVMs report matching initial state after loading doom.bin
  • Both PVMs report matching state after 1 step of doom
  • Both PVMs agree on fault status at step 2 (unmapped memory store)
  • typeberry.step(100) returns { finished: true } without throwing
  • reset() clears synthetic fault status back to ok
  • All 683 pre-existing tests pass
  • Build and lint pass

🤖 Generated with Claude Code

tomusdrw and others added 2 commits April 14, 2026 20:49
Typeberry's getStartPageIndex() uses signed << without >>> 0, causing
assertion errors when programs fault at addresses >= 0x80000000. The
adapter now catches these and reports fault status, matching Ananas.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@netlify
Copy link
Copy Markdown

netlify bot commented Apr 14, 2026

Deploy Preview for pvm-debugger ready!

Name Link
🔨 Latest commit a246eee
🔍 Latest deploy log https://app.netlify.com/projects/pvm-debugger/deploys/69de8c6f81ea2d0008f9b54a
😎 Deploy Preview https://deploy-preview-6--pvm-debugger.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

TypeberrySyncInterpreter is enhanced to catch Typeberry assertion errors during step execution, converting them into a synthetic fault status instead of propagating exceptions. The fault status is managed through load and reset lifecycle methods. Supporting test suite and documentation added.

Changes

Cohort / File(s) Summary
Typeberry adapter error handling
packages/runtime-worker/src/adapters/typeberry.ts
Added assertionFaultStatus private flag to track assertion errors. step() now wraps execution in try/catch, catching errors with "Assertion failure" in the message and returning { finished: true } with status set. getStatus() returns the synthetic fault status when set. load() and reset() clear the flag to restore normal behavior.
Test suite
packages/runtime-worker/src/index.test.ts
New test suite comparing Typeberry and Ananas interpreters on doom.bin. Validates initial state equivalence, single-step equivalence, lockstep fault handling, graceful multi-step execution without uncaught exceptions, and reset() behavior. Includes console logging of PC, gas, status, and registers.
Documentation
spec/005-runtime-worker-and-adapters.md, spec/ui/sprint-48-typeberry-assertion-fault-workaround.md
Added sections describing the Typeberry assertion fault workaround, its implementation details, lifecycle management, and acceptance criteria for fault status equivalence between interpreters.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding a workaround to handle Typeberry assertion errors on high-address faults.
Description check ✅ Passed The description clearly explains the problem, root cause, the workaround implemented, and provides a comprehensive test plan with all checks marked complete.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch td-debug-doom-pvm-diff

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/runtime-worker/src/adapters/typeberry.ts`:
- Around line 127-145: The step(n: number) method must short-circuit when a
synthetic fault is already latched: check this.assertionFaultStatus ===
TYPEBERRY_STATUS_FAULT at the start of step and immediately return { finished:
true } (without calling this.adapter.nextStep() or this.adapter.nSteps(n)); keep
the existing assertion-fault detection logic that sets this.assertionFaultStatus
in the catch block but ensure subsequent step() calls do not delegate into the
adapter.

In `@packages/runtime-worker/src/index.test.ts`:
- Around line 855-871: The loop currently allows 10 iterations to complete
without asserting that a terminal fault was reached; update the test to track
termination by introducing a boolean (e.g., observedTermination) set to true
when either tbResult.finished or anResult.finished is true (inside the existing
if block where you already assert tbStatus and anStatus are "fault"), and after
the for loop add an assertion expect(observedTermination).toBe(true) to fail the
test if neither interpreter ever reaches the terminal fault; reference
typeberry.step, ananas.step, typeberry.getStatus, ananas.getStatus and mapStatus
to locate where to set the flag and add the final assertion.

In `@spec/ui/sprint-48-typeberry-assertion-fault-workaround.md`:
- Around line 43-45: The page-alignment example is incorrect: the expression
`(0xFFFFF >>> 12) << 12` and its claimed result `0xFFFFF000` (signed -4096) is
wrong and misleading; update the prose and example around getStartPageIndex and
tryAsMemoryIndex to show the correct bit-shift behavior and resulting aligned
address (use the correct original address that leads to `0xFFFFF000` or adjust
the shifted result to match the shown input such as using `0xFFFFFFF8` =>
alignment to `0xFFFFFFF000`), and explain that getStartPageIndex aligns
addresses by zeroing lower 12 bits so tryAsMemoryIndex will receive the aligned
(unsigned) page address rather than -4096. Ensure references to
getStartPageIndex and tryAsMemoryIndex remain consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8a05ae99-c26f-410e-a8c3-6fa95b47e74e

📥 Commits

Reviewing files that changed from the base of the PR and between dc84e3f and a246eee.

📒 Files selected for processing (4)
  • packages/runtime-worker/src/adapters/typeberry.ts
  • packages/runtime-worker/src/index.test.ts
  • spec/005-runtime-worker-and-adapters.md
  • spec/ui/sprint-48-typeberry-assertion-fault-workaround.md

Comment on lines 127 to +145
step(n: number): { finished: boolean } {
if (n === 1) {
const running = this.adapter.nextStep();
try {
if (n === 1) {
const running = this.adapter.nextStep();
return { finished: !running };
}
const running = this.adapter.nSteps(n);
return { finished: !running };
} catch (err) {
// Typeberry has a known bug where its fault-handling code path throws
// assertion errors (e.g. getStartPageIndex uses signed << on high
// addresses, producing negative values that fail tryAsMemoryIndex).
// The instruction itself was correctly identified as a fault, but the
// error-reporting code crashed. Treat this as a fault.
if (err instanceof Error && err.message.includes("Assertion failure")) {
this.assertionFaultStatus = TYPEBERRY_STATUS_FAULT;
return { finished: true };
}
throw err;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make the synthetic fault terminal across subsequent step() calls.

Once assertionFaultStatus is latched, step() still delegates into Typeberry on the next call. That can re-enter the same broken path or mutate state behind a permanently forced "fault" status. Short-circuit immediately when the synthetic fault is already set.

Suggested fix
  step(n: number): { finished: boolean } {
+    if (this.assertionFaultStatus !== null) {
+      return { finished: true };
+    }
+
     try {
       if (n === 1) {
         const running = this.adapter.nextStep();
         return { finished: !running };
       }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
step(n: number): { finished: boolean } {
if (n === 1) {
const running = this.adapter.nextStep();
try {
if (n === 1) {
const running = this.adapter.nextStep();
return { finished: !running };
}
const running = this.adapter.nSteps(n);
return { finished: !running };
} catch (err) {
// Typeberry has a known bug where its fault-handling code path throws
// assertion errors (e.g. getStartPageIndex uses signed << on high
// addresses, producing negative values that fail tryAsMemoryIndex).
// The instruction itself was correctly identified as a fault, but the
// error-reporting code crashed. Treat this as a fault.
if (err instanceof Error && err.message.includes("Assertion failure")) {
this.assertionFaultStatus = TYPEBERRY_STATUS_FAULT;
return { finished: true };
}
throw err;
step(n: number): { finished: boolean } {
if (this.assertionFaultStatus !== null) {
return { finished: true };
}
try {
if (n === 1) {
const running = this.adapter.nextStep();
return { finished: !running };
}
const running = this.adapter.nSteps(n);
return { finished: !running };
} catch (err) {
// Typeberry has a known bug where its fault-handling code path throws
// assertion errors (e.g. getStartPageIndex uses signed << on high
// addresses, producing negative values that fail tryAsMemoryIndex).
// The instruction itself was correctly identified as a fault, but the
// error-reporting code crashed. Treat this as a fault.
if (err instanceof Error && err.message.includes("Assertion failure")) {
this.assertionFaultStatus = TYPEBERRY_STATUS_FAULT;
return { finished: true };
}
throw err;
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/runtime-worker/src/adapters/typeberry.ts` around lines 127 - 145,
The step(n: number) method must short-circuit when a synthetic fault is already
latched: check this.assertionFaultStatus === TYPEBERRY_STATUS_FAULT at the start
of step and immediately return { finished: true } (without calling
this.adapter.nextStep() or this.adapter.nSteps(n)); keep the existing
assertion-fault detection logic that sets this.assertionFaultStatus in the catch
block but ensure subsequent step() calls do not delegate into the adapter.

Comment on lines +855 to +871
for (let i = 0; i < 10; i++) {
const tbResult = typeberry.step(1);
const anResult = ananas.step(1);

const tbStatus = mapStatus(typeberry.getStatus());
const anStatus = mapStatus(ananas.getStatus());

expect(tbResult.finished).toBe(anResult.finished);
expect(tbStatus).toBe(anStatus);

if (tbResult.finished || anResult.finished) {
// Both should agree it's a fault
expect(tbStatus).toBe("fault");
expect(anStatus).toBe("fault");
break;
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail the test if neither interpreter ever reaches the terminal fault.

This loop can exit after 10 iterations with no finished state and still pass, which masks the regression this test is supposed to catch. Track whether termination was observed and assert it after the loop.

Suggested fix
   it("both PVMs agree on fault after store to unmapped memory", () => {
     typeberry.load(doomBytes, doomState);
     ananas.load(doomBytes, doomState);
 
     // Step until one terminates (doom faults on step 2 due to unmapped memory store)
+    let sawTerminalState = false;
     for (let i = 0; i < 10; i++) {
       const tbResult = typeberry.step(1);
       const anResult = ananas.step(1);
@@
       if (tbResult.finished || anResult.finished) {
+        sawTerminalState = true;
         // Both should agree it's a fault
         expect(tbStatus).toBe("fault");
         expect(anStatus).toBe("fault");
         break;
       }
     }
+
+    expect(sawTerminalState).toBe(true);
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/runtime-worker/src/index.test.ts` around lines 855 - 871, The loop
currently allows 10 iterations to complete without asserting that a terminal
fault was reached; update the test to track termination by introducing a boolean
(e.g., observedTermination) set to true when either tbResult.finished or
anResult.finished is true (inside the existing if block where you already assert
tbStatus and anStatus are "fault"), and after the for loop add an assertion
expect(observedTermination).toBe(true) to fail the test if neither interpreter
ever reaches the terminal fault; reference typeberry.step, ananas.step,
typeberry.getStatus, ananas.getStatus and mapStatus to locate where to set the
flag and add the final assertion.

Comment on lines +43 to +45
5. In the **fault-handling code path** (not the address calculation), `getStartPageIndex(tryAsMemoryIndex(storeResult.error.address))` is called to compute the page index for the error report.
6. `getStartPageIndex` aligns the error address to a page boundary: `(0xFFFFF >>> 12) << 12` = `0xFFFFF000` as unsigned, but `-4096` as signed 32-bit.
7. `tryAsMemoryIndex(-4096)` throws `Assertion failure: Incorrect memory index: -4096!`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Correct the page-alignment example.

The arithmetic here is off: 0xFFFFF >>> 12 cannot produce 0xFFFFF000 after shifting back. The example result matches an address like 0xFFFFFFF8, so the current text is misleading.

Suggested fix
-5. In the **fault-handling code path** (not the address calculation), `getStartPageIndex(tryAsMemoryIndex(storeResult.error.address))` is called to compute the page index for the error report.
-6. `getStartPageIndex` aligns the error address to a page boundary: `(0xFFFFF >>> 12) << 12` = `0xFFFFF000` as unsigned, but `-4096` as signed 32-bit.
+5. In the **fault-handling code path** (not the address calculation), `getStartPageIndex(tryAsMemoryIndex(storeResult.error.address))` is called to compute the page index for the error report.
+6. `getStartPageIndex` aligns the error address to a page boundary: `(0xFFFFFFF8 >>> 12) << 12` = `0xFFFFF000` as unsigned, but `-4096` as signed 32-bit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@spec/ui/sprint-48-typeberry-assertion-fault-workaround.md` around lines 43 -
45, The page-alignment example is incorrect: the expression `(0xFFFFF >>> 12) <<
12` and its claimed result `0xFFFFF000` (signed -4096) is wrong and misleading;
update the prose and example around getStartPageIndex and tryAsMemoryIndex to
show the correct bit-shift behavior and resulting aligned address (use the
correct original address that leads to `0xFFFFF000` or adjust the shifted result
to match the shown input such as using `0xFFFFFFF8` => alignment to
`0xFFFFFFF000`), and explain that getStartPageIndex aligns addresses by zeroing
lower 12 bits so tryAsMemoryIndex will receive the aligned (unsigned) page
address rather than -4096. Ensure references to getStartPageIndex and
tryAsMemoryIndex remain consistent.

@tomusdrw tomusdrw merged commit a62f89c into main Apr 14, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant