Skip to content

Fix callgrind parser: positions: instr / positions: instr line and cyclic call graphs#552

Open
marc-casavant wants to merge 1 commit into
jlfwong:mainfrom
InkbridgeNetworks:callgrind-import-fix-instr-line
Open

Fix callgrind parser: positions: instr / positions: instr line and cyclic call graphs#552
marc-casavant wants to merge 1 commit into
jlfwong:mainfrom
InkbridgeNetworks:callgrind-import-fix-instr-line

Conversation

@marc-casavant

Copy link
Copy Markdown

Fixes Issue #550

This is a larger-than-usual PR, but the changes are tightly coupled. Each is required to successfully import callgrind.out.* files produced when profiling FreeRADIUS with Valgrind. Splitting them apart would leave the parser in a broken intermediate state.

Background

The callgrind format allows a positions: header that declares how many position columns precede cost values on each cost line. The most common value is positions: line (one column), but profilers targeting native code routinely emit positions: instr line (two columns) or positions: instr (one column). One example of this would be Valgrind profiles of FreeRADIUS.

The parser previously hard-coded numPositionFields = 1; any file with a non-default positions: header would silently produce garbage output. Separately, call graphs that are entirely cyclic (every function has at least one caller) had no natural root nodes and produced an empty flame graph.

The following changes have been made to allow Speedscope to display flame graphs of Valgrind profiles generated by running tests on FreeRADIUS.

Changes

positions: header support

  • Parse the positions: header and store numPositionFields (count of whitespace-separated tokens).
  • Pass numPositionFields into parseCostLine so the correct number of leading columns are skipped when extracting cost values.
  • Add hex address parsing (0x… tokens) so instruction-address position fields are consumed correctly and subposition compression still works on subsequent lines.

Cyclic call graph fallback

  • After the natural-roots pass, if rootNodes is empty, fall back to a residual-weight strategy: compute each frame's total incoming edge weight, subtract it from its total weight, and use frames with positive residual as synthetic roots, visited heaviest-first.
  • Add a visitedGlobally set to avoid exponential re-expansion of shared nodes reachable from multiple roots.
  • Emit a console.warn when cycles are detected.

New tests

  • importFromCallgrind positions instr line
    • two-column positions: instr line; instruction address skipped, costs parsed correctly
  • importFromCallgrind positions instr
    • one-column positions: instr; hex address consumed, no cost corruption
  • importFromCallgrind instr line subposition compression
    • subposition compression (*, +N, -N) works correctly across two position columns
  • importFromCallgrind cyclic callgraph
    • fully-cyclic graph produces a non-empty flame graph via residual-weight roots

Reference file

A callgrind.out.* file (1.3 MB, generated by Valgrind while profiling FreeRADIUS) is attached as an example of the type of files we are viewing with Speedscope.

callgrind.out.20-11.txt

@coveralls

Copy link
Copy Markdown

Coverage Status

coverage: 44.275% (+0.5%) from 43.805% — InkbridgeNetworks:callgrind-import-fix-instr-line into jlfwong:main

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

Comment thread src/import/callgrind.ts
Comment on lines +227 to +233
if (visitedGlobally.has(frame)) {
profile.enterFrame(frame, Math.round(totalCumulative * unitMultiplier))
totalCumulative += subtreeTotalWeight
profile.leaveFrame(frame, Math.round(totalCumulative * unitMultiplier))
return
}
visitedGlobally.add(frame)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 visitedGlobally collapses repeated frames to leaves, losing child detail on secondary call paths

The new visitedGlobally set at src/import/callgrind.ts:160 prevents a frame from being fully expanded more than once across the entire call tree. On subsequent encounters, the frame is shown as a leaf with all its subtree weight attributed as self-time (src/import/callgrind.ts:227-232). This is documented as preventing "exponential blowup" but represents a significant accuracy trade-off for common (non-pathological) call graphs.

Consider a diamond-shaped call graph: A→B→D→E and A→C→D→E. With the old code, D's children (E) would be shown under both the B and C paths (with proportionally distributed weight). With the new code, only the first path expands D's children; the second path shows D as a leaf. The total weight is preserved, but the child breakdown under secondary paths is completely hidden. None of the existing tests exercise this scenario since func2 (the shared callee) has no children in any test profile.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@jlfwong

jlfwong commented May 27, 2026

Copy link
Copy Markdown
Owner

Hi @marc-casavant, thanks for the patch and the explanation! I basically never use callgrind profiles, so I'm taking it mostly on faith that this patch is good, and that if bugs come up, I'll tag you in them to look at.

AI review surfaced one potential issue. Would you mind taking a look and seeing if it's a real problem, or just noise? Happy to merge this after

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants