Fix callgrind parser: positions: instr / positions: instr line and cyclic call graphs#552
Conversation
| if (visitedGlobally.has(frame)) { | ||
| profile.enterFrame(frame, Math.round(totalCumulative * unitMultiplier)) | ||
| totalCumulative += subtreeTotalWeight | ||
| profile.leaveFrame(frame, Math.round(totalCumulative * unitMultiplier)) | ||
| return | ||
| } | ||
| visitedGlobally.add(frame) |
There was a problem hiding this comment.
🚩 visitedGlobally collapses repeated frames to leaves, losing child detail on secondary call paths
The new visitedGlobally set at src/import/callgrind.ts:160 prevents a frame from being fully expanded more than once across the entire call tree. On subsequent encounters, the frame is shown as a leaf with all its subtree weight attributed as self-time (src/import/callgrind.ts:227-232). This is documented as preventing "exponential blowup" but represents a significant accuracy trade-off for common (non-pathological) call graphs.
Consider a diamond-shaped call graph: A→B→D→E and A→C→D→E. With the old code, D's children (E) would be shown under both the B and C paths (with proportionally distributed weight). With the new code, only the first path expands D's children; the second path shows D as a leaf. The total weight is preserved, but the child breakdown under secondary paths is completely hidden. None of the existing tests exercise this scenario since func2 (the shared callee) has no children in any test profile.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
Hi @marc-casavant, thanks for the patch and the explanation! I basically never use callgrind profiles, so I'm taking it mostly on faith that this patch is good, and that if bugs come up, I'll tag you in them to look at. AI review surfaced one potential issue. Would you mind taking a look and seeing if it's a real problem, or just noise? Happy to merge this after |
Fixes Issue #550
This is a larger-than-usual PR, but the changes are tightly coupled. Each is required to successfully import callgrind.out.* files produced when profiling FreeRADIUS with Valgrind. Splitting them apart would leave the parser in a broken intermediate state.
Background
The callgrind format allows a
positions:header that declares how many position columns precede cost values on each cost line. The most common value ispositions: line(one column), but profilers targeting native code routinely emitpositions: instr line(two columns) orpositions: instr(one column). One example of this would be Valgrind profiles of FreeRADIUS.The parser previously hard-coded
numPositionFields = 1; any file with a non-defaultpositions:header would silently produce garbage output. Separately, call graphs that are entirely cyclic (every function has at least one caller) had no natural root nodes and produced an empty flame graph.The following changes have been made to allow Speedscope to display flame graphs of Valgrind profiles generated by running tests on FreeRADIUS.
Changes
positions:header supportpositions:header and storenumPositionFields(count of whitespace-separated tokens).numPositionFieldsintoparseCostLineso the correct number of leading columns are skipped when extracting cost values.0x…tokens) so instruction-address position fields are consumed correctly and subposition compression still works on subsequent lines.Cyclic call graph fallback
rootNodesis empty, fall back to a residual-weight strategy: compute each frame's total incoming edge weight, subtract it from its total weight, and use frames with positive residual as synthetic roots, visited heaviest-first.visitedGloballyset to avoid exponential re-expansion of shared nodes reachable from multiple roots.console.warnwhen cycles are detected.New tests
importFromCallgrind positions instr linepositions: instr line; instruction address skipped, costs parsed correctlyimportFromCallgrind positions instrpositions: instr; hex address consumed, no cost corruptionimportFromCallgrind instr line subposition compression*,+N,-N) works correctly across two position columnsimportFromCallgrind cyclic callgraphReference file
A callgrind.out.* file (1.3 MB, generated by Valgrind while profiling FreeRADIUS) is attached as an example of the type of files we are viewing with Speedscope.
callgrind.out.20-11.txt