docs: draft agent-oriented linting paper#67
Conversation
| \item an agent hook, a small integration point that runs after file edits and feeds findings back to the coding agent. | ||
| \end{itemize} | ||
|
|
||
| The hook interface is important because it shifts linting from a terminal command a human remembers to run into an automatic part of the agent's edit loop. A finding is not merely a report; it becomes a prompt for the next repair action. |
There was a problem hiding this comment.
"human remembers" is probably not a great example because we have agents who can run linters now. could probably hinge this on more immediate feedback/faster iteration
|
|
||
| \section{Motivation} | ||
|
|
||
| Generated applications fail in ways that reflect both the target framework and the generator's learned habits. In internal use, many defects were not exotic compiler problems. They were small but consequential choices: using a browser API in a server-rendered module, importing React Native primitives into a web project, omitting a \texttt{response.ok} check, using unsupported animation patterns, or forgetting an Expo-specific layout guard. These problems are easy to fix once identified, but expensive when discovered only after preview, deployment, or user interaction. |
There was a problem hiding this comment.
feels like this fails to make a distinction of why the laint model is better for our target problem versus a normal linter
like in theory you could define custom lint rules with eslint to catch these things too. but there's a reason we want to hook on file edit
| The current laint implementation contains 55 rules and 59 test files. Table~\ref{tab:categories} summarizes the rule corpus by category. These categories are taken from the \texttt{category} field in each rule's metadata rather than assigned after the fact for the paper. The corpus contains 15 error-level rules and 40 warning-level rules. Seventeen rules are universal, while the remaining rules target Expo, web, backend, or a combination of platforms. | ||
|
|
||
| \paragraph{Version pinning.} | ||
| All rule counts and reported benchmark artifacts in this paper are tied to a fixed repository state: \texttt{main} commit \texttt{6a60a0295955ee6cc1d639c88955ea50722e3516}, dated 2026-05-14. The counts are reproducible from checked-in repository artifacts using the \texttt{paper:stats} script documented with the paper source, and the archived run artifacts include metadata for the runner, prompt IDs, model aliases, model IDs, and token or repair-turn limits. Future benchmark reports should cite either an immutable commit hash or a purpose-named git tag so that later rule additions, rule rewrites, or prompt-suite changes do not change the meaning of previously reported results. |
There was a problem hiding this comment.
what's this future benchmark reports line about? for citations or something? followup research?
| \paragraph{Version pinning.} | ||
| All rule counts and reported benchmark artifacts in this paper are tied to a fixed repository state: \texttt{main} commit \texttt{6a60a0295955ee6cc1d639c88955ea50722e3516}, dated 2026-05-14. The counts are reproducible from checked-in repository artifacts using the \texttt{paper:stats} script documented with the paper source, and the archived run artifacts include metadata for the runner, prompt IDs, model aliases, model IDs, and token or repair-turn limits. Future benchmark reports should cite either an immutable commit hash or a purpose-named git tag so that later rule additions, rule rewrites, or prompt-suite changes do not change the meaning of previously reported results. | ||
|
|
||
| \begin{table}[ht] |
There was a problem hiding this comment.
might be [ht] maybe just try [h] or [h!]
| Mobile and platform-compatibility rules prevent generated code from mixing incompatible APIs or violating layout constraints. Examples include checks for web/native import boundaries, Expo image imports, safe-area handling around notches and home indicators, keyboard avoidance around text inputs, and bottom padding for native tab screens. These are common in agent-written code because examples for web and native React are semantically similar but operationally distinct. | ||
|
|
||
| \paragraph{Framework conventions.} | ||
| Expo~\cite{expo}, Next.js~\cite{nextjs}, Tailwind, and screen-transition rules encode conventions that are not always enforced by the compiler. Examples include absolute route paths, tab header configuration, animation worklet directives, transition progress ranges, shared-transition tag matching, and animation class restrictions. These are not arbitrary style preferences; they are small framework contracts that generated code often violates while still remaining valid TypeScript. |
There was a problem hiding this comment.
this prompts me to think that these patterns must be documented - perhaps make a case how laint provides token efficiency by encoding this stuff instead of relying on non-deterministic documentation grepping

Summary
Drafts an arXiv-style paper for laint around agent-oriented linting for generated JSX/TSX applications. The current draft frames laint as both an expert-curated benchmark and a feedback-loop tool for surfacing framework-specific generated-app failures before slower build, preview, device, or runtime checks.
The PR now includes checked-in raw prompt-grid artifacts, generated result tables, and a repair-loop pilot. The repair results are framed as diagnostic-feedback compliance signals: 476 -> 101 reported findings, 375 net reduction, 445 rule-level findings resolved, and 70 introduced findings across the repair loop. The paper still treats these as raw benchmark signals until human precision/recall labeling and downstream build/runtime/user-acceptance checks are added.
Verification
npm run lintnpm run buildnpm run knipnpm testnpm run paper:tablesmake -C paperpaper/main.logfor undefined refs/citations and overfull/warning/error linesRemaining Before Submission