Use provider C# reference map with pre-write internalization#10976
Use provider C# reference map with pre-write internalization#10976live1206 wants to merge 89 commits into
Conversation
commit: |
|
All changed packages have been documented.
Show changes
|
This reverts commit ff55509.
|
Follow-up on the remaining samples question from the latest review: the provider remove path only computes removal candidates from generated provider graph nodes, not from arbitrary project symbols. Sample files are generated documents but they are not provider nodes, so sample classes themselves are not selected as remove candidates by the provider graph. For generated types referenced only by samples, the full DPG regen/no-diff validation is the practical coverage; I do not see an additional code change needed here. The remaining static-state concern is still a known low-priority follow-up rather than a blocker for this PR. |
Design suggestion: derive body references from expression trees rather than declared deps + source re-parsingFollowing up on the body-reference discussion — the current approach builds graph edges from provider metadata (signatures, properties, fields, attributes, etc.) and then compensates for body-level references three ways: manually-declared The concern is soundness: a pure-metadata graph is an under-approximation of reachability, because a C# type reference can appear anywhere a method body can ( Suggestion: walk the body expression trees instead. The bodies already exist as structured
The fact that the PR re-parses generated source for the client/serialization candidates (rather than walking expression trees) suggests the tree walk wasn't feasible for all bodies — presumably where bodies are raw Not a merge blocker, but I think it's the change that would turn this from "correct-by-testing" into "correct-by-construction." Would also be worth attributing how much of the ~21% comes from skipping body analysis vs. from pre-write internalization (skipping the Roslyn --generated by Copilot |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… mtg-hybrid-reference-map # Conflicts: # packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator.ClientModel/test/PostProcessing/ClientBodyDependencyPostProcessingTests.cs # packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/src/PostProcessing/ProviderReferenceMapAnalyzer.cs # packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/src/Providers/TypeProvider.cs
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This reverts commit ec68ee3.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| @@ -0,0 +1,7 @@ | |||
| --- | |||
There was a problem hiding this comment.
we should remove this change
| changelog: ["@chronus/github/changelog", { repo: "microsoft/typespec" }] | ||
|
|
||
| additionalPackages: | ||
| - packages/http-client-csharp |
There was a problem hiding this comment.
was this intentional?
| get | ||
| { | ||
| var ns = _namedTypeSymbol.ContainingNamespace.GetFullyQualifiedNameFromDisplayString(); | ||
| return string.IsNullOrEmpty(ns) ? _namedTypeSymbol.Name : $"{ns}.{_namedTypeSymbol.Name}"; |
There was a problem hiding this comment.
can we also cache this value please so it's not calculated each time?
|
|
||
| foreach (var outputType in output.TypeProviders) | ||
| { | ||
| if (outputType is ModelFactoryProvider && outputType.Methods.Count == 0) |
There was a problem hiding this comment.
why is this only needed for model factories ?
| CodeModelGenerator.Instance.AdditionalRootTypes | ||
| .Concat(CodeModelGenerator.Instance.NonRootTypes) | ||
| .Select(GetFileNameForType)) | ||
| .Distinct(StringComparer.Ordinal) |
There was a problem hiding this comment.
hmm, is the distinct necessary? The root types are a hashset, is it possible for there to be duplicates once we get the filename?
| .ToArray(); | ||
| } | ||
|
|
||
| private static string GetFileNameForType(string typeName) |
There was a problem hiding this comment.
Looking at this further, why do we need to explicitly do this now ? The types to keep were already being checked in the postprocess step. If we must do this here, should we try and find the corresponding type in the outputlibrary by name, and use it's RelativePath to get the name of the file instead?
| var generatedPublicReachable = GetReachableTypes(internalizeRoots, internalizeReferences); | ||
| AddDerivedModelReferences(providers, publicGraph.Nodes, internalizeReferences, generatedPublicReachable, generatedDiscriminatorBaseNames); | ||
| internalizeRoots.UnionWith(customPublicRoots); | ||
| var internalizeReachableWithoutHelpers = GetReachableTypes(internalizeRoots, internalizeReferences); | ||
| AddDerivedModelReferences(providers, publicGraph.Nodes, internalizeReferences, internalizeReachableWithoutHelpers, generatedDiscriminatorBaseNames); | ||
| internalizeReachableWithoutHelpers = GetReachableTypes(internalizeRoots, internalizeReferences); | ||
| var publicizeRoots = internalizeRoots.ToHashSet(StringComparer.Ordinal); | ||
| var internalizeHelperRoots = GetHelperRootNames(generatedProviders, graph.Nodes, internalizeReachableWithoutHelpers); | ||
| internalizeRoots.UnionWith(internalizeHelperRoots); | ||
| var internalizeReachable = GetReachableTypes(internalizeRoots, internalizeReferences); | ||
| var internalizeDeclaredNodes = GetPostProcessorDeclaredNodes(generatedProviders, graph.Nodes, publicOnly: true); | ||
| var customInternalBoundaryNodes = graph.Nodes | ||
| .Where(name => publicGraph.References.TryGetValue(name, out var references) && references.Overlaps(customInternalDeclarations)) | ||
| .ToHashSet(StringComparer.Ordinal); | ||
| var publicizeDeclaredNodes = GetPostProcessorDeclaredNodes(generatedProviders, graph.Nodes, publicOnly: false) | ||
| .Except(internalizeDeclaredNodes, StringComparer.Ordinal); | ||
| var generatedImplementationInternalDeclarations = GetGeneratedImplementationInternalTypeDeclarations(generatedInternalDeclarations).ToHashSet(StringComparer.Ordinal); | ||
| var publicApiTraversalNodes = internalizeDeclaredNodes | ||
| .Except(generatedInternalDeclarations, StringComparer.Ordinal) | ||
| .Concat(publicizeDeclaredNodes) | ||
| .Except(generatedImplementationInternalDeclarations, StringComparer.Ordinal) | ||
| .ToHashSet(StringComparer.Ordinal); |
There was a problem hiding this comment.
I will be honest and say this is quite difficult to follow
Summary
Replaces the C# generator's Roslyn reference-map/internalization path with a provider/expression-based reference map and pre-write provider accessibility updates while preserving generated-output correctness.
Latest changes
TypeProvidermetadata and provider expression/body dependency traversal instead of constructing the generated reference map through Roslyn.InternalizeAsyncpass when pre-write accessibility has already been applied; removal and reduce still run through the existing post-processing pipeline.NamedTypeSymbolProviders, using lightweight metadata identity and syntax dependency extraction so arbitrary custom Roslyn symbols do not need fullCSharpType/member materialization.How provider reference-map construction works
This PR no longer uses Roslyn to construct the generated-code reference map. Instead, generated type reachability is derived from the in-memory provider model before files are written.
TypeProviders first, then run visitors, customized-member filtering, and back-compat processing so the analyzer sees the final provider shape.InternalizeAsyncis skipped because the generated files already have the final accessibility.Roslyn is still used by the existing post-processing pipeline for removal/reduce work and for reading custom-code symbols/syntax, but not for generated reference-map construction or applying internalization.
Latest benchmark data
Latest local full-generation benchmark compares Roslyn as the baseline against the current provider implementation with pre-write internalization and the latest custom-code/state-cleanup fixes. The values below are aggregated means across 3 BenchmarkDotNet runs with profiling disabled.
Per-run results:
Benchmark artifacts:
/tmp/typespec-provider-map-latest-20260701-0828-run1/tmp/typespec-provider-map-latest-20260701-0828-run2/tmp/typespec-provider-map-latest-20260701-0828-run3/tmp/typespec-provider-map-no-profile-20260630-1547-run1/tmp/typespec-provider-map-no-profile-20260630-1547-run2/tmp/typespec-provider-map-no-profile-20260630-1547-run3/tmp/typespec-hybrid-benchmark-reruns-20260629-065344/tmp/typespec-prewrite-internalize-benchmark-20260629-073203Azure SDK local project-reference benchmark
Measured Azure SDK for .NET management-plane regeneration using the same local project-reference setup in both runs:
/workspaces/azure-sdk-for-netwas wired to the sibling/workspaces/typespeccheckout via .NET project references, and only the TypeSpec checkout changed betweenmainand this PR branch.Validation
RegenPreview.ps1 -Azurescript regenerated 39 Azure-branded data-plane libraries with no SDK code diffs and no public API diffs.Azure.Data.AppConfiguration(*.nuget.g.propsalready existed`).Azure.Data.AppConfigurationalone with the same local generator passed, leaving no SDK/API diffs.Azure.AI.Vision.ImageAnalysispassed after the custom-code stack-overflow fix.System/Azureusings are removed from the model factory, and internal error models are generated/marked buildable in the MRW context.npm run formatpasses.pnpm formatpasses; unrelated formatter changes outsidehttp-client-csharpwere discarded.Microsoft.TypeSpec.Generator.ClientModel.Testspass (1451tests).npm run build -- --no-restorepasses.Microsoft.TypeSpec.Generator.Testscustomization/post-processing tests pass (ClientCustomizationTests,PostProcessorTests.RemovesInvalidUsings).npm run coppasses.