Skip to content

fix: replace deprecated grouped_entities with aggregation_strategy#1194

Open
tysoncung wants to merge 4 commits intohuggingface:mainfrom
tysoncung:fix-grouped-entities
Open

fix: replace deprecated grouped_entities with aggregation_strategy#1194
tysoncung wants to merge 4 commits intohuggingface:mainfrom
tysoncung:fix-grouped-entities

Conversation

@tysoncung
Copy link

Summary

Replace all occurrences of the deprecated grouped_entities=True parameter with the modern aggregation_strategy="simple" across all language translations in chapter 1.

Problem

The grouped_entities parameter was removed from TokenClassificationPipeline._sanitize_parameters(), causing a TypeError when running the NER pipeline examples in the course notebooks.

Reported in huggingface/transformers#44016.

Changes

  • Updated code examples in all 17 language translations (chapter 1, sections 3, 7, and 10)
  • Updated explanatory text referencing the parameter
  • Updated subtitle file

Testing

from transformers import pipeline
ner = pipeline("ner", aggregation_strategy="simple")
ner("My name is Sylvain and I work at Hugging Face in Brooklyn.")

cc @Rocketknight1 (per your comment on the issue)

The grouped_entities parameter was removed from TokenClassificationPipeline.
Replace all occurrences across all language translations with the modern
aggregation_strategy="simple" parameter which provides the same behavior.

Fixes huggingface/transformers#44016
@Rocketknight1
Copy link
Member

Hi @tysoncung, looks good but there are some build failures - can you check them in the CI? It's possible there's a typo somewhere that creates a syntax error, which blocks us from successfully building the docs

The double quotes in aggregation_strategy="simple" inside <code> tags
cause Svelte parse errors during the documentation build. Escape them
as &quot; to fix the build.
Replace backtick code formatting with <code> tags in Burmese chapter1/7.mdx
quiz component to fix Svelte parser error during documentation build.
Matches the pattern used in English and Telugu translations.
{
text: "Er gibt Begriffe zurück, die für Personen, Organisationen oder Orte stehen.",
explain: "Außerdem werden mit <code>grouped_entities=True</code> die Wörter, die zur selben Entität gehören, gruppiert, wie z. B. \"Hugging Face\".",
explain: "Außerdem werden mit <code>aggregation_strategy=&quot;simple&quot;</code> die Wörter, die zur selben Entität gehören, gruppiert, wie z. B. \"Hugging Face\".",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to take care of " double quatation around word simple like how it is for Hugging Face.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@tysoncung
Copy link
Author

Closing - CI infra issue on repo side. Happy to reopen if needed.

@tysoncung tysoncung closed this Feb 25, 2026
@tysoncung tysoncung reopened this Mar 3, 2026
@tysoncung
Copy link
Author

Thanks @khushali9 for the review! The explain fields in the quiz MDX files already use proper HTML entity escaping (&quot;simple&quot;) for the double quotes inside the <code> tags. The code blocks in Python examples use regular quotes as expected. Let me know if you spot any other issues!

@tysoncung
Copy link
Author

Friendly bump — this PR is in a clean merge state with CI passing. Would love to get this merged when you have a moment! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants