make CountFrequencyEncoder not raise an error with unseen categories during transform

**Is your feature request related to a problem? Please describe.**
Currently, `CountFrequencyEncoder` raises a `ValueError` during the `transform()` step if it encounters categories that were not seen during the `fit()` phase. This behavior can interrupt pipelines and make the transformer less flexible when working with real-world datasets where unseen categories frequently occur during inference or deployment.

**Describe the solution you'd like**
Introduce a parameter to control how unseen categories should be handled during `transform()`. For example:

```python
unseen_categories: str = "raise"  # options: 'raise', 'warn', 'ignore'
```

* `raise` → Keep the current behavior and raise a `ValueError`.
* `warn` → Encode unseen categories as `NaN` (or optionally `0`) and emit a `UserWarning` indicating which categories were unseen.
* `ignore` → Encode unseen categories as `NaN` silently without raising an error.

This would allow the transformer to continue operating while still informing the user when unexpected categories appear.

**Describe alternatives you've considered**
An alternative approach could be to always encode unseen categories as `NaN` without providing configuration options. However, this removes user control over strict validation and may hide data issues. Providing a configurable parameter maintains flexibility while preserving the option to enforce strict behavior.

**Additional context**
This change would align `CountFrequencyEncoder` with the design pattern being introduced across other transformers in the library that avoid raising errors during transformation and instead provide configurable handling of unexpected values. It would also improve usability in production pipelines where unseen categories are common.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make CountFrequencyEncoder not raise an error with unseen categories during transform #909

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

make CountFrequencyEncoder not raise an error with unseen categories during transform #909

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions