Add `drop` parameter to OneHotEncoder to control which dummy category is dropped

**Is your feature request related to a problem? Please describe.**
In the current implementation, when `drop_last=True`, `OneHotEncoder` always drops the last category (alphabetically). This makes it impossible for users to control which category is used as the reference group. In many modeling scenarios (for example logistic regression or other linear models), the choice of the reference category matters and users may want to drop a different category.

**Describe the solution you'd like**
Add a `drop` parameter that allows users to control which dummy category is dropped.

```python
drop: str = "last"  # options: "last", "first", "most_frequent"
```

* `"last"` (default): preserves current behaviour — drops the last category alphabetically.
* `"first"`: drops the first category alphabetically.
* `"most_frequent"`: drops the most frequent category found during `fit()`, which can be a more statistically meaningful reference group.

If `drop="most_frequent"` and multiple categories have the same highest frequency, the transformer should raise a `UserWarning` and fall back to dropping the first category found.

The existing `drop_last` parameter should remain for backward compatibility, but a deprecation warning should be raised if it is used together with the new `drop` parameter.

**Describe alternatives you've considered**
Users can currently control the reference category only by manually reordering or preprocessing the categorical values before applying the encoder. However, this is inconvenient and error-prone, especially in larger pipelines.

**Additional context**
Adding this parameter would make `OneHotEncoder` more flexible and align it better with common machine learning workflows, particularly when building statistical or linear models where the reference category is important.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `drop` parameter to OneHotEncoder to control which dummy category is dropped #913

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add drop parameter to OneHotEncoder to control which dummy category is dropped #913

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add `drop` parameter to OneHotEncoder to control which dummy category is dropped #913