Specialize benchmark creation helpers#1397
Specialize benchmark creation helpers#1397lrzpellegrini wants to merge 20 commits intoContinualAI:masterfrom
Conversation
|
IMO reusing the same organization of the classification generators is a bit clunky for the generic ones. If we assume that the input is an AvalancheDataset we can remove the transform arguments. IMO the generic generators should:
The usage should go like this: |
| test_generator: LazyStreamDefinition, | ||
| *, |
There was a problem hiding this comment.
Is there any reason to keep the LazyStreamDefinition? maybe we should drop the LazyStreamDefinition and just use a generator of experiences now. I don't see a particular advantage to delay the stream creation this way. It's ok to keep this internal and remove later.
| def create_lazy_detection_benchmark( | ||
| train_generator: LazyStreamDefinition, |
| train_datasets: Sequence[GenericSupportedDataset], | ||
| test_datasets: Sequence[GenericSupportedDataset], |
There was a problem hiding this comment.
can we simplify this as require an AvalancheDataset? this way we remove the necessity to pass transform/task labels...
| def make_detection_dataset( | ||
| dataset: SupportedDetectionDataset, |
There was a problem hiding this comment.
do we need this? DetectionDataset should be enough. We have the others for classificaiton mostly to keep it compatible with the old API.
There was a problem hiding this comment.
It's also a matter of allowing:
- On the Avalanche side, to automatically fetch the
targetsfrom the dataset - On the user side, to set the targets and task labels
The alternative option is setting the targets and task labels through data_attributes, but that seems an advanced topic.
| def detection_subset( | ||
| dataset: SupportedDetectionDataset, |
There was a problem hiding this comment.
do we need this? DetectionDataset should be enough. We have the others for classificaiton mostly to keep it compatible with the old API.
| return found_targets | ||
|
|
||
|
|
||
| def make_generic_dataset( |
There was a problem hiding this comment.
do we need it? also, it's not generic because it's still asks for targets
There was a problem hiding this comment.
I think it's still useful to load targets and task labels if they are available. Targets and task labels are optional (it's a best effort search).
| return data | ||
|
|
||
|
|
||
| def make_generic_tensor_dataset( |
|
the same goes for the dataset methods. We don't really need those. Instead of doing I think it's easier to do We don't need to know how the dataset is made internally. |
|
Oh no! It seems there are some PEP8 errors! 😕 |
|
Oh no! It seems there are some PEP8 errors! 😕 |
|
I agree with most of the things you pointed out. I was hoping to keep the score of this PR more narrow 😅.
|
I don't understand. We don't have generic benchmark now. We only have the classification benchmarks, which we use for everything.
I agree. Maybe we should fix this and have a simpler API to manage transformations? Combining it with the benchmark creation doesn't seem like a great choice for the generic benchmarks.
We can create lazy streams of experiences, like we do for OCL, where experiences themselves are created on-the-fly. |
|
We have
similarly, for detection:
In theory, one could create a regression/"other problem" benchmark using |
This is a class, I'm referring to benchmark generators. We don't have generic ones because they all expect targets and task labels.
This is the kind of API that I want to avoid. For every type of dataset/problem and every type of CL scenario we will need a different generator, or a super general and super complex one.
keep in mind that we don't really have a use case for timeline fields. I think it's best to keep the benchmark itself as simple as possible (i.e. a dumb collection of streams). Same for streams (a dumb collection of experiences). The idea of splitting dataset and stream creation tries to reduce this complexity by splitting separate concerns in different methods. I'm open to other proposals, but only in the direction of reducing complexity. |
|
I think it may be a good idea to re-design the benchmark generation part a bit. I'm putting this PR on hold until we devise a more general solution. |
This PR introduces the ability to customize the class of the benchmark objects returned by generic benchmark creation functions (such as
dataset_benchmark,paths_benchmark, ...).In addition, new problem-specific functions (
dataset_classification_benchmark,dataset_detection_benchmark, ...) are introduced to explicitly cover classification and detection setups. Generic functions will still returnClassificationScenarioinstances if datasets with classification targets are detected (ints), but they now will display a warning suggesting the use of its classification counterpart.The `_scenario functions, which were deprecated quite a long time ago, have been removed.
This PR also fixes a problem with CORe50-NC, which did not have fields found in NCScenario.
Minor addition: this also solves some warnings raised by sphinx when generating the doc files. The structure of some macro-sections has been slightly reworked.
Fixes #774