Specialize benchmark creation helpers by lrzpellegrini · Pull Request #1397 · ContinualAI/avalanche

lrzpellegrini · 2023-05-31T12:42:05Z

This PR introduces the ability to customize the class of the benchmark objects returned by generic benchmark creation functions (such as dataset_benchmark, paths_benchmark, ...).

In addition, new problem-specific functions (dataset_classification_benchmark, dataset_detection_benchmark, ...) are introduced to explicitly cover classification and detection setups. Generic functions will still return ClassificationScenario instances if datasets with classification targets are detected (ints), but they now will display a warning suggesting the use of its classification counterpart.

The `_scenario functions, which were deprecated quite a long time ago, have been removed.

This PR also fixes a problem with CORe50-NC, which did not have fields found in NCScenario.

Minor addition: this also solves some warnings raised by sphinx when generating the doc files. The structure of some macro-sections has been slightly reworked.

Fixes #774

coveralls · 2023-05-31T12:55:57Z

Pull Request Test Coverage Report for Build 5268393053

417 of 538 (77.51%) changed or added relevant lines in 30 files are covered.
43 unchanged lines in 5 files lost coverage.
Overall coverage increased (+0.7%) to 73.292%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
avalanche/benchmarks/classic/endless_cl_sim.py	2	3	66.67%
avalanche/benchmarks/classic/openloris.py	2	3	66.67%
avalanche/benchmarks/classic/stream51.py	3	4	75.0%
avalanche/benchmarks/generators/benchmark_generators.py	14	15	93.33%
avalanche/benchmarks/scenarios/classification_benchmark_creation.py	15	16	93.75%
avalanche/benchmarks/scenarios/detection_benchmark_creation.py	13	14	92.86%
avalanche/benchmarks/utils/classification_dataset.py	18	19	94.74%
tests/distributed/test_distributed_helper.py	1	2	50.0%
avalanche/benchmarks/classic/clear.py	3	5	60.0%
avalanche/benchmarks/classic/ctrl.py	1	4	25.0%

Files with Coverage Reduction	New Missed Lines	%
tests/benchmarks/test_flat_data.py	1	99.39%
avalanche/benchmarks/scenarios/generic_benchmark_creation.py	2	83.93%
avalanche/benchmarks/utils/detection_dataset.py	2	41.82%
avalanche/benchmarks/utils/utils.py	17	73.2%
avalanche/benchmarks/utils/flat_data.py	21	90.94%

Totals
Change from base Build 5257858549:	0.7%
Covered Lines:	16586
Relevant Lines:	22630

💛 - Coveralls

AntonioCarta · 2023-05-31T15:05:36Z

IMO reusing the same organization of the classification generators is a bit clunky for the generic ones. If we assume that the input is an AvalancheDataset we can remove the transform arguments. IMO the generic generators should:

create a stream/benchmark from a list of AvalancheDatasets
create a stream/benchmark given an AvalancheDataset and an attribute to use for splitting it (e.g. a domain label which is a DataAttribute)

The usage should go like this:

train_data, test_data = my_custom_data()
da = DataAttribute("domain", ...)
train_data = AvalancheDataset(train_data, data_attributes=[da])
# same for test

bm = benchmark_from_dataset(train_data, test_data, split_by="domain")

AntonioCarta · 2023-05-31T14:57:34Z

avalanche/benchmarks/scenarios/classification_benchmark_creation.py

+    test_generator: LazyStreamDefinition,
+    *,


Is there any reason to keep the LazyStreamDefinition? maybe we should drop the LazyStreamDefinition and just use a generator of experiences now. I don't see a particular advantage to delay the stream creation this way. It's ok to keep this internal and remove later.

AntonioCarta · 2023-05-31T14:58:37Z

avalanche/benchmarks/scenarios/detection_benchmark_creation.py

+def create_lazy_detection_benchmark(
+    train_generator: LazyStreamDefinition,


AntonioCarta · 2023-05-31T15:01:14Z

avalanche/benchmarks/scenarios/generic_benchmark_creation.py

+    train_datasets: Sequence[GenericSupportedDataset],
+    test_datasets: Sequence[GenericSupportedDataset],


can we simplify this as require an AvalancheDataset? this way we remove the necessity to pass transform/task labels...

AntonioCarta · 2023-05-31T15:09:31Z

avalanche/benchmarks/utils/detection_dataset.py

 def make_detection_dataset(
    dataset: SupportedDetectionDataset,


do we need this? DetectionDataset should be enough. We have the others for classificaiton mostly to keep it compatible with the old API.

It's also a matter of allowing:

On the Avalanche side, to automatically fetch the targets from the dataset

On the user side, to set the targets and task labels

The alternative option is setting the targets and task labels through data_attributes, but that seems an advanced topic.

AntonioCarta · 2023-05-31T15:09:42Z

avalanche/benchmarks/utils/detection_dataset.py

 def detection_subset(
    dataset: SupportedDetectionDataset,


do we need this? DetectionDataset should be enough. We have the others for classificaiton mostly to keep it compatible with the old API.

avalanche/benchmarks/utils/utils.py

AntonioCarta · 2023-05-31T15:12:43Z

avalanche/benchmarks/utils/utils.py

+    return found_targets
+
+
+def make_generic_dataset(


do we need it? also, it's not generic because it's still asks for targets

I think it's still useful to load targets and task labels if they are available. Targets and task labels are optional (it's a best effort search).

AntonioCarta · 2023-05-31T15:13:12Z

avalanche/benchmarks/utils/utils.py

+        return data
+
+
+def make_generic_tensor_dataset(


do we need it?

AntonioCarta · 2023-05-31T15:15:25Z

the same goes for the dataset methods. We don't really need those. Instead of doing

data = make_generic_tensor_data(...)

I think it's easier to do

data = TensorDataset(...) # pytorch data
data = AvalancheDataset(data)

We don't need to know how the dataset is made internally.

ContinualAI-bot · 2023-06-07T09:24:09Z

Oh no! It seems there are some PEP8 errors! 😕
Don't worry, you can fix them! 💪
Here's a report about the errors and where you can find them:

avalanche/logging/wandb_logger.py:246:21: E127 continuation line over-indented for visual indent
avalanche/logging/wandb_logger.py:247:21: E127 continuation line over-indented for visual indent
2       E127 continuation line over-indented for visual indent

ContinualAI-bot · 2023-06-07T09:26:17Z

Oh no! It seems there are some PEP8 errors! 😕
Don't worry, you can fix them! 💪
Here's a report about the errors and where you can find them:

avalanche/logging/wandb_logger.py:246:21: E127 continuation line over-indented for visual indent
avalanche/logging/wandb_logger.py:247:21: E127 continuation line over-indented for visual indent
2       E127 continuation line over-indented for visual indent

lrzpellegrini · 2023-06-14T13:58:50Z

I agree with most of the things you pointed out. I was hoping to keep the score of this PR more narrow 😅.

I agree that we can remove some generic benchmark constructors (like the tensor one). However, they should undergo a deprecation cycle before completely removing them. Users may be using them.
As for transformation parameters, I think it's better if we keep the classic train_transform and eval_transform parameters. The alternative way is populating the train end eval transformation groups of the datasets before calling the constructor, but this requires knowing what a transformation group is (I suspect most users don't know about them).
We can remove the lazy constructors (or create all experiences eagerly internally), but that will prevent users from creating experience datasets on the go. In some setups, this may be a problem memory-wise.

AntonioCarta · 2023-06-14T14:12:31Z

I agree that we can remove some generic benchmark constructors (like the tensor one). However, they should undergo a deprecation cycle before completely removing them. Users may be using them.

I don't understand. We don't have generic benchmark now. We only have the classification benchmarks, which we use for everything.

As for transformation parameters, I think it's better if we keep the classic train_transform and eval_transform parameters. The alternative way is populating the train end eval transformation groups of the datasets before calling the constructor, but this requires knowing what a transformation group is (I suspect most users don't know about them).

I agree. Maybe we should fix this and have a simpler API to manage transformations? Combining it with the benchmark creation doesn't seem like a great choice for the generic benchmarks.

We can remove the lazy constructors (or create all experiences eagerly internally), but that will prevent users from creating experience datasets on the go. In some setups, this may be a problem memory-wise.

We can create lazy streams of experiences, like we do for OCL, where experiences themselves are created on-the-fly.

lrzpellegrini · 2023-06-14T14:31:07Z

We have DatasetScenario, which is quite generic. The ClassificationScenario is a subclass of DatasetScenario with a couple of additional fields regarding the classes timeline.

DatasetScenario -> ClassesTimelineCLScenario (adds class timeline fields) -> ClassificationScenario

similarly, for detection:

DatasetScenario -> ClassesTimelineCLScenario -> DetectionScenario

In theory, one could create a regression/"other problem" benchmark using DatasetScenario instead of ClassificationScenario, which is more correct.

AntonioCarta · 2023-06-14T15:40:40Z

We have DatasetScenario, which is quite generic

This is a class, I'm referring to benchmark generators. We don't have generic ones because they all expect targets and task labels.

In theory, one could create a regression/"other problem" benchmark using DatasetScenario instead of ClassificationScenario, which is more correct.

This is the kind of API that I want to avoid. For every type of dataset/problem and every type of CL scenario we will need a different generator, or a super general and super complex one.

ClassesTimelineCLScenario (adds class timeline fields)

keep in mind that we don't really have a use case for timeline fields. I think it's best to keep the benchmark itself as simple as possible (i.e. a dumb collection of streams). Same for streams (a dumb collection of experiences).

The idea of splitting dataset and stream creation tries to reduce this complexity by splitting separate concerns in different methods. I'm open to other proposals, but only in the direction of reducing complexity.

lrzpellegrini · 2023-06-15T13:40:23Z

I think it may be a good idea to re-design the benchmark generation part a bit. I'm putting this PR on hold until we devise a more general solution.

lrzpellegrini added 13 commits May 17, 2023 16:28

Merge remote-tracking branch 'upstream/master'

153ba51

Working on classification benchmark creators.

a180095

Removed deprecated scenario_* generators

9fe67c4

Fixed minor typo

dfd0335

Collapsed Supervised versions of Datasets

30918c9

Add detection benchmark helpers plus unit tests.

e27ce6e

Fixed minor type hint.

7119c86

Added unit test for class_balanced_split_strategy

6bca924

Automatic dataset_benchmark compatibility with classification tasks

1e65ff9

Minor fix for Python 3.7

ce12590

Added regression tests for issue ContinualAI#774

eef6333

Merge remote-tracking branch 'upstream/master'

cf7c796

Add missing requirement for sphinx

9a67ae4

lrzpellegrini marked this pull request as ready for review May 31, 2023 13:25

AntonioCarta requested changes May 31, 2023

View reviewed changes

AntonioCarta mentioned this pull request Jun 1, 2023

switch to black formatting #1398

Merged

lrzpellegrini added 2 commits June 7, 2023 11:20

Add num_workers support in EWC. Supersedes PR ContinualAI#1310.

7b330b8

Fix checkpoint support in W&B logger. Supersedes PR ContinualAI#706

e095c58

Merge remote-tracking branch 'upstream/master'

cd2bf1d

Fix PEP8 error

80e24ce

Merge remote-tracking branch 'upstream/master'

1161124

lrzpellegrini added 2 commits June 14, 2023 16:22

Prevent creation of additional file in unit test

d4bf707

Make some utility functions private

60d2447

lrzpellegrini mentioned this pull request Jul 19, 2023

Various fixes and improvements #1463

Merged

		def create_lazy_detection_benchmark(
		train_generator: LazyStreamDefinition,

		train_datasets: Sequence[GenericSupportedDataset],
		test_datasets: Sequence[GenericSupportedDataset],

		def make_detection_dataset(
		dataset: SupportedDetectionDataset,

Uh oh!

Conversation

lrzpellegrini commented May 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented May 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 5268393053

💛 - Coveralls

Uh oh!

AntonioCarta commented May 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AntonioCarta commented May 31, 2023

Uh oh!

ContinualAI-bot commented Jun 7, 2023

Uh oh!

ContinualAI-bot commented Jun 7, 2023

Uh oh!

lrzpellegrini commented Jun 14, 2023

Uh oh!

AntonioCarta commented Jun 14, 2023

Uh oh!

lrzpellegrini commented Jun 14, 2023

Uh oh!

AntonioCarta commented Jun 14, 2023

Uh oh!

lrzpellegrini commented Jun 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lrzpellegrini commented May 31, 2023 •

edited

Loading

coveralls commented May 31, 2023 •

edited

Loading

AntonioCarta commented May 31, 2023 •

edited

Loading