This is an implementation of our paper: "An Expanded Benchmark that Rediscovers and Affirms the Edge of Uncertainty Sampling for Active Learning in Tabular Datasets", which is updated from "Re-benchmarking Pool-Based Active Learning for Binary Classification".
We reproduce and re-benchmark the previous work: #SV74 A Comparative Survey: Benchmarking for Pool-based Active Learning
Update on 2024/04/17, we notice that Zhan et al. released the source code: https://github.com/SineZHAN/ComparativeSurveyIJCAI2021PoolBasedAL
Update on 2024/07/23, we merge our paper to the benchmark."A More Robust Baseline for Active Learning by Injecting Randomness to Uncertainty Sampling". Please change the branch to robust-baseline.
git checkout robust-baseline
Update on 2025/06/24, our paper is accepted by TMLR 2025/06 and get Reproducibility Certification.
Call for Contribution and Future Work
We call for the community to further provide more experimental results to this benchmark. We provide below suggested future work:
- Models for tabular datasets
- Random Forest
- Gradient Boosting Decision Trees
- TabPFN
- Tasks and domains
- multi-class classifications
- regression problems
- image classifications with pre-trained embedding as the tabular data
- natural language processing with pre-trained embedding as the tabular data
- Evaluation metrics
- Deficiency score
- Data Utilization Rate
- Start Quality
- Average End Quality
- Provide new datasets: we support LIBSVM dataset format.
- Update
data/get_data.shto download more tabular data.
- Update
- Provide new query strategy: we support libact, Google, and ALiPy modules.
- Update
src/config.pyto import new query strategies.
- Update
- Provide new experimental settings: we provide common settings as arguments.
- Update
src/main.pyto adjust the settings such as the size of a test set, size of an initial labeled pool, query-oriented model, task-oriented model, etc.
- Update
cd data; bash get_data.sh # download datasets
cd ..;
cd src; python main.py # run experiments, you will see two CSV files. *-aubc.csv* and *-detail.csv*
python main.py -h # call helper functions- Ubuntu >= 20.04.3 LTS (focal)
- Python >= 3.8, for ntucllab/libact
Note. We only verify the installation steps on Ubuntu. Please raise the issue if you have any problems.
When you use Python in [3.8, 3.9].
- (optional)
apt install vim git python3 python3-venv build-essential gfortran libatlas-base-dev liblapacke-dev python3-dev -y git clone https://github.com/ariapoy/active-learning-benchmark.git act-benchmark; cd act-benchmarkpython3 -m venv act-env; source act-env/bin/activatepip install -r requirements.txtgit clone https://github.com/ariapoy/active-learning.gitgit clone https://github.com/ariapoy/ALiPy.git alipy-dev; cp -r alipy-dev/alipy alipy-dev/alipy_devgit clone https://github.com/ariapoy/libact.git libact-dev; cd libact-dev; python setup.py build; python setup.py install; cd ..; cp -r libact-dev/libact libact-dev/libact_devgit clone https://github.com/ariapoy/scikit-activeml.git scikit-activeml-dev; cp -r scikit-activeml-dev/skactiveml scikit-activeml-dev/skactiveml_devcd data; bash get_data_zhan21.sh; cd ..cd src; python main.py -h
Warning! If you use Python == 3.13
3. pip install -r requirements-py313.txt
Warning! If you use Python == 3.12
3. pip install -r requirements-py312.txt
Warning! If you use Python >= 3.11
6. git clone https://github.com/ariapoy/libact.git libact-dev; cp -r libact-dev/libact libact-dev/libact_dev
You CANNOT obtain the results of Hinted Support Vector Machine (HintSVM) and Variability Reduction (VR) for the benchmark.
Warning! If you use Python == 3.10
git clone https://github.com/ariapoy/ALiPy.git alipy-dev; cd alipy-dev; git checkout py3.10; cd .. ; cp -r alipy-dev/alipy alipy-dev/alipy_dev
Warning! If your env cannot support liblapack
git clone https://github.com/ariapoy/libact.git libact-dev; cd libact-dev; LIBACT_BUILD_VARIANCE_REDUCTION=0 python setup.py build; LIBACT_BUILD_VARIANCE_REDUCTION=0 python setup.py install; cd ..; cp -r libact-dev/libact libact-dev/libact_dev
You CANNOT obtain the results of Variability Reduction (VR) for the benchmark.
Warning! If your OS is macOS.
0. brew install cmake
3. pip install -r requirements-macos.txt
Below are examples that demonstrate how to use the benchmark for quick use, evaluating existing AL query strategies on your own datasets, and adding new AL query strategies for evaluating.
This is an example of running compatible uncertainty sampling (US-C) on Haberman dataset based on RBF kernel SVM.
python main.py --tool google --qs_name margin-zhan --hs_name google-zhan --gs_name zhan --seed 0 --n_trials 1 --data_set haberman;
- Split the evaluation part from main.py
Reproduce all experiments for this work
- Settings of initial pools
- Size of test set (
--tst_size=0.4):$40%$ - Size of initial labeled pool (
--init_lbl_size=20):$20$ . - Construction of initial labeled pool (
--exp_name="RS"): random split training set (not test set) into labeled and unlabeled pools. - Data preprocessing (
--exp_names="scale"): applyscaler = StandardScaler()to dataset.
- Size of test set (
- List of query strategies, their corresponding query-oriented model, and task-oriented model.
- task-oriented model
$\mathcal{G}$ : SVM(RBF)
- task-oriented model
| QS | query-oriented model |
|---|---|
| US-C | SVM(RBF) |
| US-NC | LR(C=0.1) |
| QBC | LR(C=1); SVM(Linear, probability=True); SVM(RBF, probability=True); LDA |
| VR | LR(C=1) |
| EER | SVM(RBF, probability=True) |
| Core-Set | N/A |
| Graph | N/A |
| Hier | N/A |
| HintSVM | SVM(RBF) |
| QUIRE | SVM(RBF) |
| DWUS | SVM(RBF) |
| InfoDiv | SVM(RBF) |
| MCM | SVM(RBF) |
| BMDR | SVM(RBF) |
| SPAL | SVM(RBF) |
| ALBL | # Combination of QSs with same query-oriented model: US-C; US-NC; HintSVM |
| LAL | SVM(RBF) |
- Reproduce all results in Zhan et al. (Warning! It will take you a very long time!)
cd src;
bash run-reproduce-google.sh # run all google datasets
bash run-reproduce-libact.sh # run all libact datasets
bash run-reproduce-libact.sh # run all libact datasets
bash run-reproduce-bso.sh # run all bso datasets
bash run-reproduce-infeasible.sh # run all infeasible time datasets, only for time testNote
N_JOBS: number of workers. Users can accelerate according to their number of CPUs. WARNING! Some methods could be slower because of insufficient resources.
- Reproduce all figures and tables in this work.
cd results; gdown 1qzezDD_fe43ctNBHC4H5W0w6skJcBlxB -O aubc.zip;
unzip aubc.zip;
gdown 1xKUT3CHHOwYY0yFxak1XKf3vWiAXQFSQ -O detail.zip;
unzip detail.zip;
python analysis.py; # choice 1
# open and run analysis.ipynb # choice 2If you use our code in your research or applications, please consider citing our and previous papers.
@article{lu2025an,
title={An Expanded Benchmark that Rediscovers and Affirms the Edge of Uncertainty Sampling for Active Learning in Tabular Datasets},
author={Po-Yi Lu and Yi-Jie Cheng and Chun-Liang Li and Hsuan-Tien Lin},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2025},
url={https://openreview.net/forum?id=855yo1Ubt2},
note={Reproducibility Certification}
}
@inproceedings{zhan2021comparative,
title={A Comparative Survey: Benchmarking for Pool-based Active Learning.},
author={Zhan, Xueying and Liu, Huan and Li, Qing and Chan, Antoni B},
booktitle={IJCAI},
pages={4679--4686},
year={2021}
}
We are glad to see more research community investigate the AL methods for tabular data.
If you have any further questions or want to discuss Active Learning with me, please leave issues or contact Po-Yi (Poy) Lu ariapoy@gmail.com/d09944015@csie.ntu.edu.tw.