Skip to content

Feat(tests): build test infrastructure#144

Open
chen2021673 wants to merge 9 commits into
masterfrom
CTest-clean
Open

Feat(tests): build test infrastructure#144
chen2021673 wants to merge 9 commits into
masterfrom
CTest-clean

Conversation

@chen2021673
Copy link
Copy Markdown
Contributor

@chen2021673 chen2021673 commented Apr 14, 2026

Summary

This PR refactors InfiniTrain’s test infrastructure around CTest and GoogleTest.

It consolidates the old test/ and tests/ layout into a single tests/ directory, introduces shared CMake utilities for test registration, and migrates applicable tests to device-parameterized TEST_P so CPU/CUDA cases can share the same test logic where appropriate.

Closes #120.

Changes

  • merge the old test/ directory into tests/
  • add shared CMake/GTest utilities under tests/common/
  • reduce repeated test registration boilerplate in per-suite CMakeLists.txt
  • migrate applicable tests from fixed-device TEST_F to device-parameterized TEST_P
  • replace hardcoded device selection with shared helpers such as GetDevice()
  • improve label-based selection for CPU/CUDA-related tests
  • refactor registration for all tests

How to run

ctest --output-on-failure
ctest -L cpu --output-on-failure
ctest -L cuda --output-on-failure

Impact

This is mainly a test infrastructure refactor. It is not intended to change training/runtime behavior, but it does change how tests are organized and registered.

Result

ctest --output-on-failure

image

ctest -L cpu --output-on-failure

image

ctest -L cuda --output-on-failure

image

Comment thread CMakeLists.txt
Comment thread CMakeLists.txt
Comment thread CMakeLists.txt Outdated
Comment thread tests/common/test_utils.h Outdated
Comment thread tests/dtype/CMakeLists.txt Outdated
Comment thread docs/test_infrastructure_design.md Outdated
Comment thread tests/common/test_utils.h Outdated
Comment thread cmake/test_macros.cmake
Comment thread tests/common/test_utils.h Outdated
Comment thread tests/tensor/test_tensor_copy.cc Outdated
}

TEST_P(TensorCopyTest, CopiesCPUToCUDA) {
ONLY_CUDA();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种语义上就不应该有 cpu 的版本,但实际上还是注册了 cpu 的版本,虽然被 skip 了,但感觉还是有点怪:

  1. not use cuda 时这个函数不应该被编译;
  2. 即使 use cuda,也不应该注册 cpu 版本(那也没有 TEST_P 的必要了),可能需要改一下注册函数体现这种例外。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对的,现在这样skip是因为编译期感知不到测例内部的信息。如果要在编译期进行控制,那就需要用#ifdef USE_CUDA + TEST_F/TEST,另外也不能用infini_train_add_test_suite,要用CUDA-only test 注册方式。我觉得如果这种ONLY_CUDA/ONLY_CPU的测例确认是极少数的话可以不这么搞,用冗余滞后的跳过逻辑保留注册清晰度?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要用#ifdef USE_CUDA + TEST_F/TEST,另外也不能用infini_train_add_test_suite,要用CUDA-only test 注册方式。

是的,我的意思也是倾向这么实现,考虑到后面还要支持其他平台,还是在编译期就直接拦截不应该被编译的 suite,否则之后每个平台都会带上一批与当前平台无关的测试编译,引入不必要的平台耦合(不排除后续可能会有在测例中直接调用 runtime api 的情况,可能导致编译就直接无法通过)。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,我做修改。ONLY_CPU()测试也做同样改动吗?

Comment thread tests/common/test_utils.h Outdated
Comment thread tests/common/test_utils.h
Comment thread tests/common/test_utils.h Outdated
Comment thread tests/transformer/test_transformer_architecture.cc Outdated
Comment thread infini_train/include/nn/parallel/global.h Outdated
@kilinchange kilinchange changed the title [WIP]Feat(tests): build test infrastructure Feat(tests): build test infrastructure May 8, 2026
luoyueyuguang and others added 8 commits May 11, 2026 07:09
- Add infini_train_add_test CMake macro for simplified test registration
- Integrate gtest_discover_tests for automatic test case discovery
- Refactor all test directories to use unified macro (autograd, optimizer, hook, slow, lora)
- Reduce test CMakeLists.txt code by 68%
- Add LoRA tests (12 test cases)
- Delete TEST_REPORT.md
- Test labels: cpu/cuda/distributed/slow for flexible test execution
- Add shared test_macros.cmake in tests/common/

BREAKING CHANGE: Test registration now uses macro instead of manual add_test()

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace TEST_F with TEST_P across all test suites so each suite runs on
both CPU and CUDA without duplicating test logic. Adds InfiniTrainTestP,
TensorTestBaseP, AutogradTestBaseP, and DistributedInfiniTrainTestP base
classes with automatic CUDA/NCCL skip guards. Introduces
INFINI_TRAIN_REGISTER_TEST* C++ macros and infini_train_add_test_suite
CMake macro to eliminate repetitive INSTANTIATE_TEST_SUITE_P /
infini_train_add_test boilerplate. Removes deprecated test/, slow/, and
split optimizer test files; consolidates optimizer tests into a single
binary with creation  + step suites.
- Simplify CMakeLists: single CTest target per suite, remove label splitting
- Migrate old test/ directory into tests/ and delete test/
- Add docs/test_usage_guide.md with build/run/write instructions
- Rename hook_mechanism.md → hook_mechanism_design.md
- Rename lora_usage.md → lora_usage_guide.md
- Add googletest as submodule in .gitmodules
- Add infini_run tool target in CMakeLists.txt, remove stale comments
Add IsInitialized() to GlobalEnv and guard SetUpTestSuite so a second
test class in the same process skips re-initialization instead of
hitting CHECK(!initialized_). Also print try_compile output on
compile-fail test to surface header-not-found vs real type errors.
- Add requires_grad default parameter to Tensor ctor so tests can
  construct autograd-enabled tensors without a fixture helper.
- Remove InfiniTrainTest::createTensor, AutogradTestBase, and
  FillConstantTensor; call sites use std::make_shared<Tensor>(...) and
  Tensor::Fill(value) directly.
- Replace gtest_main with a custom tests/common/test_main.cc that
  initializes GlobalEnv once before RUN_ALL_TESTS, eliminating the
  need for GlobalEnv::IsInitialized and per-suite SetUpTestSuite
  init guards.
- Gate CUDA test registration on USE_CUDA: when disabled, the CUDA
  parameterization is simply not instantiated instead of skipped at
  runtime.
- Move test_macros.cmake to cmake/ and include test headers via full
  project-root paths.
- Drop tests' dependency on example/*/config.h; TransformerModule
  tests now construct TransformerConfig directly.
- Add SanitizeGPT2Config / SanitizeLLaMA3Config in example/.
Comment thread tests/common/test_utils.h
#endif

#include "infini_train/include/device.h"
#include "gtest/gtest.h"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gtest 作为 third_party 头文件,应当放在本项目头文件上方的单独分组中。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread tests/autograd/test_autograd.cc Outdated
#include "infini_train/include/autograd/transform.h"
#include "infini_train/include/tensor.h"
#include "tests/common/test_utils.h"
#include "gtest/gtest.h"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://gxtctab8no8.feishu.cn/docx/ARFVdldxPo87zHxIXe4c5LMwnNl#share-NPX0dbmvDoyvkExhH1QclVMknch

third_party 头文件引用(gtest.h)和本项目的内部实现头文件引用(test_utils.h)的位置需要修正,其他文件同理。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

- Move gtest header into its own include group between stdlib and
  project headers across all test sources, per project convention
  that third-party headers sit separately above project includes.
- Split device-specific tests in tests/tensor/ into cpu_only/ and
  cuda_only/ subdirectories, each built as an independent test
  target; CUDA-only tests are skipped when USE_CUDA=OFF.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants