Comprehensive head classification. Presence/absence of hats, sunglasses, and masks; eyes open/closed; mouth open/closed; background simplicity/complexity; and Face Image Quality Assessment (FIQA).
It is capable of rapidly performing seven types of classification and inference in a single inference pass.
Merged model inputs:
head_image_48x48:[1, 3, 48, 48], head crop used for background, mask, sunglasses, and hat classificationeye_images_24x40:[2, 3, 24, 40], two eye crops used for eye-open classificationmouth_image_30x48:[1, 3, 30, 48], mouth crop used for mouth-open classificationhead_image_352x352:[1, 3, 352, 352], head crop used for FIQA in FIQA-enabled models
Input normalization:
head_image_48x48,eye_images_24x40, andmouth_image_30x48: RGBfloat32, normalized to0.0..1.0by dividing pixel values by255head_image_352x352: RGBfloat32, normalized to0.0..1.0and then ImageNet-normalized with mean[0.485, 0.456, 0.406]and std[0.229, 0.224, 0.225]
Merged model outputs:
prob_bg_plain:[1], probability that the background is plain/complicatedprob_masked:[1], probability that the person is wearing a maskprob_sunglass:[1], probability that the person is wearing sunglassesprob_hat:[1], probability that the person is wearing a hatprob_eye_open:[2], probability that each eye is openprob_mouth_open:[1], probability that the mouth is openquality_score:[1, 1], face image quality score for FIQA-enabled models
Install dependencies and run the builder with uv:
uv sync
source .venv/bin/activate
uv run build-chc-onnx --variant sThe uv environment uses Python 3.13 from .python-version.
Core dependencies are pinned in pyproject.toml.
Useful options:
uv run build-chc-onnx --variant s --verify
uv run build-chc-onnx --variant s --disable-fiqa
uv run build-chc-onnx --variant s --disable-onnxsim
uv run build-chc-onnx --variant s --output /path/to/output.onnxDownload chc_*.onnx from models and place them in the root folder.
Use sit4onnx through uv after building the merged ONNX files:
uv run sit4onnx -if chc_s.onnx -tlc 100 -oep cpu
uv run sit4onnx -if chc_s_wo_fiqa.onnx -tlc 100 -oep cpu-tlc is the inference loop count. sit4onnx reports total elapsed time and
average elapsed time per inference.
To benchmark with GPU providers from onnxruntime-gpu, change -oep:
uv run sit4onnx -if chc_s.onnx -tlc 1000 -oep cuda
uv run sit4onnx -if chc_s.onnx -tlc 1000 -oep tensorrtProfiling output can be enabled with -pro:
uv run sit4onnx -if chc_s.onnx -tlc 100 -oep cpu -proFor reproducible benchmarks with fixed input tensors, save each input as a
.npy file and pass them in graph input order:
uv run sit4onnx \
-if chc_s.onnx \
-tlc 100 \
-oep cpu \
-ifp head_image_48x48.npy \
-ifp eye_images_24x40.npy \
-ifp mouth_image_30x48.npy \
-ifp head_image_352x352.npyFor chc_s_wo_fiqa.onnx, omit head_image_352x352.npy.
Download chc_*.onnx and chc_*.tflite, yolomit_*.onnx, yolomit_*.tflite from models and place them in the root folder.
The Electron benchmark app lives in benchmark-app/ and runs ONNX Runtime Web
or LiteRT.js inside the Chromium renderer. JavaScript dependencies are pinned
exactly in package.json and locked by pnpm-lock.yaml.
cd benchmark-app
pnpm install --frozen-lockfile
pnpm devDuring dev, root-level chc_*.onnx, chc_*_float32.tflite, ONNX Runtime Web
assets, and LiteRT.js Wasm assets are served directly by the Vite asset plugin.
During vite build, the same plugin copies models into
benchmark-app/dist/models/, ONNX Runtime Web assets into
benchmark-app/dist/ort/, and LiteRT.js assets into
benchmark-app/dist/litert/wasm/. These copied assets are generated files and
are not tracked by git.
During dev, root-level chc_*.onnx files and ONNX Runtime Web assets are served
directly by the Vite asset plugin. During vite build, the same plugin copies
models into benchmark-app/dist/models/ and ONNX Runtime Web assets into
benchmark-app/dist/ort/. These copied assets are generated files and are not
tracked by git.
Build and smoke-test the app:
cd benchmark-app
pnpm build
pnpm benchmark:wasm
pnpm benchmark:litert:wasm
pnpm benchmark:webgpu
pnpm benchmark:litert:webgpuThe WASM smoke scripts should run anywhere Electron can start. WebGPU smoke scripts require a Chromium WebGPU-capable environment and may report that the backend is unsupported when no GPU adapter is available.
- VSDLM: Visual-only speech detection driven by lip movements - MIT License
- OCEC: Open closed eyes classification. Ultra-fast wink and blink estimation model - MIT License
- PGC: Ultrafast pointing gesture classification - MIT License
- SC: Ultrafast sitting classification - MIT License
- PUC: Phone Usage Classifier is a three-class image classification pipeline for understanding how people interact with smartphones - MIT License
- HSC: Happy smile classifier - MIT License
- WHC: Waving Hand Classification - MIT License
- UHD: Ultra-lightweight human detection - MIT License
- MWC: Mask wearing classifier. - MIT License
- SGC: Classification of wearing vs. not wearing sunglasses. 48x48. - MIT License
- HHC: Head Hat Classification. HHC is a binary classifier for cropped head images. 48x48. - MIT License