Description
When decord is imported before torch in FlashVstream.py, the process may hang around torch.torch_version import or abort (core dumped).
Root cause is a known OpenMP runtime conflict: many decord/FFmpeg builds link GNU OpenMP (libgomp), while PyTorch wheels with MKL pull in Intel OpenMP (libiomp5). Loading both in the same process — and especially initializing libgomp first — can lead to symbol/initialization conflicts and aborts.
(above is generated by gpt5)
Minimal Repro
# minimal_repro.py
import decord # loads libgomp first
import torch # then torch (with MKL) tries to init libiomp5 -> conflict
print("ok")
Run
python -X importtime minimal_repro.py
On affected systems this either hangs around torch.torch_version or aborts with messages like:
OMP: Error #15: Initializing libiomp5.so, but found libgomp already initialized.
Aborted (core dumped)
(or sometimes no explicit OMP message, just a core dump).
Environment
• OS: Linux (manylinux wheels)
• GPU: RTX 4090
• Driver: 560.35.05 (CUDA 12.6)
• Python: 3.10
• PyTorch: 2.0.1+cu117 (official wheel with MKL)
• decord: (pip wheel)
• Others (not required to repro): transformers/accelerate/deepspeed/bitsandbytes present in env
Note: The issue is import-order dependent and I am not sure whether it is reproducible across multiple machines; al least it bothered me an entire morning XD
Current Code (problematic order)
In src/model/FlashVstream.py:
import requests
from decord import VideoReader, cpu
import torch
...
This imports decord before torch, frequently triggering the conflict.
Expected Behavior
The code should not hang or abort due to OpenMP runtime conflicts. Importing torch should succeed reliably regardless of whether decord is present in the environment.
Actual Behavior
• Freeze during import torch (often around torch.torch_version)
• Or immediate abort with a core dump (sometimes with OMP #15 error)
Proposed Fixes (any of the following)
1. Change import order: import torch before decord.
2. Lazy import: move from decord import VideoReader, cpu inside load_video() so decord is only imported when actually needed.
3. Unify OpenMP runtime: set environment variables early in the entry module to favor GNU OpenMP when MKL is present:
import os
os.environ.setdefault("MKL_THREADING_LAYER", "GNU")
os.environ.setdefault("OMP_NUM_THREADS", "1")
I tried the first two methods and they work well.
Description
When decord is imported before torch in FlashVstream.py, the process may hang around torch.torch_version import or abort (core dumped).
Root cause is a known OpenMP runtime conflict: many decord/FFmpeg builds link GNU OpenMP (libgomp), while PyTorch wheels with MKL pull in Intel OpenMP (libiomp5). Loading both in the same process — and especially initializing libgomp first — can lead to symbol/initialization conflicts and aborts.
(above is generated by gpt5)
Minimal Repro
Run
python -X importtime minimal_repro.py
On affected systems this either hangs around torch.torch_version or aborts with messages like:
OMP: Error #15: Initializing libiomp5.so, but found libgomp already initialized.
Aborted (core dumped)
(or sometimes no explicit OMP message, just a core dump).
Environment
• OS: Linux (manylinux wheels)
• GPU: RTX 4090
• Driver: 560.35.05 (CUDA 12.6)
• Python: 3.10
• PyTorch: 2.0.1+cu117 (official wheel with MKL)
• decord: (pip wheel)
• Others (not required to repro): transformers/accelerate/deepspeed/bitsandbytes present in env
Note: The issue is import-order dependent and I am not sure whether it is reproducible across multiple machines; al least it bothered me an entire morning XD
Current Code (problematic order)
This imports decord before torch, frequently triggering the conflict.
Expected Behavior
The code should not hang or abort due to OpenMP runtime conflicts. Importing torch should succeed reliably regardless of whether decord is present in the environment.
Actual Behavior
• Freeze during import torch (often around torch.torch_version)
• Or immediate abort with a core dump (sometimes with OMP #15 error)
Proposed Fixes (any of the following)
1. Change import order: import torch before decord.
2. Lazy import: move from decord import VideoReader, cpu inside load_video() so decord is only imported when actually needed.
3. Unify OpenMP runtime: set environment variables early in the entry module to favor GNU OpenMP when MKL is present:
I tried the first two methods and they work well.