I work on one problem: how sparse neural systems learn to route computation — and when routing actually helps.
Currently a research assistant in Prof. Anna Choromanska's lab at NYU, working on self-supervised world models for autonomous driving with LiDAR.
What I'm building
- AD-LiST-JEPA — spatiotemporal JEPA world model for autonomous driving; predicts future BEV LiDAR embeddings without labels or contrastive pairs
- KAN-Multi — routing layer that selects among 6 function bases with zero supervision; +6.8% over MLP on CIFAR-100
- MoE-Bench — open diagnostic toolkit for expert collapse & routing entropy in sparse MoE LLMs (OLMoE, JetMoE, Qwen)
What I care about Self-supervised learning · Sparse MoE architectures · Neural routing · World models · LiDAR perception
Stack Python · PyTorch · C/C++ · Go · HuggingFace · Docker · FastAPI · AWS
Notable Open source contributions
|
vllm-project/vllm #44795 Fix nightly Docker ImportError: AnthropicOutputConfig
|
|
|
|
NVIDIA-NeMo/Automodel #2601 Re-tie lm_head to active embed_tokens on Gemma4 MoE path
|
|
|
|
NVIDIA-NeMo/Automodel #2709 Cherry-pick #2601 into r0.5.0
|
|
|

