AI Research Intern
Origin
- Bangalore, Karnataka
- Training
- Full-time
- Design and train diffusion-based generative models for realistic, high-resolution synthetic data.
- Build compact Vision–Language Models (VLMs) to caption, query and retrieve job-site scenes for downstream perception tasks.
- Develop Vision–Language Action Models (VLA) objectives that link textual work-orders with pixel-level segmentation masks.
- Architect large-scale auto-annotation pipelines that transform unlabeled images / point-clouds into high-quality labels with minimal human input.
- Benchmark model performance on accuracy, latency and memory for deployment on Jetson-class hardware; compress with distillation or LoRA.
- Collaborate with perception and robotics teams to integrate research prototypes into live ROS 2 stacks.
- Strong foundation in deep learning, probabilistic modeling and computer vision (coursework or research projects).
- Hands-on experience with diffusion models (e.g., DDPM, Latent Diffusion) in PyTorch or JAX.
- Familiarity with multimodal transformers / VLMs (CLIP, BLIP, Flamingo, LLaVA, etc.) and contrastive pre-training objectives.
- Working knowledge of data-centric AI: active learning, self-training, pseudo-labeling and large-scale annotation pipelines.
- Solid coding skills in Python, PyTorch / Lightning, plus git-driven workflows; bonus for C++ and CUDA kernels.
- Bonus: experience with on-device inference (TensorRT, ONNX Runtime) & synthetic data tools (Isaac Sim).
- PyTorch or JAX
- C++
- CUDA kernels
- ONNX Runtime
- TensorRT
- Isaac Sim
- Latent Diffusion