AI Research Intern

Origin

Bangalore, Karnataka
Training
Full-time

3 days ago
Apply easily

About Origin(previously 10xConstruction) is building general-purpose autonomous robots for US construction to tackle rising costs, safety risks, and labor shortages. Our modular, multi-trade platform combines purpose-built hardware with real-time site intelligence to navigate complex environments and execute tasks with precision. Trained in high-fidelity simulation and already deployed on live sites, our robots deliver 5x faster execution, 250%+ margin expansion, and significant cost savings. Join India’s most talent-dense robotics team consisting of individuals from IITs, Stanford, UCLA, etc.About the roleAs an AI Research Intern you'll push the frontiers of generative and multimodal learning that power our autonomous robots. You will prototype diffusion-based vision models, vision–language architectures (VLAs/VLMs) and automated data-annotation pipelines that turn raw site footage into training gold.Key Responsibilities

Design and train diffusion-based generative models for realistic, high-resolution synthetic data.
Build compact Vision–Language Models (VLMs) to caption, query and retrieve job-site scenes for downstream perception tasks.
Develop Vision–Language Action Models (VLA) objectives that link textual work-orders with pixel-level segmentation masks.
Architect large-scale auto-annotation pipelines that transform unlabeled images / point-clouds into high-quality labels with minimal human input.
Benchmark model performance on accuracy, latency and memory for deployment on Jetson-class hardware; compress with distillation or LoRA.
Collaborate with perception and robotics teams to integrate research prototypes into live ROS 2 stacks.

RequirementsQualifications & Skills

Strong foundation in deep learning, probabilistic modeling and computer vision (coursework or research projects).
Hands-on experience with diffusion models (e.g., DDPM, Latent Diffusion) in PyTorch or JAX.
Familiarity with multimodal transformers / VLMs (CLIP, BLIP, Flamingo, LLaVA, etc.) and contrastive pre-training objectives.
Working knowledge of data-centric AI: active learning, self-training, pseudo-labeling and large-scale annotation pipelines.
Solid coding skills in Python, PyTorch / Lightning, plus git-driven workflows; bonus for C++ and CUDA kernels.
Bonus: experience with on-device inference (TensorRT, ONNX Runtime) & synthetic data tools (Isaac Sim).

Preferred Experiences

PyTorch or JAX
C++
CUDA kernels
ONNX Runtime
TensorRT
Isaac Sim
Latent Diffusion

Origin