
Applied Research Scientist - AI Models
- Bangalore, Karnataka
- Permanent
- Full-time
- Pre-train and post-train models over large GPU clusters while optimizing for various trade-offs.
- Improve upon the state-of-the-art in Generative AI model architectures, data and training techniques.
- Accelerate the training and inference speed across AMD accelerators.
- Build agentic frameworks to solve various kinds of problems
- Publish your research at top-tier conferences, workshops and/or through technical blogs.
- Engage with academia and open-source ML communities.
- Drive continuous improvement of infrastructure and development ecosystem.
- Strong development and debugging skills in Python.
- Experience in deep learning frameworks (like PyTorch or TensorFlow) and distributed training tools (like DeepSpeed or Pytorch Distributed).
- Experience with fine-tuning methods (like RLHF & DPO) as well as parameter efficient techniques (like LoRA & DoRA).
- Solid understanding of various types of transformers and state space models.
- Strong publication record in top-tier conferences, workshops or journals.
- Solid communication and problem-solving skills.
- Passionate about learning new stuffs in this domain as well as innovating on top of it
- Advanced degree (Master’s or PhD) in machine learning, computer science, artificial intelligence, or a related field is expected. Exceptional Bachelor’s degree candidates may also be considered.