Applied Research Scientist - AI Models

Bangalore, Karnataka
Permanent
Full-time

2 months ago

Job Description:WHAT YOU DO AT AMD CHANGES EVERYTHINGWe care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.AMD together we advance_THE ROLE:The AI Models team is looking for exceptional machine learning scientists and engineers to explore and innovate on training and inference techniques for large language models (LLMs), large multimodal models (LMMs), image/video generation and other foundation models. You will be part of a world-class research and development team focussing on efficient and scalable pre-training, instruction tuning, alignment and optimization. As an early member of the team, you can help us shape the direction and strategy to fulfill this important charter.THE PERSON:This role is for you if you are passionate about reading through the latest literature, coming up with novel ideas, and implementing those through high quality code to push the boundaries on scale and performance. The ideal candidate will have both theoretical expertise and hands-on experience with developing LLMs, LMMs, and/or diffusion models. We are looking for someone who is familiar with hyper-parameter tuning methods, data preprocessing & encoding techniques and distributed training approaches for large models.KEY RESPONSIBILITIES:

Pre-train and post-train models over large GPU clusters while optimizing for various trade-offs.
Improve upon the state-of-the-art in Generative AI model architectures, data and training techniques.
Accelerate the training and inference speed across AMD accelerators.
Build agentic frameworks to solve various kinds of problems
Publish your research at top-tier conferences, workshops and/or through technical blogs.
Engage with academia and open-source ML communities.
Drive continuous improvement of infrastructure and development ecosystem.

PREFERRED EXPERIENCE:

Strong development and debugging skills in Python.
Experience in deep learning frameworks (like PyTorch or TensorFlow) and distributed training tools (like DeepSpeed or Pytorch Distributed).
Experience with fine-tuning methods (like RLHF & DPO) as well as parameter efficient techniques (like LoRA & DoRA).
Solid understanding of various types of transformers and state space models.
Strong publication record in top-tier conferences, workshops or journals.
Solid communication and problem-solving skills.
Passionate about learning new stuffs in this domain as well as innovating on top of it

ACADEMIC CREDENTIALS:

Advanced degree (Master’s or PhD) in machine learning, computer science, artificial intelligence, or a related field is expected. Exceptional Bachelor’s degree candidates may also be considered.

#LI-MK1Benefits offered are described: .AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

Advanced Micro Devices

Apply Now