
AI Performance Architect
- Bangalore, Karnataka
- Permanent
- Full-time
- Write the details of role here:
- Benchmark AI workloads (LLMs) in single and multi-node High Performance GPU configurations.
- Project and Analyze systems performance for LLMs using various parallelization techniques.
- Develop methodologies to measure key performance metrics and understand bottlenecks to improve efficiency.
- Understanding of transformer-based model architectures and basic GEMM operations.
- Strong programming skills in Python, C/C++.
- Proficiency in systems (CPU, GPU, Memory, or Network) architecture analysis and performance modelling.
- Experience with parallel computing architectures, interconnect fabrics and AI workloads (Finetuning/Inference).
- Experience with DL Frameworks (Pytorch, Tensorflow), Profiling tools (Nsight Systems, Nsight Compute, Rocprof), Containerized Environment (Docker)