
Python/FlaskAPI/Kubernetes +AI Platform Development Lead - Vice President - Software Engineering
- Bangalore, Karnataka
- Permanent
- Full-time
- Develop tooling and self-service capabilities for deploying AI solutions for the firm. Collaborate with other developers to enhance the developer experience when building and deploying AI applications.
- Have a platform mindset and build common, reusable solutions to scale Generative AI use cases using pre-trained models as well as fine-tuned models.
- Collaborate with product manager, other tech leads, junior staff and other stakeholders to analyze requirements, translate them into technical specification and architecture documentation.
- Design scalable, robust, secure, and flexible architecture of components of the AI development platform.
- Leverage Kubernetes/OpenShift to develop modern containerized workloads.
- Leverage container registries like JFrog artifactory, container packaging/configuration management technologies like Helm & Kustomize, and GitOps deployment methods to orchestrate, manage and deploy these workloads.
- Integrate with capabilities such as large-scale vector stores for embeddings.
- Author best practices on the Generative AI ecosystem, when to use which tools, available models such as GPT, Llama, Hugging Face etc. and libraries such as Langchain.
- Analyze, investigate, and implement GenAI solutions focusing on Agentic Orchestration and Agent Builder frameworks.
- Contribute to major design decisions and product selection for building Generative AI solutions. Inclusive of app authentication, service communication, state externalization, container layering strategy and immutability.
- Ensure AI platform are reliable, scalable, and operational; (e.g. blueprints for upgrade/release strategies (E.g. Blue/Green); logging/monitoring/metrics; automation of system management tasks)
- Participate in all team’s Agile/ Scrum ceremonies.
- At least 6 years' relevant experience would generally be expected to find the skills required for this role.
- 5+ years of experience architecting distributed systems.
- 1+ year of experience building AI applications, preferably Generative AI and LLM based apps - Desirable
- Strong hands-on Application Development background in at least one prominent programming language, preferably Python Flask or FAST API.
- Broad understanding of data engineering (SQL, NoSQL, Big Data, Kafka, Redis), data governance, data privacy and security.
- Experience in development, management, and deployment of Kubernetes workloads, preferably on OpenShift.
- Experience with designing, developing, and managing RESTful services for large-scale enterprise solutions.
- Hands-on experience with multiprocessing, multithreading, asynchronous I/O, performance profiling in at least one prominent programming language, preferably python.
- Practitioner of unit testing, performance testing and BDD/acceptance testing.
- Understanding of OAuth 2.0 protocol for secure authorization.
- Proficiency with Open Telemetry tools including Grafana, Loki, Prometheus, and Cortex.
- Demonstrated experience in DevOps, understanding of CI/CD (Jenkins) and GitOps.
- Ability to articulate technical concepts effectively to diverse audiences.
- Strong desire and ability to influence development teams and help them adopt AI.
- Demonstrated ability to work effectively and collaboratively in a global organization, across time zones, and across organizations.
- Understanding of deep learning, understanding of Machine Learning frameworks such as TensorFlow or PyTorch.
- Understanding of Information Security, Secure coding practices.
- Experience in building cloud and container native applications.
- Knowledge of DevOps and Agile practices.
- Excellent communication skills.
- Strong understanding and experience designing and building software following cloud-native architecture with microservices, Kubernetes, docker, kafka, etc.
- Good knowledge of Microservice based architecture, industry standards, for both public and private cloud.
- Good understanding of modern Application configuration techniques.
- Hands on experience with Cloud Application Deployment patterns like Blue/Green.
- Good understanding of State sharing between scalable cloud components (Kafka, dynamic distributed caching).
- Good knowledge of various DB engines (SQL, Redis, Kafka, etc) for cloud app storage.
- Experience building AI applications, preferably Generative AI and LLM based apps.
- Deep understanding of AI agents, Agentic Orchestration, Multi-Agent Workflow Automation, along with hands-on experience in Agent Builder frameworks such Lang Chain and Lang Graph.
- Experience working with Generative AI development, embeddings, fine tuning of Generative AI models.
- Understanding of ModelOps/ ML Ops/ LLM Op.
- Understanding of SRE techniques.
- Understanding of Responsible AI, AI Ethics.
- Understanding of Information Security, Secure coding practices.