
Lead Data Scientist
- Bangalore, Karnataka
- Permanent
- Full-time
- Take ownership and be responsible for what you build - no micromanagement
- Work with A players (some of the best talents in the country), and expedite your learning curve and career growth
- Make in India and build for the world at the scale of 900 Million active users, which no other internet company in the country has seen
- Learn together from different teams on how they scale to millions of users and billions of messages.
- Explore the latest in topics like Data Pipeline, MongoDB, ElasticSearch, Kafka, Spark, Samza and share with the team and more importantly, have fun while you work on scaling MoEngage.
- Design and own the end-to-end technical vision for our next-generation marketing platform, synthesizing recommender systems and LLMs into a cohesive architecture.
- Lead the research, design, and implementation of novel models that combine predictive signals (e.g., "what to recommend") with generative capabilities (e.g., "why it's recommended").
- Establish and champion best practices across the full modeling stack, from classical ML fundamentals to MLOps for both recommender and generative models.
- Act as the primary technical mentor for data scientists, providing guidance on everything from feature engineering to fine-tuning LLMs.
- Architect, build, and deploy large-scale recommender systems using a variety of techniques (e.g., collaborative filtering, matrix factorization, content-based).
- Solve core recommendation challenges, including the cold-start problem, real-time personalization, and balancing exploration vs. exploitation.
- Develop and implement rigorous offline and online (A/B testing) evaluation frameworks to continuously measure and improve recommendation quality and business impact.
- Leverage classical machine learning models (e.g., XGBoost, Logistic Regression) to predict user behavior (e.g., propensity to click, purchase, or churn) to be used as key features in the recommendation engine.
- Lead the development of LLM-powered features that enhance our platform, such as campaign optimiser, creative generator, making customer data AI-ready with AI-generated metadata or creating natural language interfaces for our entire product suite.
- Spearhead efforts in fine-tuning and adapting pre-trained LLMs on our proprietary data to improve relevance, style, and factuality.
- Design and implement Retrieval-Augmented Generation (RAG) pipelines that allow LLMs to reason over our vast product or content catalogs.
- Partner with Product, Engineering, and Design leaders to translate ambitious business goals into a concrete technical roadmap.
- Communicate complex technical ideas and results effectively to a broad audience, from junior engineers to executive leadership.
- Drive projects from ideation to production, ensuring models are not only accurate but also scalable, efficient, and maintainable.
- Bachelor's/Master's degree or PhD in a quantitative field such as Computer Science, Statistics, Mathematics, or equivalent practical experience.
- 7+ years of hands-on experience building and deploying machine learning models in a business environment.
- Expert-level proficiency in Python and its data science libraries (e.g., pandas, NumPy, scikit-learn, XGBoost, spark).
- Advanced proficiency in SQL for querying large and complex datasets.
- 2+ years of demonstrated, hands-on experience developing and deploying solutions using Large Language Models (e.g., fine-tuning, RAG, prompt engineering).
- Proven track record of leading complex, end-to-end data science projects that have delivered significant business impact.
- Experience with cloud-based ML platforms / ML ops (e.g., AWS SageMaker, MLflow) and their generative AI services
- Hands-on experience with vector databases
- Familiarity with frameworks like LangChain or LlamaIndex or Agent Development Kit for building LLM applications.
- Knowledge of LLM operational concerns, including cost management, latency optimization, and responsible AI principles (bias, fairness, safety)