Lead Data Scientist

India
Permanent
Full-time

21 days ago
Apply easily

JOB DESCRIPTIONDesignation: Lead Data ScientistLocation: Hyderabad, IndiaWork Mode: OfficeReporting to: Director/Head of Data ScienceAbout US:Foundation AI automatically ingests incoming documents, emails, and attachments from across your firm. It profiles, matches, classifies, and saves each to your DMS and then automates document-dependent workflows according to your rules. Read more about us atJob Overview:
As a Lead Data Scientist, you will serve as a senior technical expert, driving the design and delivery of advanced AI/ML and LLM-based solutions from concept to production. This is an individual contributor role with a strong alignment and coordination component — ensuring technical excellence, stakeholder engagement, and seamless integration across product, engineering, and delivery teams.The ideal candidate combines deep hands-on expertise in AI/ML/LLM/RAG systems with practical problem-solving acumen, delivering scalable solutions to complex business challenges.Responsibilities:1. AI/ML/LLM/RAG Solution Development

Architect and implement end-to-end AI/ML solutions, leveraging structured and unstructured data (text, images, metadata, multimodal sources).
Lead development and deployment of LLM-powered systems for classification, retrieval-augmented generation (RAG), information extraction, and summarization.
Apply advanced techniques such as LoRA, PEFT, quantization, and model distillation to optimize model performance, latency, and cost.
Design and implement robust retrieval pipelines with vector search, hybrid retrieval, and reranking strategies.

2. Technical Excellence & Best Practices

Ensure code quality, reproducibility, and model governance across the AI/ML lifecycle — from experimentation to production release.
Establish and promote best practices for data processing, feature engineering, model evaluation, and monitoring.
Conduct rigorous performance benchmarking and failure analysis, ensuring models meet accuracy, throughput, and reliability targets.

3. Product Lifecycle & Stakeholder Alignment

Partner closely with product managers, engineers, and customer-facing teams to translate business needs into AI/ML requirements.
Align model development milestones with product release schedules, ensuring timely and high-quality delivery.
Serve as a technical advisor during project scoping, prioritization, and release readiness reviews.

4. Research, Innovation & Thought Leadership

Stay current with cutting-edge research in AI, ML, NLP, LLMs, multimodal models, retrieval systems, and document intelligence..
Proactively identify opportunities to enhance existing algorithms and develop novel approaches.
Share learnings through internal tech talks, documentation, and mentorship to foster a culture of innovation.

Skills and Tools:

LLM & GenAI Expertise: Minimum 2 years of hands-on experience fine-tuning, prompting, and deploying LLMs (commercial and open-source) such as GPT, Claude, Gemini, Mistral, LLaMA, Falcon, Vicuna, MPT, T5.
NLP & Information Retrieval: At least 4 years working with NLP tasks — classification, NER, summarization, QA, RAG architectures — using Transformer-based models (BERT, RoBERTa, T5, etc.).
Deep Learning Frameworks: Strong experience with PyTorch and/or TensorFlow for model training and deployment.
Coding & Engineering: Expert-level Python; strong SQL; experience with FastAPI/Flask for serving models; Git proficiency.
Data & Infra: Proficiency with PostgreSQL and vector databases (Pinecone, Qdrant, Weaviate, etc.); familiarity with Docker/Kubernetes.
MLOps & Scaling: Experience with MLFlow/KubeFlow/SageMaker or equivalent for training pipelines, deployment, and monitoring at scale.
Prompt Engineering: Skilled in CoT, self-consistency, ToT, and advanced prompting for LLM optimization.
Ability to simplify complex technical concepts for diverse stakeholders.
Strong problem-solving skills with a bias toward scalable, maintainable solutions.
Excellent communication and documentation skills.
Track record of aligning technical execution with business priorities and delivery timelines.
Experience with multimodal models (text + vision).
Knowledge of knowledge graph construction and integration.
Familiarity with cloud services (AWS preferred).
Exposure to compliance and governance requirements in AI systems

Education:Bachelor’s or Master’s degree in Computer Science, Data Science, Electrical Engineering, Statistics, or a related discipline from a recognized Tier-1 or Tier-2 institution.Our Commitment:At Foundation AI, we're committed to creating an inclusive and diverse workplace. We value equal opportunity and affirmative action principles, giving everyone an equal chance to succeed. We're dedicated to offering equal employment opportunities regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, or veteran status. Upholding these values and adhering to applicable laws is paramount to us.CompetenciesAreasTools/SkillsMust / Good to haveExperienceLLM / GenAI DevelopmentGPT, Claude, Gemini, Mistral, LLaMA, T5, Falcon, Vicuna, MPT, OpenAI APIsMust HaveMin 2 YearsNLP ExpertiseNLTK, spaCy, HuggingFace Transformers, SentenceTransformersMust HaveMin 4 YearsRAG & Information RetrievalLangChain, LlamaIndex, Pinecone, Qdrant, WeaviateMust HaveMin 1 YearProduction-Scale Model DeploymentMLFlow, KubeFlow, Ray Serve, SageMaker, CI/CD for MLMust HaveMin 2 YearsTransformer & Advanced ArchitecturesBERT, RoBERTa, T5, LLaMA-based fine-tuningMust HaveMin 3 YearsPython & Software EngineeringNumPy, Pandas, Scikit-learn, FastAPI, Flask, REST APIsMust HaveMin 4 YearsPrompt EngineeringCoT, Self-Consistency, ToT, Retrieval-Enhanced PromptingMust HaveMin 1 YearVector & Relational DatabasesPostgreSQL, Pinecone, QdrantMust HaveMin 1 YearDeep Learning FrameworksPyTorch, TensorFlowMust HaveMin 3 YearsContainerization & OrchestrationDocker, KubernetesMust Have—Cloud & MLOpsAWS (S3, EC2, Lambda, SageMaker), GCP, AzureGood to Have—Computer Vision / Multimodal AIResNet, YOLO, CLIP, BLIPGood to Have—Model Optimization TechniquesLoRA, PEFT, Quantization, Pruning, DistillationMust HaveMin 1 YearStakeholder Alignment & Product IntegrationAgile, Scrum, cross-functional collaborationMust Have—For any feedback or inquiries, please contact us at
Learn more about us atPowered by JazzHR

Foundation AI