
Senior AI/NLP Engineer
- Chennai, Tamil Nadu
- Permanent
- Full-time
- Design, build, and optimize document parsing pipelines using tools like Amazon Textract, Azure Form Recognizer, or Google Document AI. Perform data preprocessing, labeling, and annotation for training machine learning and NLP models. Fine-tune or train models for tasks such as Named Entity Recognition (NER), text classification, and layout understanding using PyTorch, TensorFlow, or HuggingFace Transformers. Integrate document intelligence capabilities into larger workflows and applications using REST APIs, microservices, and cloud components (e.g., AWS Lambda, S3, SageMaker). Evaluate model and OCR accuracy, applying post-processing techniques or heuristics to improve precision and recall. Collaborate with data engineers, DevOps, and product teams to ensure solutions are robust, scalable, and meet business KPIs. Monitor, debug, and continuously enhance deployed document AI solutions. Maintain up-to-date knowledge of industry trends in OCR, Document AI, NLP, and machine learning.
- 4-5 years of hands-on experience in machine learning, document AI, or NLP-focused roles. Strong expertise in OCR tools and frameworks, especially Amazon Textract, Azure Form Recognizer, Google Document AI, or open-source tools like Tesseract, LayoutLM, or PaddleOCR. Solid programming skills in Python and familiarity with ML/NLP libraries: scikit-learn, spaCy, transformers, PyTorch, TensorFlow, etc. Experience working with structured and unstructured data formats, including PDF, images, JSON, and XML. Hands-on experience with REST APIs, microservices, and integrating ML models into production pipelines. Working knowledge of cloud platforms, especially AWS (S3, Lambda, SageMaker) or their equivalents. Understanding of NLP techniques such as NER, text classification, and language modeling. Strong debugging, problem-solving, and analytical skills. Clear verbal and written communication skills for technical and cross-functional collaboration.