Skip to content

Smart Data Solutions Intern AI Engineer Internship 2026

The Smart Data Solutions Intern AI Engineer role is a fantastic opportunity for students and early-career professionals looking to gain hands-on experience in artificial intelligence, machine learning, and natural language processing. This full-time internship is based in Chennai, India, at Smart Data Solutions’ office in Perungudi, and is designed for candidates eager to work with cutting-edge technologies, large datasets, and advanced AI models.

As an Intern AI Engineer, you will contribute to developing machine learning and deep learning models for pharmaceutical document analysis, building OCR and OMR pipelines, and experimenting with large language models (LLMs) like Qwen and Nuextract. This internship is perfect for those passionate about AI in healthcare, biomedical informatics, and document processing.

Smart Data Solutions Intern AI Engineer Internship 2026

About Smart Data Solutions

For over 20 years, Smart Data Solutions (SDS) has partnered with leading payer organizations to provide automation and technology solutions, focusing on data standardization and workflow automation. SDS handles claims and claims-related information in any format, digitizing and normalizing it for seamless use by payer clients. With over 420 healthcare organizations as clients, SDS processes more than 500 million transactions annually and maintains a 98%+ customer retention rate.

SDS has invested heavily in AI and machine learning to improve operational efficiency and client outcomes. Partnered with Parthenon Capital, SDS continues to accelerate product innovation and expansion. Interns joining SDS will gain exposure to real-world applications of AI, machine learning, and NLP in healthcare and pharmaceutical domains.

Role Overview

As a Smart Data Solutions Intern AI Engineer, your primary responsibilities include designing and implementing machine learning models, building OCR/OMR pipelines, and extracting structured data from unstructured documents such as clinical forms, prescriptions, and regulatory filings. You will also assist in integrating and experimenting with LLMs using frameworks like LangChain, supporting retrieval-augmented generation (RAG), and enhancing search capabilities.

This internship provides exposure to real-world AI projects, collaboration with domain experts, and the chance to apply machine learning techniques to complex healthcare datasets.

Key Responsibilities

  • Develop and evaluate machine learning and deep learning models for document analysis in pharma and healthcare
  • Build OCR/OMR pipelines and extract structured information from unstructured text
  • Integrate LLMs such as Qwen, Nuextract, and other open-source models using LangChain
  • Write scalable, testable Python and Java code for backend and integration tasks
  • Assist in creating prompt templates and LLM-enhanced search capabilities
  • Support data cleaning, annotation, and labeling tasks for medical/NLP datasets
  • Collaborate with data scientists and domain experts to improve model performance and accuracy
  • Handle large PDF/TIFF document corpora and use annotation tools effectively

Who Can Apply

CriteriaDetails
EducationCurrently pursuing or completed Bachelor’s/Master’s in Computer Science, AI, Data Science, or related fields
LocationChennai, India (Perungudi Office)
DurationFull-Time Internship
Work Hours4:00 PM to 1:00 AM
SkillsPython, Java, OCR/OMR, NLP, Machine Learning, Deep Learning, LLM, Transformers, PyTorch, TensorFlow

Share the opportunity

Required Skills

  • Strong coding proficiency in Python and Java
  • Solid understanding of machine learning, deep learning, and NLP fundamentals
  • Hands-on experience or coursework in OCR/OMR, computer vision, and document data extraction
  • Familiarity with libraries such as Transformers (Hugging Face), OpenCV, Tesseract, SpaCy, PyTorch, TensorFlow
  • Knowledge of LLMs, LangChain, Qwen, Nuextract, or other instruction-following models
  • Ability to work independently and collaboratively in cross-functional teams

Preferred Skills

  • Background in Biomedical AI, Healthcare Informatics, or Pharmaceutical NLP projects
  • Experience working with large document datasets and annotation tools
  • Knowledge of information retrieval, prompt engineering, and LLM deployment
  • Strong analytical and problem-solving skills
  • Interest in applying AI to healthcare, pharma, and regulatory document processing

What You Will Gain

  • Hands-on experience with machine learning, NLP, and LLMs in healthcare and pharma
  • Exposure to real-world AI workflows, including OCR, OMR, and large-scale document processing
  • Practical experience with Python, Java, Transformers, PyTorch, TensorFlow, and LangChain
  • Opportunity to collaborate with experienced AI engineers, data scientists, and domain experts
  • Development of technical, analytical, and problem-solving skills in a professional environment
  • Experience with healthcare datasets, document annotation, and AI-driven workflow automation

How to Apply

To apply for the Smart Data Solutions Intern AI Engineer role, click the Apply Now button and submit your resume. Highlight your experience with Python, Java, NLP, OCR/OMR, and any AI/ML projects. Mention coursework, research, or prior internships relevant to LLMs, deep learning, or document data processing.

Conclusion

The Smart Data Solutions Intern AI Engineer role is an excellent opportunity for students and early-career professionals to gain practical experience in AI, machine learning, NLP, and healthcare informatics. With exposure to real-world projects, large datasets, and cutting-edge LLMs, interns will develop invaluable skills to advance their careers in AI and data science. If you are passionate about leveraging AI for healthcare and document automation, this internship is the perfect way to start your journey. 🤖📄💡

Find your dream job tap the heart!

Share the opportunity

Leave a Reply

Your email address will not be published. Required fields are marked *