
Likith Kumar Dundigalla
Data Engineer | AI/ML Engineer
Building intelligent data pipelines and AI systems that transform raw data into actionable insights.
About Me
Background
I remember watching The Social Dilemma on Netflix—it was all about how platforms keep us
hooked. But while most people focused on the addictive nature of social media, I was more
fascinated by the tech behind it. The way data was being used to drive recommendations and
engagement blew my mind. That's what got me interested in data.
I started as a Data Engineer, working with large-scale data processing, building ETL
pipelines, and optimizing data flow for better efficiency. Along the way, I also got into
data analysis and built dashboards to derive insights, helping teams make better decisions.
But over time, I realized my role was mostly about handling the initial stages of
data—moving it, transforming it, and storing it. I wanted to go beyond that. I wanted to
understand how data could be used to make predictions, drive decisions, and power AI-driven
systems.
That's when I decided to dive deeper and pursue my Master's in Data Science. During my
studies, I worked on projects involving machine learning, natural language processing,
transformers, and large language models (LLMs). Learning how AI interprets and generates
language, how models like GPT and BERT work, and how data science can be applied in
real-world scenarios was a game-changer for me.
Now, with experience in both data engineering and data science, I see the bigger
picture—how raw data transforms into meaningful insights and AI-driven solutions. And that's
exactly what keeps me passionate about this field.
Skills
Expertise Areas
Data Engineering Data Analytics Machine Learning Artificial IntelligenceFrameworks & Tools
Data Engineering Python PySpark Pandas Polars Airflow Databricks ConfluentData Reporting DOMO BI Power BI Dataedo Datahub
Machine Learning & Artificial Intelligence Libraries( Scikit-learn, TensorFlow, PyTorch) Neural Networks(RNN, CNN) HuggingFace, LLMs Azure AI Search Azure AI Foundry Azure OpenAI Services
Education
Master of Science in Data Science
University of Arizona | Aug 2023 - Dec 2024 | Tucson, AZ, USA
- Relevant Coursework: Machine Learning, Deep Learning, NLP, Data Mining, Big Data Analytics
- GPA: 3.88/4.0
Bachelor of Technology in Electronics and Communication Engineering
Jawaharlal Nehru Technological University | 2018 - 2022 | Hyderabad, India
- Relevant Coursework: Database Management Systems, Data Structures, Operating Systems, Computer Networks
- GPA: 3.5/4.0
Experience
Data Engineer (Consultant)
Edmentum| June 2025 - Present | Remote
- Problem -
- Solution -
Data Consultant
CFG Health| May 2025 - August 2025 | Remote
- Problem - Faced with siloed data and manual reporting that delayed critical insights into jail healthcare staffing, hiring and costs.
- Solution - I spearheaded CFG Health's data modernization initiative. To solve this, I architected and built automated data pipelines from Paycom into Microsoft Fabric, then developed Power BI dashboards for expenditure reporting and deployed a RAG system for natural language queries. This end-to-end solution eliminated manual work, provided leadership with real-time financial visibility, and empowered non-technical teams to get instant answers from data, accelerating decision-making by 100%.
Software Engineer
Tricon Infotech LLC | May 2025 - Present | Remote
- Worked for client CFG Health as a Data Consultant and currenly working as a Data Engineer for Edmentum.
AI Programmer (Volunteer)
Acts4Unity Foundation | Feb 2025 - May 2025 | Palo Alto, CA, USA
-
Solely developed a multi-modal AI video agent to serve as an accessible online mental health companion. By integrating D-ID for realistic animation, Gemini LLM for empathetic conversation, and Wav2Vec for speech processing, the agent provides a responsive and natural interface. This allows for fluid, voice-based interactions that mimic a real-time conversation. All interactions are securely tracked in a database to analyze conversational patterns and improve the AI's therapeutic response capabilities over time, creating a more effective support system.
Data Science Intern
Tricon Infotech LLC | May 2024 - Jul 2024 | Hoboken, NJ, USA
- Problem - Manually tagging thousands of educational files was a slow and costly process, hindering content discoverability for students.
- Solution - To solve this, we built an AI-driven system that automates classification with 95% accuracy using a pipeline of GPT and BERT embeddings, storing the results in a Neo4j knowledge graph. This solution decreased tagging time by 100% and enhanced content retrieval by 100%. To make these new tags accessible, I created a Gemini-powered chatbot that translates natural student queries into Cypher, slashing query response time by 70% and providing an intuitive search experience via a browser extension.
Operations Analyst(Part-Time)
University of Arizona(Aramark, Student Union) | Nov 2023 - Dec 2024 | Hoboken, NJ, USA
- Problem - Inefficient scheduling and inventory management at over 30 events were leading to long wait times, high food waste, and suboptimal staffing, eroding profit margins.
- Solution - Spearheaded data analysis for 30+ events using Excel and Power BI to optimize scheduling and inventory. My initiatives in forecasting and dashboard creation directly boosted profit margins by 15%, reduced food waste by 25%, and improved labor efficiency by 10% by providing managers with actionable, real-time insights.
Data Engineer (Consultant)
HealthEdge| Jan 2023 - Jun 2023 | Alexandria, VA, USA
- Problem - Critical problem of a legacy batch-processing system that only updated every six hours and provided no real-time patient data
- Solution - Designed and implemented HIPAA-compliant scalable real-time data streaming solutions using Azure and Kafka, automated SQL procedures to handle over 1 million weekly transactions, and integrated robust monitoring with DataDog. This new system increased data processing efficiency by 90%, significantly reduced data latency, and enhanced production stability by cutting downtime by 40%, finally delivering the real-time visibility that clinical teams required.
Data Engineer (Consultant)
Informa | Apr 2022 - Jan 2023 | New York, NY, USA
- Problem -A critical data leakage issue in event registrations threatened a $5M contract. The existing data infrastructure was also costly and inefficient, while low user engagement highlighted a need for more personalized event discovery.
- Solution - Delivered an immediate solution by building AWS Spark ETL pipelines. These pipelines integrated 50+ global event data sources into DOMO with 100% validation accuracy, while new governance dashboards eliminated the leakage. Additional outcomes included a migration of 10M+ records to PostgreSQL that tripled performance and cut costs by 25%, plus automated alerts that reduced resolution time by 99%.Furthermore, an AI-driven recommendation system was implemented to suggest nearby events, increasing user engagement by 40%.
Associate Software Engineer
Tricon Infotech Pvt Ltd | Apr 2022 - Jun 2023 | Remote
- Worked as a Data Engineer on client projects for Informa Markets and HealthEdge.
Projects
Restaurant Chatbot | MenuData.ai
A restaurant recommendation engine built on San Francisco based restaurant data with live data from wikipedia and newapi.
Human or ChatGPT Generated Content?
This project implements multiple NLP-based classification techniques to distinguish between human-written and ChatGPT-generated text.
Multi-label Emotion Detection
This project implements multiple NLP-based classification techniques to predict emotions from text data, identifying sentiments such as admiration, amusement, gratitude, love, pride, relief, and remorse.
Contact
Let's Connect!
likithkumard04@gmail.com
github.com/LikithKumarDundigalla
linkedin.com/in/likithkumardundigalla-data-ai