I'm Ishmeen,
a Data Scientist building intelligent systems at the intersection of ML, NLP & Deep
Learning.
About
I'm a Master's in Data Science student at Columbia University with a background in Computer Engineering (CGPA 9.66/10) and 5 peer-reviewed publications in IEEE and Springer. My work spans machine learning, NLP, and deep learning — applied to real-world challenges in healthcare, traffic safety, and social impact. I've interned at Deloitte and IIT Roorkee, and currently serve as a Graduate Student Assistant at the Northeast Big Data Innovation Hub at Columbia.
Download ResumeExperience
Northeast Big Data Innovation Hub
Graduate Student Assistant · Columbia University
Jan 2026 – Present
- Developing healthcare digital twin prototype integrating multi-source physiological datasets for patient-level treatment simulation.
- Constructing structured and time-series feature pipelines to support predictive modeling experiments.
- Delivered Advanced Excel workshop to 25+ MS Data Science students; currently developing an Advanced SQL workshop.
Indian Institute of Technology (IIT), Roorkee
Project Intern
May 2025 – Aug 2025
- Analyzed 5,000+ vehicular trajectories using spatiotemporal modeling, validating data quality and consistency with <2% error.
- Engineered behavioral and efficiency metrics (gap acceptance, yielding, speed compliance) across 10+ maneuver types, improving model classification accuracy by 18%.
- Designed and evaluated ML-based decision systems using fine-tuned BERT and retrieval-augmented pipelines, achieving a 22% accuracy lift over rule-based baselines.
Deloitte Touche Tohmatsu India LLP - Financial Advisory
Intern
Jun 2024 – Jul 2024
- Automated FX deal valuation pipeline using QuantLib with REST API integrations, reducing manual valuation processing time by 75%.
- Built Django-based visualization dashboard with CI/CD deployment adopted by 40+ consultants across practice, enhancing reporting efficiency.
- Analyzed RBI vs. NBFC climate risk policies, identifying 3 major gaps with actionable mitigation proposals.
- Presented automation framework and policy recommendations to 3 Partners and 15+ senior leaders.
FCRIT, University of Mumbai
Research Intern
Jul 2024 – Jan 2025
Published research at the intersection of NLP, Quantum Computing, and Blockchain in IEEE Xplore and Springer.
Education
Columbia University
Master of Science in Data Science, GPA: 3.6/4
Expected Dec 2026
Relevant Coursework: Machine Learning, Statistical Inference, Causal Inference, Forecasting, Applied Deep Learning, Data Visualization.
Collaborating with Columbia Business School MBA students through Analytics in Action to solve real industry challenges.
Fr. Conceicao Rodrigues Institute of Technology
Bachelor of Engineering in Computer Science
Graduated Jun 2025
CGPA: 9.66/10.0 — Graduated in the top 10% of the class. Core coursework in Data Structures, Operating Systems, Machine Learning, Big Data Analytics, AI, Blockchain, and Cryptography. Published 5 research papers during undergraduate studies.
Skills & Technologies
The tools and technologies I work with every day.
Projects
A selection of projects I've built — from data analysis to cloud-deployed apps.
-
Exploratory Data Analysis and VisualizationHeat's Toll on Health -
ApplicationFoodMO: A Food Nutrient Analysis Application -
Augmented RealityPhysics IRL -
Recommendation SystemMyReckList: Movie Recommendation System -
WebsiteUmanità -
Healthcare AIECG Report Analysis Leveraging Generative AI -
Cloud ComputingToDo List -
Web ApplicationBridgerton Watch Party
Heat's Toll on Health
Collaborated on an EDAV course project investigating extreme heat's impact on U.S. public health. Applied data wrangling, visual analytics (R, D3.js, ggplot2), and statistical methods to identify trends in heat-related hospitalizations, emergency visits, and mortality across regions and demographics. The analysis reveals which populations are most vulnerable and provides data-driven insights to inform public health interventions.
View Project →- Data Visualization
- R Programming
- D3.js
FoodMO: A Food Nutrient Analysis Application
Built a mobile application that uses Optical Character Recognition (OCR) to extract text from food labels and a Random Forest classifier to analyze nutritional content. The system provides personalized healthier-alternative recommendations using cosine similarity, helping users make informed dietary choices. Published as a research paper in Springer LNNS.
- OCR
- Random Forest
- Cosine Similarity
Physics IRL
An augmented reality (AR) application that makes physics tangible. Using Unity's AR Plane Manager, it detects horizontal surfaces where users can place 3D objects to study gravitational force. AR Tracked Image Manager tracks predefined images for frictional force simulations. Built with custom C# scripts and realistic textures, it provides an immersive, interactive learning experience.
- Unity
- C#
- Object Detection
ToDo List: Cloud-Based Multi-Device To-Do List
A scalable, cloud-based task management application deployed on AWS EC2 with auto-scaling and Django. Supports multi-user real-time collaboration with full CRUD operations, minimal latency, and high availability. Responsive design ensures accessibility across all devices.
- Django
- AWS EC2
- Security Groups
MyReckList
A recommendation engine that suggests movies and books tailored to user preferences using the Alternating Least Squares (ALS) algorithm. Implements collaborative filtering to analyze rating patterns and deliver the top 10 most relevant recommendations, ensuring a highly personalized discovery experience.
- Collaborative Filtering
- ALS Algorithm
Umanità
A platform designed for NGOs to create and manage customizable websites that showcase their missions. Built with HTML, CSS, and JavaScript, it features donation integrations, event listings, and volunteer sign-up forms — enabling organizations to enhance their online presence and community engagement.
- Web Development
- HTML/CSS/JS
ECG Report Analysis Leveraging Generative AI
A healthcare tool that uses Convolutional Neural Networks to classify ECG reports with 92% accuracy across multiple heart conditions. Integrates Generative AI to deliver simplified, profile-aware analyses — suggesting lifestyle changes and locating nearby cardiologists. Ongoing development focuses on predictive insights for future cardiac health.
- CNN
- Generative AI
- Healthcare
Bridgerton Watch Party
An interactive, highly-stylized web application for a Bridgerton watch party. Features include a Regency-era language translator powered by Gen AI, gossip polls, a tea spill board, character quizzes, and live chat functionality deployed on Netlify.
View Project →- Web Development
- Gen AI
Publications
Peer-reviewed research published in IEEE and Springer venues.
Emerging Applications and Challenges in Quantum Computing: A Literature Survey
Literature Survey
An extensive survey of current quantum computing applications, highlighting emerging opportunities and key challenges for future growth.
View Paper →Topic Modeling for Identifying Emerging Trends on Instagram Using LDA and NMF
Research Paper
Investigates trend identification on Instagram by leveraging Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF), offering insights into evolving online trends.
View Paper →Blockchain-Based Smart Contracts for Decentralized Vehicle-to-Grid (V2G) Load Management
Smart Contract Implementation & Research
Presents a decentralized V2G system using Solidity smart contracts on Ethereum to dynamically manage grid demand while ensuring secure energy transactions between EVs and the power grid.
View Paper →FoodMO: A Food Nutrient Analysis Application Using OCR and Machine Learning
Technical Paper
Details the development of FoodMO — a mobile application that uses OCR and machine learning to analyze food nutrients and provide healthier dietary recommendations.
View Paper →Natural Language Processing: A Survey of Approaches, Applications, and Future Directions
Literature Review
A comprehensive survey of NLP techniques covering the latest approaches, real-world applications, and promising future directions for research.
View Paper →Get In Touch
Whether you're exploring a collaboration, looking for a data-driven problem-solver, or just want to connect about ML, NLP, or data science — I'd love to hear from you.