Summary
Driven Data Engineer and Analytics Enthusiast with a passion for designing and optimizing data solutions. A proactive self-learner and motivator with a deep appreciation for time management, confidentiality, and integrity.
Committed to delivering impactful outcomes by leveraging data engineering expertise to meet and exceed organizational goals.
I’m a data-driven problem solver with a passion for turning raw numbers into powerful insights! With expertise in analytics, machine learning, and visualization, I decode complex data, fuel business growth, and craft intelligent solutions. Let’s transform data into impact and shape the future—one insight at a time!
News
About
Experience
System Analyst – DMV Music Alliance
●Gather and document business and technical system requirements.
 ● Design workflows, system architecture, and data flow diagrams.
 ● Collaborate with technical and non-technical teams/departments to implement integrated solutions.
 ● Provide integration support for third-party tools, APIs, and internal systems.
 ● Conduct system testing and support.
 ● Develop reports and dashboards
 ● Perform research and go-to-market analysis to inform system and product strategies.
 ● Support change management, training, and user documentation.
 ● Act as a liaison between business stakeholders and IT/development teams.
Data Analyst Intern – Precise Software Solution
As a developer in an agile environment, I am creating an application that leverages NLP, image segmentation, and web crawling technologies to monitor and flag non-compliant sales activities. This innovative solution integrates advanced AI techniques to proactively identify irregularities in market transactions while adhering to strict regulatory standards.
Machine Learning Intern – Curaksha LLP
Focused on optimizing data extraction processes and developing advanced predictive models to analyze complex datasets. Utilized Python libraries to preprocess data, build models, and visualize results, improving trend identification accuracy and supporting data-driven decision-making.
Education
George Mason University
 Master’s degree Data Analytics Engineering 3.74
 2023 – 2025
Kishinchand Chellaram (KC) College, Mumbai
 Master of Science in Information Technology 9.10
 2020 – 2022
Kishinchand Chellaram College – India
 BSC IT Computer Programming 8.30
 2017 – 2020
Skills
- Programming Languages: Python, R, Statistics, VBA.
- Database Program: PLSQL, SQL, Relational Databases.
- Business Analytics Tools: Alteryx, Tableau, Power BI, Word, Excel, PowerPoint.
- Methods: Data Integration & Reporting, Data Interpretation, Data Validation, Data Manipulation, Pivot Tables, Data Visualization, VLOOKUP, Data Cleaning, Data Management, and Data Modeling.
- Leadership Skills: Verbal & Written Communication, Ad Hoc Analyses, Strategic Thinking, Problem-Solving, Attention to Detail, Self-Starter, Process Improvements, and
 Quality Assurance.
Contact
Email: Website
Web Links
Resume
Projects
Diabetes Readmission Analysis
Source: Github
In this project we will identify if the patient is readmitted to the hospital within 30 days or not and which ethinicity has the most number of diabetic patients.
Problem Statement
Diabetes is a rapidly growing global health challenge, impacting individuals across the world. Individuals with all age groups, including an alarming rise among younger populations. While it is prevalent in people aged 30 to 50. Talking about specific races, it also disproportionately affects certain demographics, with many Caucasian and African American individuals facing significant challenges in managing the condition. With millions of patients requiring complex care, numerous medications and therapies are prescribed to control diabetes and prevent complications and then people are admitted again to the hospital within 30 days or after period. This study investigates the effectiveness of these treatments in managing diabetes and explores the factors associated with hospital readmissions within 30 days. Utilizing a dataset of 100,000 patient records, the analysis considered 50 key attributes, including demographics, medical diagnoses, medications, and hospital-related details. The dataset represents ten years (1999-2008) of clinical care at 130 US hospitals and integrated delivery networks.
Solution
We have used advanced machine learning models such as J48, Random Forest, and Multinomial Logistic Regression to uncover patterns and relationships within the data. Additionally, we have used Principle Component Analysis and One Hot Encoding to preprocess the dataset. These methods were selected for their capability to handle complex datasets and generate actionable insights. The findings aim to enhance data-driven decision-making in healthcare, optimize patient management strategies, and reduce readmission rates, ultimately improving diabetes care and hospital efficiency.
Powering the Roads: An Comprehensive Analysis on Electric Vehicle
Source: Github
Using Databricks and PySpark to Analyze and Predict the Clean Alternative Fuel Vehicle Status of electric car and best conditions to drive in.
Abstract
The goal of this project is to identify and analyze the cars in Washington State, which are electric and has the Clean Alternative fuel and is eligible to drive. The second problem statement is to identify the best suited city supporting electric vehicle. The dataset was collected from the Washington Department of Transportation.
Solution
The first step was to have a subset of the original dataset and identify the area with the most population of electric vehicles.
Used Car Value Analysis
Source: Github
Using R programming with integration of Machine Learning to understand the resale value of vehicles.
The first step was to identify does the price of car fluctuates due to some other features and their significance. Using significance testing and p-value analysis, we got to know that fuel type does affect the price.
Leaf Disease Detection
Source: Github
n this project we will identify the disease affecting the crop, taking in consideration the people not aware of the type of disease.
Crop wastage due to diseases is largely unfurled around the globe. If such a mass quantity of crop sets is squandered it might be a difficult task to encounter. Food security is an initiative that we all have to take to maintain a healthy and wealthy environment.
For the backend, we have used Machine learning and Artificial Neural Network to train the images. Image pre-processing, data reduction, segmentation, and recognition are the processes used in managing images with ANN. An image can be represented as a matrix, each element of the matrix containing color information for a pixel. The matrix is used as input data into the neuronal network. The small dimensions of the images, easily and quickly help to learn and establish the size of the vector and the number of input vectors. The transfer function used is sigmoidal. The learning rate includes values between [0,1] and the error is recommended to be below 0.1.
Data Science Salary Categorization
In this project, we will classify the types of salary in Data Science domain and convert a bad graph to a good graph. All the charts presented are interactive
Identifying the two bad graph we have identified.
 
 
