Hello ! I'm Kirubasagar
Hello ! I'm Kirubasagar
" Dream, Dream, Dream
Dreams transform into thoughts
And thoughts results in action "
by Dr. APJ Abdul Kalam
Practice Highlights
I designed an end-to-end Forecasting Platform that empowers farmers and traders with data-driven buy/sell decisions through up to 180-day price outlooks across any crop-state combination in India. The system ingests 10 years of government mandi data (2016–2026), stored as Parquet files on Azure Blob Storage and queried in real-time via DuckDB's httpfs — enabling sub-second analytics over ~80 millions of records without any database server. A 4 model forecasting stack was developed comprising ARIMA with auto-order selection, XGBoost with lag and rolling features, a PyTorch LSTM for sequence learning, and a performance-weighted ensemble that blends all three. An automated crop advisory system generating buy/sell/hold recommendations — all surfaced through a live Streamlit dashboard deployed on Streamlit Cloud.
I led the creation of an innovative Revenue Predictor for New York taxi drivers, focusing on tipping behavior prediction. The model, working with binary outcomes, harnessed various factors to accurately forecast customer actions. Advanced machine learning techniques, including RandomForest and XGBoostClassifier, were applied, backed by rigorous cross-validation and F1 score evaluation. The XGBoost Classifier stood out with an accuracy of 62.56% and F1 score of 35.78%. Crucial predictors like 'predicted fare', 'mean distance', and 'mean duration' emerged, offering potential to revolutionize revenue strategies for taxi drivers.
Python-driven Content based movie recommendation system utilizes TF-IDF and cosine similarity for on-demand, personalized film suggestions. The incorporation of interactive widgets ensures a seamless user experience, while the application of advanced NLP techniques showcases data-driven precision and user-centric design, ultimately enhancing the quality of movie recommendations.
A dynamic fusion of medical knowledge and data-driven clarity. This dashboard harmonizes global COVID-19 data, presenting a visual exploration of confirmed cases and recoveries across nations. It offers a positive path towards informed strategies and impactful decisions in the fight against the pandemic.
An analysis was conducted to determine if a statistically significant difference exists between video view counts and Account verification status on TikTok. The Hypothesis test, specifically a two-sample T-Test (A/B Test), was employed for this purpose. The obtained p-value was remarkably small, far below the standard significance level of 5%. Consequently, the null hypothesis was rejected, leading to the conclusion that a notable and statistically significant distinction exists in the mean video view counts between verified and unverified TikTok accounts.
I developed a churn prediction model for Waze to enhance retention and business growth. By analyzing variables, training a logistic classifier, and evaluating with metrics like Accuracy and F1 Score, the model achieved an 82.37% accuracy. Surprisingly, 'km_per_driving_day' showed strong correlation with churn despite its lower importance in the model, providing Waze insights to optimize retention strategies and user experience.