Projects & Innovations
Detailed technical breakdown of production AI systems, enterprise platforms, and published data research.
A production-grade web application designed to "decode" any GitHub repository. It transforms raw code into a structured, visual "Code Wiki" with instant technical documentation and high-fidelity architecture diagrams.
🏗️ Architecture & Structure:
• Core UI Layer: Manages application lifecycle, Wiki-style deep links, and complex analysis state.
• Service Layer:
- githubService.ts: Data acquisition engine with recursive tree fetching and multi-tier fallback (API → JsDelivr → Raw).
- aiService.ts: Deep-context analysis using Gemini 3.1 Pro focusing on "Why" and "How".
- graphEngine.ts: Custom static analysis parsing file relationships into Mermaid.js diagrams.
• Smart Context Injection: Prioritizes "DNA files" (READMEs, manifests, entry points) for high signal-to-noise ratio.
• Resilient Fetching: Fallback strategy to ensure 100% uptime and bypass GitHub API rate limits.
React 18
TypeScript
Gemini 3.1 Pro
Mermaid.js
Tailwind CSS
Vite
A B2B SaaS platform transforming Healthcare RCM through autonomous data auditing and predictive analytics.
🚀 Project Overview:
• Autonomous Agent Architecture: Specialized AI "Pods" (CEO, Engineering, Data/ML, DevOps) collaborate to build and scale.
• Automated Data Auditing: Autonomous scanner identifying billing errors and missing data with custom heuristic logic.
• Predictive Payment Forecasting: Scikit-learn regression models (Random Forest) to predict cash flow timelines.
• CRM Integration: Direct API integration with Monday.com for seamless operational synchronization.
• Infrastructure: FastAPI backend, React/Vite frontend, Docker orchestration, and GCP deployment.
FastAPI
Python
React
Scikit-learn
Docker
GCP
SQLAlchemy
An advanced career development platform automating the end-to-end job-seeking process using a decentralized multi-agent approach.
🏗️ Technical Architecture:
• Agent Alpha (Resume Architect): Multimodal analysis using "Thinking" budget for deep candidate impact reasoning.
• Agent Bravo (Market Scout): Real-time salary and trend fetching using Google Search Grounding.
• Interview Simulator: Low-latency, full-duplex voice interactions using Gemini 3.1 Flash Live API and Web Audio API.
• Coding Lab: Real-time code evaluation for time/space complexity and style.
• UI/UX: High-performance "Glassmorphism" interface with custom SVG-based data visualizations.
React 19
Gemini 3.1 Live
Web Audio API
SVG Visualization
JSON Schema
Collective Intranet System
Full Stack Architect & Main Developer
Enterprise internal platform for a healthcare RCM organization to centralize onboarding, training, task management, and admin reporting.
Technical Highlights:
• Hybrid Database: PostgreSQL (SQLAlchemy) for relational data and MongoDB (GridFS) for scalable document storage (PDFs, PPTs).
• Workflow Logic: Training-and-progression system with company email validation, 2FA, approval gates, and timed exams.
• Task Management: Admin-assigned client tasks with status milestones, automated notifications, and scheduled cleanup via APScheduler.
• Enterprise Security: CSRF protection, Redis-based rate limiting, account lock logic, and secure session handling.
• DevOps: GitHub Actions CI/CD with Azure Web App deployment and Flask-Migrate for schema management.
Python
Flask
PostgreSQL
MongoDB
Redis
Azure
CI/CD
Predictive Analytics for Payment Prediction in RCM
Lead Data Scientist / Researcher
Comprehensive research on applying ML to forecast payment behaviors and optimize revenue flow in healthcare.
🎯 Research Objectives:
• Develop models for payment likelihood, provider variations, and location-based trends.
• Analyze key drivers influencing payment behavior (Days since billing, Payer mix, Previous behavior).
📊 Methodology:
• Dataset: 3-year HIPAA-compliant dataset (~28,000+ records).
• Feature Engineering: Payment history lags, rolling averages, and Branch & Bound inspired feature selection.
• Models: Compared Linear Regression, Decision Trees, Random Forest, and Neural Networks.
🔍 Key Findings:
• Future Payment: Random Forest & Neural Networks achieved R² > 0.90.
• Revenue Forecasting: Simpler linear models outperformed complex ones due to bias-variance tradeoffs on limited time-series data.
• Location Trends: Used K-Means clustering to segment facilities and improve prediction accuracy.
Python
Pandas
Scikit-learn
Neural Networks
Statistical Modeling
HIPAA
End-to-end AI platform analyzing ~7M+ Yelp reviews to generate actionable sentiment insights and recommendations.
🏗️ System Architecture:
• ETL Pipeline: Multi-threaded batch ingestion from raw JSON to PostgreSQL with dynamic schema filtering and foreign key validation.
• Feature Engineering:
- Text: VADER sentiment scores and TF-IDF (Top 50 terms).
- Temporal/Geo: Weekend/holiday indicators and Geo clustering (K-Means).
- Behavioral: User retention modeling and business review density distribution.
• Machine Learning: Ensemble VotingClassifier (Logistic Regression, RF, XGBoost) with Soft Voting and Stratified K-Fold validation.
• LLM Layer: Design-ready architecture for review summarization and pros/cons generation.
Python
XGBoost
PostgreSQL
spaCy
VADER
ETL
Scikit-learn
Disaster Management System
Lead Developer
Critical incident response system designed to mitigate the impact of floods in Sri Lanka through real-time mapping and mass communication.
Key Capabilities:
• Mass Alerting: Integrated SMS gateways for automated emergency alerts to at-risk populations based on geo-fencing.
• Incident Mapping: Real-time location tracking for flood reports and emergency resource allocation using mapping APIs.
• Crisis Coordination: Centralized dashboard for disaster management units to coordinate field responses and monitor weather behavioral trends.
• Impact: Provided a scalable solution for national-level emergency response coordination.
Laravel
PHP
MySQL
SMS Gateway
Geo-mapping
Interactive web application for comprehensive statistical analysis and machine learning.
Features:
• Data Processing: Robust cleaning tools for missing value handling (Mean/Median/Custom), outlier detection (IQR/Z-Score), and scaling.
• Inferential Statistics: Interactive T-Tests, ANOVA, Chi-Square, and non-parametric alternatives with full visualization.
• ML Lifecycle: Full support for Regression, Clustering (K-Means, DBSCAN), and PCA dimensionality reduction.
• Dashboard: Dynamic visualizations using Plotly and Seaborn with interactive filtering and data export (CSV/Excel).
Python
Streamlit
Plotly
Scikit-learn
SciPy