Data Science

Machine Learning for Future Payment Prediction in Healthcare Revenue Cycle Management

A focused industry research paper on transaction-level payment prediction in healthcare Revenue Cycle Management (RCM) comparing Linear Regression, Decision Trees, Random Forests, and Neural Networks on a chronological hold-out evaluation.

Thanuka EllepolaApril 29, 20268 min read

Executive Summary

This research paper focuses on Future Payment Prediction within Healthcare Revenue Cycle Management (RCM). The study addresses a core operational question: can historical, de-identified billing data be used to predict the fraction of an individual healthcare bill that will be paid within 90 days, and can that prediction improve revenue-cycle prioritization?

Broader experiments showed much weaker explanatory power for provider-level prediction (R² = 0.2934) and aggregate revenue forecasting (R² = 0.1246). In contrast, the Future Payment Prediction scenario achieved outstanding results, reaching R² = 0.9191 with a neural network and R² = 0.9102 with a random forest, justifying its focus as the primary operational optimization model.

Abstract

Healthcare revenue cycle teams routinely prioritize accounts receivable using retrospective reports and manual judgment, yet these approaches often fail to identify which patient balances are most likely to remain unpaid. This paper presents a focused industry case study on Future Payment Prediction in healthcare RCM using de-identified historical billing data from a multi-facility healthcare system. The task is formulated as a transaction-level regression problem: predicting the fraction of each bill expected to be paid within 90 days.

Four commonly used machine learning models were compared on a chronological hold-out evaluation: Linear Regression, Decision Tree, Random Forest, and Neural Network. The reported test-set results show strong predictive performance, with Linear Regression achieving MAE = 0.0060243 (R² = 0.7409), Decision Tree achieving MAE = 0.0019225 (R² = 0.8929), Random Forest achieving MAE = 0.0016033 (R² = 0.9102), and Neural Network achieving MAE = 0.0037950 (R² = 0.9191). The findings indicate that nonlinear models substantially outperform linear baselines for transaction-level payment prediction, while Random Forest provides the most attractive balance between absolute error and predictive stability.

Performance Summary: Random Forest provides the lowest absolute error (MAE = 0.00160), while the Neural Network achieves the highest overall fit (R² = 0.9191).

Introduction & Operational Significance

Revenue Cycle Management is the financial backbone of healthcare delivery because it connects patient encounters to charge capture, claims processing, reimbursement, and final payment collection. In practice, however, many RCM teams still operate reactively: accounts are followed up after delays emerge, work queues are broad rather than risk-based, and scarce staff time is spent on balances that may have very different probabilities of recovery.

This study argues that predictive analytics can shift RCM from retrospective monitoring to proactive intervention by using historical billing and payment patterns to anticipate future account behavior. Account-level payment prediction is immediately actionable in industry settings because it can support collections prioritization, patient outreach, payment-plan assignment, and near-term cash-flow planning without requiring radical workflow redesign.

Data and Methodological Framework

The study used retrospective financial records from a medium-sized healthcare system spanning 2022–2024. The transaction-level Future Payment Prediction scenario retained over 28,000 usable records after preprocessing. All Protected Health Information (PHI) was removed in line with HIPAA Safe Harbor expectations, which requires the removal of 18 categories of identifiers to protect patient privacy.

The prediction target is the fraction of each bill expected to be paid within 90 days of billing, making the task a transaction-level regression problem on a bounded, normalized target. Features include: total charges, payments made so far, days since billing, payer type, prior payment history, service type, and whether the account was insured or self-pay. To evaluate performance reliably and avoid look-ahead leakage, the data were split chronologically into an 80% training set and a 20% hold-out evaluation set.

pythonNeural Code Block

# Hyperparameter configurations selected via Grid Search
decision_tree_params = {
    'max_depth': 15,
    'min_samples_split': 2
}

random_forest_params = {
    'n_estimators': 200,
    'max_depth': 20,
    'min_samples_split': 2
}

neural_network_params = {
    'hidden_layer_sizes': (50, 50),
    'activation': 'relu',
    'learning_rate_init': 0.01
}

Best configurations from the grid search and temporal cross-validation pipelines

Results & Performance Patterns

The empirical comparison shows a clear pattern: linear models leave substantial predictive performance unused, while nonlinear models capture the threshold-driven relationships between billing attributes and eventual payment fractions. Random Forest achieved the lowest absolute error (MAE = 0.0016033), suggesting it offers the highest account-level prediction stability. The Neural Network achieved the highest overall fit (R² = 0.9191).

For RCM operations, a scatter plot of predicted versus actual payments indicates that large-balance accounts remain a difficult edge case where raw-dollar deviations can be large even when normalized metrics are strong. A production system should therefore combine predicted payment propensity with claim size to flag high-value outlier claims for senior specialist review.

"Random Forest achieved the lowest absolute error (MAE = 0.0016033), providing the most stable basis for operational collections prioritization."

Ethical Deployment & Continuous Auditing

In production, the model should be used to trigger supportive financial actions rather than punitive ones. A low predicted payment fraction should activate tailored outreach, payment-plan offers, or financial counseling, not discriminatory billing treatment.

The cleanest deployment path is a nightly batch pipeline that runs inference on open accounts, writes risk scores back into collections queues, and monitors error drift across payers and facilities over time.

Additional project repositories can be found on my GitHub profile (https://github.com/Thanuka9) and professional publications are documented on LinkedIn (https://www.linkedin.com/in/thanuka-ellepola-a559b01aa/).

References

[1] ICTer 2026 Industry R&D Track, official call for papers.

[2] Springer, Instructions for Using the Microsoft Word Proceedings Paper Template.

[3] U.S. Department of Health and Human Services, Guidance Regarding HIPAA Privacy Rule De-identification.

[4] Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001).

[5] Pedregosa, F., et al.: Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011).

Thanuka Ellepola.

Connectivity