Probability-Calibrated Ensemble Methods for Automotive CRM Lead Scoring
Abstract
Accurately predicting sales conversion in automotive CRM systems is critical for optimizing marketing spend and sales team efficiency. This study presents a calibrated ensemble framework combining XGBoost, Gradient Boosting, and Random Forest classifiers to predict lead conversion probability in automotive dealership operations. Using 62,859 real-world leads collected between July 2024 and July 2025, we developed a systematic pipeline encompassing behavioral feature engineering, statistical feature selection, ensemble modeling, and probability calibration via Platt scaling. The calibrated ensemble achieved an AUC of 0.841, Brier score of 0.146, and 19% improvement in top-decile precision over baseline logistic regression. The framework provides actionable lead segmentation into four priority tiers, directly supporting sales resource allocation and marketing campaign optimization. Results confirm that probability calibration is essential for automotive CRM applications where predicted scores inform operational decisions.
References
- 1.Agag, G., Aboul-Dahab, S., & El-Masry, A. A. (2024). Understanding the relationship between marketing analytics, customer agility, and customer satisfaction: A longitudinal perspective. Journal of Retailing and Consumer Services, 76, 103663. https://doi.org/10.1016/j.jretconser.2023.103663DOI
- 2.Basu, A., Bhattacharyya, S., & Shukla, V. K. (2023). Deep learning for information systems research. Journal of Management Information Systems, 40(1), 122–154. https://doi.org/10.1080/07421222.2023.2172772DOI
- 3.Berta, P., Bach, S., & Jordan, M. (2024). Classifier calibration with ROC-regularized isotonic regression. In Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024) (Vol. 238, pp. 3615–3623). PMLR.
- 4.Bohanec, M., Borštnar, M. K., & Robnik-Šikonja, M. (2017). Explaining machine learning models in sales predictions. Expert Systems with Applications, 71, 416–428. https://doi.org/10.1016/j.eswa.2016.11.010DOI
- 5.Sharma, K. K., Tomar, M., & Tadimarri, A. (2023). Optimizing sales funnel efficiency: Deep learning techniques for lead scoring. Journal of Knowledge Learning and Science Technology, 2(2), 261–274. https://doi.org/10.60087/jklst.vol2.n2.p274DOI
- 6.D’Haen, J., & Van den Poel, D. (2013). Model-supported business-to-business prospect prediction based on an iterative customer acquisition framework. Industrial Marketing Management, 42(4), 544–551. https://doi.org/10.1016/j.indmarman.2013.03.005DOI
- 7.Eitle, V., & Buxmann, P. (2019). Business analytics for sales pipeline management in the software industry: A machine learning perspective. In Proceedings of the 52nd Hawaii International Conference on System Sciences (HICSS) (pp. 1013–1022). https://doi.org/10.24251/HICSS.2019.125DOI
- 8.González-Flores, K., Gil-García, C., & Arco-Tirado, J. L. (2025). The relevance of lead prioritization: A B2B lead scoring model based on machine learning. Frontiers in Artificial Intelligence, 8, 1554325. https://doi.org/10.3389/frai.2025.1554325DOI
- 9.Gupta, A., & Ramdas, A. (2023). Online Platt scaling with calibeating. In Proceedings of the 40th International Conference on Machine Learning (ICML 2023) (Vol. 202, pp. 12182–12204). PMLR.
- 10.Hollebeek, L. D., Rather, R. A., Sigurdsson, V., & Bowden, J. L. (2024). Unravelling the customer journey: A conceptual framework and research agenda. Technological Forecasting and Social Change, 201, 123916. https://doi.org/10.1016/j.techfore.2024.123916DOI
- 11.Järvinen, J., & Taiminen, H. (2016). Harnessing marketing automation for B2B content marketing. Industrial Marketing Management, 54, 164–175. https://doi.org/10.1016/j.indmarman.2015.07.002DOI
- 12.Kapoor, S., & Narayanan, A. (2023). Leakage and the reproducibility crisis in machine-learning-based science. Patterns, 4(9), 100804. https://doi.org/10.1016/j.patter.2023.100804DOI
- 13.Kull, M., Perello-Nieto, M., Kängsepp, M., Silva Filho, T., Song, H., & Flach, P. (2019). Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration. Advances in Neural Information Processing Systems, 32, 12316–12326.
- 14.Kusnawi, Adiwijaya, & Gani, A. (2024). Leveraging various feature selection methods for churn prediction using various machine learning algorithms. JOIV: International Journal on Informatics Visualization, 8(2), 543–552. https://doi.org/10.62527/joiv.8.2.2453DOI
- 15.Lin, Q. (2025). Application of machine learning in predicting consumer behavior and precision marketing. PLOS ONE, 20(1), e0321854. https://doi.org/10.1371/journal.pone.0321854DOI
- 16.Meire, M., Ballings, M., & Van den Poel, D. (2017). The added value of social media data in B2B customer acquisition systems: A real-life experiment. Decision Support Systems, 104, 26–37. https://doi.org/10.1016/j.dss.2017.10.003DOI
- 17.Naeini, M. P., Cooper, G. F., & Hauskrecht, M. (2015). Obtaining well-calibrated probabilities using Bayesian binning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 29, No. 1, pp. 2901–2907).
- 18.Ngai, E. W. T., Xiu, L., & Chau, D. C. K. (2009). Application of data mining techniques in customer relationship management: A literature review and classification. Expert Systems with Applications, 36(2), 2592–2602. https://doi.org/10.1016/j.eswa.2008.02.021DOI
- 19.Pineau, J., Vincent-Lamarre, P., Sinha, K., Larivière, V., Beygelzimer, A., d’Alché-Buc, F., Fox, E., & Larochelle, H. (2021). Improving reproducibility in machine learning research: A report from the NeurIPS 2019 Reproducibility Program. Journal of Machine Learning Research, 22(1), 7459–7478.
- 20.Sabnis, G., Chatterjee, S. C., Grewal, R., & Lilien, G. L. (2013). The sales lead black hole: On sales reps’ follow-up of marketing leads. Journal of Marketing, 77(1), 52–67. https://doi.org/10.1509/jm.10.0047DOI
- 21.Säuberlich, F., Smith, K., & Yuhn, M. (2005). Analytical lead management in the automotive industry. In M. J. Shaw, D. D. Zeng, H. Chen, F. Y. Wang, & C. C. Yang (Eds.), Intelligence and Security Informatics (pp. 290–299). Springer. https://doi.org/10.1007/11427995_25DOI
- 22.Thorleuchter, D., Van Den Poel, D., & Prinzie, A. (2012). Analyzing existing customers’ websites to improve the customer acquisition process as well as the profitability prediction in business-to-business marketing. Expert Systems with Applications, 39(3), 2597–2603. https://doi.org/10.1016/j.eswa.2011.08.109DOI
- 23.Wu, M., Andreev, P., & Benyoucef, M. (2023). The state of lead scoring models and their impact on sales performance. Information Technology and Management, 24, 157–183. https://doi.org/10.1007/s10799-023-00388-wDOI
- 24.Xiao, H., Huang, X., Peng, Y., & Li, J. (2025). Example dependent cost sensitive learning based selective deep ensemble model for customer credit scoring. Scientific Reports, 15(1), 89880. https://doi.org/10.1038/s41598-025-89880-7DOI
Sedef, B., Bayracı, S., Bilgin, T. T. (2025). Probability-Calibrated Ensemble Methods for Automotive CRM Lead Scoring. *The European Journal of Research and Development*, 5(1), 502-525. https://doi.org/10.56038/ejrnd.v5i1.717
Bibliographic Info