Natural Language Processing Based Poet Recognition with Supervised Learning on Turkish Poetry Dataset
Abstract
Natural language processing-based studies become popular nowadays and Turkish based studies are increasing. The problem of author classification is based on determining whether an anonymous text belongs to one of the popular authors. This research problem is motivated by the idea that each author's work will reflect some basic features about the author's intellectual vocabulary and thus it should be possible to distinguish between authors. In this study, 50 poems of 5 different poets from Turkish Literature were taken and a dataset was obtained. Experiments were performed on the dataset using 9 different classifier methods. This is a preliminary study that will serve as a basis for future studies.
References
- 1.D. Reinsel, J. Gantz, J. Rydning, "The digitization of the world from edge to core". International Data Corporation, 16, 1-28, 2018.
- 2.A. Oğuzlar, "Temel Metin Madenciliği". Dora Yayınları, 2011.
- 3.E. Adalı, "Türkçe Doğal Dil İşleme". Akçağ Yayınları, 2020.
- 4.Z. Korkmaz, "Türkiye Türkçesi Grameri Şekil Bilgisi". Türk Dil Kurumu Yayınları, 2009.
- 5.C. M. Stamatatos, "Automatic authorship attribution". Ninth Conference of the European Chapter of the Association for Computational Linguistics, 1999.
- 6.D. Ünal, Ş. E. Şeker, "Metin Madenciliğinde Yazar Tanıma (Author Recognition in Text Mining)". BS Ansiklopedisi, 2018. Ninth Conference of the European Chapter of the Association for Computational Linguistics, 1999.
- 7.F. Mosteller, D. L. Wallace, "Applied Bayesian and Classical Inference: The Case of the Federalist Papers". Addison-Wesley, 1984.
- 8.K. Oflazer, Two-level description of Turkish morphology. In Literary and linguistic computing, volume 752, pages 137-148. Madison, WI, 1998.
- 9.G. Cebiroğlu, "Sentetik Türkçe Sözcük Kökleri Üretimi". International XII. Turkish Symposium on Artificial Intelligence and Neural Networks–TAINN, 2003.
- 10.İ. Büyukkuşcu, E. Adalı, "Heceleme Yöntemiyle Kök Sözcük Üretme". International XII. Turkish Symposium on Artificial Intelligence and Neural Networks–TAINN, 2003.
- 11.C. M. Tan, Y. F. Wang, et al. "The use of bigrams to enhance text categorization". Information Processing & Management 38(4), 2002.
- 12.B. Diri, F. Amasyalı, "Automatic author detection for Turkish texts". Artificial Neural Networks and Neural Information Processing (ICANN/ICONIP), 2003.
- 13.F. Amasyalı, B. Diri, et al. "Farklı özellik vektörleri ile Türkçe dokümanların yazarlarının belirlenmesi". 15th Turkish Symposium on Artificial Intelligence and Neural Network, Muğla, Türkiye, 2006.
- 14.İ. N. Bozkurt, O. Baghoglu, et al. "Authorship attribution performance of various features and classification methods". 22nd International Symposium on Computer and Information Sciences, 2007.
- 15.A. McCallum, K. Nigam, et al. A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization, volume 752, pages 41–48. Madison, WI, 1998.
- 16.B. Schölkopf, A. J. Smola, et al. "New support vector algorithms." Neural Computation, 12(5):1207–1245, 2000.
- 17.L. Breiman, J. H. Friedman, et al. Classification And Regression Trees. Routledge, October 2017.
- 18.L. Breiman. Random forests. Machine learning, 45:5–32, 2001.
- 19.P. Geurts, D. Ernst, L. Wehenkel. Extremely randomized trees. Machine learning, 63:3–42, 2006.
- 20.E. Şahin. Makine öğrenme yöntemleri ve kelime kümesi tekniği ile İstenmeyen e-posta/e-posta sınıflaması. Master’s thesis, Hacettepe Üniversitesi, 2018
Korkmaz, S., Köylü, F. (2024). Natural Language Processing Based Poet Recognition with Supervised Learning on Turkish Poetry Dataset. *Orclever Proceedings of Research and Development*, 4(1), 115-122. https://doi.org/10.56038/oprd.v4i1.470
Bibliographic Info
More from Orclever Proceedings of Research and Development
Single-Bath Dyeing of Blends of Cotton Fibers with New Generation Polyacrylonitrile Fibers with Reactive Dye in Line with the Target of Sustainable Production
Yıldıray Fatih Dilsiz, Seda Keskin, Rıza Atav
2025 · Vol 7 · Issue 1
The Green Step Upper: A Novel Sustainable Bonding Method Replacing Solvent-Based Adhesives in Footwear Upper Assembly
Baris Bekiroglu, Mustafa Yener
2025 · Vol 7 · Issue 1
Innovative Technological Strategies to Enhance Bioavailability in Germinated Grains
Ebru Bozkurt Abdik
2025 · Vol 7 · Issue 1
Graph-Based Customer Segmentation with GraphSAGE on a Customer–Vehicle Bipartite Network
Abdullah Sezdi, Metin Bilgin
2025 · Vol 7 · Issue 1
Natural Language Processing-Based Layered Reconciliation System for Financial Transaction Analysis
Dilara Hazırlar, Özlem Avcı, Mesut Tekir
2025 · Vol 7 · Issue 1
An Integrated Deep Learning Framework for Automated Quality Control and Process Optimization in Slasher Indigo Dyeing
Mohammad Muttaqi, Gizem Daskaya, Kerem Cakir
2025 · Vol 7 · Issue 1