Skip to main content

Showing 1–5 of 5 results for author: Leliwa, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2206.01949  [pdf

    cs.CL cs.AI cs.LG

    Exploring the Potential of Feature Density in Estimating Machine Learning Classifier Performance with Application to Cyberbullying Detection

    Authors: Juuso Eronen, Michal Ptaszynski, Fumito Masui, Gniewosz Leliwa, Michal Wroczynski

    Abstract: In this research. we analyze the potential of Feature Density (HD) as a way to comparatively estimate machine learning (ML) classifier performance prior to training. The goal of the study is to aid in solving the problem of resource-intensive training of ML models which is becoming a serious issue due to continuously increasing dataset sizes and the ever rising popularity of Deep Neural Networks (… ▽ More

    Submitted 4 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2111.01689

    Journal ref: he 7th Workshop on Linguistic and Cognitive Approaches to Dialog Agents (LaCATODA 2021) collocated with IJCAI 2021,August 21--26th, 2021, Montreal, Canada. CEUR Workshop Proceedings 2935, 5-14

  2. arXiv:2206.01889  [pdf

    cs.CL cs.AI cs.LG

    Initial Study into Application of Feature Density and Linguistically-backed Embedding to Improve Machine Learning-based Cyberbullying Detection

    Authors: Juuso Eronen, Michal Ptaszynski, Fumito Masui, Gniewosz Leliwa, Michal Wroczynski, Mateusz Piech, Aleksander Smywinski-Pohl

    Abstract: In this research, we study the change in the performance of machine learning (ML) classifiers when various linguistic preprocessing methods of a dataset were used, with the specific focus on linguistically-backed embeddings in Convolutional Neural Networks (CNN). Moreover, we study the concept of Feature Density and confirm its potential to comparatively predict the performance of ML classifiers,… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Journal ref: Proceedings of The 6th Linguistic and Cognitive Approaches to Dialog Agents (LaCATODA 2020) IJCAI 2020 Workshop, Yokohama, Japan, January 2020

  3. arXiv:2206.00962  [pdf

    cs.CL cs.AI cs.LG

    Transfer Language Selection for Zero-Shot Cross-Lingual Abusive Language Detection

    Authors: Juuso Eronen, Michal Ptaszynski, Fumito Masui, Masaki Arata, Gniewosz Leliwa, Michal Wroczynski

    Abstract: We study the selection of transfer languages for automatic abusive language detection. Instead of preparing a dataset for every language, we demonstrate the effectiveness of cross-lingual transfer learning for zero-shot abusive language detection. This way we can use existing data from higher-resource languages to build better detection systems for low-resource languages. Our datasets are from sev… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Journal ref: Information Processing & Management, Volume 59, Issue 4, July 2022, paper ID: 102981

  4. arXiv:2111.01689  [pdf

    cs.CL cs.AI cs.CY

    Improving Classifier Training Efficiency for Automatic Cyberbullying Detection with Feature Density

    Authors: Juuso Eronen, Michal Ptaszynski, Fumito Masui, Aleksander Smywiński-Pohl, Gniewosz Leliwa, Michal Wroczynski

    Abstract: We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods in order to estimate dataset complexity, which in turn is used to comparatively estimate the potential performance of machine learning (ML) classifiers prior to any training. We hypothesise that estimating dataset complexity allows for the reduction of the number of required exper… ▽ More

    Submitted 2 November, 2021; v1 submitted 2 November, 2021; originally announced November 2021.

    Comments: 73 pages, 4 figures, 19 tables, Information Processing and Management, Vol. 58, Issue 5, September 2021, paper ID: 102616

    Journal ref: Information Processing and Management, Vol. 58, Issue 5, September 2021, paper ID: 102616

  5. arXiv:1808.00926  [pdf, other

    cs.CL

    Cyberbullying Detection -- Technical Report 2/2018, Department of Computer Science AGH, University of Science and Technology

    Authors: Michał Ptaszyński, Gniewosz Leliwa, Mateusz Piech, Aleksander Smywiński-Pohl

    Abstract: The research described in this paper concerns automatic cyberbullying detection in social media. There are two goals to achieve: building a gold standard cyberbullying detection dataset and measuring the performance of the Samurai cyberbullying detection system. The Formspring dataset provided in a Kaggle competition was re-annotated as a part of the research. The annotation procedure is described… ▽ More

    Submitted 2 August, 2018; originally announced August 2018.

    Report number: 2/2018 CS AGH