Skip to main content

Showing 1–6 of 6 results for author: Farra, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.14397  [pdf, other

    cs.CL cs.CY cs.LG

    RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?

    Authors: Adrian de Wynter, Ishaan Watts, Nektar Ege Altıntoprak, Tua Wongsangaroonsri, Minghui Zhang, Noura Farra, Lena Baur, Samantha Claudet, Pavel Gajdusek, Can Gören, Qilong Gu, Anna Kaminska, Tomasz Kaminski, Ruby Kuo, Akiko Kyuba, Jongho Lee, Kartik Mathur, Petter Merok, Ivana Milovanović, Nani Paananen, Vesa-Matti Paananen, Anna Pavlenko, Bruno Pereira Vidal, Luciano Strika, Yueh Tsao , et al. (8 additional authors not shown)

    Abstract: Large language models (LLMs) and small language models (SLMs) are being adopted at remarkable speed, although their safety still remains a serious concern. With the advent of multilingual S/LLMs, the question now becomes a matter of scale: can we expand multilingual safety evaluations of these models with the same velocity at which they are deployed? To this end we introduce RTP-LX, a human-transc… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Work in progress

  2. arXiv:1912.00741  [pdf, ps, other

    cs.CL cs.IR cs.LG

    SemEval-2017 Task 4: Sentiment Analysis in Twitter

    Authors: Sara Rosenthal, Noura Farra, Preslav Nakov

    Abstract: This paper describes the fifth year of the Sentiment Analysis in Twitter task. SemEval-2017 Task 4 continues with a rerun of the subtasks of SemEval-2016 Task 4, which include identifying the overall sentiment of the tweet, sentiment towards a topic with classification on a two-point and on a five-point ordinal scale, and quantification of the distribution of sentiment towards a topic across a num… ▽ More

    Submitted 2 December, 2019; originally announced December 2019.

    Comments: sentiment analysis, Twitter, classification, quantification, ranking, English, Arabic

    Report number: SemEval-2017 MSC Class: 68T50 ACM Class: I.2.7

  3. arXiv:1903.08983  [pdf, other

    cs.CL

    SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)

    Authors: Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, Ritesh Kumar

    Abstract: We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval). The task was based on a new dataset, the Offensive Language Identification Dataset (OLID), which contains over 14,000 English tweets. It featured three sub-tasks. In sub-task A, the goal was to discriminate between offensive and non-offensive posts. I… ▽ More

    Submitted 26 April, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

    Comments: Proceedings of the International Workshop on Semantic Evaluation (SemEval)

  4. arXiv:1902.09666  [pdf, ps, other

    cs.CL

    Predicting the Type and Target of Offensive Posts in Social Media

    Authors: Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, Ritesh Kumar

    Abstract: As offensive content has become pervasive in social media, there has been much research in identifying potentially offensive messages. However, previous work on this topic did not consider the problem as a whole, but rather focused on detecting very specific types of offensive content, e.g., hate speech, cyberbulling, or cyber-aggression. In contrast, here we target several different kinds of offe… ▽ More

    Submitted 16 April, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

    Comments: Proceedings of the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)

  5. arXiv:1902.08899  [pdf, other

    cs.CL

    The ARIEL-CMU Systems for LoReHLT18

    Authors: Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W Black, Jaime Carbonell, Graham V. Horwood , et al. (5 additional authors not shown)

    Abstract: This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).

    Submitted 24 February, 2019; originally announced February 2019.

  6. arXiv:1701.03434  [pdf, other

    cs.CL

    SMARTies: Sentiment Models for Arabic Target Entities

    Authors: Noura Farra, Kathleen McKeown

    Abstract: We consider entity-level sentiment analysis in Arabic, a morphologically rich language with increasing resources. We present a system that is applied to complex posts written in response to Arabic newspaper articles. Our goal is to identify important entity "targets" within the post along with the polarity expressed about each target. We achieve significant improvements over multiple baselines, de… ▽ More

    Submitted 12 January, 2017; originally announced January 2017.

    Comments: To be published in Proceedings of the European Chapter of the Association for Computational Linguistics (EACL 2017)