Skip to main content

Showing 1–50 of 110 results for author: Søgaard, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13419  [pdf, other

    cs.CL cs.AI cs.LG cs.SC

    From Words to Worlds: Compositionality for Cognitive Architectures

    Authors: Ruchira Dhar, Anders Søgaard

    Abstract: Large language models (LLMs) are very performant connectionist systems, but do they exhibit more compositionality? More importantly, is that part of why they perform so well? We present empirical analyses across four LLM families (12 models) and three task categories, including a novel task introduced below. Our findings reveal a nuanced relationship in learning of compositional strategies by LLMs… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted to ICML 2024 Workshop on LLMs & Cognition

  2. arXiv:2407.06177  [pdf, other

    cs.CV cs.AI cs.CL cs.CY

    Vision-Language Models under Cultural and Inclusive Considerations

    Authors: Antonia Karamolegkou, Phillip Rust, Yong Cao, Ruixiang Cui, Anders Søgaard, Daniel Hershcovich

    Abstract: Large vision-language models (VLMs) can assist visually impaired people by describing images from their daily lives. Current evaluation datasets may not reflect diverse cultural user backgrounds or the situational context of this use case. To address this problem, we create a survey to determine caption preferences and propose a culture-centric evaluation benchmark by filtering VizWiz, an existing… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: HuCLLM @ ACL 2024

  3. arXiv:2406.11030  [pdf, other

    cs.CL

    FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

    Authors: Wenyan Li, Xinyu Zhang, Jiaang Li, Qiwei Peng, Raphael Tang, Li Zhou, Weijia Zhang, Guimin Hu, Yifei Yuan, Anders Søgaard, Daniel Hershcovich, Desmond Elliott

    Abstract: Food is a rich and varied dimension of cultural heritage, crucial to both individuals and social groups. To bridge the gap in the literature on the often-overlooked regional diversity in this domain, we introduce FoodieQA, a manually curated, fine-grained image-text dataset capturing the intricate features of food cultures across various regions in China. We evaluate vision-language Models (VLMs)… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  4. arXiv:2404.15206  [pdf, other

    cs.CL

    Does Instruction Tuning Make LLMs More Consistent?

    Authors: Constanza Fierro, Jiaang Li, Anders Søgaard

    Abstract: The purpose of instruction tuning is enabling zero-shot performance, but instruction tuning has also been shown to improve chain-of-thought reasoning and value alignment (Si et al., 2023). Here we consider the impact on $\textit{consistency}$, i.e., the sensitivity of language models to small perturbations in the input. We compare 10 instruction-tuned LLaMA models to the original LLaMA-7b model an… ▽ More

    Submitted 30 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  5. arXiv:2404.03036  [pdf, other

    cs.CL

    MuLan: A Study of Fact Mutability in Language Models

    Authors: Constanza Fierro, Nicolas Garneau, Emanuele Bugliarello, Yova Kementchedjhieva, Anders Søgaard

    Abstract: Facts are subject to contingencies and can be true or false in different circumstances. One such contingency is time, wherein some facts mutate over a given period, e.g., the president of a country or the winner of a championship. Trustworthy language models ideally identify mutable facts as such and process them accordingly. We create MuLan, a benchmark for evaluating the ability of English langu… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  6. arXiv:2403.15250  [pdf, other

    cs.CL cs.AI cs.LG

    Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A Multifaceted Statistical Approach

    Authors: Kun Sun, Rong Wang, Anders Søgaard

    Abstract: Amidst the rapid evolution of LLMs, the significance of evaluation in comprehending and propelling these models forward is increasingly paramount. Evaluations have revealed that factors such as scaling, training types, architectures and other factors profoundly impact the performance of LLMs. However, the extent and nature of these impacts continue to be subjects of debate because most assessments… ▽ More

    Submitted 24 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  7. arXiv:2403.00876  [pdf, other

    cs.CL cs.AI

    Word Order and World Knowledge

    Authors: Qinghua Zhao, Vinit Ravishankar, Nicolas Garneau, Anders Søgaard

    Abstract: Word order is an important concept in natural language, and in this work, we study how word order affects the induction of world knowledge from raw text using language models. We use word analogies to probe for such knowledge. Specifically, in addition to the natural word order, we first respectively extract texts of six fixed word orders from five languages and then pretrain the language models o… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  8. arXiv:2402.19133  [pdf, other

    cs.CL

    Evaluating Webcam-based Gaze Data as an Alternative for Human Rationale Annotations

    Authors: Stephanie Brandl, Oliver Eberle, Tiago Ribeiro, Anders Søgaard, Nora Hollenstein

    Abstract: Rationales in the form of manually annotated input spans usually serve as ground truth when evaluating explainability methods in NLP. They are, however, time-consuming and often biased by the annotation process. In this paper, we debate whether human gaze, in the form of webcam-based eye-tracking recordings, poses a valid alternative when evaluating importance scores. We evaluate the additional in… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted to LREC-COLING 2024

  9. arXiv:2310.19567  [pdf, other

    cs.CL cs.AI

    CreoleVal: Multilingual Multitask Benchmarks for Creoles

    Authors: Heather Lent, Kushal Tatariya, Raj Dabre, Yiyi Chen, Marcell Fekete, Esther Ploeger, Li Zhou, Ruth-Ann Armstrong, Abee Eijansantos, Catriona Malau, Hans Erik Heje, Ernests Lavrinovics, Diptesh Kanojia, Paul Belony, Marcel Bollmann, Loïc Grobol, Miryam de Lhoneux, Daniel Hershcovich, Michel DeGraff, Anders Søgaard, Johannes Bjerva

    Abstract: Creoles represent an under-explored and marginalized group of languages, with few available resources for NLP research.While the genealogical ties between Creoles and a number of highly-resourced languages imply a significant potential for transfer learning, this potential is hampered due to this lack of annotated data. In this work we present CreoleVal, a collection of benchmark datasets spanning… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted to TACL

  10. arXiv:2310.13771  [pdf, other

    cs.CL cs.AI

    Copyright Violations and Large Language Models

    Authors: Antonia Karamolegkou, Jiaang Li, Li Zhou, Anders Søgaard

    Abstract: Language models may memorize more than just facts, including entire chunks of texts seen during training. Fair use exemptions to copyright laws typically allow for limited use of copyrighted material without permission from the copyright holder, but typically for extraction of information from copyrighted materials, rather than {\em verbatim} reproduction. This work explores the issue of copyright… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023

  11. arXiv:2308.15047  [pdf, other

    cs.LG cs.CL

    Large language models converge toward human-like concept organization

    Authors: Mathias Lykke Gammelgaard, Jonathan Gabel Christiansen, Anders Søgaard

    Abstract: Large language models show human-like performance in knowledge extraction, reasoning and dialogue, but it remains controversial whether this performance is best explained by memorization and pattern matching, or whether it reflects human-like inferential semantics and world knowledge. Knowledge bases such as WikiData provide large-scale, high-quality representations of inferential semantics and wo… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  12. arXiv:2308.08774  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models

    Authors: Phillip Rust, Anders Søgaard

    Abstract: Language models such as mBERT, XLM-R, and BLOOM aim to achieve multilingual generalization or compression to facilitate transfer to a large number of (potentially unseen) languages. However, these models should ideally also be private, linguistically fair, and transparent, by relating their predictions to training data. Can these requirements be simultaneously satisfied? We show that multilingual… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: ICML 2023

  13. arXiv:2307.04427  [pdf, other

    astro-ph.HE astro-ph.GA cs.LG

    Observation of high-energy neutrinos from the Galactic plane

    Authors: R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., S. W. Barwick, V. Basu, S. Baur, R. Bay, J. J. Beatty, K. -H. Becker, J. Becker Tjus , et al. (364 additional authors not shown)

    Abstract: The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrin… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: Submitted on May 12th, 2022; Accepted on May 4th, 2023

    Journal ref: Science 380, 6652, 1338-1343 (2023)

  14. arXiv:2306.05126  [pdf, other

    cs.CL

    Mapping Brains with Language Models: A Survey

    Authors: Antonia Karamolegkou, Mostafa Abdou, Anders Søgaard

    Abstract: Over the years, many researchers have seemingly made the same observation: Brain and language model activations exhibit some structural similarities, enabling linear partial mappings between features extracted from neural recordings and computational language models. In an attempt to evaluate how much evidence has been accumulated for this observation, we survey over 30 studies spanning 10 dataset… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

  15. arXiv:2306.01930  [pdf, other

    cs.CL cs.AI

    Structural Similarities Between Language Models and Neural Response Measurements

    Authors: Jiaang Li, Antonia Karamolegkou, Yova Kementchedjhieva, Mostafa Abdou, Sune Lehmann, Anders Søgaard

    Abstract: Large language models (LLMs) have complicated internal dynamics, but induce representations of words and phrases whose geometry we can study. Human language processing is also opaque, but neural response measurements can provide (noisy) recordings of activation during listening or reading, from which we can extract similar representations of words and phrases. Here we study the extent to which the… ▽ More

    Submitted 31 October, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: NeurReps@NeurIPS 2023

  16. arXiv:2306.00639  [pdf, other

    cs.CL cs.HC

    Being Right for Whose Right Reasons?

    Authors: Terne Sasha Thorn Jakobsen, Laura Cabello, Anders Søgaard

    Abstract: Explainability methods are used to benchmark the extent to which model predictions align with human rationales i.e., are 'right for the right reasons'. Previous work has failed to acknowledge, however, that what counts as a rationale is sometimes subjective. This paper presents what we think is a first of its kind, a collection of human rationale annotations augmented with the annotators demograph… ▽ More

    Submitted 13 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: In Proceedings of ACL 2023

  17. arXiv:2305.19597  [pdf, other

    cs.CL cs.AI

    What does the Failure to Reason with "Respectively" in Zero/Few-Shot Settings Tell Us about Language Models?

    Authors: Ruixiang Cui, Seolhwa Lee, Daniel Hershcovich, Anders Søgaard

    Abstract: Humans can effortlessly understand the coordinate structure of sentences such as "Niels Bohr and Kurt Cobain were born in Copenhagen and Seattle, respectively". In the context of natural language inference (NLI), we examine how language models (LMs) reason with respective readings (Gawron and Kehler, 2004) from two perspectives: syntactic-semantic and commonsense-world knowledge. We propose a cont… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: To appear at ACL 2023

  18. Private Meeting Summarization Without Performance Loss

    Authors: Seolhwa Lee, Anders Søgaard

    Abstract: Meeting summarization has an enormous business potential, but in addition to being a hard problem, roll-out is challenged by privacy concerns. We explore the problem of meeting summarization under differential privacy constraints and find, to our surprise, that while differential privacy leads to slightly lower performance on in-sample data, differential privacy improves performance when evaluated… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: SIGIR23 Main conference

  19. arXiv:2305.07507  [pdf, other

    cs.CL

    LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development

    Authors: Ilias Chalkidis, Nicolas Garneau, Catalina Goanta, Daniel Martin Katz, Anders Søgaard

    Abstract: In this work, we conduct a detailed analysis on the performance of legal-oriented pre-trained language models (PLMs). We examine the interplay between their original objective, acquired knowledge, and legal language understanding capacities which we define as the upstream, probing, and downstream performance, respectively. We consider not only the models' size but also the pre-training corpora use… ▽ More

    Submitted 22 May, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: 9 pages, long paper at ACL 2023 proceedings

  20. arXiv:2304.10153  [pdf, other

    cs.CL cs.HC

    On the Independence of Association Bias and Empirical Fairness in Language Models

    Authors: Laura Cabello, Anna Katrine Jørgensen, Anders Søgaard

    Abstract: The societal impact of pre-trained language models has prompted researchers to probe them for strong associations between protected attributes and value-loaded terms, from slur to prestigious job titles. Such work is said to probe models for bias or fairness-or such probes 'into representational biases' are said to be 'motivated by fairness'-suggesting an intimate connection between bias and fairn… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: To be published in ACM FAccT 23

  21. arXiv:2303.17876  [pdf, other

    cs.CL

    WebQAmGaze: A Multilingual Webcam Eye-Tracking-While-Reading Dataset

    Authors: Tiago Ribeiro, Stephanie Brandl, Anders Søgaard, Nora Hollenstein

    Abstract: We present WebQAmGaze, a multilingual low-cost eye-tracking-while-reading dataset, designed as the first webcam-based eye-tracking corpus of reading to support the development of explainable computational language processing models. WebQAmGaze includes webcam eye-tracking data from 600 participants of a wide age range naturally reading English, German, Spanish, and Turkish texts. Each participant… ▽ More

    Submitted 15 March, 2024; v1 submitted 31 March, 2023; originally announced March 2023.

  22. arXiv:2302.10086  [pdf, other

    cs.CL

    A Two-Sided Discussion of Preregistration of NLP Research

    Authors: Anders Søgaard, Daniel Hershcovich, Miryam de Lhoneux

    Abstract: Van Miltenburg et al. (2021) suggest NLP research should adopt preregistration to prevent fishing expeditions and to promote publication of negative results. At face value, this is a very reasonable suggestion, seemingly solving many methodological problems with NLP research. We discuss pros and cons -- some old, some new: a) Preregistration is challenged by the practice of retrieving hypotheses a… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: EACL 2023

  23. arXiv:2302.06555  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    Do Vision and Language Models Share Concepts? A Vector Space Alignment Study

    Authors: Jiaang Li, Yova Kementchedjhieva, Constanza Fierro, Anders Søgaard

    Abstract: Large-scale pretrained language models (LMs) are said to ``lack the ability to connect utterances to the world'' (Bender and Koller, 2020), because they do not have ``mental models of the world' '(Mitchell and Krakauer, 2023). If so, one would expect LM representations to be unrelated to representations induced by vision models. We present an empirical evaluation across four families of LMs (BERT,… ▽ More

    Submitted 6 July, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: 12 pages, long paper accepted by TACL

  24. arXiv:2212.09255  [pdf, other

    cs.CL

    Multi hash embeddings in spaCy

    Authors: Lester James Miranda, Ákos Kádár, Adriane Boyd, Sofie Van Landeghem, Anders Søgaard, Matthew Honnibal

    Abstract: The distributed representation of symbols is one of the key technologies in machine learning systems today, playing a pivotal role in modern natural language processing. Traditional word embeddings associate a separate vector with each word. While this approach is simple and leads to good performance, it requires a lot of memory for representing a large vocabulary. To reduce the memory footprint,… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    ACM Class: I.2.7

  25. arXiv:2210.12194  [pdf, other

    astro-ph.IM cs.LG hep-ex physics.data-an

    GraphNeT: Graph neural networks for neutrino telescope event reconstruction

    Authors: Andreas Søgaard, Rasmus F. Ørsøe, Leon Bozianu, Morten Holm, Kaare Endrup Iversen, Tim Guggenmos, Martin Ha Minh, Philipp Eller, Troels C. Petersen

    Abstract: GraphNeT is an open-source python framework aimed at providing high quality, user friendly, end-to-end functionality to perform reconstruction tasks at neutrino telescopes using graph neural networks (GNNs). GraphNeT makes it fast and easy to train complex models that can provide event reconstruction with state-of-the-art performance, for arbitrary detector configurations, with inference times tha… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: 6 pages, 1 figure. Code can be found at https://github.com/graphnet-team/graphnet . Submitted to the Journal of Open Source Software (JOSS)

  26. arXiv:2210.05457  [pdf, other

    cs.CL

    Are Pretrained Multilingual Models Equally Fair Across Languages?

    Authors: Laura Cabello Piqueras, Anders Søgaard

    Abstract: Pretrained multilingual language models can help bridge the digital language divide, enabling high-quality NLP models for lower resourced languages. Studies of multilingual models have so far focused on performance, consistency, and cross-lingual generalisation. However, with their wide-spread application in the wild and downstream societal impact, it is important to put multilingual models under… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  27. arXiv:2209.03042  [pdf, other

    hep-ex astro-ph.IM cs.LG physics.data-an physics.ins-det

    Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

    Authors: R. Abbasi, M. Ackermann, J. Adams, N. Aggarwal, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, V. Basu, R. Bay, J. J. Beatty, K. -H. Becker , et al. (359 additional authors not shown)

    Abstract: IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challen… ▽ More

    Submitted 11 October, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: Prepared for submission to JINST

  28. arXiv:2206.09755  [pdf, other

    cs.CL cs.LG

    Square One Bias in NLP: Towards a Multi-Dimensional Exploration of the Research Manifold

    Authors: Sebastian Ruder, Ivan Vulić, Anders Søgaard

    Abstract: The prototypical NLP experiment trains a standard architecture on labeled English data and optimizes for accuracy, without accounting for other dimensions such as fairness, interpretability, or computational efficiency. We show through a manual classification of recent NLP research papers that this is indeed the case and refer to it as the square one experimental setup. We observe that NLP researc… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

    Comments: Findings of ACL 2022

  29. arXiv:2206.04371  [pdf, other

    cs.CL

    Ancestor-to-Creole Transfer is Not a Walk in the Park

    Authors: Heather Lent, Emanuele Bugliarello, Anders Søgaard

    Abstract: We aim to learn language models for Creole languages for which large volumes of data are not readily available, and therefore explore the potential transfer from ancestor languages (the 'Ancestry Transfer Hypothesis'). We find that standard transfer methods do not facilitate ancestry transfer. Surprisingly, different from other non-Creole languages, a very distinct two-phase pattern emerges for Cr… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: Workshop on Insights from Negative Results in NLP 2022

  30. arXiv:2206.02661  [pdf, other

    cs.CL

    Evaluating Deep Taylor Decomposition for Reliability Assessment in the Wild

    Authors: Stephanie Brandl, Daniel Hershcovich, Anders Søgaard

    Abstract: We argue that we need to evaluate model interpretability methods 'in the wild', i.e., in situations where professionals make critical decisions, and models can potentially assist them. We present an in-the-wild evaluation of token attribution based on Deep Taylor Decomposition, with professional journalists performing reliability assessments. We find that using this method in conjunction with RoBE… ▽ More

    Submitted 3 May, 2022; originally announced June 2022.

    Comments: ICWSM 2022

  31. arXiv:2206.00437  [pdf, other

    cs.CL cs.CY

    What a Creole Wants, What a Creole Needs

    Authors: Heather Lent, Kelechi Ogueji, Miryam de Lhoneux, Orevaoghene Ahia, Anders Søgaard

    Abstract: In recent years, the natural language processing (NLP) community has given increased attention to the disparity of efforts directed towards high-resource languages over low-resource ones. Efforts to remedy this delta often begin with translations of existing English datasets into other languages. However, this approach ignores that different language communities have different needs. We consider a… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: LREC 2022

  32. arXiv:2205.10226  [pdf, other

    cs.CL cs.LG

    Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?

    Authors: Stephanie Brandl, Oliver Eberle, Jonas Pilot, Anders Søgaard

    Abstract: Learned self-attention functions in state-of-the-art NLP models often correlate with human attention. We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention. We compare attention functions across two task-specific reading datasets for sentiment analysis and… ▽ More

    Submitted 25 April, 2022; originally announced May 2022.

    Comments: Accepted to ACL 2022

  33. arXiv:2205.03075  [pdf, other

    cs.CV cs.CL

    QLEVR: A Diagnostic Dataset for Quantificational Language and Elementary Visual Reasoning

    Authors: Zechen Li, Anders Søgaard

    Abstract: Synthetic datasets have successfully been used to probe visual question-answering datasets for their reasoning abilities. CLEVR (johnson2017clevr), for example, tests a range of visual reasoning abilities. The questions in CLEVR focus on comparisons of shapes, colors, and sizes, numerical reasoning, and existence claims. This paper introduces a minimally biased, diagnostic visual question-answerin… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: To appear at Findings of NAACL 2022

  34. arXiv:2204.10615  [pdf, other

    cs.CL cs.LO

    Generalized Quantifiers as a Source of Error in Multilingual NLU Benchmarks

    Authors: Ruixiang Cui, Daniel Hershcovich, Anders Søgaard

    Abstract: Logical approaches to representing language have developed and evaluated computational models of quantifier words since the 19th century, but today's NLU models still struggle to capture their semantics. We rely on Generalized Quantifier Theory for language-independent representations of the semantics of quantifier words, to quantify their contribution to the errors of NLU models. We find that qua… ▽ More

    Submitted 20 May, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: To appear at NAACL 2022

  35. arXiv:2204.10281  [pdf, other

    cs.CL

    How Conservative are Language Models? Adapting to the Introduction of Gender-Neutral Pronouns

    Authors: Stephanie Brandl, Ruixiang Cui, Anders Søgaard

    Abstract: Gender-neutral pronouns have recently been introduced in many languages to a) include non-binary people and b) as a generic singular. Recent results from psycholinguistics suggest that gender-neutral pronouns (in Swedish) are not associated with human processing difficulties. This, we show, is in sharp contrast with automated processing. We show that gender-neutral pronouns in Danish, English, and… ▽ More

    Submitted 3 May, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: To appear at NAACL 2022

  36. Factual Consistency of Multilingual Pretrained Language Models

    Authors: Constanza Fierro, Anders Søgaard

    Abstract: Pretrained language models can be queried for factual knowledge, with potential applications in knowledge base acquisition and tasks that require inference. However, for that, we need to know how reliable this knowledge is, and recent work has shown that monolingual English language models lack consistency when predicting factual knowledge, that is, they fill-in-the-blank differently for paraphras… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Journal ref: Findings of the Association for Computational Linguistics: ACL 2022, pages 3046-3052, Dublin, Ireland. Association for Computational Linguistics

  37. arXiv:2203.10995  [pdf, other

    cs.CL

    Word Order Does Matter (And Shuffled Language Models Know It)

    Authors: Vinit Ravishankar, Mostafa Abdou, Artur Kulmizev, Anders Søgaard

    Abstract: Recent studies have shown that language models pretrained and/or fine-tuned on randomly permuted sentences exhibit competitive performance on GLUE, putting into question the importance of word order information. Somewhat counter-intuitively, some of these studies also report that position embeddings appear to be crucial for models' good performance with shuffled text. We probe these language model… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: To appear at ACL 2022; 9 pages

  38. arXiv:2203.10020  [pdf, other

    cs.CL

    Challenges and Strategies in Cross-Cultural NLP

    Authors: Daniel Hershcovich, Stella Frank, Heather Lent, Miryam de Lhoneux, Mostafa Abdou, Stephanie Brandl, Emanuele Bugliarello, Laura Cabello Piqueras, Ilias Chalkidis, Ruixiang Cui, Constanza Fierro, Katerina Margatina, Phillip Rust, Anders Søgaard

    Abstract: Various efforts in the Natural Language Processing (NLP) community have been made to accommodate linguistic diversity and serve speakers of many different languages. However, it is important to acknowledge that speakers and the content they produce and require, vary not just by language, but also by culture. Although language and culture are tightly linked, there are important differences. Analogo… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: ACL 2022 - Theme track

  39. arXiv:2203.08555  [pdf, other

    cs.CL

    Zero-Shot Dependency Parsing with Worst-Case Aware Automated Curriculum Learning

    Authors: Miryam de Lhoneux, Sheng Zhang, Anders Søgaard

    Abstract: Large multilingual pretrained language models such as mBERT and XLM-RoBERTa have been found to be surprisingly effective for cross-lingual transfer of syntactic parsing models (Wu and Dredze 2019), but only between related languages. However, source and training languages are rarely related, when parsing truly low-resource languages. To close this gap, we adopt a method from multi-task learning, w… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: ACL 2022

  40. arXiv:2203.07856  [pdf, other

    cs.CL cs.LG

    Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting

    Authors: Ilias Chalkidis, Anders Søgaard

    Abstract: In document classification for, e.g., legal and biomedical text, we often deal with hundreds of classes, including very infrequent ones, as well as temporal concept drift caused by the influence of real world events, e.g., policy changes, conflicts, or pandemics. Class imbalance and drift can sometimes be mitigated by resampling the training data to simulate (or compensate for) a known target dist… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: 9 pages, long paper at ACL 2022 Findings

  41. arXiv:2203.07228  [pdf, other

    cs.CL

    FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing

    Authors: Ilias Chalkidis, Tommaso Pasini, Sheng Zhang, Letizia Tomada, Sebastian Felix Schwemer, Anders Søgaard

    Abstract: We present a benchmark suite of four datasets for evaluating the fairness of pre-trained language models and the techniques used to fine-tune them for downstream tasks. Our benchmarks cover four jurisdictions (European Council, USA, Switzerland, and China), five languages (English, German, French, Italian and Chinese) and fairness across five attributes (gender, age, region, language, and legal ar… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: 9 pages, long paper at ACL 2022 proceedings

  42. arXiv:2203.02745  [pdf, other

    cs.CR cs.CL cs.LG

    The Impact of Differential Privacy on Group Disparity Mitigation

    Authors: Victor Petrén Bach Hansen, Atula Tejaswi Neerkaje, Ramit Sawhney, Lucie Flek, Anders Søgaard

    Abstract: The performance cost of differential privacy has, for some applications, been shown to be higher for minority groups; fairness, conversely, has been shown to disproportionally compromise the privacy of members of such groups. Most work in this area has been restricted to computer vision and risk assessment. In this paper, we evaluate the impact of differential privacy on fairness across four tasks… ▽ More

    Submitted 5 March, 2022; originally announced March 2022.

  43. arXiv:2202.12058  [pdf, other

    cs.LG cs.CL

    Exploring the Unfairness of DP-SGD Across Settings

    Authors: Frederik Noe, Rasmus Herskind, Anders Søgaard

    Abstract: End users and regulators require private and fair artificial intelligence models, but previous work suggests these objectives may be at odds. We use the CivilComments to evaluate the impact of applying the {\em de facto} standard approach to privacy, DP-SGD, across several fairness metrics. We evaluate three implementations of DP-SGD: for dimensionality reduction (PCA), linear classification (logi… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

    Comments: 6 pages, 3 figures, https://aaai-ppai22.github.io/

    Journal ref: The Third AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-22) (2022)

  44. arXiv:2111.14842  [pdf, other

    eess.AS cs.CL cs.LG

    Do We Still Need Automatic Speech Recognition for Spoken Language Understanding?

    Authors: Lasse Borgholt, Jakob Drachmann Havtorn, Mostafa Abdou, Joakim Edin, Lars Maaløe, Anders Søgaard, Christian Igel

    Abstract: Spoken language understanding (SLU) tasks are usually solved by first transcribing an utterance with automatic speech recognition (ASR) and then feeding the output to a text-based model. Recent advances in self-supervised representation learning for speech data have focused on improving the ASR component. We investigate whether representation learning for speech has matured enough to replace ASR i… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

    Comments: Under review as a conference paper at ICASSP 2022

  45. arXiv:2111.04683  [pdf, other

    cs.LG cs.AI

    Revisiting Methods for Finding Influential Examples

    Authors: Karthikeyan K, Anders Søgaard

    Abstract: Several instance-based explainability methods for finding influential training examples for test-time decisions have been proposed recently, including Influence Functions, TraceIn, Representer Point Selection, Grad-Dot, and Grad-Cos. Typically these methods are evaluated using LOO influence (Cook's distance) as a gold standard, or using various heuristics. In this paper, we show that all of the ab… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

  46. arXiv:2110.05111  [pdf, other

    cs.CL

    Dynamic Forecasting of Conversation Derailment

    Authors: Yova Kementchedjhieva, Anders Søgaard

    Abstract: Online conversations can sometimes take a turn for the worse, either due to systematic cultural differences, accidental misunderstandings, or mere malice. Automatically forecasting derailment in public online conversations provides an opportunity to take early action to moderate it. Previous work in this space is limited, and we extend it in several ways. We apply a pretrained language encoder to… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: To appear at EMNLP 2021

  47. arXiv:2110.04384  [pdf, other

    cs.CL

    Evaluation of Summarization Systems across Gender, Age, and Race

    Authors: Anna Jørgensen, Anders Søgaard

    Abstract: Summarization systems are ultimately evaluated by human annotators and raters. Usually, annotators and raters do not reflect the demographics of end users, but are recruited through student populations or crowdsourcing platforms with skewed demographics. For two different evaluation scenarios -- evaluation against gold summaries and system output ratings -- we show that summary evaluation is sensi… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  48. arXiv:2109.07971  [pdf, other

    cs.CL

    Do Language Models Know the Way to Rome?

    Authors: Bastien Liétard, Mostafa Abdou, Anders Søgaard

    Abstract: The global geometry of language models is important for a range of applications, but language model probes tend to evaluate rather local relations, for which ground truths are easily obtained. In this paper we exploit the fact that in geography, ground truths are available beyond local relations. In a series of experiments, we evaluate the extent to which language model representations of city and… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Journal ref: BlackboxNLP Workshop 2021

  49. arXiv:2109.06129  [pdf, other

    cs.CV cs.CL

    Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color

    Authors: Mostafa Abdou, Artur Kulmizev, Daniel Hershcovich, Stella Frank, Ellie Pavlick, Anders Søgaard

    Abstract: Pretrained language models have been shown to encode relational information, such as the relations between entities or concepts in knowledge-bases -- (Paris, Capital, France). However, simple relations of this type can often be recovered heuristically and the extent to which models implicitly reflect topological structure that is grounded in world, such as perceptual structure, is unknown. To expl… ▽ More

    Submitted 14 September, 2021; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: CoNLL 2021

  50. arXiv:2109.06074  [pdf, other

    cs.CL

    On Language Models for Creoles

    Authors: Heather Lent, Emanuele Bugliarello, Miryam de Lhoneux, Chen Qiu, Anders Søgaard

    Abstract: Creole languages such as Nigerian Pidgin English and Haitian Creole are under-resourced and largely ignored in the NLP literature. Creoles typically result from the fusion of a foreign language with multiple local languages, and what grammatical and lexical features are transferred to the creole is a complex process. While creoles are generally stable, the prominence of some features may be much s… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: CoNLL 2021