Search | arXiv e-print repository

Memorization In In-Context Learning

Authors: Shahriar Golchin, Mihai Surdeanu, Steven Bethard, Eduardo Blanco, Ellen Riloff

Abstract: In-context learning (ICL) has proven to be an effective strategy for improving the performance of large language models (LLMs) with no additional training. However, the exact mechanism behind these performance improvements remains unclear. This study is the first to show how ICL surfaces memorized training data and to explore the correlation between this memorization and performance across various… ▽ More In-context learning (ICL) has proven to be an effective strategy for improving the performance of large language models (LLMs) with no additional training. However, the exact mechanism behind these performance improvements remains unclear. This study is the first to show how ICL surfaces memorized training data and to explore the correlation between this memorization and performance across various ICL regimes: zero-shot, few-shot, and many-shot. Our most notable findings include: (1) ICL significantly surfaces memorization compared to zero-shot learning in most cases; (2) demonstrations, without their labels, are the most effective element in surfacing memorization; (3) ICL improves performance when the surfaced memorization in few-shot regimes reaches a high level (about 40%); and (4) there is a very strong correlation between performance and memorization in ICL when it outperforms zero-shot learning. Overall, our study uncovers a hidden phenomenon -- memorization -- at the core of ICL, raising an important question: to what extent do LLMs truly generalize from demonstrations in ICL, and how much of their success is due to memorization? △ Less

Submitted 21 August, 2024; originally announced August 2024.

Comments: v1

arXiv:2407.03525 [pdf, other]

UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization

Authors: Md Nayem Uddin, Amir Saeidi, Divij Handa, Agastya Seth, Tran Cao Son, Eduardo Blanco, Steven R. Corman, Chitta Baral

Abstract: This paper introduces UnSeenTimeQA, a novel time-sensitive question-answering (TSQA) benchmark that diverges from traditional TSQA benchmarks by avoiding factual and web-searchable queries. We present a series of time-sensitive event scenarios decoupled from real-world factual information. It requires large language models (LLMs) to engage in genuine temporal reasoning, disassociating from the kno… ▽ More This paper introduces UnSeenTimeQA, a novel time-sensitive question-answering (TSQA) benchmark that diverges from traditional TSQA benchmarks by avoiding factual and web-searchable queries. We present a series of time-sensitive event scenarios decoupled from real-world factual information. It requires large language models (LLMs) to engage in genuine temporal reasoning, disassociating from the knowledge acquired during the pre-training phase. Our evaluation of six open-source LLMs (ranging from 2B to 70B in size) and three closed-source LLMs reveal that the questions from the UnSeenTimeQA present substantial challenges. This indicates the models' difficulties in handling complex temporal reasoning scenarios. Additionally, we present several analyses shedding light on the models' performance in answering time-sensitive questions. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2406.16253 [pdf, other]

LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as they have to spend more time reading, writing, and reviewing papers. This raises the question: how can LLMs potentially assist researchers in alleviating their heavy workload? This study focuses on the topic of LLMs assist NLP Researchers, particularly examining the effectiveness of LLM in assisting paper (meta-)reviewing and its recognizability. To address this, we constructed the ReviewCritique dataset, which includes two types of information: (i) NLP papers (initial submissions rather than camera-ready) with both human-written and LLM-generated reviews, and (ii) each review comes with "deficiency" labels and corresponding explanations for individual segments, annotated by experts. Using ReviewCritique, this study explores two threads of research questions: (i) "LLMs as Reviewers", how do reviews generated by LLMs compare with those written by humans in terms of quality and distinguishability? (ii) "LLMs as Metareviewers", how effectively can LLMs identify potential issues, such as Deficient or unprofessional review segments, within individual paper reviews? To our knowledge, this is the first work to provide such a comprehensive analysis. △ Less

Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.07492 [pdf, other]

Paraphrasing in Affirmative Terms Improves Negation Understanding

Authors: MohammadHossein Rezaei, Eduardo Blanco

Abstract: Negation is a common linguistic phenomenon. Yet language models face challenges with negation in many natural language understanding tasks such as question answering and natural language inference. In this paper, we experiment with seamless strategies that incorporate affirmative interpretations (i.e., paraphrases without negation) to make models more robust against negation. Crucially, our affirm… ▽ More Negation is a common linguistic phenomenon. Yet language models face challenges with negation in many natural language understanding tasks such as question answering and natural language inference. In this paper, we experiment with seamless strategies that incorporate affirmative interpretations (i.e., paraphrases without negation) to make models more robust against negation. Crucially, our affirmative interpretations are obtained automatically. We show improvements with CondaQA, a large corpus requiring reasoning with negation, and five natural language understanding tasks. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: Accepted to ACL 2024

arXiv:2404.16413 [pdf, other]

Asking and Answering Questions to Extract Event-Argument Structures

Authors: Md Nayem Uddin, Enfa Rose George, Eduardo Blanco, Steven Corman

Abstract: This paper presents a question-answering approach to extract document-level event-argument structures. We automatically ask and answer questions for each argument type an event may have. Questions are generated using manually defined templates and generative transformers. Template-based questions are generated using predefined role-specific wh-words and event triggers from the context document. Tr… ▽ More This paper presents a question-answering approach to extract document-level event-argument structures. We automatically ask and answer questions for each argument type an event may have. Questions are generated using manually defined templates and generative transformers. Template-based questions are generated using predefined role-specific wh-words and event triggers from the context document. Transformer-based questions are generated using large language models trained to formulate questions based on a passage and the expected answer. Additionally, we develop novel data augmentation strategies specialized in inter-sentential event-argument relations. We use a simple span-swapping technique, coreference resolution, and large language models to augment the training instances. Our approach enables transfer learning without any corpora-specific modifications and yields competitive results with the RAMS dataset. It outperforms previous work, and it is especially beneficial to extract arguments that appear in different sentences than the event trigger. We also present detailed quantitative and qualitative analyses shedding light on the most common errors made by our best model. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: Accepted at LREC-COLING 2024

arXiv:2404.16262 [pdf, other]

Interpreting Answers to Yes-No Questions in Dialogues from Multiple Domains

Authors: Zijie Wang, Farzana Rashid, Eduardo Blanco

Abstract: People often answer yes-no questions without explicitly saying yes, no, or similar polar keywords. Figuring out the meaning of indirect answers is challenging, even for large language models. In this paper, we investigate this problem working with dialogues from multiple domains. We present new benchmarks in three diverse domains: movie scripts, tennis interviews, and airline customer service. We… ▽ More People often answer yes-no questions without explicitly saying yes, no, or similar polar keywords. Figuring out the meaning of indirect answers is challenging, even for large language models. In this paper, we investigate this problem working with dialogues from multiple domains. We present new benchmarks in three diverse domains: movie scripts, tennis interviews, and airline customer service. We present an approach grounded on distant supervision and blended training to quickly adapt to a new dialogue domain. Experimental results show that our approach is never detrimental and yields F1 improvements as high as 11-34%. △ Less

Submitted 24 April, 2024; originally announced April 2024.

Comments: To appear at NAACL 2024 Findings

arXiv:2404.04770 [pdf, other]

Generating Uncontextualized and Contextualized Questions for Document-Level Event Argument Extraction

Authors: Md Nayem Uddin, Enfa Rose George, Eduardo Blanco, Steven Corman

Abstract: This paper presents multiple question generation strategies for document-level event argument extraction. These strategies do not require human involvement and result in uncontextualized questions as well as contextualized questions grounded on the event and document of interest. Experimental results show that combining uncontextualized and contextualized questions is beneficial, especially when e… ▽ More This paper presents multiple question generation strategies for document-level event argument extraction. These strategies do not require human involvement and result in uncontextualized questions as well as contextualized questions grounded on the event and document of interest. Experimental results show that combining uncontextualized and contextualized questions is beneficial, especially when event triggers and arguments appear in different sentences. Our approach does not have corpus-specific components, in particular, the question generation strategies transfer across corpora. We also present a qualitative analysis of the most common errors made by our best model. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: Accepted at NAACL 2024

arXiv:2403.17146 [pdf, other]

Outcome-Constrained Large Language Models for Countering Hate Speech

Authors: Lingzi Hong, Pengcheng Luo, Eduardo Blanco, Xiaoying Song

Abstract: Counterspeech that challenges or responds to hate speech has been seen as an alternative to mitigate the negative impact of hate speech and foster productive online communications. Research endeavors have been directed to using language models for the automatic generation of counterspeech to assist efforts in combating online hate. Existing research focuses on the generation of counterspeech with… ▽ More Counterspeech that challenges or responds to hate speech has been seen as an alternative to mitigate the negative impact of hate speech and foster productive online communications. Research endeavors have been directed to using language models for the automatic generation of counterspeech to assist efforts in combating online hate. Existing research focuses on the generation of counterspeech with certain linguistic attributes, such as being polite, informative, and intent-driven. However, it remains unclear what impact the counterspeech might have in an online environment. We first explore methods that utilize large language models (LLM) to generate counterspeech constrained by potential conversation outcomes. We build two conversation outcome classifiers that predict the incivility level and the hater reentry behavior following replies to hate with Reddit data, then propose four methods to incorporate the desired outcomes, i.e., low conversation incivility and non-hateful hater reentry, into the text generation process, including Prompt with Instructions, Prompt and Select, LLM finetune, and LLM transformer reinforcement learning (TRL). Evaluation results show effective strategies to generate outcome-constrained counterspeech and the linguistic characteristics of texts generated by different methods. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.11082 [pdf, other]

RobustSentEmbed: Robust Sentence Embeddings Using Adversarial Self-Supervised Contrastive Learning

Authors: Javad Rafiei Asl, Prajwal Panzade, Eduardo Blanco, Daniel Takabi, Zhipeng Cai

Abstract: Pre-trained language models (PLMs) have consistently demonstrated outstanding performance across a diverse spectrum of natural language processing tasks. Nevertheless, despite their success with unseen data, current PLM-based representations often exhibit poor robustness in adversarial settings. In this paper, we introduce RobustSentEmbed, a self-supervised sentence embedding framework designed to… ▽ More Pre-trained language models (PLMs) have consistently demonstrated outstanding performance across a diverse spectrum of natural language processing tasks. Nevertheless, despite their success with unseen data, current PLM-based representations often exhibit poor robustness in adversarial settings. In this paper, we introduce RobustSentEmbed, a self-supervised sentence embedding framework designed to improve both generalization and robustness in diverse text representation tasks and against a diverse set of adversarial attacks. Through the generation of high-risk adversarial perturbations and their utilization in a novel objective function, RobustSentEmbed adeptly learns high-quality and robust sentence embeddings. Our experiments confirm the superiority of RobustSentEmbed over state-of-the-art representations. Specifically, Our framework achieves a significant reduction in the success rate of various adversarial attacks, notably reducing the BERTAttack success rate by almost half (from 75.51\% to 38.81\%). The framework also yields improvements of 1.59\% and 0.23\% in semantic textual similarity tasks and various transfer tasks, respectively. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: Accepted at the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL Findings) 2024. [https://openreview.net/forum?id=9dEAg4lJEA]

arXiv:2312.04804 [pdf, other]

Hate Cannot Drive out Hate: Forecasting Conversation Incivility following Replies to Hate Speech

Authors: Xinchen Yu, Eduardo Blanco, Lingzi Hong

Abstract: User-generated replies to hate speech are promising means to combat hatred, but questions about whether they can stop incivility in follow-up conversations linger. We argue that effective replies stop incivility from emerging in follow-up conversations - replies that elicit more incivility are counterproductive. This study introduces the task of predicting the incivility of conversations following… ▽ More User-generated replies to hate speech are promising means to combat hatred, but questions about whether they can stop incivility in follow-up conversations linger. We argue that effective replies stop incivility from emerging in follow-up conversations - replies that elicit more incivility are counterproductive. This study introduces the task of predicting the incivility of conversations following replies to hate speech. We first propose a metric to measure conversation incivility based on the number of civil and uncivil comments as well as the unique authors involved in the discourse. Our metric approximates human judgments more accurately than previous metrics. We then use the metric to evaluate the outcomes of replies to hate speech. A linguistic analysis uncovers the differences in the language of replies that elicit follow-up conversations with high and low incivility. Experimental results show that forecasting incivility is challenging. We close with a qualitative analysis shedding light into the most common errors made by the best model. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: The 18th International AAAI Conference on Web and Social Media (ICWSM 2024) Accepted

arXiv:2310.15464 [pdf, other]

Interpreting Answers to Yes-No Questions in User-Generated Content

Authors: Shivam Mathur, Keun Hee Park, Dhivya Chinnappa, Saketh Kotamraju, Eduardo Blanco

Abstract: Interpreting answers to yes-no questions in social media is difficult. Yes and no keywords are uncommon, and the few answers that include them are rarely to be interpreted what the keywords suggest. In this paper, we present a new corpus of 4,442 yes-no question-answer pairs from Twitter. We discuss linguistic characteristics of answers whose interpretation is yes or no, as well as answers whose i… ▽ More Interpreting answers to yes-no questions in social media is difficult. Yes and no keywords are uncommon, and the few answers that include them are rarely to be interpreted what the keywords suggest. In this paper, we present a new corpus of 4,442 yes-no question-answer pairs from Twitter. We discuss linguistic characteristics of answers whose interpretation is yes or no, as well as answers whose interpretation is unknown. We show that large language models are far from solving this problem, even after fine-tuning and blending other corpora for the same problem but outside social media. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted at the Findings of EMNLP 2023

arXiv:2310.13290 [pdf, other]

Interpreting Indirect Answers to Yes-No Questions in Multiple Languages

Authors: Zijie Wang, Md Mosharaf Hossain, Shivam Mathur, Terry Cruz Melo, Kadir Bulut Ozler, Keun Hee Park, Jacob Quintero, MohammadHossein Rezaei, Shreya Nupur Shakya, Md Nayem Uddin, Eduardo Blanco

Abstract: Yes-no questions expect a yes or no for an answer, but people often skip polar keywords. Instead, they answer with long explanations that must be interpreted. In this paper, we focus on this challenging problem and release new benchmarks in eight languages. We present a distant supervision approach to collect training data. We also demonstrate that direct answers (i.e., with polar keywords) are us… ▽ More Yes-no questions expect a yes or no for an answer, but people often skip polar keywords. Instead, they answer with long explanations that must be interpreted. In this paper, we focus on this challenging problem and release new benchmarks in eight languages. We present a distant supervision approach to collect training data. We also demonstrate that direct answers (i.e., with polar keywords) are useful to train models to interpret indirect answers (i.e., without polar keywords). Experimental results demonstrate that monolingual fine-tuning is beneficial if training data can be obtained via distant supervision for the language of interest (5 languages). Additionally, we show that cross-lingual fine-tuning is always beneficial (8 languages). △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: Accepted to EMNLP 2023 Findings

arXiv:2307.05034 [pdf, other]

Synthetic Dataset for Evaluating Complex Compositional Knowledge for Natural Language Inference

Authors: Sushma Anand Akoju, Robert Vacareanu, Haris Riaz, Eduardo Blanco, Mihai Surdeanu

Abstract: We introduce a synthetic dataset called Sentences Involving Complex Compositional Knowledge (SICCK) and a novel analysis that investigates the performance of Natural Language Inference (NLI) models to understand compositionality in logic. We produce 1,304 sentence pairs by modifying 15 examples from the SICK dataset (Marelli et al., 2014). To this end, we modify the original texts using a set of p… ▽ More We introduce a synthetic dataset called Sentences Involving Complex Compositional Knowledge (SICCK) and a novel analysis that investigates the performance of Natural Language Inference (NLI) models to understand compositionality in logic. We produce 1,304 sentence pairs by modifying 15 examples from the SICK dataset (Marelli et al., 2014). To this end, we modify the original texts using a set of phrases - modifiers that correspond to universal quantifiers, existential quantifiers, negation, and other concept modifiers in Natural Logic (NL) (MacCartney, 2009). We use these phrases to modify the subject, verb, and object parts of the premise and hypothesis. Lastly, we annotate these modified texts with the corresponding entailment labels following NL rules. We conduct a preliminary verification of how well the change in the structural and semantic composition is captured by neural NLI models, in both zero-shot and fine-tuned scenarios. We found that the performance of NLI models under the zero-shot setting is poor, especially for modified sentences with negation and existential quantifiers. After fine-tuning this dataset, we observe that models continue to perform poorly over negation, existential and universal modifiers. △ Less

Submitted 7 September, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

Comments: Accepted to Natural Language Reasoning and Structured Explanations (NLRSE) Workshop, ACL 2023. For dataset, please refer https://github.com/sushmaakoju/clulab-releases/blob/master/acl2023-nlrse-sicck/README.md and https://github.com/sushmaakoju/acl2023-nlrse-clulab-SICCK-dataset

arXiv:2210.14486 [pdf, other]

Leveraging Affirmative Interpretations from Negation Improves Natural Language Understanding

Authors: Md Mosharaf Hossain, Eduardo Blanco

Abstract: Negation poses a challenge in many natural language understanding tasks. Inspired by the fact that understanding a negated statement often requires humans to infer affirmative interpretations, in this paper we show that doing so benefits models for three natural language understanding tasks. We present an automated procedure to collect pairs of sentences with negation and their affirmative interpr… ▽ More Negation poses a challenge in many natural language understanding tasks. Inspired by the fact that understanding a negated statement often requires humans to infer affirmative interpretations, in this paper we show that doing so benefits models for three natural language understanding tasks. We present an automated procedure to collect pairs of sentences with negation and their affirmative interpretations, resulting in over 150,000 pairs. Experimental results show that leveraging these pairs helps (a) T5 generate affirmative interpretations from negations in a previous benchmark, and (b) a RoBERTa-based classifier solve the task of natural language inference. We also leverage our pairs to build a plug-and-play neural generator that given a negated statement generates an affirmative interpretation. Then, we incorporate the pretrained generator into a RoBERTa-based classifier for sentiment analysis and show that doing so improves the results. Crucially, our proposal does not require any manual effort. △ Less

Submitted 26 October, 2022; originally announced October 2022.

Comments: To appear at the main conference of EMNLP 2022

arXiv:2206.06423 [pdf, other]

Hate Speech and Counter Speech Detection: Conversational Context Does Matter

Authors: Xinchen Yu, Eduardo Blanco, Lingzi Hong

Abstract: Hate speech is plaguing the cyberspace along with user-generated content. This paper investigates the role of conversational context in the annotation and detection of online hate and counter speech, where context is defined as the preceding comment in a conversation thread. We created a context-aware dataset for a 3-way classification task on Reddit comments: hate speech, counter speech, or neutr… ▽ More Hate speech is plaguing the cyberspace along with user-generated content. This paper investigates the role of conversational context in the annotation and detection of online hate and counter speech, where context is defined as the preceding comment in a conversation thread. We created a context-aware dataset for a 3-way classification task on Reddit comments: hate speech, counter speech, or neutral. Our analyses indicate that context is critical to identify hate and counter speech: human judgments change for most comments depending on whether we show annotators the context. A linguistic analysis draws insights into the language people use to express hate and counter speech. Experimental results show that neural networks obtain significantly better results if context is taken into account. We also present qualitative error analyses shedding light into (a) when and why context is beneficial and (b) the remaining errors made by our best model when context is taken into account. △ Less

Submitted 13 June, 2022; originally announced June 2022.

Comments: Accepted by NAACL 2022

arXiv:2205.11467 [pdf, other]

A Question-Answer Driven Approach to Reveal Affirmative Interpretations from Verbal Negations

Authors: Md Mosharaf Hossain, Luke Holman, Anusha Kakileti, Tiffany Iris Kao, Nathan Raul Brito, Aaron Abraham Mathews, Eduardo Blanco

Abstract: This paper explores a question-answer driven approach to reveal affirmative interpretations from verbal negations (i.e., when a negation cue grammatically modifies a verb). We create a new corpus consisting of 4,472 verbal negations and discover that 67.1% of them convey that an event actually occurred. Annotators generate and answer 7,277 questions for the 3,001 negations that convey an affirmati… ▽ More This paper explores a question-answer driven approach to reveal affirmative interpretations from verbal negations (i.e., when a negation cue grammatically modifies a verb). We create a new corpus consisting of 4,472 verbal negations and discover that 67.1% of them convey that an event actually occurred. Annotators generate and answer 7,277 questions for the 3,001 negations that convey an affirmative interpretation. We first cast the problem of revealing affirmative interpretations from negations as a natural language inference (NLI) classification task. Experimental results show that state-of-the-art transformers trained with existing NLI corpora are insufficient to reveal affirmative interpretations. We also observe, however, that fine-tuning brings small improvements. In addition to NLI classification, we also explore the more realistic task of generating affirmative interpretations directly from negations with the T5 transformer. We conclude that the generation task remains a challenge as T5 substantially underperforms humans. △ Less

Submitted 23 May, 2022; originally announced May 2022.

Comments: Accepted at the Findings of NAACL 2022

arXiv:2203.16148 [pdf, other]

doi 10.18429/JACoW-ICALEPCS2021-WEPV042

Applying Model Checking to Highly-Configurable Safety Critical Software: The SPS-PPS PLC Program

Authors: Borja Fernandez Adiego, Ignacio D. Lopez-Miguel, Jean-Charles Tournier, Enrique Blanco, Tomasz Ladzinski, Frederic Havart

Abstract: An important aspect of many particle accelerators is the constant evolution and frequent configuration changes that are needed to perform the experiments they are designed for. This often leads to the design of configurable software that can absorb these changes and perform the required control and protection actions. This design strategy minimizes the engineering and maintenance costs, but it mak… ▽ More An important aspect of many particle accelerators is the constant evolution and frequent configuration changes that are needed to perform the experiments they are designed for. This often leads to the design of configurable software that can absorb these changes and perform the required control and protection actions. This design strategy minimizes the engineering and maintenance costs, but it makes the software verification activities more challenging since safety properties must be guaranteed for any of the possible configurations. Software model checking is a popular automated verification technique in many industries. This verification method explores all possible combinations of the system model to guarantee its compliance with certain properties or specification. This is a very appropriate technique for highly configurable software, since there is usually an enormous amount of combinations to be checked. This paper presents how PLCverif, a CERN model checking platform, has been applied to a highly configurable Programmable Logic Controller (PLC) program, the SPS Personnel Protection System (PPS). The benefits and challenges of this verification approach are also discussed. △ Less

Submitted 30 March, 2022; originally announced March 2022.

Comments: 18th International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS2021)

arXiv:2203.08929 [pdf, other]

An Analysis of Negation in Natural Language Understanding Corpora

Authors: Md Mosharaf Hossain, Dhivya Chinnappa, Eduardo Blanco

Abstract: This paper analyzes negation in eight popular corpora spanning six natural language understanding tasks. We show that these corpora have few negations compared to general-purpose English, and that the few negations in them are often unimportant. Indeed, one can often ignore negations and still make the right predictions. Additionally, experimental results show that state-of-the-art transformers tr… ▽ More This paper analyzes negation in eight popular corpora spanning six natural language understanding tasks. We show that these corpora have few negations compared to general-purpose English, and that the few negations in them are often unimportant. Indeed, one can often ignore negations and still make the right predictions. Additionally, experimental results show that state-of-the-art transformers trained with these corpora obtain substantially worse results with instances that contain negation, especially if the negations are important. We conclude that new corpora accounting for negation are needed to solve natural language understanding tasks when negation is present. △ Less

Submitted 16 March, 2022; originally announced March 2022.

Comments: To appear in the proceedings of ACL 2022 (main conference)

arXiv:2109.07017 [pdf, other]

Written Justifications are Key to Aggregate Crowdsourced Forecasts

Authors: Saketh Kotamraju, Eduardo Blanco

Abstract: This paper demonstrates that aggregating crowdsourced forecasts benefits from modeling the written justifications provided by forecasters. Our experiments show that the majority and weighted vote baselines are competitive, and that the written justifications are beneficial to call a question throughout its life except in the last quarter. We also conduct an error analysis shedding light into the c… ▽ More This paper demonstrates that aggregating crowdsourced forecasts benefits from modeling the written justifications provided by forecasters. Our experiments show that the majority and weighted vote baselines are competitive, and that the written justifications are beneficial to call a question throughout its life except in the last quarter. We also conduct an error analysis shedding light into the characteristics that make a justification unreliable. △ Less

Submitted 14 September, 2021; originally announced September 2021.

Comments: Findings of EMNLP 2021

arXiv:2010.05432 [pdf, other]

It's not a Non-Issue: Negation as a Source of Error in Machine Translation

Authors: Md Mosharaf Hossain, Antonios Anastasopoulos, Eduardo Blanco, Alexis Palmer

Abstract: As machine translation (MT) systems progress at a rapid pace, questions of their adequacy linger. In this study we focus on negation, a universal, core property of human language that significantly affects the semantics of an utterance. We investigate whether translating negation is an issue for modern MT systems using 17 translation directions as test bed. Through thorough analysis, we find that… ▽ More As machine translation (MT) systems progress at a rapid pace, questions of their adequacy linger. In this study we focus on negation, a universal, core property of human language that significantly affects the semantics of an utterance. We investigate whether translating negation is an issue for modern MT systems using 17 translation directions as test bed. Through thorough analysis, we find that indeed the presence of negation can significantly impact downstream quality, in some cases resulting in quality reductions of more than 60%. We also provide a linguistically motivated analysis that directly explains the majority of our findings. We release our annotations and code to replicate our analysis here: https://github.com/mosharafhossain/negation-mt. △ Less

Submitted 11 October, 2020; originally announced October 2020.

Comments: Accepted at the Findings of EMNLP2020

arXiv:2008.00956 [pdf, other]

doi 10.1017/S1471068420000137

Interactive Text Graph Mining with a Prolog-based Dialog Engine

Authors: Paul Tarau, Eduardo Blanco

Abstract: On top of a neural network-based dependency parser and a graph-based natural language processing module we design a Prolog-based dialog engine that explores interactively a ranked fact database extracted from a text document. We reorganize dependency graphs to focus on the most relevant content elements of a sentence and integrate sentence identifiers as graph nodes. Additionally, after rankin… ▽ More On top of a neural network-based dependency parser and a graph-based natural language processing module we design a Prolog-based dialog engine that explores interactively a ranked fact database extracted from a text document. We reorganize dependency graphs to focus on the most relevant content elements of a sentence and integrate sentence identifiers as graph nodes. Additionally, after ranking the graph we take advantage of the implicit semantic information that dependency links and WordNet bring in the form of subject-verb-object, is-a and part-of relations. Working on the Prolog facts and their inferred consequences, the dialog engine specializes the text graph with respect to a query and reveals interactively the document's most relevant content elements. The open-source code of the integrated system is available at https://github.com/ptarau/DeepRank . Under consideration in Theory and Practice of Logic Programming (TPLP). △ Less

Submitted 30 July, 2020; originally announced August 2020.

Comments: Under consideration in Theory and Practice of Logic Programming (TPLP). arXiv admin note: substantial text overlap with arXiv:1909.09742

Journal ref: Theory and Practice of Logic Programming 21 (2021) 244-263

arXiv:1909.09742 [pdf, other]

Dependency-based Text Graphs for Keyphrase and Summary Extraction with Applications to Interactive Content Retrieval

Authors: Paul Tarau, Eduardo Blanco

Abstract: We build a bridge between neural network-based machine learning and graph-based natural language processing and introduce a unified approach to keyphrase, summary and relation extraction by aggregating dependency graphs from links provided by a deep-learning based dependency parser. We reorganize dependency graphs to focus on the most relevant content elements of a sentence, integrate sentence i… ▽ More We build a bridge between neural network-based machine learning and graph-based natural language processing and introduce a unified approach to keyphrase, summary and relation extraction by aggregating dependency graphs from links provided by a deep-learning based dependency parser. We reorganize dependency graphs to focus on the most relevant content elements of a sentence, integrate sentence identifiers as graph nodes and after ranking the graph, we extract our keyphrases and summaries from its largest strongly-connected component. We take advantage of the implicit structural information that dependency links bring to extract subject-verb-object, is-a and part-of relations. We put it all together into a proof-of-concept dialog engine that specializes the text graph with respect to a query and reveals interactively the document's most relevant content elements. The open-source code of the integrated system is available at https://github.com/ptarau/DeepRank . Keywords: graph-based natural language processing, dependency graphs, keyphrase, summary and relation extraction, query-driven salient sentence extraction, logic-based dialog engine, synergies between neural and symbolic processing. △ Less

Submitted 20 September, 2019; originally announced September 2019.

arXiv:1904.03987 [pdf]

doi 10.1016/j.compag.2015.12.009

Early warning in egg production curves from commercial hens: A SVM approach

Authors: Iván Ramírez Morales, Daniel Rivero Cebrián, Enrique Fernández Blanco, Alejandro Pazos Sierra

Abstract: Artificial Intelligence allows the improvement of our daily life, for instance, speech and handwritten text recognition, real time translation and weather forecasting are common used applications. In the livestock sector, machine learning algorithms have the potential for early detection and warning of problems, which represents a significant milestone in the poultry industry. Production problems… ▽ More Artificial Intelligence allows the improvement of our daily life, for instance, speech and handwritten text recognition, real time translation and weather forecasting are common used applications. In the livestock sector, machine learning algorithms have the potential for early detection and warning of problems, which represents a significant milestone in the poultry industry. Production problems generate economic loss that could be avoided by acting in a timely manner. In the current study, training and testing of support vector machines are addressed, for an early detection of problems in the production curve of commercial eggs, using farm's egg production data of 478,919 laying hens grouped in 24 flocks. Experiments using support vector machines with a 5 k-fold cross-validation were performed at different previous time intervals, to alert with up to 5 days of forecasting interval, whether a flock will experience a problem in production curve. Performance metrics such as accuracy, specificity, sensitivity, and positive predictive value were evaluated, reaching 0-day values of 0.9874, 0.9876, 0.9783 and 0.6518 respectively on unseen data (test-set). The optimal forecasting interval was from zero to three days, performance metrics decreases as the forecasting interval is increased. It should be emphasized that this technique was able to issue an alert a day in advance, achieving an accuracy of 0.9854, a specificity of 0.9865, a sensitivity of 0.9333 and a positive predictive value of 0.6135. This novel application embedded in a computer system of poultry management is able to provide significant improvements in early detection and warning of problems related to the production curve. △ Less

Submitted 8 April, 2019; originally announced April 2019.

Journal ref: Early warning in egg production curves from commercial hens: A SVM approach, Computers and Electronics in Agriculture, Volume 121, 2016, Pages 169-179, ISSN 0168-1699, https://doi.org/10.1016/j.compag.2015.12.009

arXiv:1702.04415 [pdf, other]

Small Boxes Big Data: A Deep Learning Approach to Optimize Variable Sized Bin Packing

Authors: Feng Mao, Edgar Blanco, Mingang Fu, Rohit Jain, Anurag Gupta, Sebastien Mancel, Rong Yuan, Stephen Guo, Sai Kumar, Yayang Tian

Abstract: Bin Packing problems have been widely studied because of their broad applications in different domains. Known as a set of NP-hard problems, they have different vari- ations and many heuristics have been proposed for obtaining approximate solutions. Specifically, for the 1D variable sized bin packing problem, the two key sets of optimization heuristics are the bin assignment and the bin allocation.… ▽ More Bin Packing problems have been widely studied because of their broad applications in different domains. Known as a set of NP-hard problems, they have different vari- ations and many heuristics have been proposed for obtaining approximate solutions. Specifically, for the 1D variable sized bin packing problem, the two key sets of optimization heuristics are the bin assignment and the bin allocation. Usually the performance of a single static optimization heuristic can not beat that of a dynamic one which is tailored for each bin packing instance. Building such an adaptive system requires modeling the relationship between bin features and packing perform profiles. The primary drawbacks of traditional AI machine learnings for this task are the natural limitations of feature engineering, such as the curse of dimensionality and feature selection quality. We introduce a deep learning approach to overcome the drawbacks by applying a large training data set, auto feature selection and fast, accurate labeling. We show in this paper how to build such a system by both theoretical formulation and engineering practices. Our prediction system achieves up to 89% training accuracy and 72% validation accuracy to select the best heuristic that can generate a better quality bin packing solution. △ Less

Submitted 14 February, 2017; originally announced February 2017.

Comments: The Third IEEE International Conference on Big Data Computing Service and Applications, 2017

ACM Class: I.1.2; I.2.8

Showing 1–24 of 24 results for author: Blanco, E