Skip to main content

Showing 1–50 of 51 results for author: Dhingra, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15968  [pdf, other

    cs.CL cs.LG

    ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods

    Authors: Roy Xie, Junlin Wang, Ruomin Huang, Minxing Zhang, Rong Ge, Jian Pei, Neil Zhenqiang Gong, Bhuwan Dhingra

    Abstract: The rapid scaling of large language models (LLMs) has raised concerns about the transparency and fair use of the pretraining data used for training them. Detecting such content is challenging due to the scale of the data and limited exposure of each instance during training. We propose ReCaLL (Relative Conditional Log-Likelihood), a novel membership inference attack (MIA) to detect LLMs' pretraini… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2406.06737  [pdf, other

    cs.CR cs.CL

    Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications

    Authors: Junlin Wang, Tianyi Yang, Roy Xie, Bhuwan Dhingra

    Abstract: With the proliferation of LLM-integrated applications such as GPT-s, millions are deployed, offering valuable services through proprietary instruction prompts. These systems, however, are prone to prompt extraction attacks through meticulously designed queries. To help mitigate this problem, we introduce the Raccoon benchmark which comprehensively evaluates a model's susceptibility to prompt extra… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  3. arXiv:2406.04291  [pdf, other

    cs.LG stat.ML

    Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

    Authors: Adam Fisch, Joshua Maynez, R. Alex Hofer, Bhuwan Dhingra, Amir Globerson, William W. Cohen

    Abstract: Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. PPI achieves this by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate -- but potentially biased -- automatic system, in a way that results in tighter confidence intervals for certain parameters of interest (e.g., the mean… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2405.13131  [pdf, other

    cs.CL

    Atomic Self-Consistency for Better Long Form Generations

    Authors: Raghuveer Thirukovalluru, Yukun Huang, Bhuwan Dhingra

    Abstract: Recent work has aimed to improve LLM generations by filtering out hallucinations, thereby improving the precision of the information in responses. Correctness of a long-form response, however, also depends on the recall of multiple pieces of information relevant to the question. In this paper, we introduce Atomic Self-Consistency (ASC), a technique for improving the recall of relevant information… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 12 pages

  5. arXiv:2405.10861  [pdf, other

    cs.CL cs.AI cs.CY

    Tailoring Vaccine Messaging with Common-Ground Opinions

    Authors: Rickard Stureborg, Sanxing Chen, Ruoyu Xie, Aayushi Patel, Christopher Li, Chloe Qinyu Zhu, Tingnan Hu, Jun Yang, Bhuwan Dhingra

    Abstract: One way to personalize chatbot interactions is by establishing common ground with the intended reader. A domain where establishing mutual understanding could be particularly impactful is vaccine concerns and misinformation. Vaccine interventions are forms of messaging which aim to answer concerns expressed about vaccination. Tailoring responses in this domain is difficult, since opinions often hav… ▽ More

    Submitted 23 July, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: NAACL Findings 2024

    MSC Class: 68T50 (Primary) 68T01; 68T37; 91F20 (Secondary) ACM Class: I.2; I.2.7; I.7

  6. arXiv:2405.06034  [pdf, other

    cs.LG

    Bayesian Prediction-Powered Inference

    Authors: R. Alex Hofer, Joshua Maynez, Bhuwan Dhingra, Adam Fisch, Amir Globerson, William W. Cohen

    Abstract: Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. Specifically, PPI methods provide tighter confidence intervals by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate, but potentially biased, automatic system. We propose a framework for PPI based on Bayesian inference that… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  7. arXiv:2404.09911  [pdf, other

    cs.CL

    ChatShop: Interactive Information Seeking with Language Agents

    Authors: Sanxing Chen, Sam Wiseman, Bhuwan Dhingra

    Abstract: The desire and ability to seek new information strategically are fundamental to human learning but often overlooked in current language agent evaluation. We analyze a popular web shopping task designed to test language agents' ability to perform strategic exploration and discover that it can be reformulated and solved as a single-turn retrieval task without the need for interactive information see… ▽ More

    Submitted 16 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  8. arXiv:2404.01266  [pdf, other

    cs.AI cs.CL

    IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations

    Authors: Deqing Fu, Ruohao Guo, Ghazal Khalighinejad, Ollie Liu, Bhuwan Dhingra, Dani Yogatama, Robin Jia, Willie Neiswanger

    Abstract: Current foundation models exhibit impressive capabilities when prompted either with text only or with both image and text inputs. But do their capabilities change depending on the input modality? In this work, we propose $\textbf{IsoBench}$, a benchmark dataset containing problems from four major areas: math, science, algorithms, and games. Each example is presented with multiple… ▽ More

    Submitted 18 August, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 1st Conference on Language Modeling (COLM), 2024

  9. arXiv:2403.00260  [pdf, other

    cs.CL

    Extracting Polymer Nanocomposite Samples from Full-Length Documents

    Authors: Ghazal Khalighinejad, Defne Circi, L. C. Brinson, Bhuwan Dhingra

    Abstract: This paper investigates the use of large language models (LLMs) for extracting sample lists of polymer nanocomposites (PNCs) from full-length materials science research papers. The challenge lies in the complex nature of PNC samples, which have numerous attributes scattered throughout the text. The complexity of annotating detailed information on PNCs limits the availability of data, making conven… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  10. arXiv:2402.17916  [pdf, other

    cs.CL cs.AI

    Adversarial Math Word Problem Generation

    Authors: Roy Xie, Chengxuan Huang, Junlin Wang, Bhuwan Dhingra

    Abstract: Large language models (LLMs) have significantly transformed the educational landscape. As current plagiarism detection tools struggle to keep pace with LLMs' rapid advancements, the educational community faces the challenge of assessing students' true problem-solving abilities in the presence of LLMs. In this work, we explore a new paradigm for ensuring fair evaluation -- generating adversarial ex… ▽ More

    Submitted 15 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Code/data: https://github.com/ruoyuxie/adversarial_mwps_generation

  11. arXiv:2402.06544  [pdf, other

    cs.CL cs.AI cs.LG

    Calibrating Long-form Generations from Large Language Models

    Authors: Yukun Huang, Yixin Liu, Raghuveer Thirukovalluru, Arman Cohan, Bhuwan Dhingra

    Abstract: To enhance Large Language Models' (LLMs) reliability, calibration is essential -- the model's assessed confidence scores should align with the actual likelihood of its responses being correct. However, current confidence elicitation methods and calibration metrics typically rely on a binary true/false assessment of response correctness. This approach does not apply to long-form generation, where a… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  12. arXiv:2402.01783  [pdf, other

    cs.CL cs.AI cs.LG

    Hierarchical Multi-Label Classification of Online Vaccine Concerns

    Authors: Chloe Qinyu Zhu, Rickard Stureborg, Bhuwan Dhingra

    Abstract: Vaccine concerns are an ever-evolving target, and can shift quickly as seen during the COVID-19 pandemic. Identifying longitudinal trends in vaccine concerns and misinformation might inform the healthcare space by helping public health efforts strategically allocate resources or information campaigns. We explore the task of detecting vaccine concerns in online discourse using large language models… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: Published in AAAI 2024 Health Intelligence workshop

  13. arXiv:2305.19419  [pdf, other

    cs.CL

    Hierarchical Multi-Instance Multi-Label Learning for Detecting Propaganda Techniques

    Authors: Anni Chen, Bhuwan Dhingra

    Abstract: Since the introduction of the SemEval 2020 Task 11 (Martino et al., 2020a), several approaches have been proposed in the literature for classifying propaganda based on the rhetorical techniques used to influence readers. These methods, however, classify one span at a time, ignoring dependencies from the labels of other spans within the same context. In this paper, we approach propaganda technique… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  14. arXiv:2305.14613  [pdf, other

    cs.CL cs.AI

    Selectively Answering Ambiguous Questions

    Authors: Jeremy R. Cole, Michael J. Q. Zhang, Daniel Gillick, Julian Martin Eisenschlos, Bhuwan Dhingra, Jacob Eisenstein

    Abstract: Trustworthy language models should abstain from answering questions when they do not know the answer. However, the answer to a question can be unknown for a variety of reasons. Prior research has focused on the case in which the question is clear and the answer is unambiguous but possibly unknown, but the answer to a question can also be unclear due to uncertainty of the questioner's intent or con… ▽ More

    Submitted 14 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: To appear in EMNLP 2023. 9 pages, 5 figures, 2 pages of appendix

  15. arXiv:2303.12860  [pdf, other

    cs.CL cs.AI

    Salient Span Masking for Temporal Understanding

    Authors: Jeremy R. Cole, Aditi Chaudhary, Bhuwan Dhingra, Partha Talukdar

    Abstract: Salient Span Masking (SSM) has shown itself to be an effective strategy to improve closed-book question answering performance. SSM extends general masked language model pretraining by creating additional unsupervised training sentences that mask a single entity or date span, thus oversampling factual information. Despite the success of this paradigm, the span types and sampling strategies are rela… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: 5 pages 1 figure, to appear in EACL 2023

  16. arXiv:2303.05077  [pdf, other

    cs.CL cs.CV

    Learning the Legibility of Visual Text Perturbations

    Authors: Dev Seth, Rickard Stureborg, Danish Pruthi, Bhuwan Dhingra

    Abstract: Many adversarial attacks in NLP perturb inputs to produce visually similar strings ('ergo' $\rightarrow$ '$ε$rgo') which are legible to humans but degrade model performance. Although preserving legibility is a necessary condition for text perturbation, little work has been done to systematically characterize it; instead, legibility is typically loosely enforced via intuitions around the nature and… ▽ More

    Submitted 10 March, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

    Comments: 14 pages, 7 figures. Accepted at EACL 2023 (main, long)

  17. arXiv:2303.00242  [pdf, other

    cs.CL

    DIFFQG: Generating Questions to Summarize Factual Changes

    Authors: Jeremy R. Cole, Palak Jain, Julian Martin Eisenschlos, Michael J. Q. Zhang, Eunsol Choi, Bhuwan Dhingra

    Abstract: Identifying the difference between two versions of the same article is useful to update knowledge bases and to understand how articles evolve. Paired texts occur naturally in diverse situations: reporters write similar news stories and maintainers of authoritative websites must keep their information up to date. We propose representing factual changes between paired documents as question-answer pa… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: 14 pages. Accepted at EACL 2023 (main, long)

  18. Interface Design for Crowdsourcing Hierarchical Multi-Label Text Annotations

    Authors: Rickard Stureborg, Bhuwan Dhingra, Jun Yang

    Abstract: Human data labeling is an important and expensive task at the heart of supervised learning systems. Hierarchies help humans understand and organize concepts. We ask whether and how concept hierarchies can inform the design of annotation interfaces to improve labeling quality and efficiency. We study this question through annotation of vaccine misinformation, where the labeling task is difficult an… ▽ More

    Submitted 22 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: To appear in CHI-2023

    ACM Class: H.5

  19. arXiv:2209.06869  [pdf, other

    cs.CL cs.AI cs.LG

    On the State of the Art in Authorship Attribution and Authorship Verification

    Authors: Jacob Tyo, Bhuwan Dhingra, Zachary C. Lipton

    Abstract: Despite decades of research on authorship attribution (AA) and authorship verification (AV), inconsistent dataset splits/filtering and mismatched evaluation methods make it difficult to assess the state of the art. In this paper, we present a survey of the fields, resolve points of confusion, introduce Valla that standardizes and benchmarks AA/AV datasets and metrics, provide a large-scale empiric… ▽ More

    Submitted 5 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

  20. arXiv:2204.07288  [pdf, other

    cs.CL cs.LG

    Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models

    Authors: Phyllis Ang, Bhuwan Dhingra, Lisa Wu Wills

    Abstract: With many real-world applications of Natural Language Processing (NLP) comprising of long texts, there has been a rise in NLP benchmarks that measure the accuracy of models that can handle longer input sequences. However, these benchmarks do not consider the trade-offs between accuracy, speed, and power consumption as input sizes or model sizes are varied. In this work, we perform a systematic stu… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted at NLP Power! Workshop on Efficient Benchmarking in NLP at ACL2022

  21. arXiv:2204.06092  [pdf, other

    cs.CL

    ASQA: Factoid Questions Meet Long-Form Answers

    Authors: Ivan Stelmakh, Yi Luan, Bhuwan Dhingra, Ming-Wei Chang

    Abstract: An abundance of datasets and availability of reliable evaluation metrics have resulted in strong progress in factoid question answering (QA). This progress, however, does not easily transfer to the task of long-form QA, where the goal is to answer questions that require in-depth explanations. The hurdles include (i) a lack of high-quality data, and (ii) the absence of a well-defined notion of the… ▽ More

    Submitted 22 January, 2023; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: A minor bug in computing the ROUGE score was fixed. The fix **did not** result in any changes in observations and conclusions

  22. Time-Aware Language Models as Temporal Knowledge Bases

    Authors: Bhuwan Dhingra, Jeremy R. Cole, Julian Martin Eisenschlos, Daniel Gillick, Jacob Eisenstein, William W. Cohen

    Abstract: Many facts come with an expiration date, from the name of the President to the basketball team Lebron James plays for. But language models (LMs) are trained on snapshots of data collected at a specific moment in time, and this can limit their utility, especially in the closed-book setting where the pretraining corpus must contain the facts the model should memorize. We introduce a diagnostic datas… ▽ More

    Submitted 23 April, 2022; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: Version accepted to TACL

    Journal ref: Transactions of the Association for Computational Linguistics 2022; 10 257-273

  23. arXiv:2104.04725  [pdf, other

    cs.CL

    Fool Me Twice: Entailment from Wikipedia Gamification

    Authors: Julian Martin Eisenschlos, Bhuwan Dhingra, Jannis Bulian, Benjamin Börschinger, Jordan Boyd-Graber

    Abstract: We release FoolMeTwice (FM2 for short), a large dataset of challenging entailment pairs collected through a fun multi-player game. Gamification encourages adversarial examples, drastically lowering the number of examples that can be solved using "shortcuts" compared to other popular entailment datasets. Players are presented with two tasks. The first task asks the player to write a plausible claim… ▽ More

    Submitted 10 April, 2021; originally announced April 2021.

    Comments: Published in NAACL 2021

  24. arXiv:2102.07043  [pdf, other

    cs.AI cs.CL cs.LG

    Reasoning Over Virtual Knowledge Bases With Open Predicate Relations

    Authors: Haitian Sun, Pat Verga, Bhuwan Dhingra, Ruslan Salakhutdinov, William W. Cohen

    Abstract: We present the Open Predicate Query Language (OPQL); a method for constructing a virtual KB (VKB) trained entirely from text. Large Knowledge Bases (KBs) are indispensable for a wide-range of industry applications such as question answering and recommendation. Typically, KBs encode world knowledge in a structured, readily accessible form derived from laborious human annotation efforts. Unfortunate… ▽ More

    Submitted 14 June, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

    Comments: Accepted at the 38th International Conference on Machine Learning, PMLR 139, 2021

  25. arXiv:2012.00893  [pdf, other

    cs.CL cs.LG

    Evaluating Explanations: How much do explanations from the teacher aid students?

    Authors: Danish Pruthi, Rachit Bansal, Bhuwan Dhingra, Livio Baldini Soares, Michael Collins, Zachary C. Lipton, Graham Neubig, William W. Cohen

    Abstract: While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated. In this work, we introduce a framework to quantify the value of explanations via the accuracy gains that they confer on a student model trained to simulate a teacher model. Crucially, the explanations are available to the stude… ▽ More

    Submitted 16 December, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: TACL 2021 (pre-MIT Press publication version)

  26. arXiv:2011.01459  [pdf, other

    cs.CL cs.LG

    Weakly- and Semi-supervised Evidence Extraction

    Authors: Danish Pruthi, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton

    Abstract: For many prediction tasks, stakeholders desire not only predictions but also supporting evidence that a human can use to verify its correctness. However, in practice, additional annotations marking supporting evidence may only be available for a minority of training examples (if available at all). In this paper, we propose new methods to combine few evidence annotations (strong semi-supervision) w… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted to the Findings of EMNLP 2020, to be presented at BlackBoxNLP

  27. arXiv:2010.14439  [pdf, other

    cs.CL cs.AI cs.LG

    Differentiable Open-Ended Commonsense Reasoning

    Authors: Bill Yuchen Lin, Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Xiang Ren, William W. Cohen

    Abstract: Current commonsense reasoning research focuses on developing models that use commonsense knowledge to answer multiple-choice questions. However, systems designed to answer multiple-choice questions may not be useful in applications that do not provide a small list of candidate answers to choose from. As a step towards making commonsense reasoning research more realistic, we propose to study open-e… ▽ More

    Submitted 6 June, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: Accepted to NAACL 2021. Project website: https://open-csr.github.io

  28. arXiv:2004.14373  [pdf, other

    cs.CL cs.LG

    ToTTo: A Controlled Table-To-Text Generation Dataset

    Authors: Ankur P. Parikh, Xuezhi Wang, Sebastian Gehrmann, Manaal Faruqui, Bhuwan Dhingra, Diyi Yang, Dipanjan Das

    Abstract: We present ToTTo, an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description. To obtain generated targets that are natural but also faithful to the source table, we introduce a dataset construction process where annotators directly revis… ▽ More

    Submitted 6 October, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: Accepted to EMNLP 2020

  29. arXiv:2002.10640  [pdf, other

    cs.CL cs.LG

    Differentiable Reasoning over a Virtual Knowledge Base

    Authors: Bhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Ruslan Salakhutdinov, William W. Cohen

    Abstract: We consider the task of answering complex multi-hop questions using a corpus as a virtual knowledge base (KB). In particular, we describe a neural module, DrKIT, that traverses textual data like a KB, softly following paths of relations between mentions of entities in the corpus. At each step the module uses a combination of sparse-matrix TFIDF indices and a maximum inner product search (MIPS) on… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: ICLR 2020

  30. arXiv:1909.07913  [pdf, other

    cs.CL cs.LG

    Learning to Deceive with Attention-Based Explanations

    Authors: Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton

    Abstract: Attention mechanisms are ubiquitous components in neural architectures applied to natural language processing. In addition to yielding gains in predictive accuracy, attention weights are often claimed to confer interpretability, purportedly useful both for providing insights to practitioners and for explaining why a model makes its decisions to stakeholders. We call the latter use of attention mec… ▽ More

    Submitted 6 April, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: Accepted to ACL 2020 as a long paper. Updated version

  31. arXiv:1909.06146  [pdf, other

    cs.CL cs.LG q-bio.QM

    PubMedQA: A Dataset for Biomedical Research Question Answering

    Authors: Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William W. Cohen, Xinghua Lu

    Abstract: We introduce PubMedQA, a novel biomedical question answering (QA) dataset collected from PubMed abstracts. The task of PubMedQA is to answer research questions with yes/no/maybe (e.g.: Do preoperative statins reduce atrial fibrillation after coronary artery bypass grafting?) using the corresponding abstracts. PubMedQA has 1k expert-annotated, 61.2k unlabeled and 211.3k artificially generated QA in… ▽ More

    Submitted 13 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  32. arXiv:1906.01081  [pdf, other

    cs.CL

    Handling Divergent Reference Texts when Evaluating Table-to-Text Generation

    Authors: Bhuwan Dhingra, Manaal Faruqui, Ankur Parikh, Ming-Wei Chang, Dipanjan Das, William W. Cohen

    Abstract: Automatically constructed datasets for generating text from semi-structured data (tables), such as WikiBio, often contain reference texts that diverge from the information in the corresponding semi-structured data. We show that metrics which rely solely on the reference texts, such as BLEU and ROUGE, show poor correlation with human judgments when those references diverge. We propose a new metric,… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

    Comments: To appear at ACL 2019

  33. arXiv:1905.11268  [pdf, other

    cs.CL cs.CR cs.LG

    Combating Adversarial Misspellings with Robust Word Recognition

    Authors: Danish Pruthi, Bhuwan Dhingra, Zachary C. Lipton

    Abstract: To combat adversarial spelling mistakes, we propose placing a word recognition model in front of the downstream classifier. Our word recognition models build upon the RNN semi-character architecture, introducing several new backoff strategies for handling rare and unseen words. Trained to recognize words corrupted by random adds, drops, swaps, and keyboard mistakes, our method achieves 32% relativ… ▽ More

    Submitted 29 August, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: ACL 2019, long paper

  34. arXiv:1904.04428  [pdf, other

    cs.CL

    Text Generation with Exemplar-based Adaptive Decoding

    Authors: Hao Peng, Ankur P. Parikh, Manaal Faruqui, Bhuwan Dhingra, Dipanjan Das

    Abstract: We propose a novel conditioned text generation model. It draws inspiration from traditional template-based text generation techniques, where the source provides the content (i.e., what to say), and the template influences how to say it. Building on the successful encoder-decoder paradigm, it first encodes the content representation from the given input text; to produce the output, it retrieves exe… ▽ More

    Submitted 10 April, 2019; v1 submitted 8 April, 2019; originally announced April 2019.

    Comments: NAACL 2019

  35. arXiv:1904.02181  [pdf, other

    cs.CL

    Probing Biomedical Embeddings from Language Models

    Authors: Qiao Jin, Bhuwan Dhingra, William W. Cohen, Xinghua Lu

    Abstract: Contextualized word embeddings derived from pre-trained language models (LMs) show significant improvements on downstream NLP tasks. Pre-training on domain-specific corpora, such as biomedical articles, further improves their performance. In this paper, we conduct probing experiments to determine what additional information is carried intrinsically by the in-domain trained contextualized embedding… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

    Comments: NAACL-HLT 2019 Workshop on Evaluating Vector Space Representations for NLP (RepEval)

  36. arXiv:1809.00782  [pdf, other

    cs.CL cs.LG

    Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text

    Authors: Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Kathryn Mazaitis, Ruslan Salakhutdinov, William W. Cohen

    Abstract: Open Domain Question Answering (QA) is evolving from complex pipelined systems to end-to-end deep neural networks. Specialized neural models have been developed for extracting answers from either text alone or Knowledge Bases (KBs) alone. In this paper we look at a more practical setting, namely QA over the combination of a KB and entity-linked text, which is appropriate when an incomplete KB is a… ▽ More

    Submitted 3 September, 2018; originally announced September 2018.

    Comments: EMNLP 2018

  37. arXiv:1806.05662  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations

    Authors: Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann LeCun

    Abstract: Modern deep transfer learning approaches have mainly focused on learning generic feature vectors from one task that are transferable to other tasks, such as word embeddings in language and pretrained convolutional features in vision. However, these approaches usually transfer unary features and largely ignore more structured graphical representations. This work explores the possibility of learning… ▽ More

    Submitted 2 July, 2018; v1 submitted 14 June, 2018; originally announced June 2018.

  38. arXiv:1806.04313  [pdf, other

    cs.CL cs.LG

    Embedding Text in Hyperbolic Spaces

    Authors: Bhuwan Dhingra, Christopher J. Shallue, Mohammad Norouzi, Andrew M. Dai, George E. Dahl

    Abstract: Natural language text exhibits hierarchical structure in a variety of respects. Ideally, we could incorporate our prior knowledge of this hierarchical structure into unsupervised learning algorithms that work on text data. Recent work by Nickel & Kiela (2017) proposed using hyperbolic instead of Euclidean embedding spaces to represent hierarchical data and demonstrated encouraging results when emb… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

    Comments: TextGraphs 2018

  39. arXiv:1804.05922  [pdf, other

    cs.CL cs.LG

    Neural Models for Reasoning over Multiple Mentions using Coreference

    Authors: Bhuwan Dhingra, Qiao Jin, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov

    Abstract: Many problems in NLP require aggregating information from multiple mentions of the same entity which may be far apart in the text. Existing Recurrent Neural Network (RNN) layers are biased towards short-term dependencies and hence not suited to such tasks. We present a recurrent layer which is instead biased towards coreferent dependencies. The layer uses coreference annotations extracted from an… ▽ More

    Submitted 16 April, 2018; originally announced April 2018.

    Comments: NAACL 2018 (Short Paper)

  40. arXiv:1804.00720  [pdf, other

    cs.CL cs.LG

    Simple and Effective Semi-Supervised Question Answering

    Authors: Bhuwan Dhingra, Danish Pruthi, Dheeraj Rajagopal

    Abstract: Recent success of deep learning models for the task of extractive Question Answering (QA) is hinged on the availability of large annotated corpora. However, large domain specific annotated corpora are limited and expensive to construct. In this work, we envision a system where the end user specifies a set of base documents and only a few labelled examples. Our system exploits the document structur… ▽ More

    Submitted 2 April, 2018; originally announced April 2018.

    Comments: Short paper, NAACL 2018

  41. arXiv:1707.03904  [pdf, other

    cs.CL cs.IR cs.LG

    Quasar: Datasets for Question Answering by Search and Reading

    Authors: Bhuwan Dhingra, Kathryn Mazaitis, William W. Cohen

    Abstract: We present two new large-scale datasets aimed at evaluating systems designed to comprehend a natural language query and extract its answer from a large corpus of text. The Quasar-S dataset consists of 37000 cloze-style (fill-in-the-gap) queries constructed from definitions of software entity tags on the popular website Stack Overflow. The posts and comments on the website serve as the background c… ▽ More

    Submitted 8 August, 2017; v1 submitted 12 July, 2017; originally announced July 2017.

  42. arXiv:1703.08885  [pdf, other

    cs.CL

    Question Answering from Unstructured Text by Retrieval and Comprehension

    Authors: Yusuke Watanabe, Bhuwan Dhingra, Ruslan Salakhutdinov

    Abstract: Open domain Question Answering (QA) systems must interact with external knowledge sources, such as web pages, to find relevant information. Information sources like Wikipedia, however, are not well structured and difficult to utilize in comparison with Knowledge Bases (KBs). In this work we present a two-step approach to question answering from unstructured text, consisting of a retrieval step and… ▽ More

    Submitted 26 March, 2017; originally announced March 2017.

  43. arXiv:1703.02620  [pdf, other

    cs.CL

    Linguistic Knowledge as Memory for Recurrent Neural Networks

    Authors: Bhuwan Dhingra, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov

    Abstract: Training recurrent neural networks to model long term dependencies is difficult. Hence, we propose to use external linguistic knowledge as an explicit signal to inform the model which memories it should utilize. Specifically, external knowledge is used to augment a sequence with typed edges between arbitrarily distant elements, and the resulting graph is decomposed into directed acyclic subgraphs.… ▽ More

    Submitted 7 March, 2017; originally announced March 2017.

  44. arXiv:1703.01557  [pdf, other

    cs.LG cs.CL stat.ML

    Using Graphs of Classifiers to Impose Declarative Constraints on Semi-supervised Learning

    Authors: Lidong Bing, William W. Cohen, Bhuwan Dhingra

    Abstract: We propose a general approach to modeling semi-supervised learning (SSL) algorithms. Specifically, we present a declarative language for modeling both traditional supervised classification tasks and many SSL heuristics, including both well-known heuristics such as co-training and novel domain-specific heuristics. In addition to representing individual SSL heuristics, we show that multiple heuristi… ▽ More

    Submitted 23 March, 2017; v1 submitted 4 March, 2017; originally announced March 2017.

    Comments: 8 pages, 3 figures

  45. arXiv:1703.00993  [pdf, other

    cs.CL

    A Comparative Study of Word Embeddings for Reading Comprehension

    Authors: Bhuwan Dhingra, Hanxiao Liu, Ruslan Salakhutdinov, William W. Cohen

    Abstract: The focus of past machine learning research for Reading Comprehension tasks has been primarily on the design of novel deep learning architectures. Here we show that seemingly minor choices made on (1) the use of pre-trained word embeddings, and (2) the representation of out-of-vocabulary tokens at test time, can turn out to have a larger impact than architectural choices on the final performance.… ▽ More

    Submitted 2 March, 2017; originally announced March 2017.

  46. arXiv:1612.05688  [pdf, other

    cs.LG cs.AI cs.CL

    A User Simulator for Task-Completion Dialogues

    Authors: Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen

    Abstract: Despite widespread interests in reinforcement-learning for task-oriented dialogue systems, several obstacles can frustrate research and development progress. First, reinforcement learners typically require interaction with the environment, so conventional dialogue corpora cannot be used directly. Second, each task presents specific challenges, requiring separate corpus of task-specific annotated d… ▽ More

    Submitted 13 November, 2017; v1 submitted 16 December, 2016; originally announced December 2016.

    Comments: 14 pages, 2 Figures

  47. arXiv:1611.01724  [pdf, other

    cs.CL cs.LG

    Words or Characters? Fine-grained Gating for Reading Comprehension

    Authors: Zhilin Yang, Bhuwan Dhingra, Ye Yuan, Junjie Hu, William W. Cohen, Ruslan Salakhutdinov

    Abstract: Previous work combines word-level and character-level representations using concatenation or scalar weighting, which is suboptimal for high-level tasks like reading comprehension. We present a fine-grained gating mechanism to dynamically combine word-level and character-level representations based on properties of the words. We also extend the idea of fine-grained gating to modeling the interactio… ▽ More

    Submitted 11 September, 2017; v1 submitted 5 November, 2016; originally announced November 2016.

    Comments: Accepted as a conference paper at ICLR 2017

  48. arXiv:1609.00777  [pdf, other

    cs.CL cs.LG

    Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access

    Authors: Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao, Yun-Nung Chen, Faisal Ahmed, Li Deng

    Abstract: This paper proposes KB-InfoBot -- a multi-turn dialogue agent which helps users search Knowledge Bases (KBs) without composing complicated queries. Such goal-oriented dialogue agents typically need to interact with an external database to access real-world knowledge. Previous systems achieved this by issuing a symbolic query to the KB to retrieve entries based on their attributes. However, such sy… ▽ More

    Submitted 20 April, 2017; v1 submitted 2 September, 2016; originally announced September 2016.

    Comments: Accepted at ACL 2017

  49. arXiv:1606.03398  [pdf, other

    cs.CL

    Bootstrapping Distantly Supervised IE using Joint Learning and Small Well-structured Corpora

    Authors: Lidong Bing, Bhuwan Dhingra, Kathryn Mazaitis, Jong Hyuk Park, William W. Cohen

    Abstract: We propose a framework to improve performance of distantly-supervised relation extraction, by jointly learning to solve two related tasks: concept-instance extraction and relation extraction. We combine this with a novel use of document structure: in some small, well-structured corpora, sections can be identified that correspond to relation arguments, and distantly-labeled examples from such secti… ▽ More

    Submitted 10 August, 2016; v1 submitted 10 June, 2016; originally announced June 2016.

    Comments: 10 pages, 5 figures

  50. arXiv:1606.01549  [pdf, other

    cs.CL cs.LG

    Gated-Attention Readers for Text Comprehension

    Authors: Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov

    Abstract: In this paper we study the problem of answering cloze-style questions over documents. Our model, the Gated-Attention (GA) Reader, integrates a multi-hop architecture with a novel attention mechanism, which is based on multiplicative interactions between the query embedding and the intermediate states of a recurrent neural network document reader. This enables the reader to build query-specific rep… ▽ More

    Submitted 21 April, 2017; v1 submitted 5 June, 2016; originally announced June 2016.

    Comments: Accepted at ACL 2017