Skip to main content

Showing 1–20 of 20 results for author: Verga, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.18796  [pdf, other

    cs.CL cs.AI

    Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

    Authors: Pat Verga, Sebastian Hofstatter, Sophia Althammer, Yixuan Su, Aleksandra Piktus, Arkady Arkhangorodsky, Minjie Xu, Naomi White, Patrick Lewis

    Abstract: As Large Language Models (LLMs) have become more advanced, they have outpaced our abilities to accurately evaluate their quality. Not only is finding data to adequately probe particular model properties difficult, but evaluating the correctness of a model's freeform generation alone is a challenge. To address this, many evaluations now rely on using LLMs themselves as judges to score the quality o… ▽ More

    Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  2. arXiv:2212.10381  [pdf, other

    cs.CL

    To Adapt or to Annotate: Challenges and Interventions for Domain Adaptation in Open-Domain Question Answering

    Authors: Dheeru Dua, Emma Strubell, Sameer Singh, Pat Verga

    Abstract: Recent advances in open-domain question answering (ODQA) have demonstrated impressive accuracy on standard Wikipedia style benchmarks. However, it is less clear how robust these models are and how well they perform when applied to real-world applications in drastically different domains. While there has been some work investigating how well ODQA models perform when tested for out-of-domain (OOD) g… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

  3. arXiv:2212.08037  [pdf, other

    cs.CL

    Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

    Authors: Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster

    Abstract: Large language models (LLMs) have shown impressive results while requiring little or no direct supervision. Further, there is mounting evidence that LLMs may have potential in information-seeking scenarios. We believe the ability of an LLM to attribute the text that it generates is likely to be crucial in this setting. We formulate and study Attributed QA as a key first step in the development of… ▽ More

    Submitted 10 February, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

  4. arXiv:2210.02928  [pdf, other

    cs.CL cs.AI cs.CV

    MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text

    Authors: Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen

    Abstract: While language Models store a massive amount of world knowledge implicitly in their parameters, even very large models often fail to encode information about rare entities and events, while incurring huge computational costs. Recently, retrieval-augmented models, such as REALM, RAG, and RETRO, have incorporated world knowledge into language generation by leveraging an external non-parametric index… ▽ More

    Submitted 20 October, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022 main conference

  5. arXiv:2207.00630  [pdf, other

    cs.AI

    QA Is the New KR: Question-Answer Pairs as Knowledge Bases

    Authors: Wenhu Chen, William W. Cohen, Michiel De Jong, Nitish Gupta, Alessandro Presta, Pat Verga, John Wieting

    Abstract: In this position paper, we propose a new approach to generating a type of knowledge base (KB) from text, based on question generation and entity linking. We argue that the proposed type of KB has many of the key advantages of a traditional symbolic KB: in particular, it consists of small modular components, which can be combined compositionally to answer complex queries, including relational queri… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

  6. arXiv:2204.13761  [pdf, other

    cs.CL

    Faithful to the Document or to the World? Mitigating Hallucinations via Entity-linked Knowledge in Abstractive Summarization

    Authors: Yue Dong, John Wieting, Pat Verga

    Abstract: Despite recent advances in abstractive summarization, current summarization systems still suffer from content hallucinations where models generate text that is either irrelevant or contradictory to the source document. However, prior work has been predicated on the assumption that any generated facts not appearing explicitly in the source are undesired hallucinations. Methods have been proposed to… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: 12 pages, 5 figures

  7. arXiv:2204.04581  [pdf, other

    cs.CL cs.AI cs.LG

    Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering

    Authors: Wenhu Chen, Pat Verga, Michiel de Jong, John Wieting, William Cohen

    Abstract: Retrieval augmented language models have recently become the standard for knowledge intensive tasks. Rather than relying purely on latent semantics within the parameters of large neural models, these methods enlist a semi-parametric memory to encode an index of knowledge for the model to retrieve over. Most prior work has employed text passages as the unit of knowledge, which has high coverage at… ▽ More

    Submitted 23 January, 2023; v1 submitted 9 April, 2022; originally announced April 2022.

    Comments: Accepted by EACL 2023

  8. arXiv:2109.14364  [pdf, other

    cs.CL

    Multilingual Fact Linking

    Authors: Keshav Kolluru, Martin Rezk, Pat Verga, William W. Cohen, Partha Talukdar

    Abstract: Knowledge-intensive NLP tasks can benefit from linking natural language text with facts from a Knowledge Graph (KG). Although facts themselves are language-agnostic, the fact labels (i.e., language-specific representation of the fact) in the KG are often present only in a few languages. This makes it challenging to link KG facts to sentences in languages other than the limited set of languages. To… ▽ More

    Submitted 30 September, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: AKBC 2021

  9. arXiv:2102.07043  [pdf, other

    cs.AI cs.CL cs.LG

    Reasoning Over Virtual Knowledge Bases With Open Predicate Relations

    Authors: Haitian Sun, Pat Verga, Bhuwan Dhingra, Ruslan Salakhutdinov, William W. Cohen

    Abstract: We present the Open Predicate Query Language (OPQL); a method for constructing a virtual KB (VKB) trained entirely from text. Large Knowledge Bases (KBs) are indispensable for a wide-range of industry applications such as question answering and recommendation. Typically, KBs encode world knowledge in a structured, readily accessible form derived from laborious human annotation efforts. Unfortunate… ▽ More

    Submitted 14 June, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

    Comments: Accepted at the 38th International Conference on Machine Learning, PMLR 139, 2021

  10. arXiv:2007.00849  [pdf, other

    cs.CL cs.AI cs.LG

    Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge

    Authors: Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen

    Abstract: Massive language models are the core of modern NLP modeling and have been shown to encode impressive amounts of commonsense and factual information. However, that knowledge exists only within the latent parameters of the model, inaccessible to inspection and interpretation, and even worse, factual information memorized from the training corpora is likely to become stale as the world changes. Knowl… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

  11. arXiv:1912.01070  [pdf, other

    cs.CL cs.IR cs.LG

    Simultaneously Linking Entities and Extracting Relations from Biomedical Text Without Mention-level Supervision

    Authors: Trapit Bansal, Pat Verga, Neha Choudhary, Andrew McCallum

    Abstract: Understanding the meaning of text often involves reasoning about entities and their relationships. This requires identifying textual mentions of entities, linking them to a canonical concept, and discerning their relationships. These tasks are nearly always viewed as separate components within a pipeline, each requiring a distinct model and training data. While relation extraction can often be tra… ▽ More

    Submitted 2 December, 2019; originally announced December 2019.

    Comments: Accepted in AAAI 2020

  12. arXiv:1904.02142  [pdf, other

    cs.CL

    Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders

    Authors: Andrew Drozdov, Pat Verga, Mohit Yadav, Mohit Iyyer, Andrew McCallum

    Abstract: We introduce deep inside-outside recursive autoencoders (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree. Our approach predicts each word in an input sentence conditioned on the rest of the sentence and uses inside-outside dynamic programming to consider all possible binary trees over the sentence. At te… ▽ More

    Submitted 4 April, 2019; v1 submitted 3 April, 2019; originally announced April 2019.

    Comments: 14 pages, 8 figures, 8 tables. NAACL 2019

  13. arXiv:1804.08199  [pdf, other

    cs.CL

    Linguistically-Informed Self-Attention for Semantic Role Labeling

    Authors: Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, Andrew McCallum

    Abstract: Current state-of-the-art semantic role labeling (SRL) uses a deep neural network with no explicit linguistic features. However, prior work has shown that gold syntax trees can dramatically improve SRL decoding, suggesting the possibility of increased accuracy from explicit modeling of syntax. In this work, we present linguistically-informed self-attention (LISA): a neural network model that combin… ▽ More

    Submitted 12 November, 2018; v1 submitted 22 April, 2018; originally announced April 2018.

    Comments: In Conference on Empirical Methods in Natural Language Processing (EMNLP). Brussels, Belgium. October 2018

  14. arXiv:1802.10569  [pdf, other

    cs.CL

    Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction

    Authors: Patrick Verga, Emma Strubell, Andrew McCallum

    Abstract: Most work in relation extraction forms a prediction by looking at a short span of text within a single sentence containing a single entity pair mention. This approach often does not consider interactions across mentions, requires redundant computation for each mention pair, and ignores relationships expressed across sentence boundaries. These problems are exacerbated by the document- (rather than… ▽ More

    Submitted 28 February, 2018; originally announced February 2018.

    Comments: NAACL 2018

  15. arXiv:1711.05795  [pdf, other

    cs.CL cs.NE

    Finer Grained Entity Typing with TypeNet

    Authors: Shikhar Murty, Patrick Verga, Luke Vilnis, Andrew McCallum

    Abstract: We consider the challenging problem of entity typing over an extremely fine grained set of types, wherein a single mention or entity can have many simultaneous and often hierarchically-structured types. Despite the importance of the problem, there is a relative lack of resources in the form of fine-grained, deep type hierarchies aligned to existing knowledge bases. In response, we introduce TypeNe… ▽ More

    Submitted 15 November, 2017; originally announced November 2017.

    Comments: Accepted at 6th Workshop on Automated Knowledge Base Construction (AKBC) at NIPS 2017

  16. arXiv:1710.08312  [pdf, other

    cs.CL

    Attending to All Mention Pairs for Full Abstract Biological Relation Extraction

    Authors: Patrick Verga, Emma Strubell, Ofer Shai, Andrew McCallum

    Abstract: Most work in relation extraction forms a prediction by looking at a short span of text within a single sentence containing a single entity pair mention. However, many relation types, particularly in biomedical text, are expressed across sentences or require a large context to disambiguate. We propose a model to consider all mention and entity pairs simultaneously in order to make a prediction. We… ▽ More

    Submitted 15 November, 2017; v1 submitted 23 October, 2017; originally announced October 2017.

    Comments: 6th Workshop on Automated Knowledge Base Construction (AKBC)

  17. arXiv:1702.02098  [pdf, other

    cs.CL

    Fast and Accurate Entity Recognition with Iterated Dilated Convolutions

    Authors: Emma Strubell, Patrick Verga, David Belanger, Andrew McCallum

    Abstract: Today when many practitioners run basic NLP on the entire web and large-volume traffic, faster methods are paramount to saving time and energy costs. Recent advances in GPU hardware have led to the emergence of bi-directional LSTMs as a standard method for obtaining per-token vector representations serving as input to labeling tasks such as NER (often followed by prediction in a linear-chain CRF).… ▽ More

    Submitted 22 July, 2017; v1 submitted 7 February, 2017; originally announced February 2017.

    Comments: In Conference on Empirical Methods in Natural Language Processing (EMNLP). Copenhagen, Denmark. September 2017

  18. arXiv:1606.05804  [pdf, other

    cs.CL

    Generalizing to Unseen Entities and Entity Pairs with Row-less Universal Schema

    Authors: Patrick Verga, Arvind Neelakantan, Andrew McCallum

    Abstract: Universal schema predicts the types of entities and relations in a knowledge base (KB) by jointly embedding the union of all available schema types---not only types from multiple structured databases (such as Freebase or Wikipedia infoboxes), but also types expressed as textual patterns from raw text. This prediction is typically modeled as a matrix completion problem, with one type per column, an… ▽ More

    Submitted 9 January, 2017; v1 submitted 18 June, 2016; originally announced June 2016.

    Comments: EACL 2017. arXiv admin note: text overlap with arXiv:1604.06361

  19. arXiv:1604.06361  [pdf, other

    cs.CL

    Row-less Universal Schema

    Authors: Patrick Verga, Andrew McCallum

    Abstract: Universal schema jointly embeds knowledge bases and textual patterns to reason about entities and relations for automatic knowledge base construction and information extraction. In the past, entity pairs and relations were represented as learned vectors with compatibility determined by a scoring function, limiting generalization to unseen text patterns and entities. Recently, 'column-less' version… ▽ More

    Submitted 21 April, 2016; originally announced April 2016.

    Comments: AKBC 2016 Workshop

  20. arXiv:1511.06396  [pdf, other

    cs.CL cs.LG

    Multilingual Relation Extraction using Compositional Universal Schema

    Authors: Patrick Verga, David Belanger, Emma Strubell, Benjamin Roth, Andrew McCallum

    Abstract: Universal schema builds a knowledge base (KB) of entities and relations by jointly embedding all relation types from input KBs as well as textual patterns expressing relations from raw text. In most previous applications of universal schema, each textual pattern is represented as a single embedding, preventing generalization to unseen patterns. Recent work employs a neural network to capture patte… ▽ More

    Submitted 3 March, 2016; v1 submitted 19 November, 2015; originally announced November 2015.

    Comments: Accepted to NAACL 2016