Skip to main content

Showing 1–32 of 32 results for author: Vakulenko, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10357  [pdf, other

    cs.CL cs.IR

    Beyond Relevant Documents: A Knowledge-Intensive Approach for Query-Focused Summarization using Large Language Models

    Authors: Weijia Zhang, Jia-Hong Huang, Svitlana Vakulenko, Yumo Xu, Thilina Rajapakse, Evangelos Kanoulas

    Abstract: Query-focused summarization (QFS) is a fundamental task in natural language processing with broad applications, including search engines and report generation. However, traditional approaches assume the availability of relevant documents, which may not always hold in practical scenarios, especially in highly specialized topics. To address this limitation, we propose a novel knowledge-intensive app… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: Accepted by the 27th International Conference on Pattern Recognition (ICPR 2024)

  2. arXiv:2301.05174  [pdf, other

    cs.IR cs.CV cs.LG cs.MM

    Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study

    Authors: Mariya Hendriksen, Svitlana Vakulenko, Ernst Kuiper, Maarten de Rijke

    Abstract: Most approaches to cross-modal retrieval (CMR) focus either on object-centric datasets, meaning that each document depicts or describes a single object, or on scene-centric datasets, meaning that each image depicts or describes a complex scene that involves multiple objects and relations between them. We posit that a robust CMR model should generalize well across both dataset types. Despite recent… ▽ More

    Submitted 10 October, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: 18 pages, accepted as a reproducibility paper at ECIR 2023

  3. arXiv:2210.06164  [pdf, other

    cs.CL cs.IR

    Focusing on Context is NICE: Improving Overshadowed Entity Disambiguation

    Authors: Vera Provatorova, Simone Tedeschi, Svitlana Vakulenko, Roberto Navigli, Evangelos Kanoulas

    Abstract: Entity disambiguation (ED) is the task of mapping an ambiguous entity mention to the corresponding entry in a structured knowledge base. Previous research showed that entity overshadowing is a significant challenge for existing ED models: when presented with an ambiguous entity mention, the models are much more likely to rank a more frequent yet less contextually relevant entity at the top. Here,… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  4. On the Impact of Speech Recognition Errors in Passage Retrieval for Spoken Question Answering

    Authors: Georgios Sidiropoulos, Svitlana Vakulenko, Evangelos Kanoulas

    Abstract: Interacting with a speech interface to query a Question Answering (QA) system is becoming increasingly popular. Typically, QA systems rely on passage retrieval to select candidate contexts and reading comprehension to extract the final answer. While there has been some attention to improving the reading comprehension part of QA systems against errors that automatic speech recognition (ASR) models… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: Accepted at 31st ACM International Conference on Information and Knowledge Management (CIKM 2022)

  5. arXiv:2208.03197  [pdf, other

    cs.CL

    Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey

    Authors: Xiaoyu Shen, Svitlana Vakulenko, Marco del Tredici, Gianni Barlacchi, Bill Byrne, Adrià de Gispert

    Abstract: Dense retrieval (DR) approaches based on powerful pre-trained language models (PLMs) achieved significant advances and have become a key component for modern open-domain question-answering systems. However, they require large amounts of manual annotations to perform competitively, which is infeasible to scale. To address this, a growing body of research works have recently focused on improving DR… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

  6. arXiv:2205.02517  [pdf, other

    cs.CL

    A Simple Contrastive Learning Objective for Alleviating Neural Text Degeneration

    Authors: Shaojie Jiang, Ruqing Zhang, Svitlana Vakulenko, Maarten de Rijke

    Abstract: The cross-entropy objective has proved to be an all-purpose training objective for autoregressive language models (LMs). However, without considering the penalization of problematic tokens, LMs trained using cross-entropy exhibit text degeneration. To address this, unlikelihood training has been proposed to reduce the probability of unlikely tokens predicted by LMs. But unlikelihood does not consi… ▽ More

    Submitted 19 May, 2022; v1 submitted 5 May, 2022; originally announced May 2022.

    Comments: 22 pages, 11 figures, 8 tables

  7. arXiv:2201.11094  [pdf, other

    cs.IR cs.CL

    SCAI-QReCC Shared Task on Conversational Question Answering

    Authors: Svitlana Vakulenko, Johannes Kiesel, Maik Fröbe

    Abstract: Search-Oriented Conversational AI (SCAI) is an established venue that regularly puts a spotlight upon the recent work advancing the field of conversational search. SCAI'21 was organised as an independent on-line event and featured a shared task on conversational question answering. Since all of the participant teams experimented with answer generation models for this task, we identified evaluation… ▽ More

    Submitted 26 January, 2022; originally announced January 2022.

    Comments: 10 pages

  8. arXiv:2112.11294  [pdf, other

    cs.IR cs.LG cs.MM

    Extending CLIP for Category-to-image Retrieval in E-commerce

    Authors: Mariya Hendriksen, Maurits Bleeker, Svitlana Vakulenko, Nanne van Noord, Ernst Kuiper, Maarten de Rijke

    Abstract: E-commerce provides rich multimodal data that is barely leveraged in practice. One aspect of this data is a category tree that is being used in search and recommendation. However, in practice, during a user's session there is often a mismatch between a textual and a visual representation of a given category. Motivated by the problem, we introduce the task of category-to-image retrieval in e-commer… ▽ More

    Submitted 4 January, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

    Comments: 15 pages, accepted as a full paper at ECIR 2022

  9. arXiv:2112.07536  [pdf, other

    cs.CL cs.IR

    Tackling Query-Focused Summarization as A Knowledge-Intensive Task: A Pilot Study

    Authors: Weijia Zhang, Svitlana Vakulenko, Thilina Rajapakse, Yumo Xu, Evangelos Kanoulas

    Abstract: Query-focused summarization (QFS) requires generating a summary given a query using a set of relevant documents. However, such relevant documents should be annotated manually and thus are not readily available in realistic scenarios. To address this limitation, we tackle the QFS task as a knowledge-intensive (KI) task without access to any relevant documents. Instead, we assume that these document… ▽ More

    Submitted 31 July, 2023; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: Accepted by Gen-IR@SIGIR 2023 workshop

  10. arXiv:2108.10949  [pdf, other

    cs.CL

    Robustness Evaluation of Entity Disambiguation Using Prior Probes:the Case of Entity Overshadowing

    Authors: Vera Provatorova, Svitlana Vakulenko, Samarth Bhargav, Evangelos Kanoulas

    Abstract: Entity disambiguation (ED) is the last step of entity linking (EL), when candidate entities are reranked according to the context they appear in. All datasets for training and evaluating models for EL consist of convenience samples, such as news articles and tweets, that propagate the prior probability bias of the entity distribution towards more frequently occurring entities. It was previously sh… ▽ More

    Submitted 14 December, 2021; v1 submitted 24 August, 2021; originally announced August 2021.

    Journal ref: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2021) 10501-10510

  11. VerbCL: A Dataset of Verbatim Quotes for Highlight Extraction in Case Law

    Authors: Julien Rossi, Svitlana Vakulenko, Evangelos Kanoulas

    Abstract: Citing legal opinions is a key part of legal argumentation, an expert task that requires retrieval, extraction and summarization of information from court decisions. The identification of legally salient parts in an opinion for the purpose of citation may be seen as a domain-specific formulation of a highlight extraction or passage retrieval task. As similar tasks in other domains such as web sear… ▽ More

    Submitted 23 August, 2021; originally announced August 2021.

    Comments: CIKM 2021, Resource Track

  12. arXiv:2106.08433  [pdf, other

    cs.IR

    Combining Lexical and Dense Retrieval for Computationally Efficient Multi-hop Question Answering

    Authors: Georgios Sidiropoulos, Nikos Voskarides, Svitlana Vakulenko, Evangelos Kanoulas

    Abstract: In simple open-domain question answering (QA), dense retrieval has become one of the standard approaches for retrieving the relevant passages to infer an answer. Recently, dense retrieval also achieved state-of-the-art results in multi-hop QA, where aggregating information from multiple pieces of information and reasoning over them is required. Despite their success, dense retrieval methods are co… ▽ More

    Submitted 22 September, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: Accepted at the 2nd Workshop on Simple and Efficient Natural Language Processing (SustaiNLP 2021)

  13. arXiv:2104.07096  [pdf, other

    cs.IR

    A Large-Scale Analysis of Mixed Initiative in Information-Seeking Dialogues for Conversational Search

    Authors: Svitlana Vakulenko, Evangelos Kanoulas, Maarten de Rijke

    Abstract: Conversational search is a relatively young area of research that aims at automating an information-seeking dialogue. In this paper we help to position it with respect to other research areas within conversational Artificial Intelligence (AI) by analysing the structural properties of an information-seeking dialogue. To this end, we perform a large-scale dialogue analysis of more than 150K transcri… ▽ More

    Submitted 8 June, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

    Comments: 32 pages; To appear in ACM Transactions on Information Systems (TOIS), Special Issue on Conversational Search and Recommendation. 2021

  14. arXiv:2102.08795  [pdf, other

    cs.IR cs.CL

    Leveraging Query Resolution and Reading Comprehension for Conversational Passage Retrieval

    Authors: Svitlana Vakulenko, Nikos Voskarides, Zhucheng Tu, Shayne Longpre

    Abstract: This paper describes the participation of UvA.ILPS group at the TREC CAsT 2020 track. Our passage retrieval pipeline consists of (i) an initial retrieval module that uses BM25, and (ii) a re-ranking module that combines the score of a BERT ranking model with the score of a machine comprehension model adjusted for passage retrieval. An important challenge in conversational passage retrieval is that… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

    Comments: TREC 2020

  15. arXiv:2101.07382  [pdf, other

    cs.IR cs.CL

    A Comparison of Question Rewriting Methods for Conversational Passage Retrieval

    Authors: Svitlana Vakulenko, Nikos Voskarides, Zhucheng Tu, Shayne Longpre

    Abstract: Conversational passage retrieval relies on question rewriting to modify the original question so that it no longer depends on the conversation history. Several methods for question rewriting have recently been proposed, but they were compared under different retrieval pipelines. We bridge this gap by thoroughly evaluating those question rewriting methods on the TREC CAsT 2019 and 2020 datasets und… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

    Comments: ECIR 2021 short paper

  16. arXiv:2012.03704  [pdf, other

    cs.IR

    Conversational Browsing

    Authors: Svitlana Vakulenko, Vadim Savenkov, Maarten de Rijke

    Abstract: How can we better understand the mechanisms behind multi-turn information seeking dialogues? How can we use these insights to design a dialogue system that does not require explicit query formulation upfront as in question answering? To answer these questions, we collected observations of human participants performing a similar task to obtain inspiration for the system design. Then, we studied the… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

  17. arXiv:2010.06835  [pdf, other

    cs.CL

    A Wrong Answer or a Wrong Question? An Intricate Relationship between Question Reformulation and Answer Selection in Conversational Question Answering

    Authors: Svitlana Vakulenko, Shayne Longpre, Zhucheng Tu, Raviteja Anantha

    Abstract: The dependency between an adequate question formulation and correct answer selection is a very intriguing but still underexplored area. In this paper, we show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon and also use it to evaluate robustness of different answer selection approaches. We introduce a simple framework that enables an automate… ▽ More

    Submitted 3 February, 2022; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: Accepted at the Workshop on Search-Oriented Conversational AI (SCAI) 2020. Code for error analysis: https://github.com/svakulenk0/QRQA. arXiv admin note: text overlap with arXiv:2004.14652

  18. arXiv:2010.04898  [pdf, other

    cs.IR cs.CL

    Open-Domain Question Answering Goes Conversational via Question Rewriting

    Authors: Raviteja Anantha, Svitlana Vakulenko, Zhucheng Tu, Shayne Longpre, Stephen Pulman, Srinivas Chappidi

    Abstract: We introduce a new dataset for Question Rewriting in Conversational Context (QReCC), which contains 14K conversations with 80K question-answer pairs. The task in QReCC is to find answers to conversational questions within a collection of 10M web pages (split into 54M passages). Answers to questions in the same conversation may be distributed across several web pages. QReCC provides annotations tha… ▽ More

    Submitted 14 April, 2021; v1 submitted 10 October, 2020; originally announced October 2020.

    Comments: 15 pages, 10 tables, 3 figures, accepted at NAACL 2021

  19. An Analysis of Mixed Initiative and Collaboration in Information-Seeking Dialogues

    Authors: Svitlana Vakulenko, Evangelos Kanoulas, Maarten de Rijke

    Abstract: The ability to engage in mixed-initiative interaction is one of the core requirements for a conversational search system. How to achieve this is poorly understood. We propose a set of unsupervised metrics, termed ConversationShape, that highlights the role each of the conversation participants plays by comparing the distribution of vocabulary and utterance types. Using ConversationShape as a lens,… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: SIGIR 2020 short conference paper

  20. arXiv:2004.14652  [pdf, other

    cs.IR cs.LG

    Question Rewriting for Conversational Question Answering

    Authors: Svitlana Vakulenko, Shayne Longpre, Zhucheng Tu, Raviteja Anantha

    Abstract: Conversational question answering (QA) requires the ability to correctly interpret a question in the context of previous conversation turns. We address the conversational QA task by decomposing it into question rewriting and question answering subtasks. The question rewriting (QR) subtask is specifically designed to reformulate ambiguous questions, which depend on the conversational context, into… ▽ More

    Submitted 23 October, 2020; v1 submitted 30 April, 2020; originally announced April 2020.

    Comments: Version accepted to WSDM 2021

  21. arXiv:2001.06910  [pdf, ps, other

    cs.IR

    Common Conversational Community Prototype: Scholarly Conversational Assistant

    Authors: Krisztian Balog, Lucie Flekova, Matthias Hagen, Rosie Jones, Martin Potthast, Filip Radlinski, Mark Sanderson, Svitlana Vakulenko, Hamed Zamani

    Abstract: This paper discusses the potential for creating academic resources (tools, data, and evaluation approaches) to support research in conversational search, by focusing on realistic information needs and conversational interactions. Specifically, we propose to develop and operate a prototype conversational search system for scholarly activities. This Scholarly Conversational Assistant would serve as… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

  22. arXiv:1912.06859  [pdf, other

    cs.IR cs.CL

    Knowledge-based Conversational Search

    Authors: Svitlana Vakulenko

    Abstract: Conversational interfaces that allow for intuitive and comprehensive access to digitally stored information remain an ambitious goal. In this thesis, we lay foundations for designing conversational search systems by analyzing the requirements and proposing concrete solutions for automating some of the basic components and tasks that such systems should support. We describe several interdependent s… ▽ More

    Submitted 14 December, 2019; originally announced December 2019.

    Comments: PhD thesis

  23. arXiv:1909.03653  [pdf, other

    cs.IR

    Open Data Chatbot

    Authors: Sophia Keyner, Vadim Savenkov, Svitlana Vakulenko

    Abstract: Recently, chatbots received an increased attention from industry and diverse research communities as a dialogue-based interface providing advanced human-computer interactions. On the other hand, Open Data continues to be an important trend and a potential enabler for government transparency and citizen participation. This paper shows how these two paradigms can be combined to help non-expert users… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Journal ref: The Semantic Web - 16th International Conference, ESWC 2019, Portoroz, Slovenia, June 2-6, 2019

  24. arXiv:1908.06917  [pdf, other

    cs.CL cs.AI cs.IR

    Message Passing for Complex Question Answering over Knowledge Graphs

    Authors: Svitlana Vakulenko, Javier David Fernandez Garcia, Axel Polleres, Maarten de Rijke, Michael Cochez

    Abstract: Question answering over knowledge graphs (KGQA) has evolved from simple single-fact questions to complex questions that require graph traversal and aggregation. We propose a novel approach for complex KGQA that uses unsupervised message passing, which propagates confidence scores obtained by parsing an input question and matching terms in the knowledge graph to a set of possible answers. First, we… ▽ More

    Submitted 19 August, 2019; originally announced August 2019.

    Comments: Accepted in CIKM 2019

  25. arXiv:1812.10720  [pdf, other

    cs.IR cs.CL

    QRFA: A Data-Driven Model of Information-Seeking Dialogues

    Authors: Svitlana Vakulenko, Kate Revoredo, Claudio Di Ciccio, Maarten de Rijke

    Abstract: Understanding the structure of interaction processes helps us to improve information-seeking dialogue systems. Analyzing an interaction process boils down to discovering patterns in sequences of alternating utterances exchanged between a user and an agent. Process mining techniques have been successfully applied to analyze structured event logs, discovering the underlying process models or evaluat… ▽ More

    Submitted 27 December, 2018; originally announced December 2018.

    Comments: Advances in Information Retrieval. Proceedings of the 41st European Conference on Information Retrieval (ECIR '19), 2019

  26. arXiv:1806.06411  [pdf, other

    cs.CL cs.AI

    Measuring Semantic Coherence of a Conversation

    Authors: Svitlana Vakulenko, Maarten de Rijke, Michael Cochez, Vadim Savenkov, Axel Polleres

    Abstract: Conversational systems have become increasingly popular as a way for humans to interact with computers. To be able to provide intelligent responses, conversational systems must correctly model the structure and semantics of a conversation. We introduce the task of measuring semantic (in)coherence in a conversation with respect to background knowledge, which relies on the identification of semantic… ▽ More

    Submitted 17 June, 2018; originally announced June 2018.

  27. arXiv:1709.05298  [pdf, other

    cs.HC cs.IR

    Conversational Exploratory Search via Interactive Storytelling

    Authors: Svitlana Vakulenko, Ilya Markov, Maarten de Rijke

    Abstract: Conversational interfaces are likely to become more efficient, intuitive and engaging way for human-computer interaction than today's text or touch-based interfaces. Current research efforts concerning conversational interfaces focus primarily on question answering functionality, thereby neglecting support for search activities beyond targeted information lookup. Users engage in exploratory search… ▽ More

    Submitted 15 September, 2017; originally announced September 2017.

    Comments: Accepted at ICTIR'17 Workshop on Search-Oriented Conversational AI (SCAI 2017)

  28. arXiv:1705.06504  [pdf, other

    cs.IR

    TableQA: Question Answering on Tabular Data

    Authors: Svitlana Vakulenko, Vadim Savenkov

    Abstract: Tabular data is difficult to analyze and to search through, yielding for new tools and interfaces that would allow even non tech-savvy users to gain insights from open datasets without resorting to specialized data analysis tools or even without having to fully understand the dataset structure. The goal of our demonstration is to showcase answering natural language questions from tabular data, and… ▽ More

    Submitted 30 August, 2017; v1 submitted 18 May, 2017; originally announced May 2017.

    Comments: Full version of the demo paper accepted at SEMANTiCS 2017

  29. arXiv:1705.00894  [pdf, other

    cs.IR

    Talking Open Data

    Authors: Sebastian Neumaier, Vadim Savenkov, Svitlana Vakulenko

    Abstract: Enticing users into exploring Open Data remains an important challenge for the whole Open Data paradigm. Standard stock interfaces often used by Open Data portals are anything but inspiring even for tech-savvy users, let alone those without an articulated interest in data science. To address a broader range of citizens, we designed an open data search interface supporting natural language interact… ▽ More

    Submitted 2 May, 2017; originally announced May 2017.

    Comments: Accepted at ESWC2017 demo track

  30. arXiv:1703.05123  [pdf, other

    cs.IR cs.CL

    Character-based Neural Embeddings for Tweet Clustering

    Authors: Svitlana Vakulenko, Lyndon Nixon, Mihai Lupu

    Abstract: In this paper we show how the performance of tweet clustering can be improved by leveraging character-based neural networks. The proposed approach overcomes the limitations related to the vocabulary explosion in the word-based models and allows for the seamless processing of the multilingual content. Our evaluation results and code are available on-line at https://github.com/vendi12/tweet2vec_clus… ▽ More

    Submitted 16 March, 2017; v1 submitted 15 March, 2017; originally announced March 2017.

    Comments: Accepted at the SocialNLP 2017 workshop held in conjunction with EACL 2017, April 3, 2017, Valencia, Spain

  31. arXiv:1309.0870  [pdf, other

    cs.CE q-bio.MN

    A hybrid mammalian cell cycle model

    Authors: Vincent Noël, Sergey Vakulenko, Ovidiu Radulescu

    Abstract: Hybrid modeling provides an effective solution to cope with multiple time scales dynamics in systems biology. Among the applications of this method, one of the most important is the cell cycle regulation. The machinery of the cell cycle, leading to cell division and proliferation, combines slow growth, spatio-temporal re-organisation of the cell, and rapid changes of regulatory proteins concentrat… ▽ More

    Submitted 3 September, 2013; originally announced September 2013.

    Comments: In Proceedings HSB 2013, arXiv:1308.5724

    Journal ref: EPTCS 125, 2013, pp. 68-83

  32. arXiv:1208.3854  [pdf, ps, other

    cs.CE eess.SY q-bio.QM

    Hybrid models of the cell cycle molecular machinery

    Authors: Vincent Noel, Dima Grigoriev, Sergei Vakulenko, Ovidiu Radulescu

    Abstract: Piecewise smooth hybrid systems, involving continuous and discrete variables, are suitable models for describing the multiscale regulatory machinery of the biological cells. In hybrid models, the discrete variables can switch on and off some molecular interactions, simulating cell progression through a series of functioning modes. The advancement through the cell cycle is the archetype of such an… ▽ More

    Submitted 19 August, 2012; originally announced August 2012.

    Comments: In Proceedings HSB 2012, arXiv:1208.3151

    Journal ref: EPTCS 92, 2012, pp. 88-105