Skip to main content

Showing 1–3 of 3 results for author: Antverg, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  2. arXiv:2206.00259  [pdf, other

    cs.CL cs.AI cs.LG

    IDANI: Inference-time Domain Adaptation via Neuron-level Interventions

    Authors: Omer Antverg, Eyal Ben-David, Yonatan Belinkov

    Abstract: Large pre-trained models are usually fine-tuned on downstream task data, and tested on unseen data. When the train and test data come from different domains, the model is likely to struggle, as it is not adapted to the test domain. We propose a new approach for domain adaptation (DA), using neuron-level interventions: We modify the representation of each test example in specific neurons, resulting… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: Our code is available at https://github.com/technion-cs-nlp/idani

  3. arXiv:2110.07483  [pdf, other

    cs.CL

    On the Pitfalls of Analyzing Individual Neurons in Language Models

    Authors: Omer Antverg, Yonatan Belinkov

    Abstract: While many studies have shown that linguistic information is encoded in hidden word representations, few have studied individual neurons, to show how and in which neurons it is encoded. Among these, the common approach is to use an external probe to rank neurons according to their relevance to some linguistic attribute, and to evaluate the obtained ranking using the same probe that produced it. We… ▽ More

    Submitted 1 August, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: ICLR 2022 Main Conference

    ACM Class: I.2.7