Skip to main content

Showing 1–16 of 16 results for author: Andreev, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.00118  [pdf, other

    cs.CL cs.AI

    Gemma 2: Improving Open Language Models at a Practical Size

    Authors: Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, Johan Ferret, Peter Liu, Pouya Tafti, Abe Friesen, Michelle Casbon, Sabela Ramos, Ravin Kumar, Charline Le Lan, Sammy Jerome, Anton Tsitsulin, Nino Vieillard, Piotr Stanczyk, Sertan Girgin, Nikola Momchev, Matt Hoffman , et al. (172 additional authors not shown)

    Abstract: In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We al… ▽ More

    Submitted 2 August, 2024; v1 submitted 31 July, 2024; originally announced August 2024.

  2. arXiv:2404.07839  [pdf, other

    cs.LG cs.AI cs.CL

    RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

    Authors: Aleksandar Botev, Soham De, Samuel L Smith, Anushan Fernando, George-Cristian Muraru, Ruba Haroun, Leonard Berrada, Razvan Pascanu, Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot, Johan Ferret, Sertan Girgin, Olivier Bachem, Alek Andreev, Kathleen Kenealy, Thomas Mesnard, Cassidy Hardin, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti , et al. (37 additional authors not shown)

    Abstract: We introduce RecurrentGemma, an open language model which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, which reduces memory use and enables efficient inference on long sequences. We provide a pre-trained model with 2B non-embedding parameters, and an instruction tuned var… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  3. arXiv:2403.08295  [pdf, other

    cs.CL cs.AI

    Gemma: Open Models Based on Gemini Research and Technology

    Authors: Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari , et al. (83 additional authors not shown)

    Abstract: This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Ge… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  4. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  5. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  6. arXiv:2307.10693  [pdf

    cs.AI cs.HC cs.PL math.PR

    Towards an architectural framework for intelligent virtual agents using probabilistic programming

    Authors: Anton Andreev, Grégoire Cattan

    Abstract: We present a new framework called KorraAI for conceiving and building embodied conversational agents (ECAs). Our framework models ECAs' behavior considering contextual information, for example, about environment and interaction time, and uncertain information provided by the human interaction partner. Moreover, agents built with KorraAI can show proactive behavior, as they can initiate interaction… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

  7. arXiv:2302.02648  [pdf

    cs.HC stat.ML

    First steps towards quantum machine learning applied to the classification of event-related potentials

    Authors: Grégoire Cattan, Alexandre Quemy, Anton Andreev

    Abstract: Low information transfer rate is a major bottleneck for brain-computer interfaces based on non-invasive electroencephalography (EEG) for clinical applications. This led to the development of more robust and accurate classifiers. In this study, we investigate the performance of quantum-enhanced support vector classifier (QSVC). Training (predicting) balanced accuracy of QSVC was 83.17 (50.25) %. Th… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: in French language

  8. A comparison of mobile VR display running on an ordinary smartphone with standard PC display for P300-BCI stimulus presentation

    Authors: Grégoire Cattan, Anton Andreev, Cesar Mendoza, Marco Congedo

    Abstract: A brain-computer interface (BCI) based on electroencephalography (EEG) is a promising technology for enhancing virtual reality (VR) applications-in particular, for gaming. We focus on the so-called P300-BCI, a stable and accurate BCI paradigm relying on the recognition of a positive event-related potential (ERP) occurring in the EEG about 300 ms post-stimulation. We implemented a basic version of… ▽ More

    Submitted 6 February, 2020; originally announced February 2020.

    Comments: IEEE Transactions on Games, Institute of Electrical and Electronics Engineers, In press

  9. arXiv:1906.12251  [pdf

    cs.HC

    Engineering study on the use of Head-Mounted display for Brain- Computer Interface

    Authors: Anton Andreev, Grégoire Cattan, M Congedo

    Abstract: In this article, we explore the availability of head-mounted display (HMD) devices which can be coupled in a seamless way with P300-based brain-computer interfaces (BCI) using electroencephalography (EEG). The P300 is an event-related potential appearing about 300ms after the onset of a stimulation. The recognition of this potential on the ongoing EEG requires the knowledge of the exact onset of t… ▽ More

    Submitted 28 June, 2019; originally announced June 2019.

  10. arXiv:1905.05182  [pdf

    cs.HC q-bio.NC

    Building Brain Invaders: EEG data of an experimental validation

    Authors: Gijsbrecht Van Veen, Alexandre Barachant, Anton Andreev, Grégoire Cattan, Pedro Coelho Rodrigues, Marco Congedo

    Abstract: We describe the experimental procedures for a dataset that we have made publicly available at https://doi.org/10.5281/zenodo.2649006 in mat and csv formats. This dataset contains electroencephalographic (EEG) recordings of 25 subjects testing the Brain Invaders (Congedo, 2011), a visual P300 Brain-Computer Interface inspired by the famous vintage video game Space Invaders (Taito, Tokyo, Japan). Th… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1904.09111

  11. arXiv:1904.09111  [pdf

    cs.HC

    Brain Invaders Adaptive versus Non-Adaptive P300 Brain-Computer Interface dataset

    Authors: Erwan Vaineau, Alexandre Barachant, Anton Andreev, Pedro C. Rodrigues, Grégoire Cattan, Marco Congedo

    Abstract: We describe the experimental procedures for a dataset that we have made publicly available at https://doi.org/10.5281/zenodo.1494163 in mat and csv formats. This dataset contains electroencephalographic (EEG) recordings of 24 subjects doing a visual P300 Brain-Computer Interface experiment on PC. The visual P300 is an event-related potential elicited by visual stimulation, peaking 240-600 ms afte… ▽ More

    Submitted 19 April, 2019; originally announced April 2019.

  12. arXiv:1903.11297  [pdf

    cs.HC

    Dataset of an EEG-based BCI experiment in Virtual Reality and on a Personal Computer

    Authors: Grégoire Cattan, A. Andreev, P. Rodrigues, M. Congedo

    Abstract: We describe the experimental procedures for a dataset that we have made publicly available at https://doi.org/10.5281/zenodo.2605204 in mat (Mathworks, Natick, USA) and csv formats. This dataset contains electroencephalographic recordings on 21 subjects doing a visual P300 experiment on PC (personal computer) and VR (virtual reality). The visual P300 is an event-related potential elicited by a vi… ▽ More

    Submitted 27 March, 2019; originally announced March 2019.

  13. arXiv:1812.03066  [pdf

    cs.HC cs.GR eess.SY

    Analysis of tagging latency when comparing event-related potentials

    Authors: Grégoire Cattan, Anton Andreev, Bastien Maureille, Marco Congedo

    Abstract: Event-related potentials (ERPs) are very small voltage produced by the brain in response to external stimulation. In order to detect and evaluate an ERP in an ongoing electroencephalogram (EEG), it is necessary to tag the EEG with the exact onset time of the stimulus. We define the latency as the delay between the time the tagging command is sent and the detection of the stimulus on the screen. Fa… ▽ More

    Submitted 7 December, 2018; originally announced December 2018.

  14. arXiv:1804.10167  [pdf, other

    cs.CV stat.AP

    fMRI: preprocessing, classification and pattern recognition

    Authors: Maxim Sharaev, Alexander Andreev, Alexey Artemov, Alexander Bernstein, Evgeny Burnaev, Ekaterina Kondratyeva, Svetlana Sushchinskaya, Renat Akzhigitov

    Abstract: As machine learning continues to gain momentum in the neuroscience community, we witness the emergence of novel applications such as diagnostics, characterization, and treatment outcome prediction for psychiatric and neurological disorders, for instance, epilepsy and depression. Systematic research into these mental disorders increasingly involves drawing clinical conclusions on the basis of data-… ▽ More

    Submitted 26 April, 2018; originally announced April 2018.

    Comments: 20 pages, 1 figure

  15. arXiv:1804.10163  [pdf, other

    cs.CV stat.AP

    Machine Learning pipeline for discovering neuroimaging-based biomarkers in neurology and psychiatry

    Authors: Alexander Bernstein, Evgeny Burnaev, Ekaterina Kondratyeva, Svetlana Sushchinskaya, Maxim Sharaev, Alexander Andreev, Alexey Artemov, Renat Akzhigitov

    Abstract: We consider a problem of diagnostic pattern recognition/classification from neuroimaging data. We propose a common data analysis pipeline for neuroimaging-based diagnostic classification problems using various ML algorithms and processing toolboxes for brain imaging. We illustrate the pipeline application by discovering new biomarkers for diagnostics of epilepsy and depression based on clinical an… ▽ More

    Submitted 26 April, 2018; originally announced April 2018.

    Comments: 20 pages, 2 figures

  16. arXiv:1310.8115  [pdf

    cs.HC math.DG

    A New Generation of Brain-Computer Interface Based on Riemannian Geometry

    Authors: Marco Congedo, Alexandre Barachant, Anton Andreev

    Abstract: Based on the cumulated experience over the past 25 years in the field of Brain-Computer Interface (BCI) we can now envision a new generation of BCI. Such BCIs will not require training; instead they will be smartly initialized using remote massive databases and will adapt to the user fast and effectively in the first minute of use. They will be reliable, robust and will maintain good performances… ▽ More

    Submitted 30 October, 2013; originally announced October 2013.

    Comments: 33 pages, 9 Figures, 17 equations/algorithms