Skip to main content

Showing 1–9 of 9 results for author: Pickett, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.04242  [pdf, other

    cs.LG cs.AI cs.NE

    The Ungrounded Alignment Problem

    Authors: Marc Pickett, Aakash Kumar Nain, Joseph Modayil, Llion Jones

    Abstract: Modern machine learning systems have demonstrated substantial abilities with methods that either embrace or ignore human-provided knowledge, but combining benefits of both styles remains a challenge. One particular challenge involves designing learning systems that exhibit built-in responses to specific abstract stimulus patterns, yet are still plastic enough to be agnostic about the modality and… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 7 pages, plus references and appendix

  2. arXiv:2407.12101  [pdf, other

    cs.CL cs.AI

    Better RAG using Relevant Information Gain

    Authors: Marc Pickett, Jeremy Hartman, Ayan Kumar Bhowmick, Raquib-ul Alam, Aditya Vempaty

    Abstract: A common way to extend the memory of large language models (LLMs) is by retrieval augmented generation (RAG), which inserts text retrieved from a larger memory into an LLM's context window. However, the context window is typically limited to several thousand tokens, which limits the number of retrieved passages that can inform a model's response. For this reason, it's important to avoid occupying… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 4 page paper submitted to EMNLP

  3. arXiv:2407.09298  [pdf, other

    cs.CL

    Transformer Layers as Painters

    Authors: Qi Sun, Marc Pickett, Aakash Kumar Nain, Llion Jones

    Abstract: Despite their nearly universal adoption for large language models, the internal workings of transformers are not well understood. We aim to better understand the impact of removing or reorganizing information throughout the layers of a pretrained transformer. Such an understanding could both yield better usage of existing models as well as to make architectural improvements to produce new variants… ▽ More

    Submitted 5 August, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: 12 pages total, including references and appendices

  4. arXiv:2201.08239  [pdf, other

    cs.CL cs.AI

    LaMDA: Language Models for Dialog Applications

    Authors: Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, YaGuang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen, Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Vincent Zhao , et al. (35 additional authors not shown)

    Abstract: We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotat… ▽ More

    Submitted 10 February, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

  5. arXiv:2009.09929  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions

    Authors: Vincenzo Lomonaco, Lorenzo Pellegrini, Pau Rodriguez, Massimo Caccia, Qi She, Yu Chen, Quentin Jodelet, Ruiping Wang, Zheda Mai, David Vazquez, German I. Parisi, Nikhil Churamani, Marc Pickett, Issam Laradji, Davide Maltoni

    Abstract: In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous. However, despite the significant and undoubted progress of the field in addressing the issue of catastrophic forgetting, benchmarking different continual learning approaches is a… ▽ More

    Submitted 14 September, 2020; originally announced September 2020.

    Comments: Pre-print v1: 12 pages, 3 figures, 8 tables

  6. arXiv:1707.03979  [pdf, other

    cs.AI cs.LG

    A Brief Study of In-Domain Transfer and Learning from Fewer Samples using A Few Simple Priors

    Authors: Marc Pickett, Ayush Sekhari, James Davidson

    Abstract: Domain knowledge can often be encoded in the structure of a network, such as convolutional layers for vision, which has been shown to increase generalization and decrease sample complexity, or the number of samples required for successful learning. In this study, we ask whether sample complexity can be reduced for systems where the structure of the domain is unknown beforehand, and the structure a… ▽ More

    Submitted 13 July, 2017; originally announced July 2017.

    Comments: Accepted for ICML 2017 Workshop on Picky Learners

  7. arXiv:1610.06402  [pdf, other

    cs.AI cs.LG cs.NE

    A Growing Long-term Episodic & Semantic Memory

    Authors: Marc Pickett, Rami Al-Rfou, Louis Shao, Chris Tar

    Abstract: The long-term memory of most connectionist systems lies entirely in the weights of the system. Since the number of weights is typically fixed, this bounds the total amount of knowledge that can be learned and stored. Though this is not normally a problem for a neural network designed for a specific task, such a bound is undesirable for a system that continually learns over an open range of domains… ▽ More

    Submitted 20 October, 2016; originally announced October 2016.

    Comments: Submission to NIPS workshop on Continual Learning. 4 page extended abstract plus 5 more pages of references, figures, and supplementary material

  8. arXiv:1606.00372  [pdf, other

    cs.CL cs.LG

    Conversational Contextual Cues: The Case of Personalization and History for Response Ranking

    Authors: Rami Al-Rfou, Marc Pickett, Javier Snaider, Yun-hsuan Sung, Brian Strope, Ray Kurzweil

    Abstract: We investigate the task of modeling open-domain, multi-turn, unstructured, multi-participant, conversational dialogue. We specifically study the effect of incorporating different elements of the conversation. Unlike previous efforts, which focused on modeling messages and responses, we extend the modeling to long context and participant's history. Our system does not rely on handwritten rules or e… ▽ More

    Submitted 1 June, 2016; originally announced June 2016.

    Comments: 10 pages, 6 figures

  9. arXiv:1310.2955  [pdf, ps, other

    cs.AI cs.LG

    Spontaneous Analogy by Piggybacking on a Perceptual System

    Authors: Marc Pickett, David W. Aha

    Abstract: Most computational models of analogy assume they are given a delineated source domain and often a specified target domain. These systems do not address how analogs can be isolated from large domains and spontaneously retrieved from long-term memory, a process we call spontaneous analogy. We present a system that represents relational structures as feature bags. Using this representation, our syste… ▽ More

    Submitted 10 October, 2013; originally announced October 2013.

    Comments: Proceedings of the 35th Meeting of the Cognitive Science Society, 2013