Search | arXiv e-print repository

Defending Large Language Models Against Attacks With Residual Stream Activation Analysis

Authors: Amelia Kawasaki, Andrew Davis, Houssam Abbas

Abstract: The widespread adoption of Large Language Models (LLMs), exemplified by OpenAI's ChatGPT, brings to the forefront the imperative to defend against adversarial threats on these models. These attacks, which manipulate an LLM's output by introducing malicious inputs, undermine the model's integrity and the trust users place in its outputs. In response to this challenge, our paper presents an innovati… ▽ More The widespread adoption of Large Language Models (LLMs), exemplified by OpenAI's ChatGPT, brings to the forefront the imperative to defend against adversarial threats on these models. These attacks, which manipulate an LLM's output by introducing malicious inputs, undermine the model's integrity and the trust users place in its outputs. In response to this challenge, our paper presents an innovative defensive strategy, given white box access to an LLM, that harnesses residual activation analysis between transformer layers of the LLM. We apply a novel methodology for analyzing distinctive activation patterns in the residual streams for attack prompt classification. We curate multiple datasets to demonstrate how this method of classification has high accuracy across multiple types of attack scenarios, including our newly-created attack dataset. Furthermore, we enhance the model's resilience by integrating safety fine-tuning techniques for LLMs in order to measure its effect on our capability to detect attacks. The results underscore the effectiveness of our approach in enhancing the detection and mitigation of adversarial inputs, advancing the security framework within which LLMs operate. △ Less

Submitted 9 July, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

arXiv:2405.19595 [pdf]

The RSNA Abdominal Traumatic Injury CT (RATIC) Dataset

Authors: Jeffrey D. Rudie, Hui-Ming Lin, Robyn L. Ball, Sabeena Jalal, Luciano M. Prevedello, Savvas Nicolaou, Brett S. Marinelli, Adam E. Flanders, Kirti Magudia, George Shih, Melissa A. Davis, John Mongan, Peter D. Chang, Ferco H. Berger, Sebastiaan Hermans, Meng Law, Tyler Richards, Jan-Peter Grunz, Andreas Steven Kunz, Shobhit Mathur, Sandro Galea-Soler, Andrew D. Chung, Saif Afat, Chin-Chi Kuo, Layal Aweidah , et al. (15 additional authors not shown)

Abstract: The RSNA Abdominal Traumatic Injury CT (RATIC) dataset is the largest publicly available collection of adult abdominal CT studies annotated for traumatic injuries. This dataset includes 4,274 studies from 23 institutions across 14 countries. The dataset is freely available for non-commercial use via Kaggle at https://www.kaggle.com/competitions/rsna-2023-abdominal-trauma-detection. Created for the… ▽ More The RSNA Abdominal Traumatic Injury CT (RATIC) dataset is the largest publicly available collection of adult abdominal CT studies annotated for traumatic injuries. This dataset includes 4,274 studies from 23 institutions across 14 countries. The dataset is freely available for non-commercial use via Kaggle at https://www.kaggle.com/competitions/rsna-2023-abdominal-trauma-detection. Created for the RSNA 2023 Abdominal Trauma Detection competition, the dataset encourages the development of advanced machine learning models for detecting abdominal injuries on CT scans. The dataset encompasses detection and classification of traumatic injuries across multiple organs, including the liver, spleen, kidneys, bowel, and mesentery. Annotations were created by expert radiologists from the American Society of Emergency Radiology (ASER) and Society of Abdominal Radiology (SAR). The dataset is annotated at multiple levels, including the presence of injuries in three solid organs with injury grading, image-level annotations for active extravasations and bowel injury, and voxelwise segmentations of each of the potentially injured organs. With the release of this dataset, we hope to facilitate research and development in machine learning and abdominal trauma that can lead to improved patient care and outcomes. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 40 pages, 2 figures, 3 tables

arXiv:2404.19631 [pdf, other]

On Training a Neural Network to Explain Binaries

Authors: Alexander Interrante-Grant, Andy Davis, Heather Preslier, Tim Leek

Abstract: In this work, we begin to investigate the possibility of training a deep neural network on the task of binary code understanding. Specifically, the network would take, as input, features derived directly from binaries and output English descriptions of functionality to aid a reverse engineer in investigating the capabilities of a piece of closed-source software, be it malicious or benign. Given re… ▽ More In this work, we begin to investigate the possibility of training a deep neural network on the task of binary code understanding. Specifically, the network would take, as input, features derived directly from binaries and output English descriptions of functionality to aid a reverse engineer in investigating the capabilities of a piece of closed-source software, be it malicious or benign. Given recent success in applying large language models (generative AI) to the task of source code summarization, this seems a promising direction. However, in our initial survey of the available datasets, we found nothing of sufficiently high quality and volume to train these complex models. Instead, we build our own dataset derived from a capture of Stack Overflow containing 1.1M entries. A major result of our work is a novel dataset evaluation method using the correlation between two distances on sample pairs: one distance in the embedding space of inputs and the other in the embedding space of outputs. Intuitively, if two samples have inputs close in the input embedding space, their outputs should also be close in the output embedding space. We found this Embedding Distance Correlation (EDC) test to be highly diagnostic, indicating that our collected dataset and several existing open-source datasets are of low quality as the distances are not well correlated. We proceed to explore the general applicability of EDC, applying it to a number of qualitatively known good datasets and a number of synthetically known bad ones and found it to be a reliable indicator of dataset value. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.16632 [pdf]

Introducing Systems Thinking as a Framework for Teaching and Assessing Threat Modeling Competency

Authors: Siddhant S. Joshi, Preeti Mukherjee, Kirsten A. Davis, James C. Davis

Abstract: Computing systems face diverse and substantial cybersecurity threats. To mitigate these cybersecurity threats, software engineers need to be competent in the skill of threat modeling. In industry and academia, there are many frameworks for teaching threat modeling, but our analysis of these frameworks suggests that (1) these approaches tend to be focused on component-level analysis rather than edu… ▽ More Computing systems face diverse and substantial cybersecurity threats. To mitigate these cybersecurity threats, software engineers need to be competent in the skill of threat modeling. In industry and academia, there are many frameworks for teaching threat modeling, but our analysis of these frameworks suggests that (1) these approaches tend to be focused on component-level analysis rather than educating students to reason holistically about a system's cybersecurity, and (2) there is no rubric for assessing a student's threat modeling competency. To address these concerns, we propose using systems thinking in conjunction with popular and industry-standard threat modeling frameworks like STRIDE for teaching and assessing threat modeling competency. Prior studies suggest a holistic approach, like systems thinking, can help understand and mitigate cybersecurity threats. Thus, we developed and piloted two novel rubrics - one for assessing STRIDE threat modeling performance and the other for assessing systems thinking performance while conducting STRIDE. To conduct this study, we piloted the two rubrics mentioned above to assess threat model artifacts of students enrolled in an upper-level software engineering course at Purdue University in Fall 2021, Spring 2023, and Fall 2023. Students who had both systems thinking and STRIDE instruction identified and attempted to mitigate component-level as well as systems-level threats. Students with only STRIDE instruction tended to focus on identifying and mitigating component-level threats and discounted system-level threats. We contribute to engineering education by: (1) describing a new rubric for assessing threat modeling based on systems thinking; (2) identifying trends and blindspots in students' threat modeling approach; and (3) envisioning the benefits of integrating systems thinking in threat modeling teaching and assessment. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: Presented at the Annual Conference of the American Society for Engineering Education (ASEE'24) 2024

arXiv:2403.18679 [pdf]

An Exploratory Study on Upper-Level Computing Students' Use of Large Language Models as Tools in a Semester-Long Project

Authors: Ben Arie Tanay, Lexy Arinze, Siddhant S. Joshi, Kirsten A. Davis, James C. Davis

Abstract: Background: Large Language Models (LLMs) such as ChatGPT and CoPilot are influencing software engineering practice. Software engineering educators must teach future software engineers how to use such tools well. As of yet, there have been few studies that report on the use of LLMs in the classroom. It is, therefore, important to evaluate students' perception of LLMs and possible ways of adapting t… ▽ More Background: Large Language Models (LLMs) such as ChatGPT and CoPilot are influencing software engineering practice. Software engineering educators must teach future software engineers how to use such tools well. As of yet, there have been few studies that report on the use of LLMs in the classroom. It is, therefore, important to evaluate students' perception of LLMs and possible ways of adapting the computing curriculum to these shifting paradigms. Purpose: The purpose of this study is to explore computing students' experiences and approaches to using LLMs during a semester-long software engineering project. Design/Method: We collected data from a senior-level software engineering course at Purdue University. This course uses a project-based learning (PBL) design. The students used LLMs such as ChatGPT and Copilot in their projects. A sample of these student teams were interviewed to understand (1) how they used LLMs in their projects; and (2) whether and how their perspectives on LLMs changed over the course of the semester. We analyzed the data to identify themes related to students' usage patterns and learning outcomes. Results/Discussion: When computing students utilize LLMs within a project, their use cases cover both technical and professional applications. In addition, these students perceive LLMs to be efficient tools in obtaining information and completion of tasks. However, there were concerns about the responsible use of LLMs without being detrimental to their own learning outcomes. Based on our findings, we recommend future research to investigate the usage of LLM's in lower-level computer engineering courses to understand whether and how LLMs can be integrated as a learning aid without hurting the learning outcomes. △ Less

Submitted 16 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

Comments: Accepted to the 2024 General Conference of the American Society for Engineering Education (ASEE)

arXiv:2310.07888 [pdf, other]

Viability of Mobile Forms for Population Health Surveys in Low Resource Areas

Authors: Alexander Davis, Aidan Chen, Milton Chen, James Davis

Abstract: Population health surveys are an important tool to effectively allocate limited resources in low resource communities. In such an environment, surveys are often done by local population with pen and paper. Data thus collected is difficult to tabulate and analyze. We conducted a series of interviews and experiments in the Philippines to assess if mobile forms can be a viable and more efficient surv… ▽ More Population health surveys are an important tool to effectively allocate limited resources in low resource communities. In such an environment, surveys are often done by local population with pen and paper. Data thus collected is difficult to tabulate and analyze. We conducted a series of interviews and experiments in the Philippines to assess if mobile forms can be a viable and more efficient survey method. We first conducted pilot interviews and found 60% of the local surveyors actually preferred mobile forms over paper. We then built a software that can generate mobile forms that are easy to use, capable of working offline, and able to track key metrics such as time to complete questions. Our mobile form was field tested in three locations in the Philippines with 33 surveyors collecting health survey responses from 266 subjects. The percentage of surveyors preferring mobile forms increased to 76% after just using the form a few times. The results demonstrate our mobile form is a viable method to conduct large scale population health surveys in a low resource environment. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 2023 IEEE Global Humanitarian Technology Conference (GHTC)

arXiv:2309.13213 [pdf, other]

The LHCb ultra-fast simulation option, Lamarr: design and validation

Authors: Lucio Anderlini, Matteo Barbetti, Simone Capelli, Gloria Corti, Adam Davis, Denis Derkach, Nikita Kazeev, Artem Maevskiy, Maurizio Martinelli, Sergei Mokonenko, Benedetto Gianluca Siddi, Zehua Xu

Abstract: Detailed detector simulation is the major consumer of CPU resources at LHCb, having used more than 90% of the total computing budget during Run 2 of the Large Hadron Collider at CERN. As data is collected by the upgraded LHCb detector during Run 3 of the LHC, larger requests for simulated data samples are necessary, and will far exceed the pledged resources of the experiment, even with existing fa… ▽ More Detailed detector simulation is the major consumer of CPU resources at LHCb, having used more than 90% of the total computing budget during Run 2 of the Large Hadron Collider at CERN. As data is collected by the upgraded LHCb detector during Run 3 of the LHC, larger requests for simulated data samples are necessary, and will far exceed the pledged resources of the experiment, even with existing fast simulation options. An evolution of technologies and techniques to produce simulated samples is mandatory to meet the upcoming needs of analysis to interpret signal versus background and measure efficiencies. In this context, we propose Lamarr, a Gaudi-based framework designed to offer the fastest solution for the simulation of the LHCb detector. Lamarr consists of a pipeline of modules parameterizing both the detector response and the reconstruction algorithms of the LHCb experiment. Most of the parameterizations are made of Deep Generative Models and Gradient Boosted Decision Trees trained on simulated samples or alternatively, where possible, on real data. Embedding Lamarr in the general LHCb Gauss Simulation framework allows combining its execution with any of the available generators in a seamless way. Lamarr has been validated by comparing key reconstructed quantities with Detailed Simulation. Good agreement of the simulated distributions is obtained with two-order-of-magnitude speed-up of the simulation phase. △ Less

Submitted 22 September, 2023; originally announced September 2023.

Comments: Under review in EPJ Web of Conferences (CHEP 2023)

arXiv:2307.06294 [pdf, other]

doi 10.1109/ISCA.2008.35

Corona: System Implications of Emerging Nanophotonic Technology

Authors: Dana Vantrease, Robert Schreiber, Matteo Monchiero, Moray McLaren, Norman P. Jouppi, Marco Fiorentin, Al Davis, Nathan Binkert, Raymond G. Beausoleil, Jung Ho Ahn

Abstract: We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance, memory and inter-core bandwidths will also have to scale by orders of magnitude. Pin limitations, the energy cost of electrical signaling, and the non-scalability of chip-length global wires are significant bandwidth impe… ▽ More We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance, memory and inter-core bandwidths will also have to scale by orders of magnitude. Pin limitations, the energy cost of electrical signaling, and the non-scalability of chip-length global wires are significant bandwidth impediments. Recent developments in silicon nanophotonic technology have the potential to meet these off- and on- stack bandwidth requirements at acceptable power levels. Corona is a 3D many-core architecture that uses nanophotonic communication for both inter-core communication and off-stack communication to memory or I/O devices. Its peak floating-point performance is 10 teraflops. Dense wavelength division multiplexed optically connected memory modules provide 10 terabyte per second memory bandwidth. A photonic crossbar fully interconnects its 256 low-power multithreaded cores at 20 terabyte per second bandwidth. We have simulated a 1024 thread Corona system running synthetic benchmarks and scaled versions of the SPLASH-2 benchmark suite. We believe that in comparison with an electrically-connected many-core alternative that uses the same on-stack interconnect power, Corona can provide 2 to 6 times more performance on many memory-intensive workloads, while simultaneously reducing power. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: This edition is recompiled from proceedings of ISCA-35 (the 35th International Symposium on Computer Architecture, June 21 - 25, 2008, Beijing, China) and has minor formatting differences. 13 pages; 11 figures

arXiv:2306.17141 [pdf, other]

Filtered-Guided Diffusion: Fast Filter Guidance for Black-Box Diffusion Models

Authors: Zeqi Gu, Abe Davis

Abstract: Recent advances in diffusion-based generative models have shown incredible promise for Image-to-Image translation and editing. Most recent work in this space relies on additional training or architecture-specific adjustments to the diffusion process. In this work, we show that much of this low-level control can be achieved without additional training or any access to features of the diffusion mode… ▽ More Recent advances in diffusion-based generative models have shown incredible promise for Image-to-Image translation and editing. Most recent work in this space relies on additional training or architecture-specific adjustments to the diffusion process. In this work, we show that much of this low-level control can be achieved without additional training or any access to features of the diffusion model. Our method simply applies a filter to the input of each diffusion step based on the output of the previous step in an adaptive manner. Notably, this approach does not depend on any specific architecture or sampler and can be done without access to internal features of the network, making it easy to combine with other techniques, samplers, and diffusion architectures. Furthermore, it has negligible cost to performance, and allows for more continuous adjustment of guidance strength than other approaches. We show FGD offers a fast and strong baseline that is competitive with recent architecture-dependent approaches. Furthermore, FGD can also be used as a simple add-on to enhance the structural guidance of other state-of-the-art I2I methods. Finally, our derivation of this method helps to understand the impact of self attention, a key component of other recent architecture-specific I2I approaches, in a more architecture-independent way. Project page: https://github.com/jaclyngu/FilteredGuidedDiffusion △ Less

Submitted 29 June, 2023; originally announced June 2023.

Comments: Project page: https://github.com/jaclyngu/FilteredGuidedDiffusion

arXiv:2306.15688 [pdf, ps, other]

RETROSPECTIVE: Corona: System Implications of Emerging Nanophotonic Technology

Authors: Dana Vantrease, Robert Schreiber, Matteo Monchiero, Moray McLaren, Norman P. Jouppi, Marco Fiorentino, Al Davis, Nathan Binkert, Raymond G. Beausoleil, Jung Ho Ahn

Abstract: The 2008 Corona effort was inspired by a pressing need for more of everything, as demanded by the salient problems of the day. Dennard scaling was no longer in effect. A lot of computer architecture research was in the doldrums. Papers often showed incremental subsystem performance improvements, but at incommensurate cost and complexity. The many-core era was moving rapidly, and the approach with… ▽ More The 2008 Corona effort was inspired by a pressing need for more of everything, as demanded by the salient problems of the day. Dennard scaling was no longer in effect. A lot of computer architecture research was in the doldrums. Papers often showed incremental subsystem performance improvements, but at incommensurate cost and complexity. The many-core era was moving rapidly, and the approach with many simpler cores was at odds with the better and more complex subsystem publications of the day. Core counts were doubling every 18 months, while per-pin bandwidth was expected to double, at best, over the next decade. Memory bandwidth and capacity had to increase to keep pace with ever more powerful multi-core processors. With increasing core counts per die, inter-core communication bandwidth and latency became more important. At the same time, the area and power of electrical networks-on-chip were increasingly problematic: To be reliably received, any signal that traverses a wire spanning a full reticle-sized die would need significant equalization, re-timing, and multiple clock cycles. This additional time, area, and power was the crux of the concern, and things looked to get worse in the future. Silicon nanophotonics was of particular interest and seemed to be improving rapidly. This led us to consider taking advantage of 3D packaging, where one die in the 3D stack would be a photonic network layer. Our focus was on a system that could be built about a decade out. Thus, we tried to predict how the technologies and the system performance requirements would converge in about 2018. Corona was the result this exercise; now, 15 years later, it's interesting to look back at the effort. △ Less

Submitted 23 June, 2023; originally announced June 2023.

Comments: 2 pages. Proceedings of ISCA-50: 50 years of the International Symposia on Computer Architecture (selected papers) June 17-21 Orlando, Florida

arXiv:2304.13681 [pdf, other]

Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation

Authors: Eric Ming Chen, Sidhanth Holalkere, Ruyu Yan, Kai Zhang, Abe Davis

Abstract: Multi-view image generation attracts particular attention these days due to its promising 3D-related applications, e.g., image viewpoint editing. Most existing methods follow a paradigm where a 3D representation is first synthesized, and then rendered into 2D images to ensure photo-consistency across viewpoints. However, such explicit bias for photo-consistency sacrifices photo-realism, causing ge… ▽ More Multi-view image generation attracts particular attention these days due to its promising 3D-related applications, e.g., image viewpoint editing. Most existing methods follow a paradigm where a 3D representation is first synthesized, and then rendered into 2D images to ensure photo-consistency across viewpoints. However, such explicit bias for photo-consistency sacrifices photo-realism, causing geometry artifacts and loss of fine-scale details when these methods are applied to edit real images. To address this issue, we propose ray conditioning, a geometry-free alternative that relaxes the photo-consistency constraint. Our method generates multi-view images by conditioning a 2D GAN on a light field prior. With explicit viewpoint control, state-of-the-art photo-realism and identity consistency, our method is particularly suited for the viewpoint editing task. △ Less

Submitted 4 September, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

Comments: ICCV 2023 paper. Project page at https://ray-cond.github.io/

arXiv:2212.08032 [pdf, other]

doi 10.1088/1367-2630/ace6c8

Demonstration of machine-learning-enhanced Bayesian quantum state estimation

Authors: Sanjaya Lohani, Joseph M. Lukens, Atiyya A. Davis, Amirali Khannejad, Sangita Regmi, Daniel E. Jones, Ryan T. Glasser, Thomas A. Searles, Brian T. Kirby

Abstract: Machine learning (ML) has found broad applicability in quantum information science in topics as diverse as experimental design, state classification, and even studies on quantum foundations. Here, we experimentally realize an approach for defining custom prior distributions that are automatically tuned using ML for use with Bayesian quantum state estimation methods. Previously, researchers have lo… ▽ More Machine learning (ML) has found broad applicability in quantum information science in topics as diverse as experimental design, state classification, and even studies on quantum foundations. Here, we experimentally realize an approach for defining custom prior distributions that are automatically tuned using ML for use with Bayesian quantum state estimation methods. Previously, researchers have looked to Bayesian quantum state tomography due to its unique advantages like natural uncertainty quantification, the return of reliable estimates under any measurement condition, and minimal mean-squared error. However, practical challenges related to long computation times and conceptual issues concerning how to incorporate prior knowledge most suitably can overshadow these benefits. Using both simulated and experimental measurement results, we demonstrate that ML-defined prior distributions reduce net convergence times and provide a natural way to incorporate both implicit and explicit information directly into the prior distribution. These results constitute a promising path toward practical implementations of Bayesian quantum state tomography. △ Less

Submitted 15 December, 2022; originally announced December 2022.

Comments: 9 pages, 4 figures

arXiv:2211.13172 [pdf, other]

Kernel PCA for multivariate extremes

Authors: Marco Avella-Medina, Richard A. Davis, Gennady Samorodnitsky

Abstract: We propose kernel PCA as a method for analyzing the dependence structure of multivariate extremes and demonstrate that it can be a powerful tool for clustering and dimension reduction. Our work provides some theoretical insight into the preimages obtained by kernel PCA, demonstrating that under certain conditions they can effectively identify clusters in the data. We build on these new insights to… ▽ More We propose kernel PCA as a method for analyzing the dependence structure of multivariate extremes and demonstrate that it can be a powerful tool for clustering and dimension reduction. Our work provides some theoretical insight into the preimages obtained by kernel PCA, demonstrating that under certain conditions they can effectively identify clusters in the data. We build on these new insights to characterize rigorously the performance of kernel PCA based on an extremal sample, i.e., the angular part of random vectors for which the radius exceeds a large threshold. More specifically, we focus on the asymptotic dependence of multivariate extremes characterized by the angular or spectral measure in extreme value theory and provide a careful analysis in the case where the extremes are generated from a linear factor model. We give theoretical guarantees on the performance of kernel PCA preimages of such extremes by leveraging their asymptotic distribution together with Davis-Kahan perturbation bounds. Our theoretical findings are complemented with numerical experiments illustrating the finite sample performance of our methods. △ Less

Submitted 23 November, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

arXiv:2211.02145 [pdf, other]

FactorMatte: Redefining Video Matting for Re-Composition Tasks

Authors: Zeqi Gu, Wenqi Xian, Noah Snavely, Abe Davis

Abstract: We propose "factor matting", an alternative formulation of the video matting problem in terms of counterfactual video synthesis that is better suited for re-composition tasks. The goal of factor matting is to separate the contents of video into independent components, each visualizing a counterfactual version of the scene where contents of other components have been removed. We show that factor ma… ▽ More We propose "factor matting", an alternative formulation of the video matting problem in terms of counterfactual video synthesis that is better suited for re-composition tasks. The goal of factor matting is to separate the contents of video into independent components, each visualizing a counterfactual version of the scene where contents of other components have been removed. We show that factor matting maps well to a more general Bayesian framing of the matting problem that accounts for complex conditional interactions between layers. Based on this observation, we present a method for solving the factor matting problem that produces useful decompositions even for video with complex cross-layer interactions like splashes, shadows, and reflections. Our method is trained per-video and requires neither pre-training on external large datasets, nor knowledge about the 3D structure of the scene. We conduct extensive experiments, and show that our method not only can disentangle scenes with complex interactions, but also outperforms top methods on existing tasks such as classical video matting and background subtraction. In addition, we demonstrate the benefits of our approach on a range of downstream tasks. Please refer to our project webpage for more details: https://factormatte.github.io △ Less

Submitted 3 November, 2022; originally announced November 2022.

Comments: Project webpage: https://factormatte.github.io

arXiv:2206.14286 [pdf, ps, other]

TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s

Authors: Felix Chern, Blake Hechtman, Andy Davis, Ruiqi Guo, David Majnemer, Sanjiv Kumar

Abstract: This paper presents a novel nearest neighbor search algorithm achieving TPU (Google Tensor Processing Unit) peak performance, outperforming state-of-the-art GPU algorithms with similar level of recall. The design of the proposed algorithm is motivated by an accurate accelerator performance model that takes into account both the memory and instruction bottlenecks. Our algorithm comes with an analyt… ▽ More This paper presents a novel nearest neighbor search algorithm achieving TPU (Google Tensor Processing Unit) peak performance, outperforming state-of-the-art GPU algorithms with similar level of recall. The design of the proposed algorithm is motivated by an accurate accelerator performance model that takes into account both the memory and instruction bottlenecks. Our algorithm comes with an analytical guarantee of recall in expectation and does not require maintaining sophisticated index data structure or tuning, making it suitable for applications with frequent updates. Our work is available in the open-source package of Jax and Tensorflow on TPU. △ Less

Submitted 30 June, 2022; v1 submitted 28 June, 2022; originally announced June 2022.

arXiv:2204.10855 [pdf, other]

doi 10.1007/s40571-021-00392-3

ParticLS: Object-oriented software for discrete element methods and peridynamics

Authors: Andrew D. Davis, Brendan A. West, Nathanael J. Frisch, Devin T. O'Connor, Matthew D. Parno

Abstract: ParticLS (\emph{Partic}le \emph{L}evel \emph{S}ets) is a software library that implements the discrete element method (DEM) and meshfree methods. ParticLS tracks the interaction between individual particles whose geometries are defined by level sets capable of capturing complex shapes. These particles either represent rigid bodies or material points within a continuum. Particle-particle interactio… ▽ More ParticLS (\emph{Partic}le \emph{L}evel \emph{S}ets) is a software library that implements the discrete element method (DEM) and meshfree methods. ParticLS tracks the interaction between individual particles whose geometries are defined by level sets capable of capturing complex shapes. These particles either represent rigid bodies or material points within a continuum. Particle-particle interactions using various contact laws numerically approximate solutions to energy and mass conservation equations, simulating rigid body dynamics or deformation/fracture. By leveraging multiple contact laws, ParticLS can simulate interacting bodies that deform, fracture, and are composed of many particles. In the continuum setting, we numerically solve the peridynamic equations -- integro-differential equations capable of modeling objects with discontinuous displacement fields and complex fracture dynamics. We show that the discretized peridynamic equations can be solved using the same software infrastructure that implements the DEM. Therefore, we design a unique software library where users can easily add particles with arbitrary geometries and new contact laws that model either rigid-body interaction or peridynamic constitutive relationships. We demonstrate ParticLS' versatility on test problems meant to showcase features applicable to a broad selection of fields such as tectonics, granular media, multiscale simulations, glacier calving, and sea ice. △ Less

Submitted 19 April, 2022; originally announced April 2022.

Journal ref: Computational Particle Mechanics (2021)

arXiv:2204.09753 [pdf]

Path Planning Algorithms for Robotic Aquaculture Monitoring

Authors: Anthony Davis, Srijita Mukherjee, Paul S. Wills, Bing Ouyang

Abstract: Aerial drones have great potential to monitor large areas quickly and efficiently. Aquaculture is an industry that requires continuous water quality data to successfully grow and harvest fish. The Hybrid Aerial Underwater Robotic System (HAUCS) is designed to collect water quality data of aquaculture ponds to reduce labor costs for farmers. The routing of drones to cover each fish pond on an aquac… ▽ More Aerial drones have great potential to monitor large areas quickly and efficiently. Aquaculture is an industry that requires continuous water quality data to successfully grow and harvest fish. The Hybrid Aerial Underwater Robotic System (HAUCS) is designed to collect water quality data of aquaculture ponds to reduce labor costs for farmers. The routing of drones to cover each fish pond on an aquaculture farm can be reduced to the Vehicle Routing Problem. A dataset is created to simulate the distribution of ponds on a farm and is used to assess the HAUCS Path Planning Algorithm (HPP). Its performance is compared with the Google Linear Optimization Package (GLOP) and a Graph Attention Model (AM) for routing problems. GLOP is the most efficient solver for 50 to 200 ponds at the expense of long run times, while HPP outperforms the other methods in solution quality and run time for instances larger than 200 ponds. △ Less

Submitted 20 April, 2022; originally announced April 2022.

arXiv:2203.10637 [pdf, other]

Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise

Authors: Tuomo Raitio, Petko Petkov, Jiangchuan Li, Muhammed Shifas, Andrea Davis, Yannis Stylianou

Abstract: We present a neural text-to-speech (TTS) method that models natural vocal effort variation to improve the intelligibility of synthetic speech in the presence of noise. The method consists of first measuring the spectral tilt of unlabeled conventional speech data, and then conditioning a neural TTS model with normalized spectral tilt among other prosodic factors. Changing the spectral tilt paramete… ▽ More We present a neural text-to-speech (TTS) method that models natural vocal effort variation to improve the intelligibility of synthetic speech in the presence of noise. The method consists of first measuring the spectral tilt of unlabeled conventional speech data, and then conditioning a neural TTS model with normalized spectral tilt among other prosodic factors. Changing the spectral tilt parameter and keeping other prosodic factors unchanged enables effective vocal effort control at synthesis time independent of other prosodic factors. By extrapolation of the spectral tilt values beyond what has been seen in the original data, we can generate speech with high vocal effort levels, thus improving the intelligibility of speech in the presence of masking noise. We evaluate the intelligibility and quality of normal speech and speech with increased vocal effort in the presence of various masking noise conditions, and compare these to well-known speech intelligibility-enhancing algorithms. The evaluations show that the proposed method can improve the intelligibility of synthetic speech with little loss in speech quality. △ Less

Submitted 28 March, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

Comments: 5 pages, 5 figures. Submitted to Interspeech 2022, revision includes more data in results and improved text

arXiv:2111.10459 [pdf, other]

Identifying Population Movements with Non-Negative Matrix Factorization from Wi-Fi User Counts in Smart and Connected Cities

Authors: Michael Huffman, Armen Davis, Joshua Park, James Curry

Abstract: Non-Negative Matrix Factorization (NMF) is a valuable matrix factorization technique which produces a "parts-based" decomposition of data sets. Wi-Fi user counts are a privacy-preserving indicator of population movements in smart and connected urban environments. In this paper, we apply NMF with a novel matrix embedding to Wi-Fi user count data from the University of Colorado at Boulder Campus for… ▽ More Non-Negative Matrix Factorization (NMF) is a valuable matrix factorization technique which produces a "parts-based" decomposition of data sets. Wi-Fi user counts are a privacy-preserving indicator of population movements in smart and connected urban environments. In this paper, we apply NMF with a novel matrix embedding to Wi-Fi user count data from the University of Colorado at Boulder Campus for the purpose of automatically identifying patterns of human movement in a Smart and Connected infrastructure environment. △ Less

Submitted 19 November, 2021; originally announced November 2021.

arXiv:2111.07799 [pdf, other]

Spectral learning of multivariate extremes

Authors: Marco Avella Medina, Richard A. Davis, Gennady Samorodnitsky

Abstract: We propose a spectral clustering algorithm for analyzing the dependence structure of multivariate extremes. More specifically, we focus on the asymptotic dependence of multivariate extremes characterized by the angular or spectral measure in extreme value theory. Our work studies the theoretical performance of spectral clustering based on a random $k$-nearest neighbor graph constructed from an ext… ▽ More We propose a spectral clustering algorithm for analyzing the dependence structure of multivariate extremes. More specifically, we focus on the asymptotic dependence of multivariate extremes characterized by the angular or spectral measure in extreme value theory. Our work studies the theoretical performance of spectral clustering based on a random $k$-nearest neighbor graph constructed from an extremal sample, i.e., the angular part of random vectors for which the radius exceeds a large threshold. In particular, we derive the asymptotic distribution of extremes arising from a linear factor model and prove that, under certain conditions, spectral clustering can consistently identify the clusters of extremes arising in this model. Leveraging this result we propose a simple consistent estimation strategy for learning the angular measure. Our theoretical findings are complemented with numerical experiments illustrating the finite sample performance of our methods. △ Less

Submitted 1 August, 2023; v1 submitted 15 November, 2021; originally announced November 2021.

arXiv:2109.12143 [pdf, other]

Weather of the Dorm WIFI Ecosystem at the University of Colorado Boulder for Fall Semester 2019 to Spring Semester 2020 a Case Study of WIFI and a Campus Response to the COVID-19 Perturbation

Authors: Jake Mcgrath, Armen Davis, James Curry, Orrie Gartner, Glenn Rodrigues, Seth Spielman, Daniel Massey

Abstract: Growing use of network technology in Higher Education means that there has been increasing demand to adapt technology platforms and tools that transform student learning strategies, faculty teaching, research modalities, as well as general operations. Many of the new modalities are necessary for IHE business. In August 2019, we began collecting and analyzing data from the campus WIFI network. A go… ▽ More Growing use of network technology in Higher Education means that there has been increasing demand to adapt technology platforms and tools that transform student learning strategies, faculty teaching, research modalities, as well as general operations. Many of the new modalities are necessary for IHE business. In August 2019, we began collecting and analyzing data from the campus WIFI network. A goal of the research was to answer question like what passive sensing of the IHE WIFI might tell us about the dynamics of the WIFI weather in the IHE ecosystem and what does anonymized data tell us about the IHE ecosystem. The analogy with weather prediction seemed appropriate and a viable approach. Starting Fall 2019, data were collected in the observational phase. In the analysis phase, we applied Singular Spectrum Analysis decomposition, to deconstruct WIFI data from dorms, the central campus dining cafeteria, the recreation center, and other buildings on campus. That analysis led to the identification of clusters of buildings that behaved similarly. Just as in the case of models of the weather, a final component of this research was forecasting. We found that weekly forecast of WIFI behavior in the Fall 2019, were straight forward using SSA and seemed to present behavior of a low dimensional dynamical system. However, in Spring 2020, and the COVID perturbation, the campus ecosystem received a shock and data show that the campus changed very quickly. We found that as the campus moved to conduct remote learning, teaching, the closure of research labs, and the edict to work remotely, SSA forecasting techniques not trained on the Spring 2020, data after the shock, performed poorly. While SSA forecasting trained on a portion of the data did better. △ Less

Submitted 24 September, 2021; originally announced September 2021.

Comments: Contact E-mail: [email protected], Applied Mathematics, University of Colorado, Boulder 80309-0526

arXiv:2104.01661 [pdf, ps, other]

LAGraph: Linear Algebra, Network Analysis Libraries, and the Study of Graph Algorithms

Authors: Gábor Szárnyas, David A. Bader, Timothy A. Davis, James Kitchen, Timothy G. Mattson, Scott McMillan, Erik Welch

Abstract: Graph algorithms can be expressed in terms of linear algebra. GraphBLAS is a library of low-level building blocks for such algorithms that targets algorithm developers. LAGraph builds on top of the GraphBLAS to target users of graph algorithms with high-level algorithms common in network analysis. In this paper, we describe the first release of the LAGraph library, the design decisions behind the… ▽ More Graph algorithms can be expressed in terms of linear algebra. GraphBLAS is a library of low-level building blocks for such algorithms that targets algorithm developers. LAGraph builds on top of the GraphBLAS to target users of graph algorithms with high-level algorithms common in network analysis. In this paper, we describe the first release of the LAGraph library, the design decisions behind the library, and performance using the GAP benchmark suite. LAGraph, however, is much more than a library. It is also a project to document and analyze the full range of algorithms enabled by the GraphBLAS. To that end, we have developed a compact and intuitive notation for describing these algorithms. In this paper, we present that notation with examples from the GAP benchmark suite. △ Less

Submitted 4 April, 2021; originally announced April 2021.

Comments: Accepted to GrAPL 2021

arXiv:2011.07565 [pdf, ps, other]

User-Centered Programming Language Design: A Course-Based Case Study

Authors: Michael Coblenz, Ariel Davis, Megan Hofmann, Vivian Huang, Siyue Jin, Max Krieger, Kyle Liang, Brian Wei, Mengchen Sam Yong, Jonathan Aldrich

Abstract: Recently, user-centered methods have been proposed to improve the design of programming languages. In order to explore what benefits these methods might have for novice programming language designers, we taught a collection of user-centered programming language design methods to a group of eight students. We observed that natural programming and usability studies helped the students refine their l… ▽ More Recently, user-centered methods have been proposed to improve the design of programming languages. In order to explore what benefits these methods might have for novice programming language designers, we taught a collection of user-centered programming language design methods to a group of eight students. We observed that natural programming and usability studies helped the students refine their language designs and identify opportunities for improvement, even in the short duration of a course project. △ Less

Submitted 15 November, 2020; originally announced November 2020.

Comments: 7 pages. Presented at HATRA 2020 (https://2020.splashcon.org/home/hatra-2020)

ACM Class: D.2; D.3

arXiv:2010.07935 [pdf]

Multi-Agent Motion Planning using Deep Learning for Space Applications

Authors: Kyongsik Yun, Changrak Choi, Ryan Alimo, Anthony Davis, Linda Forster, Amir Rahmani, Muhammad Adil, Ramtin Madani

Abstract: State-of-the-art motion planners cannot scale to a large number of systems. Motion planning for multiple agents is an NP (non-deterministic polynomial-time) hard problem, so the computation time increases exponentially with each addition of agents. This computational demand is a major stumbling block to the motion planner's application to future NASA missions involving the swarm of space vehicles.… ▽ More State-of-the-art motion planners cannot scale to a large number of systems. Motion planning for multiple agents is an NP (non-deterministic polynomial-time) hard problem, so the computation time increases exponentially with each addition of agents. This computational demand is a major stumbling block to the motion planner's application to future NASA missions involving the swarm of space vehicles. We applied a deep neural network to transform computationally demanding mathematical motion planning problems into deep learning-based numerical problems. We showed optimal motion trajectories can be accurately replicated using deep learning-based numerical models in several 2D and 3D systems with multiple agents. The deep learning-based numerical model demonstrates superior computational efficiency with plans generated 1000 times faster than the mathematical model counterpart. △ Less

Submitted 15 October, 2020; originally announced October 2020.

Comments: 2020 AIAA ASCEND

arXiv:2008.02479 [pdf, ps, other]

Modeling of time series using random forests: theoretical developments

Authors: Richard A. Davis, Mikkel S. Nielsen

Abstract: In this paper we study asymptotic properties of random forests within the framework of nonlinear time series modeling. While random forests have been successfully applied in various fields, the theoretical justification has not been considered for their use in a time series setting. Under mild conditions, we prove a uniform concentration inequality for regression trees built on nonlinear autoregre… ▽ More In this paper we study asymptotic properties of random forests within the framework of nonlinear time series modeling. While random forests have been successfully applied in various fields, the theoretical justification has not been considered for their use in a time series setting. Under mild conditions, we prove a uniform concentration inequality for regression trees built on nonlinear autoregressive processes and, subsequently, we use this result to prove consistency for a large class of random forests. The results are supported by various simulations. △ Less

Submitted 6 August, 2020; originally announced August 2020.

MSC Class: 62G05; 62G08; 60G10; 60J05; 62M05; 62M10

arXiv:2007.15194 [pdf, other]

Crowdsampling the Plenoptic Function

Authors: Zhengqi Li, Wenqi Xian, Abe Davis, Noah Snavely

Abstract: Many popular tourist landmarks are captured in a multitude of online, public photos. These photos represent a sparse and unstructured sampling of the plenoptic function for a particular scene. In this paper,we present a new approach to novel view synthesis under time-varying illumination from such data. Our approach builds on the recent multi-plane image (MPI) format for representing local light f… ▽ More Many popular tourist landmarks are captured in a multitude of online, public photos. These photos represent a sparse and unstructured sampling of the plenoptic function for a particular scene. In this paper,we present a new approach to novel view synthesis under time-varying illumination from such data. Our approach builds on the recent multi-plane image (MPI) format for representing local light fields under fixed viewing conditions. We introduce a new DeepMPI representation, motivated by observations on the sparsity structure of the plenoptic function, that allows for real-time synthesis of photorealistic views that are continuous in both space and across changes in lighting. Our method can synthesize the same compelling parallax and view-dependent effects as previous MPI methods, while simultaneously interpolating along changes in reflectance and illumination with time. We show how to learn a model of these effects in an unsupervised way from an unstructured collection of photos without temporal registration, demonstrating significant improvements over recent work in neural rendering. More information can be found crowdsampling.io. △ Less

Submitted 29 July, 2020; originally announced July 2020.

Comments: ECCV, 2020 (Oral)

arXiv:2006.09512 [pdf, other]

doi 10.1109/CVPR42600.2020.01231

Visual Chirality

Authors: Zhiqiu Lin, Jin Sun, Abe Davis, Noah Snavely

Abstract: How can we tell whether an image has been mirrored? While we understand the geometry of mirror reflections very well, less has been said about how it affects distributions of imagery at scale, despite widespread use for data augmentation in computer vision. In this paper, we investigate how the statistics of visual data are changed by reflection. We refer to these changes as "visual chirality", af… ▽ More How can we tell whether an image has been mirrored? While we understand the geometry of mirror reflections very well, less has been said about how it affects distributions of imagery at scale, despite widespread use for data augmentation in computer vision. In this paper, we investigate how the statistics of visual data are changed by reflection. We refer to these changes as "visual chirality", after the concept of geometric chirality - the notion of objects that are distinct from their mirror image. Our analysis of visual chirality reveals surprising results, including low-level chiral signals pervading imagery stemming from image processing in cameras, to the ability to discover visual chirality in images of people and faces. Our work has implications for data augmentation, self-supervised learning, and image forensics. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Comments: Published at CVPR 2020, Best Paper Nomination, Oral Presentation. Project Page: https://linzhiqiu.github.io/papers/chirality/

ACM Class: I.4

Journal ref: CVPR (2020), 12292-12300

arXiv:2006.03193 [pdf, other]

LSTM-based Anomaly Detection for Non-linear Dynamical System

Authors: Yue Tan, Chunjing Hu, Kuan Zhang, Kan Zheng, Ethan A. Davis, Jae Sung Park

Abstract: Anomaly detection for non-linear dynamical system plays an important role in ensuring the system stability. However, it is usually complex and has to be solved by large-scale simulation which requires extensive computing resources. In this paper, we propose a novel anomaly detection scheme in non-linear dynamical system based on Long Short-Term Memory (LSTM) to capture complex temporal changes of… ▽ More Anomaly detection for non-linear dynamical system plays an important role in ensuring the system stability. However, it is usually complex and has to be solved by large-scale simulation which requires extensive computing resources. In this paper, we propose a novel anomaly detection scheme in non-linear dynamical system based on Long Short-Term Memory (LSTM) to capture complex temporal changes of the time sequence and make multi-step predictions. Specifically, we first present the framework of LSTM-based anomaly detection in non-linear dynamical system, including data preprocessing, multi-step prediction and anomaly detection. According to the prediction requirement, two types of training modes are explored in multi-step prediction, where samples in a wall shear stress dataset are collected by an adaptive sliding window. On the basis of the multi-step prediction result, a Local Average with Adaptive Parameters (LAAP) algorithm is proposed to extract local numerical features of the time sequence and estimate the upcoming anomaly. The experimental results show that our proposed multi-step prediction method can achieve a higher prediction accuracy than traditional method in wall shear stress dataset, and the LAAP algorithm performs better than the absolute value-based method in anomaly detection task. △ Less

Submitted 4 June, 2020; originally announced June 2020.

Comments: 8 pages, 6 figures

arXiv:2006.00915 [pdf, other]

doi 10.14778/3397230.3397233

eXtreme Modelling in Practice

Authors: A. Jesse Jiryu Davis, Max Hirschhorn, Judah Schvimer

Abstract: Formal modelling is a powerful tool for developing complex systems. At MongoDB, we use TLA+ to model and verify multiple aspects of several systems. Ensuring conformance between a specification and its implementation can add value to any specification; it can avoid transcription errors, prevent bugs as a large organization rapidly develops the specified code, and even keep multiple implementations… ▽ More Formal modelling is a powerful tool for developing complex systems. At MongoDB, we use TLA+ to model and verify multiple aspects of several systems. Ensuring conformance between a specification and its implementation can add value to any specification; it can avoid transcription errors, prevent bugs as a large organization rapidly develops the specified code, and even keep multiple implementations of the same specification in sync. In this paper, we explore model-based testing as a tool for ensuring specification-implementation conformance. We attempted two case studies: model-based trace-checking (MBTC) in the MongoDB Server's replication protocol and model-based test-case generation (MBTCG) in MongoDB Realm Sync's operational transformation algorithm. We found MBTC to be impractical for testing that the Server conformed to a highly abstract specification. MBTCG was highly successful for Realm Sync, however. We analyze why one technique succeeded and the other failed, and advise future implementers making similar attempts at model-based testing. △ Less

Submitted 28 May, 2020; originally announced June 2020.

Journal ref: PVLDB (Proceedings of the VLDB Endowment), Vol. 13, No. 9, pp. 1346-1358 (2020)

arXiv:2005.11423 [pdf, other]

Multi-view polarimetric scattering cloud tomography and retrieval of droplet size

Authors: Aviad Levis, Yoav Y. Schechner, Anthony B. Davis, Jesse Loveridge

Abstract: Tomography aims to recover a three-dimensional (3D) density map of a medium or an object. In medical imaging, it is extensively used for diagnostics via X-ray computed tomography (CT). Optical diffusion tomography is an alternative to X-ray CT that uses multiply scattered light to deliver coarse density maps for soft tissues. We define and derive tomography of cloud droplet distributions via passi… ▽ More Tomography aims to recover a three-dimensional (3D) density map of a medium or an object. In medical imaging, it is extensively used for diagnostics via X-ray computed tomography (CT). Optical diffusion tomography is an alternative to X-ray CT that uses multiply scattered light to deliver coarse density maps for soft tissues. We define and derive tomography of cloud droplet distributions via passive remote sensing. We use multi-view polarimetric images to fit a 3D polarized radiative transfer (RT) forward model. Our motivation is 3D volumetric probing of vertically-developed convectively-driven clouds that are ill-served by current methods in operational passive remote sensing. These techniques are based on strictly 1D RT modeling and applied to a single cloudy pixel, where cloud geometry is assumed to be that of a plane-parallel slab. Incident unpolarized sunlight, once scattered by cloud-droplets, changes its polarization state according to droplet size. Therefore, polarimetric measurements in the rainbow and glory angular regions can be used to infer the droplet size distribution. This work defines and derives a framework for a full 3D tomography of cloud droplets for both their mass concentration in space and their distribution across a range of sizes. This 3D retrieval of key microphysical properties is made tractable by our novel approach that involves a restructuring and differentiation of an open-source polarized 3D RT code to accommodate a special two-step optimization technique. Physically-realistic synthetic clouds are used to demonstrate the methodology with rigorous uncertainty quantification. △ Less

Submitted 22 May, 2020; originally announced May 2020.

arXiv:2004.14554 [pdf, other]

Indirect Identification of Psychosocial Risks from Natural Language

Authors: Kristen C. Allen, Alex Davis, Tamar Krishnamurti

Abstract: During the perinatal period, psychosocial health risks, including depression and intimate partner violence, are associated with serious adverse health outcomes for parents and children. To appropriately intervene, healthcare professionals must first identify those at risk, yet stigma often prevents people from directly disclosing the information needed to prompt an assessment. We examine indirect… ▽ More During the perinatal period, psychosocial health risks, including depression and intimate partner violence, are associated with serious adverse health outcomes for parents and children. To appropriately intervene, healthcare professionals must first identify those at risk, yet stigma often prevents people from directly disclosing the information needed to prompt an assessment. We examine indirect methods of eliciting and analyzing information that could indicate psychosocial risks. Short diary entries by peripartum women exhibit thematic patterns, extracted by topic modeling, and emotional perspective, drawn from dictionary-informed sentiment features. Using these features, we use regularized regression to predict screening measures of depression and psychological aggression by an intimate partner. Journal text entries quantified through topic models and sentiment features show promise for depression prediction, with performance almost as good as closed-form questions. Text-based features were less useful for prediction of intimate partner violence, but moderately indirect multiple-choice questioning allowed for detection without explicit disclosure. Both methods may serve as an initial or complementary screening approach to detecting stigmatized risks. △ Less

Submitted 29 April, 2020; originally announced April 2020.

Comments: 12 pages, 4 figures

ACM Class: J.3; J.4; H.5.2

arXiv:2002.11054 [pdf, other]

MLIR: A Compiler Infrastructure for the End of Moore's Law

Authors: Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, Oleksandr Zinenko

Abstract: This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together. MLIR facilitates the design and implementation of code generators, translators and o… ▽ More This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together. MLIR facilitates the design and implementation of code generators, translators and optimizers at different levels of abstraction and also across application domains, hardware targets and execution environments. The contribution of this work includes (1) discussion of MLIR as a research artifact, built for extension and evolution, and identifying the challenges and opportunities posed by this novel design point in design, semantics, optimization specification, system, and engineering. (2) evaluation of MLIR as a generalized infrastructure that reduces the cost of building compilers-describing diverse use-cases to show research and educational opportunities for future programming languages, compilers, execution environments, and computer architecture. The paper also presents the rationale for MLIR, its original design principles, structures and semantics. △ Less

Submitted 29 February, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

arXiv:1910.12444 [pdf]

Information Seeking and Information Processing Behaviors Among Type 2 Diabetics

Authors: Sarah Masud Preum, Kate Clark, Ashley Davis, Konstantine Khutsishvilli, Rupa S Valdez

Abstract: Effective patient education is critical for managing Type 2 Diabetes Mellitus (T2DM), one of the most common chronic diseases in the United States. While some studies focus on the information-seeking behavior of T2DM patients, other self-education behaviors including information processing and utilization are rarely explored in the context of T2DM. This study sought to assess two self-education be… ▽ More Effective patient education is critical for managing Type 2 Diabetes Mellitus (T2DM), one of the most common chronic diseases in the United States. While some studies focus on the information-seeking behavior of T2DM patients, other self-education behaviors including information processing and utilization are rarely explored in the context of T2DM. This study sought to assess two self-education behaviors of type 2 diabetics, namely, information seeking and information processing, to understand more about how these behaviors affect the self-management of this common chronic disease. Semi-structured interviews were conducted with 8 English speaking T2DM patients and qualitative content analysis techniques were performed to analyze their responses. The information seeking and processing behaviors vary across individuals based on their prognosis of T2DM, information needs, and personal preferences. Patients are often dissatisfied with information from official sources, have difficulty evaluating the trustworthiness of information sources, and desire information that is more personally relevant to them. Several participants identified a lack of personalized information as a key factor in the inability to adhere to T2DM management guidelines, which led them to experience increased glucose levels, difficulty managing A1C levels, frustration, and anxiety. They mentioned that they followed trial and error based approaches to tailor information according to their needs and physiological conditions. Many participants identified conflicting or inconsistent information from different sources as a major barrier to information processing. The results of this study indicate a need for authentic, consistent, and individualized information for type 2 diabetics. △ Less

Submitted 28 October, 2019; originally announced October 2019.

arXiv:1901.00912 [pdf, other]

doi 10.1002/hbe2.115

Arming the public with artificial intelligence to counter social bots

Authors: Kai-Cheng Yang, Onur Varol, Clayton A. Davis, Emilio Ferrara, Alessandro Flammini, Filippo Menczer

Abstract: The increased relevance of social media in our daily life has been accompanied by efforts to manipulate online conversations and opinions. Deceptive social bots -- automated or semi-automated accounts designed to impersonate humans -- have been successfully exploited for these kinds of abuse. Researchers have responded by developing AI tools to arm the public in the fight against social bots. Here… ▽ More The increased relevance of social media in our daily life has been accompanied by efforts to manipulate online conversations and opinions. Deceptive social bots -- automated or semi-automated accounts designed to impersonate humans -- have been successfully exploited for these kinds of abuse. Researchers have responded by developing AI tools to arm the public in the fight against social bots. Here we review the literature on different types of bots, their impact, and detection methods. We use the case study of Botometer, a popular bot detection tool developed at Indiana University, to illustrate how people interact with AI countermeasures. A user experience survey suggests that bot detection has become an integral part of the social media experience for many users. However, barriers in interpreting the output of AI tools can lead to fundamental misunderstandings. The arms race between machine learning methods to develop sophisticated bots and effective countermeasures makes it necessary to update the training data and features of detection tools. We again use the Botometer case to illustrate both algorithmic and interpretability improvements of bot scores, designed to meet user expectations. We conclude by discussing how future AI developments may affect the fight between malicious bots and the public. △ Less

Submitted 6 February, 2019; v1 submitted 3 January, 2019; originally announced January 2019.

Comments: Published in Human Behavior and Emerging Technologies

Journal ref: Hum Behav & Emerg Tech. 2019;e115

arXiv:1805.01772 [pdf, other]

doi 10.1145/3190508.3190551

Dynamic Control Flow in Large-Scale Machine Learning

Authors: Yuan Yu, Martín Abadi, Paul Barham, Eugene Brevdo, Mike Burrows, Andy Davis, Jeff Dean, Sanjay Ghemawat, Tim Harley, Peter Hawkins, Michael Isard, Manjunath Kudlur, Rajat Monga, Derek Murray, Xiaoqiang Zheng

Abstract: Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from the ability to make rapid control-flow decisions a… ▽ More Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from the ability to make rapid control-flow decisions across a set of computing devices in a distributed system. For performance, scalability, and expressiveness, a machine learning system must support dynamic control flow in distributed and heterogeneous environments. This paper presents a programming model for distributed machine learning that supports dynamic control flow. We describe the design of the programming model, and its implementation in TensorFlow, a distributed machine learning system. Our approach extends the use of dataflow graphs to represent machine learning models, offering several distinctive features. First, the branches of conditionals and bodies of loops can be partitioned across many machines to run on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs. Second, programs written in our model support automatic differentiation and distributed gradient computations, which are necessary for training machine learning models that use control flow. Third, our choice of non-strict semantics enables multiple loop iterations to execute in parallel across machines, and to overlap compute and I/O operations. We have done our work in the context of TensorFlow, and it has been used extensively in research and production. We evaluate it using several real-world applications, and demonstrate its performance and scalability. △ Less

Submitted 4 May, 2018; originally announced May 2018.

Comments: Appeared in EuroSys 2018. 14 pages, 16 figures

Journal ref: EuroSys 2018: Thirteenth EuroSys Conference, April 23-26, 2018, Porto, Portugal. ACM, New York, NY, USA

arXiv:1707.06719 [pdf, other]

Generalized Convolutional Neural Networks for Point Cloud Data

Authors: Aleksandr Savchenkov, Andrew Davis, Xuan Zhao

Abstract: The introduction of cheap RGB-D cameras, stereo cameras, and LIDAR devices has given the computer vision community 3D information that conventional RGB cameras cannot provide. This data is often stored as a point cloud. In this paper, we present a novel method to apply the concept of convolutional neural networks to this type of data. By creating a mapping of nearest neighbors in a dataset, and in… ▽ More The introduction of cheap RGB-D cameras, stereo cameras, and LIDAR devices has given the computer vision community 3D information that conventional RGB cameras cannot provide. This data is often stored as a point cloud. In this paper, we present a novel method to apply the concept of convolutional neural networks to this type of data. By creating a mapping of nearest neighbors in a dataset, and individually applying weights to spatial relationships between points, we achieve an architecture that works directly with point clouds, but closely resembles a convolutional neural net in both design and behavior. Such a method bypasses the need for extensive feature engineering, while proving to be computationally efficient and requiring few parameters. △ Less

Submitted 18 October, 2018; v1 submitted 20 July, 2017; originally announced July 2017.

arXiv:1703.03107 [pdf, other]

Online Human-Bot Interactions: Detection, Estimation, and Characterization

Authors: Onur Varol, Emilio Ferrara, Clayton A. Davis, Filippo Menczer, Alessandro Flammini

Abstract: Increasing evidence suggests that a growing amount of social media content is generated by autonomous entities known as social bots. In this work we present a framework to detect such entities on Twitter. We leverage more than a thousand features extracted from public data and meta-data about users: friends, tweet content and sentiment, network patterns, and activity time series. We benchmark the… ▽ More Increasing evidence suggests that a growing amount of social media content is generated by autonomous entities known as social bots. In this work we present a framework to detect such entities on Twitter. We leverage more than a thousand features extracted from public data and meta-data about users: friends, tweet content and sentiment, network patterns, and activity time series. We benchmark the classification framework by using a publicly available dataset of Twitter bots. This training data is enriched by a manually annotated collection of active Twitter users that include both humans and bots of varying sophistication. Our models yield high accuracy and agreement with each other and can detect bots of different nature. Our estimates suggest that between 9% and 15% of active Twitter accounts are bots. Characterizing ties among accounts, we observe that simple bots tend to interact with bots that exhibit more human-like behaviors. Analysis of content flows reveals retweet and mention strategies adopted by bots to interact with different target groups. Using clustering analysis, we characterize several subclasses of accounts, including spammers, self promoters, and accounts that post content from connected applications. △ Less

Submitted 27 March, 2017; v1 submitted 8 March, 2017; originally announced March 2017.

Comments: Accepted paper for ICWSM'17, 10 pages, 8 figures, 1 table

arXiv:1701.06538 [pdf, other]

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

Authors: Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean

Abstract: The capacity of a neural network to absorb information is limited by its number of parameters. Conditional computation, where parts of the network are active on a per-example basis, has been proposed in theory as a way of dramatically increasing model capacity without a proportional increase in computation. In practice, however, there are significant algorithmic and performance challenges. In this… ▽ More The capacity of a neural network to absorb information is limited by its number of parameters. Conditional computation, where parts of the network are active on a per-example basis, has been proposed in theory as a way of dramatically increasing model capacity without a proportional increase in computation. In practice, however, there are significant algorithmic and performance challenges. In this work, we address these challenges and finally realize the promise of conditional computation, achieving greater than 1000x improvements in model capacity with only minor losses in computational efficiency on modern GPU clusters. We introduce a Sparsely-Gated Mixture-of-Experts layer (MoE), consisting of up to thousands of feed-forward sub-networks. A trainable gating network determines a sparse combination of these experts to use for each example. We apply the MoE to the tasks of language modeling and machine translation, where model capacity is critical for absorbing the vast quantities of knowledge available in the training corpora. We present model architectures in which a MoE with up to 137 billion parameters is applied convolutionally between stacked LSTM layers. On large language modeling and machine translation benchmarks, these models achieve significantly better results than state-of-the-art at lower computational cost. △ Less

Submitted 23 January, 2017; originally announced January 2017.

arXiv:1609.08239 [pdf, other]

doi 10.1007/978-3-319-47874-6_19

On the influence of social bots in online protests. Preliminary findings of a Mexican case study

Authors: Pablo Suárez-Serrato, Margaret E. Roberts, Clayton A. Davis, Filippo Menczer

Abstract: Social bots can affect online communication among humans. We study this phenomenon by focusing on #YaMeCanse, the most active protest hashtag in the history of Twitter in Mexico. Accounts using the hashtag are classified using the BotOrNot bot detection tool. Our preliminary analysis suggests that bots played a critical role in disrupting online communication about the protest movement. Social bots can affect online communication among humans. We study this phenomenon by focusing on #YaMeCanse, the most active protest hashtag in the history of Twitter in Mexico. Accounts using the hashtag are classified using the BotOrNot bot detection tool. Our preliminary analysis suggests that bots played a critical role in disrupting online communication about the protest movement. △ Less

Submitted 26 September, 2016; originally announced September 2016.

Comments: 10 pages

Journal ref: SocInfo 2016, Part II, LNCS 10047

arXiv:1605.08695 [pdf, other]

TensorFlow: A system for large-scale machine learning

Authors: Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, Xiaoqiang Zheng

Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs,… ▽ More TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with particularly strong support for training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model in contrast to existing systems, and demonstrate the compelling performance that TensorFlow achieves for several real-world applications. △ Less

Submitted 31 May, 2016; v1 submitted 27 May, 2016; originally announced May 2016.

Comments: 18 pages, 9 figures; v2 has a spelling correction in the metadata

arXiv:1603.04467 [pdf, other]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Authors: Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mane, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah , et al. (15 additional authors not shown)

Abstract: TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational de… ▽ More TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org. △ Less

Submitted 16 March, 2016; v1 submitted 14 March, 2016; originally announced March 2016.

Comments: Version 2 updates only the metadata, to correct the formatting of Martín Abadi's name

arXiv:1602.04878 [pdf, other]

Kinsey Reporter: Citizen Science for Sex Research

Authors: Clayton A Davis, Julia Heiman, Erick Janssen, Stephanie Sanders, Justin Garcia, Filippo Menczer

Abstract: Kinsey Reporter is a global mobile app to share, explore, and visualize anonymous data about sex. Reports are submitted via smartphone, then visualized on a website or downloaded for offline analysis. In this paper we present the major features of the Kinsey Reporter citizen science platform designed to preserve the anonymity of its contributors, and preliminary data analyses that suggest question… ▽ More Kinsey Reporter is a global mobile app to share, explore, and visualize anonymous data about sex. Reports are submitted via smartphone, then visualized on a website or downloaded for offline analysis. In this paper we present the major features of the Kinsey Reporter citizen science platform designed to preserve the anonymity of its contributors, and preliminary data analyses that suggest questions for future research. △ Less

Submitted 15 February, 2016; originally announced February 2016.

Comments: Let's Talk About Sex (Apps) Workshop at CSCW 2015

arXiv:1602.00975 [pdf, other]

doi 10.1145/2872518.2889302

BotOrNot: A System to Evaluate Social Bots

Authors: Clayton A. Davis, Onur Varol, Emilio Ferrara, Alessandro Flammini, Filippo Menczer

Abstract: While most online social media accounts are controlled by humans, these platforms also host automated agents called social bots or sybil accounts. Recent literature reported on cases of social bots imitating humans to manipulate discussions, alter the popularity of users, pollute content and spread misinformation, and even perform terrorist propaganda and recruitment actions. Here we present BotOr… ▽ More While most online social media accounts are controlled by humans, these platforms also host automated agents called social bots or sybil accounts. Recent literature reported on cases of social bots imitating humans to manipulate discussions, alter the popularity of users, pollute content and spread misinformation, and even perform terrorist propaganda and recruitment actions. Here we present BotOrNot, a publicly-available service that leverages more than one thousand features to evaluate the extent to which a Twitter account exhibits similarity to the known characteristics of social bots. Since its release in May 2014, BotOrNot has served over one million requests via our website and APIs. △ Less

Submitted 2 February, 2016; originally announced February 2016.

Comments: 2 pages, 2 figures, WWW Developers Day

Journal ref: Proceedings of the 25th International Conference Companion on World Wide Web (pp. 273-274). 2016

arXiv:1510.04734 [pdf]

A Method for Modeling Co-Occurrence Propensity of Clinical Codes with Application to ICD-10-PCS Auto-Coding

Authors: Michael Subotin, Anthony R. Davis

Abstract: Objective. Natural language processing methods for medical auto-coding, or automatic generation of medical billing codes from electronic health records, generally assign each code independently of the others. They may thus assign codes for closely related procedures or diagnoses to the same document, even when they do not tend to occur together in practice, simply because the right choice can be d… ▽ More Objective. Natural language processing methods for medical auto-coding, or automatic generation of medical billing codes from electronic health records, generally assign each code independently of the others. They may thus assign codes for closely related procedures or diagnoses to the same document, even when they do not tend to occur together in practice, simply because the right choice can be difficult to infer from the clinical narrative. Materials and Methods. We propose a method that injects awareness of the propensities for code co-occurrence into this process. First, a model is trained to estimate the conditional probability that one code is assigned by a human coder, given than another code is known to have been assigned to the same document. Then, at runtime, an iterative algorithm is used to apply this model to the output of an existing statistical auto-coder to modify the confidence scores of the codes. Results. We tested this method in combination with a primary auto-coder for ICD-10 procedure codes, achieving a 12% relative improvement in F-score over the primary auto-coder baseline. Discussion. The proposed method can be used, with appropriate features, in combination with any auto-coder that generates codes with different levels of confidence. Conclusion. The promising results obtained for ICD-10 procedure codes suggest that the proposed method may have wider applications in auto-coding. △ Less

Submitted 15 October, 2015; originally announced October 2015.

Comments: Submitted to Journal of the American Medical Informatics Association, 2015

arXiv:1406.1566 [pdf, other]

doi 10.4204/EPTCS.152.13

Development of a Translator from LLVM to ACL2

Authors: David S. Hardin, Jennifer A. Davis, David A. Greve, Jedidiah R. McClurg

Abstract: In our current work a library of formally verified software components is to be created, and assembled, using the Low-Level Virtual Machine (LLVM) intermediate form, into subsystems whose top-level assurance relies on the assurance of the individual components. We have thus undertaken a project to build a translator from LLVM to the applicative subset of Common Lisp accepted by the ACL2 theorem p… ▽ More In our current work a library of formally verified software components is to be created, and assembled, using the Low-Level Virtual Machine (LLVM) intermediate form, into subsystems whose top-level assurance relies on the assurance of the individual components. We have thus undertaken a project to build a translator from LLVM to the applicative subset of Common Lisp accepted by the ACL2 theorem prover. Our translator produces executable ACL2 formal models, allowing us to both prove theorems about the translated models as well as validate those models by testing. The resulting models can be translated and certified without user intervention, even for code with loops, thanks to the use of the def::ung macro which allows us to defer the question of termination. Initial measurements of concrete execution for translated LLVM functions indicate that performance is nearly 2.4 million LLVM instructions per second on a typical laptop computer. In this paper we overview the translation process and illustrate the translator's capabilities by way of a concrete example, including both a functional correctness theorem as well as a validation test for that example. △ Less

Submitted 5 June, 2014; originally announced June 2014.

Comments: In Proceedings ACL2 2014, arXiv:1406.1238

ACM Class: F.3.1; F.4.1

Journal ref: EPTCS 152, 2014, pp. 163-177

arXiv:1312.4461 [pdf, ps, other]

Low-Rank Approximations for Conditional Feedforward Computation in Deep Neural Networks

Authors: Andrew Davis, Itamar Arel

Abstract: Scalability properties of deep neural networks raise key research questions, particularly as the problems considered become larger and more challenging. This paper expands on the idea of conditional computation introduced by Bengio, et. al., where the nodes of a deep network are augmented by a set of gating units that determine when a node should be calculated. By factorizing the weight matrix int… ▽ More Scalability properties of deep neural networks raise key research questions, particularly as the problems considered become larger and more challenging. This paper expands on the idea of conditional computation introduced by Bengio, et. al., where the nodes of a deep network are augmented by a set of gating units that determine when a node should be calculated. By factorizing the weight matrix into a low-rank approximation, an estimation of the sign of the pre-nonlinearity activation can be efficiently obtained. For networks using rectified-linear hidden units, this implies that the computation of a hidden unit with an estimated negative pre-nonlinearity can be ommitted altogether, as its value will become zero when nonlinearity is applied. For sparse neural networks, this can result in considerable speed gains. Experimental results using the MNIST and SVHN data sets with a fully-connected deep neural network demonstrate the performance robustness of the proposed scheme with respect to the error introduced by the conditional computation process. △ Less

Submitted 28 January, 2014; v1 submitted 16 December, 2013; originally announced December 2013.

Comments: 10 pages, 5 figures. Submitted to ICLR 2014

arXiv:1301.2264 [pdf]

Using Bayesian Networks to Identify the Causal Effect of Speeding in Individual Vehicle/Pedestrian Collisions

Authors: Gary A. Davis

Abstract: On roads showing significant violations of posted speed limits, one measure of the safety effect of speeding is the difference between the road's actual accident count and the count that would have occurred if the posted speed limit had been strictly obeyed. An estimate of this accident reduction can be had by computing the probability that speeding was a necessary condition for each of set of acc… ▽ More On roads showing significant violations of posted speed limits, one measure of the safety effect of speeding is the difference between the road's actual accident count and the count that would have occurred if the posted speed limit had been strictly obeyed. An estimate of this accident reduction can be had by computing the probability that speeding was a necessary condition for each of set of accidents. This is an instance of assessing individual probabilities of causation, which is generally not possible absent prior knowledge of causal structure. For traffic accidents such prior knowledge is often available and this paper illustrates how, for a commonly occurring class of vehicle/pedestrian accidents, approaches to uncertainty and causal analyses appearing in the accident reconstruction literature can be unified using Bayesian networks. Measured skidmarks, pedestrian throw distances, and pedestrian injury severity are treated as evidence, and using the Gibbs Sampling routine BUGS, the posterior probability distribution over exogenous variables, such as the vehicle's initial speed, location, and driver reaction time, is computed. This posterior distribution is then used to compute the "probability of necessity" for speeding. △ Less

Submitted 10 January, 2013; originally announced January 2013.

Comments: Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001)

Report number: UAI-P-2001-PG-105-111

arXiv:0807.1253 [pdf, ps, other]

doi 10.1098/rspa.2008.0465

Informed Traders

Authors: Dorje C. Brody, Mark H. A. Davis, Robyn L. Friedman, Lane P. Hughston

Abstract: An asymmetric information model is introduced for the situation in which there is a small agent who is more susceptible to the flow of information in the market than the general market participant, and who tries to implement strategies based on the additional information. In this model market participants have access to a stream of noisy information concerning the future return of an asset, wher… ▽ More An asymmetric information model is introduced for the situation in which there is a small agent who is more susceptible to the flow of information in the market than the general market participant, and who tries to implement strategies based on the additional information. In this model market participants have access to a stream of noisy information concerning the future return of an asset, whereas the informed trader has access to a further information source which is obscured by an additional noise that may be correlated with the market noise. The informed trader uses the extraneous information source to seek statistical arbitrage opportunities, while at the same time accommodating the additional risk. The amount of information available to the general market participant concerning the asset return is measured by the mutual information of the asset price and the associated cash flow. The worth of the additional information source is then measured in terms of the difference of mutual information between the general market participant and the informed trader. This difference is shown to be nonnegative when the signal-to-noise ratio of the information flow is known in advance. Explicit trading strategies leading to statistical arbitrage opportunities, taking advantage of the additional information, are constructed, illustrating how excess information can be translated into profit. △ Less

Submitted 17 November, 2008; v1 submitted 8 July, 2008; originally announced July 2008.

Comments: 20 pages, 5 figures. Version to appear in the Proceedings of the Royal Society A

Journal ref: Proceedings of the Royal Society London A465, 1103-1122 (2009)

arXiv:cs/0409003 [pdf]

ScheduleNanny: Using GPS to Learn the User's Significant Locations, Travel Times and Schedule

Authors: Parth Bhawalkar, Victor Bigio, Adam Davis, Karthik Narayanaswami, Femi Olumoko

Abstract: As computing technology becomes more pervasive, personal devices such as the PDA, cell-phone, and notebook should use context to determine how to act. Location is one form of context that can be used in many ways. We present a multiple-device system that collects and clusters GPS data into significant locations. These locations are then used to determine travel times and a probabilistic model of… ▽ More As computing technology becomes more pervasive, personal devices such as the PDA, cell-phone, and notebook should use context to determine how to act. Location is one form of context that can be used in many ways. We present a multiple-device system that collects and clusters GPS data into significant locations. These locations are then used to determine travel times and a probabilistic model of the user's schedule, which is used to intelligently alert the user. We evaluate our system and suggest how it should be integrated with a variety of applications. △ Less

Submitted 2 September, 2004; originally announced September 2004.

Comments: 7 pages, 10 figures. Adaptive & Ubiquitous Computing

ACM Class: F.2.2; I.5.3; H.5.3; H.5.m

Showing 1–49 of 49 results for author: Davis, A