Skip to main content

Showing 1–20 of 20 results for author: Addanki, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.07291  [pdf, other

    cs.LG cs.AI stat.ML

    Causal Discovery in Semi-Stationary Time Series

    Authors: Shanyun Gao, Raghavendra Addanki, Tong Yu, Ryan A. Rossi, Murat Kocaoglu

    Abstract: Discovering causal relations from observational time series without making the stationary assumption is a significant challenge. In practice, this challenge is common in many areas, such as retail sales, transportation systems, and medical science. Here, we consider this problem for a class of non-stationary time series. The structural causal model (SCM) of this type of time series, called the sem… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    ACM Class: I.2.6, G.3

  2. arXiv:2407.07290  [pdf, other

    cs.LG cs.AI stat.ML

    Causal Discovery-Driven Change Point Detection in Time Series

    Authors: Shanyun Gao, Raghavendra Addanki, Tong Yu, Ryan A. Rossi, Murat Kocaoglu

    Abstract: Change point detection in time series seeks to identify times when the probability distribution of time series changes. It is widely applied in many areas, such as human-activity sensing and medical science. In the context of multivariate time series, this typically involves examining the joint distribution of high-dimensional data: If any one variable changes, the whole time series is assumed to… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    ACM Class: I.2.6, G.3

  3. arXiv:2403.10618  [pdf, ps, other

    cs.LG cs.AI cs.DS econ.EM stat.ME

    Limits of Approximating the Median Treatment Effect

    Authors: Raghavendra Addanki, Siddharth Bhandari

    Abstract: Average Treatment Effect (ATE) estimation is a well-studied problem in causal inference. However, it does not necessarily capture the heterogeneity in the data, and several approaches have been proposed to tackle the issue, including estimating the Quantile Treatment Effects. In the finite population setting containing $n$ individuals, with treatment and control values denoted by the potential out… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  4. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  5. arXiv:2402.00168  [pdf, other

    stat.ML cs.LG stat.ME

    Continuous Treatment Effects with Surrogate Outcomes

    Authors: Zhenghao Zeng, David Arbour, Avi Feller, Raghavendra Addanki, Ryan Rossi, Ritwik Sinha, Edward H. Kennedy

    Abstract: In many real-world causal inference applications, the primary outcomes (labels) are often partially missing, especially if they are expensive or difficult to collect. If the missingness depends on covariates (i.e., missingness is not completely at random), analyses based on fully observed samples alone may be biased. Incorporating surrogates, which are fully observed post-treatment variables relat… ▽ More

    Submitted 21 May, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: 30 pages, 7 figures

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2311.14652  [pdf, other

    cs.LG cs.CL stat.ML

    One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space

    Authors: Raghav Addanki, Chenyang Li, Zhao Song, Chiwun Yang

    Abstract: Attention computation takes both the time complexity of $O(n^2)$ and the space complexity of $O(n^2)$ simultaneously, which makes deploying Large Language Models (LLMs) in streaming applications that involve long contexts requiring substantial computational resources. In recent OpenAI DevDay (Nov 6, 2023), OpenAI released a new model that is able to support a 128K-long document, in our paper, we f… ▽ More

    Submitted 5 February, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

  8. arXiv:2210.06594  [pdf, other

    cs.LG cs.AI cs.DS econ.EM stat.ME

    Sample Constrained Treatment Effect Estimation

    Authors: Raghavendra Addanki, David Arbour, Tung Mai, Cameron Musco, Anup Rao

    Abstract: Treatment effect estimation is a fundamental problem in causal inference. We focus on designing efficient randomized controlled trials, to accurately estimate the effect of some treatment on a population of $n$ individuals. In particular, we study sample-constrained treatment effect estimation, where we must select a subset of $s \ll n$ individuals from the population to experiment on. This subset… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: Conference on Neural Information Processing Systems (NeurIPS) 2022

  9. arXiv:2207.02817  [pdf, ps, other

    cs.DS

    Non-Adaptive Edge Counting and Sampling via Bipartite Independent Set Queries

    Authors: Raghavendra Addanki, Andrew McGregor, Cameron Musco

    Abstract: We study the problem of estimating the number of edges in an $n$-vertex graph, accessed via the Bipartite Independent Set query model introduced by Beame et al. (ITCS '18). In this model, each query returns a Boolean, indicating the existence of at least one edge between two specified sets of nodes. We present a non-adaptive algorithm that returns a $(1\pm ε)$ relative error approximation to the n… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

    Comments: European Symposium on Algorithms (ESA) 2022

  10. arXiv:2201.06678  [pdf, other

    cs.DS

    Improved Approximation and Scalability for Fair Max-Min Diversification

    Authors: Raghavendra Addanki, Andrew McGregor, Alexandra Meliou, Zafeiria Moumoulidou

    Abstract: Given an $n$-point metric space $(\mathcal{X},d)$ where each point belongs to one of $m=O(1)$ different categories or groups and a set of integers $k_1, \ldots, k_m$, the fair Max-Min diversification problem is to select $k_i$ points belonging to category $i\in [m]$, such that the minimum pairwise distance between selected points is maximized. The problem was introduced by Moumoulidou et al. [ICDT… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

    Comments: To appear in ICDT 2022

  11. arXiv:2107.09422  [pdf, other

    cs.LG cs.AI cs.SI stat.ML

    Large-scale graph representation learning with very deep GNNs and self-supervision

    Authors: Ravichandra Addanki, Peter W. Battaglia, David Budden, Andreea Deac, Jonathan Godwin, Thomas Keck, Wai Lok Sibon Li, Alvaro Sanchez-Gonzalez, Jacklynn Stott, Shantanu Thakoor, Petar Veličković

    Abstract: Effectively and efficiently deploying graph neural networks (GNNs) at scale remains one of the most challenging aspects of graph representation learning. Many powerful solutions have only ever been validated on comparatively small datasets, often with counter-intuitive outcomes -- a barrier which has been broken by the Open Graph Benchmark Large-Scale Challenge (OGB-LSC). We entered the OGB-LSC wi… ▽ More

    Submitted 20 July, 2021; originally announced July 2021.

    Comments: To appear at KDD Cup 2021. 13 pages, 3 figures. All authors contributed equally

  12. arXiv:2106.03028  [pdf, other

    cs.LG cs.AI

    Collaborative Causal Discovery with Atomic Interventions

    Authors: Raghavendra Addanki, Shiva Prasad Kasiviswanathan

    Abstract: We introduce a new Collaborative Causal Discovery problem, through which we model a common scenario in which we have multiple independent entities each with their own causal graph, and the goal is to simultaneously learn all these causal graphs. We study this problem without the causal sufficiency assumption, using Maximal Ancestral Graphs (MAG) to model the causal graphs, and assuming that we hav… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

  13. arXiv:2105.05782  [pdf, other

    cs.DS cs.DB stat.ML

    How to Design Robust Algorithms using Noisy Comparison Oracle

    Authors: Raghavendra Addanki, Sainyam Galhotra, Barna Saha

    Abstract: Metric based comparison operations such as finding maximum, nearest and farthest neighbor are fundamental to studying various clustering techniques such as $k$-center clustering and agglomerative hierarchical clustering. These techniques crucially rely on accurate estimation of pairwise distance between records. However, computing exact features of the records, and their pairwise distances is ofte… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

    Comments: PVLDB 2021

  14. arXiv:2012.13976  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Intervention Efficient Algorithms for Approximate Learning of Causal Graphs

    Authors: Raghavendra Addanki, Andrew McGregor, Cameron Musco

    Abstract: We study the problem of learning the causal relationships between a set of observed variables in the presence of latents, while minimizing the cost of interventions on the observed variables. We assume access to an undirected graph $G$ on the observed variables whose edges represent either all direct causal relationships or, less restrictively, a superset of causal relationships (identified, e.g.,… ▽ More

    Submitted 27 December, 2020; originally announced December 2020.

    Comments: To appear, International Conference on Algorithmic Learning Theory(ALT) 2021

  15. arXiv:2012.13349  [pdf, other

    math.OC cs.AI cs.DM cs.LG cs.NE

    Solving Mixed Integer Programs Using Neural Networks

    Authors: Vinod Nair, Sergey Bartunov, Felix Gimeno, Ingrid von Glehn, Pawel Lichocki, Ivan Lobov, Brendan O'Donoghue, Nicolas Sonnerat, Christian Tjandraatmadja, Pengming Wang, Ravichandra Addanki, Tharindi Hapuarachchi, Thomas Keck, James Keeling, Pushmeet Kohli, Ira Ktena, Yujia Li, Oriol Vinyals, Yori Zwols

    Abstract: Mixed Integer Programming (MIP) solvers rely on an array of sophisticated heuristics developed with decades of research to solve large-scale MIP instances encountered in practice. Machine learning offers to automatically construct better heuristics from data by exploiting shared structure among instances in the data. This paper applies learning to the two key sub-tasks of a MIP solver, generating… ▽ More

    Submitted 29 July, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

  16. arXiv:2008.11191  [pdf, other

    cs.SI

    Multi-team Formation using Community Based Approach in Real-World Networks

    Authors: Ramesh Bobby Addanki, Durga Bhavani S

    Abstract: In an organization, tasks called projects that require several skills, are generally assigned to teams rather than individuals. The problem of choosing a right team for a given task with minimal communication cost is known as team formation problem and many algorithms have been proposed in the literature. We propose an algorithm that exploits the community structure of the social network and forms… ▽ More

    Submitted 25 August, 2020; originally announced August 2020.

    Comments: 25 pages, 12 figures

  17. arXiv:2005.11736  [pdf, other

    cs.LG cs.DS stat.ML

    Efficient Intervention Design for Causal Discovery with Latents

    Authors: Raghavendra Addanki, Shiva Prasad Kasiviswanathan, Andrew McGregor, Cameron Musco

    Abstract: We consider recovering a causal graph in presence of latent variables, where we seek to minimize the cost of interventions used in the recovery process. We consider two intervention cost models: (1) a linear cost model where the cost of an intervention on a subset of variables has a linear form, and (2) an identity cost model where the cost of an intervention is the same, regardless of what variab… ▽ More

    Submitted 12 July, 2020; v1 submitted 24 May, 2020; originally announced May 2020.

    Comments: International Conference on Machine Learning 2020

  18. arXiv:1907.08650  [pdf, other

    cs.LG cs.AI stat.ML

    Snomed2Vec: Random Walk and Poincaré Embeddings of a Clinical Knowledge Base for Healthcare Analytics

    Authors: Khushbu Agarwal, Tome Eftimov, Raghavendra Addanki, Sutanay Choudhury, Suzanne Tamang, Robert Rallo

    Abstract: Representation learning methods that transform encoded data (e.g., diagnosis and drug codes) into continuous vector spaces (i.e., vector embeddings) are critical for the application of deep learning in healthcare. Initial work in this area explored the use of variants of the word2vec algorithm to learn embeddings for medical concepts from electronic health records or medical claims datasets. We pr… ▽ More

    Submitted 19 July, 2019; originally announced July 2019.

    Comments: 2019 KDD Workshop on Applied Data Science for Healthcare (DSHealth '19). https://gitlab.com/agarwal.khushbu/Snomed2Vec

  19. arXiv:1906.08879  [pdf, other

    cs.LG cs.DC stat.ML

    Placeto: Learning Generalizable Device Placement Algorithms for Distributed Machine Learning

    Authors: Ravichandra Addanki, Shaileshh Bojja Venkatakrishnan, Shreyan Gupta, Hongzi Mao, Mohammad Alizadeh

    Abstract: We present Placeto, a reinforcement learning (RL) approach to efficiently find device placements for distributed neural network training. Unlike prior approaches that only find a device placement for a specific computation graph, Placeto can learn generalizable device placement policies that can be applied to any graph. We propose two key ideas in our approach: (1) we represent the policy as perfo… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

  20. arXiv:1804.03197  [pdf, ps, other

    cs.DS

    Dynamic Set Cover: Improved Algorithms & Lower Bounds

    Authors: Amir Abboud, Raghavendra Addanki, Fabrizio Grandoni, Debmalya Panigrahi, Barna Saha

    Abstract: We give new upper and lower bounds for the {\em dynamic} set cover problem. First, we give a $(1+ε) f$-approximation for fully dynamic set cover in $O(f^2\log n /ε^5)$ (amortized) update time, for any $ε> 0$, where $f$ is the maximum number of sets that an element belongs to. In the decremental setting, the update time can be improved to $O(f^2/ε^5)$, while still obtaining an $(1+ε) f$-approximati… ▽ More

    Submitted 14 May, 2019; v1 submitted 9 April, 2018; originally announced April 2018.

    Comments: The STOC final version