Skip to main content

Showing 1–12 of 12 results for author: Niu, Z

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.16550  [pdf, other

    stat.ME math.ST stat.ML

    A Kernel-Based Conditional Two-Sample Test Using Nearest Neighbors (with Applications to Calibration, Regression Curves, and Simulation-Based Inference)

    Authors: Anirban Chatterjee, Ziang Niu, Bhaswar B. Bhattacharya

    Abstract: In this paper we introduce a kernel-based measure for detecting differences between two conditional distributions. Using the `kernel trick' and nearest-neighbor graphs, we propose a consistent estimate of this measure which can be computed in nearly linear time (for a fixed number of nearest neighbors). Moreover, when the two conditional distributions are the same, the estimate has a Gaussian limi… ▽ More

    Submitted 28 August, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

  2. arXiv:2407.08911  [pdf, other

    stat.ME stat.AP

    Computationally efficient and statistically accurate conditional independence testing with spaCRT

    Authors: Ziang Niu, Jyotishka Ray Choudhury, Eugene Katsevich

    Abstract: We introduce the saddlepoint approximation-based conditional randomization test (spaCRT), a novel conditional independence test that effectively balances statistical accuracy and computational efficiency, inspired by applications to single-cell CRISPR screens. Resampling-based methods like the distilled conditional randomization test (dCRT) offer statistical precision but at a high computational c… ▽ More

    Submitted 14 September, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  3. arXiv:2403.07318  [pdf, ps, other

    stat.AP

    Test for high-dimensional linear hypothesis of mean vectors via random integration

    Authors: Jianghao Li, Shizhe Hong, Zhenzhen Niu, Zhidong Bai

    Abstract: In this paper, we investigate hypothesis testing for the linear combination of mean vectors across multiple populations through the method of random integration. We have established the asymptotic distributions of the test statistics under both null and alternative hypotheses. Additionally, we provide a theoretical explanation for the special use of our test statistics in situations when the nonze… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  4. arXiv:2403.05760  [pdf, other

    stat.AP

    Simultaneous test of the mean vectors and covariance matrices for high-dimensional data using RMT

    Authors: Zhenzhen Niu, Jianghao Li, Wenya Luo, Zhidong Bai

    Abstract: In this paper, we propose a new modified likelihood ratio test (LRT) for simultaneously testing mean vectors and covariance matrices of two-sample populations in high-dimensional settings. By employing tools from Random Matrix Theory (RMT), we derive the limiting null distribution of the modified LRT for generally distributed populations. Furthermore, we compare the proposed test with existing tes… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  5. arXiv:2401.17143  [pdf, other

    math.ST stat.ME

    Test for high-dimensional mean vectors via the weighted $L_2$-norm

    Authors: Jianghao Li, Zhenzhen Niu, Shizhe Hong, Zhidong Bai

    Abstract: In this paper, we propose a novel approach to test the equality of high-dimensional mean vectors of several populations via the weighted $L_2$-norm. We establish the asymptotic normality of the test statistics under the null hypothesis. We also explain theoretically why our test statistics can be highly useful in weakly dense cases when the nonzero signal in mean vectors is present. Furthermore, w… ▽ More

    Submitted 31 January, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

  6. arXiv:2302.05828  [pdf, other

    cs.LG stat.ML

    Graph Neural Network-Inspired Kernels for Gaussian Processes in Semi-Supervised Learning

    Authors: Zehao Niu, Mihai Anitescu, Jie Chen

    Abstract: Gaussian processes (GPs) are an attractive class of machine learning models because of their simplicity and flexibility as building blocks of more complex Bayesian models. Meanwhile, graph neural networks (GNNs) emerged recently as a promising class of models for graph-structured data in semi-supervised learning and beyond. Their competitive performance is often attributed to a proper capturing of… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: ICLR 2023. Code is available at https://github.com/niuzehao/gnn-gp

  7. arXiv:2211.15639  [pdf, other

    math.ST stat.ME

    Distribution-free joint independence testing and robust independent component analysis using optimal transport

    Authors: Ziang Niu, Bhaswar B. Bhattacharya

    Abstract: In this paper we study the problem of measuring and testing joint independence for a collection of multivariate random variables. Using the emerging theory of optimal transport (OT) based multivariate ranks, we propose a distribution-free test for multivariate joint independence. Towards this we introduce the notion of rank joint distance covariance (RJdCov), the higher-order rank analogue of the… ▽ More

    Submitted 30 November, 2022; v1 submitted 28 November, 2022; originally announced November 2022.

  8. arXiv:2211.14698  [pdf, other

    stat.ME math.ST

    Reconciling model-X and doubly robust approaches to conditional independence testing

    Authors: Ziang Niu, Abhinav Chakraborty, Oliver Dukes, Eugene Katsevich

    Abstract: Model-X approaches to testing conditional independence between a predictor and an outcome variable given a vector of covariates usually assume exact knowledge of the conditional distribution of the predictor given the covariates. Nevertheless, model-X methodologies are often deployed with this conditional distribution learned in sample. We investigate the consequences of this choice through the le… ▽ More

    Submitted 8 February, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

  9. arXiv:2206.12680  [pdf, other

    cs.LG stat.ML

    Topology-aware Generalization of Decentralized SGD

    Authors: Tongtian Zhu, Fengxiang He, Lan Zhang, Zhengyang Niu, Mingli Song, Dacheng Tao

    Abstract: This paper studies the algorithmic stability and generalizability of decentralized stochastic gradient descent (D-SGD). We prove that the consensus model learned by D-SGD is $\mathcal{O}{(N^{-1}+m^{-1} +λ^2)}$-stable in expectation in the non-convex non-smooth setting, where $N$ is the total sample size, $m$ is the worker number, and $1+λ$ is the spectral gap that measures the connectivity of the… ▽ More

    Submitted 4 February, 2023; v1 submitted 25 June, 2022; originally announced June 2022.

    Comments: Accepted for publication in the 39th International Conference on Machine Learning (ICML 2022)

  10. arXiv:2204.00111  [pdf, ps, other

    stat.ME math.ST

    Estimation and inference for high-dimensional nonparametric additive instrumental-variables regression

    Authors: Ziang Niu, Yuwen Gu, Wei Li

    Abstract: The method of instrumental variables provides a fundamental and practical tool for causal inference in many empirical studies where unmeasured confounding between the treatments and the outcome is present. Modern data such as the genetical genomics data from these studies are often high-dimensional. The high-dimensional linear instrumental-variables regression has been considered in the literature… ▽ More

    Submitted 27 October, 2022; v1 submitted 31 March, 2022; originally announced April 2022.

    Comments: Submitted version

  11. arXiv:2110.03200  [pdf, other

    math.ST stat.ME

    High Dimensional Logistic Regression Under Network Dependence

    Authors: Somabha Mukherjee, Ziang Niu, Sagnik Halder, Bhaswar B. Bhattacharya, George Michailidis

    Abstract: Logistic regression is one of the most fundamental methods for modeling the probability of a binary outcome based on a collection of covariates. However, the classical formulation of logistic regression relies on the independent sampling assumption, which is often violated when the outcomes interact through an underlying network structure. This necessitates the development of models that can simul… ▽ More

    Submitted 9 September, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: Major updates. 46 pages, 3 figures

  12. arXiv:2106.11561  [pdf, other

    stat.CO stat.ME

    Discrepancy-based Inference for Intractable Generative Models using Quasi-Monte Carlo

    Authors: Ziang Niu, Johanna Meier, François-Xavier Briol

    Abstract: Intractable generative models are models for which the likelihood is unavailable but sampling is possible. Most approaches to parameter inference in this setting require the computation of some discrepancy between the data and the generative model. This is for example the case for minimum distance estimation and approximate Bayesian computation. These approaches require sampling a high number of r… ▽ More

    Submitted 4 July, 2022; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: Assumptions were weakened and a further numerical experiment was added