Skip to main content

Showing 1–50 of 1,190 results for author: Shi, Z

.
  1. arXiv:2409.06577  [pdf, other

    eess.SP

    Compressed Sensing based Detection Schemes for Differential Spatial Modulation in Visible Light Communication Systems

    Authors: Zichun Shi, Pu Miao, Peng Chen, Lei Xue, Li-Yang Zheng, Laiyuan Wang, Gaojie Chen

    Abstract: Differential spatial modulation (DSM) exploits the time dimension to facilitate the differential modulation, which can perfectly avoid the challenge in acquiring of heavily entangled channel state information of visible light communication (VLC) system. However, it has huge search space and high complexity for large number of transmitters. In this paper, a novel vector correction (VC)-based orthog… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

    Comments: This paper has been accepted by 2024 IEEE 24th International Conference on Communication Technology (ICCT 2024)

  2. arXiv:2409.06197  [pdf, other

    cs.CV

    UdeerLID+: Integrating LiDAR, Image, and Relative Depth with Semi-Supervised

    Authors: Tao Ni, Xin Zhan, Tao Luo, Wenbin Liu, Zhan Shi, JunBo Chen

    Abstract: Road segmentation is a critical task for autonomous driving systems, requiring accurate and robust methods to classify road surfaces from various environmental data. Our work introduces an innovative approach that integrates LiDAR point cloud data, visual image, and relative depth maps derived from images. The integration of multiple data sources in road segmentation presents both opportunities an… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  3. arXiv:2409.03319  [pdf, other

    cs.ET

    Semantic Communication for Efficient Point Cloud Transmission

    Authors: Shangzhuo Xie, Qianqian Yang, Yuyi Sun, Tianxiao Han, Zhaohui Yang, Zhiguo Shi

    Abstract: As three-dimensional acquisition technologies like LiDAR cameras advance, the need for efficient transmission of 3D point clouds is becoming increasingly important. In this paper, we present a novel semantic communication (SemCom) approach for efficient 3D point cloud transmission. Different from existing methods that rely on downsampling and feature extraction for compression, our approach utiliz… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  4. arXiv:2409.02008  [pdf, other

    cs.NI cs.AI cs.DC

    When Digital Twin Meets 6G: Concepts, Obstacles, and Research Prospects

    Authors: Wenshuai Liu, Yaru Fu, Zheng Shi, Hong Wang

    Abstract: The convergence of digital twin technology and the emerging 6G network presents both challenges and numerous research opportunities. This article explores the potential synergies between digital twin and 6G, highlighting the key challenges and proposing fundamental principles for their integration. We discuss the unique requirements and capabilities of digital twin in the context of 6G networks, s… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: 7 pages, 6 figures

  5. arXiv:2409.01048  [pdf, ps, other

    math.PR

    Boundedness of discounted tree sums

    Authors: Elie Aïdékon, Yueyun Hu, Zhan Shi

    Abstract: Let $(V(u),\, u\in \mathcal{T})$ be a (supercritical) branching random walk and $(η_u,\,u\in \mathcal{T})$ be marks on the vertices of the tree, distributed in an i.i.d.\ fashion. Following Aldous and Bandyopadhyay \cite{AB05}, for each infinite ray $ξ$ of the tree, we associate the {\it discounted tree sum} $D(ξ)$ which is the sum of the $e^{-V(u)}η_u$ taken along the ray. The paper deals with… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  6. arXiv:2409.01035  [pdf, other

    cs.CL cs.CV cs.LG

    Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning

    Authors: Chongjie Si, Zhiyi Shi, Shifan Zhang, Xiaokang Yang, Hanspeter Pfister, Wei Shen

    Abstract: Large language models demonstrate impressive performance on downstream tasks, yet requiring extensive resource consumption when fully fine-tuning all parameters. To mitigate this, Parameter Efficient Fine-Tuning (PEFT) strategies, such as LoRA, have been developed. In this paper, we delve into the concept of task-specific directions--critical for transitioning large models from pre-trained states… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Comments: Revisions ongoing. Codes in https://github.com/Chongjie-Si/Subspace-Tuning

  7. arXiv:2409.00130  [pdf

    eess.SP cs.AI cs.LG

    Mirror contrastive loss based sliding window transformer for subject-independent motor imagery based EEG signal recognition

    Authors: Jing Luo, Qi Mao, Weiwei Shi, Zhenghao Shi, Xiaofan Wang, Xiaofeng Lu, Xinhong Hei

    Abstract: While deep learning models have been extensively utilized in motor imagery based EEG signal recognition, they often operate as black boxes. Motivated by neurological findings indicating that the mental imagery of left or right-hand movement induces event-related desynchronization (ERD) in the contralateral sensorimotor area of the brain, we propose a Mirror Contrastive Loss based Sliding Window Tr… ▽ More

    Submitted 29 August, 2024; originally announced September 2024.

    Comments: This paper has been accepted by the Fourth International Workshop on Human Brain and Artificial Intelligence, joint workshop of the 33rd International Joint Conference on Artificial Intelligence, Jeju Island, South Korea, from August 3rd to August 9th, 2024

  8. arXiv:2408.16634  [pdf, other

    cs.CY cs.AI cs.CR

    RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model

    Authors: Zhuan Shi, Jing Yan, Xiaoli Tang, Lingjuan Lyu, Boi Faltings

    Abstract: The increasing sophistication of text-to-image generative models has led to complex challenges in defining and enforcing copyright infringement criteria and protection. Existing methods, such as watermarking and dataset deduplication, fail to provide comprehensive solutions due to the lack of standardized metrics and the inherent complexity of addressing copyright infringement in diffusion models.… ▽ More

    Submitted 2 September, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

    Comments: arXiv admin note: text overlap with arXiv:2403.12052 by other authors

  9. arXiv:2408.16243  [pdf, ps, other

    math.NA

    Error analysis of finite element method for nonlocal diffusion model

    Authors: Zuoqiang Shi

    Abstract: We analyze the error of finite element method for nonlocal diffusion model include both conformal and nonconformal method. We also consider the mesh with and without shape regularity. For shape regular mesh, finite element method for nonlocal diffusion model is asymptotic preserving and the error is $O(h^k+δ)$. For shape irregular mesh, the error becomes $O(\frac{h^{k+1}}δ+δ)$.

    Submitted 2 September, 2024; v1 submitted 28 August, 2024; originally announced August 2024.

  10. arXiv:2408.15821  [pdf, other

    gr-qc

    Higher-dimensional quantum Oppenheimer-Snyder model

    Authors: Zijian Shi, Xiangdong Zhang, Yongge Ma

    Abstract: The quantum Oppenheimer-Snyder model for higher-dimensional spacetimes is studied. The higher-dimensional quantum-corrected Schwarzschild black hole is obtained by the junction condition. It turns out that quantum bounces always occur in the collapse thus that the classical gravitational collapse singularities are avoided. The scalar perturbations upon the quantum-corrected black holes are also st… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: 15 pages, 15 figures

  11. arXiv:2408.15472  [pdf, other

    math.NA

    On the implementation of linear finite element method for nonlocal diffusion model over 2D domain

    Authors: Zuoqiang Shi

    Abstract: We propose an implementation of linear finite element method for nonlocal diffusion problem in 2D space. In the implementation, we reduce the integral from 4D to 2D which would simplify the computation significantly.

    Submitted 27 August, 2024; originally announced August 2024.

  12. Pulsar Population Synthesis with Magnetorotational Evolution: Constraining the Decay of Magnetic field

    Authors: Zhihong Shi, C. -Y. Ng

    Abstract: We present a population synthesis model for normal radio pulsars in the Galaxy incorporating the latest developments in the field and the magnetorotational evolution processes. Our model considers spin-down with a force-free magnetosphere and the decay of the magnetic field strength and its inclination angle. The simulated pulsar population is fit to a large observation sample that covers the majo… ▽ More

    Submitted 3 September, 2024; v1 submitted 27 August, 2024; originally announced August 2024.

    Comments: 17 pages, 10 figures. Published by APJ

  13. arXiv:2408.14475  [pdf, other

    cs.OH cs.RO

    Crowdsense Roadside Parking Spaces with Dynamic Gap Reduction Algorithm

    Authors: Wenjun Zheng, Zhan Shi, Qianyu Ou, Ruizhi Liao

    Abstract: In the context of smart city development, mobile sensing emerges as a cost-effective alternative to fixed sensing for on-street parking detection. However, its practicality is often challenged by the inherent accuracy limitations arising from detection intervals. This paper introduces a novel Dynamic Gap Reduction Algorithm (DGRA), which is a crowdsensing-based approach aimed at addressing this qu… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  14. arXiv:2408.13233  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time

    Authors: Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Yufa Zhou

    Abstract: The quadratic computational complexity in the self-attention mechanism of popular transformer architectures poses significant challenges for training and inference, particularly in terms of efficiency and memory requirements. Towards addressing these challenges, this paper introduces a novel fast computation method for gradient calculation in multi-layer transformer models. Our approach enables th… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  15. arXiv:2408.12317  [pdf, other

    cs.CV

    Adapt CLIP as Aggregation Instructor for Image Dehazing

    Authors: Xiaozhe Zhang, Fengying Xie, Haidong Ding, Linpeng Pan, Zhenwei Shi

    Abstract: Most dehazing methods suffer from limited receptive field and do not explore the rich semantic prior encapsulated in vision-language models, which have proven effective in downstream tasks. In this paper, we introduce CLIPHaze, a pioneering hybrid framework that synergizes the efficient global modeling of Mamba with the prior knowledge and zero-shot capabilities of CLIP to address both issues simu… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: 12 pages, 6 figures

  16. arXiv:2408.12151  [pdf, ps, other

    cs.DS cs.AI cs.CL cs.LG

    A Tighter Complexity Analysis of SparseGPT

    Authors: Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song

    Abstract: In this work, we improved the analysis of the running time of SparseGPT [Frantar, Alistarh ICML 2023] from $O(d^{3})$ to $O(d^ω + d^{2+a+o(1)} + d^{1+ω(1,1,a)-a})$ for any $a \in [0, 1]$, where $ω$ is the exponent of matrix multiplication. In particular, for the current $ω\approx 2.371$ [Alman, Duan, Williams, Xu, Xu, Zhou 2024], our running times boil down to $O(d^{2.53})$. This running time is d… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  17. arXiv:2408.10854  [pdf, other

    physics.ao-ph cs.AI cs.CV

    MambaDS: Near-Surface Meteorological Field Downscaling with Topography Constrained Selective State Space Modeling

    Authors: Zili Liu, Hao Chen, Lei Bai, Wenyuan Li, Wanli Ouyang, Zhengxia Zou, Zhenwei Shi

    Abstract: In an era of frequent extreme weather and global warming, obtaining precise, fine-grained near-surface weather forecasts is increasingly essential for human activities. Downscaling (DS), a crucial task in meteorological forecasting, enables the reconstruction of high-resolution meteorological states for target regions from global-scale forecast results. Previous downscaling methods, inspired by CN… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  18. arXiv:2408.09723  [pdf, other

    cs.LG

    sTransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting

    Authors: Jiaheng Yin, Zhengxin Shi, Jianshen Zhang, Xiaomin Lin, Yulin Huang, Yongzhi Qi, Wei Qi

    Abstract: In recent years, numerous Transformer-based models have been applied to long-term time-series forecasting (LTSF) tasks. However, recent studies with linear models have questioned their effectiveness, demonstrating that simple linear layers can outperform sophisticated Transformer-based models. In this work, we review and categorize existing Transformer-based models into two main types: (1) modific… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  19. arXiv:2408.09220  [pdf, other

    cs.CV cs.AI

    Flatten: Video Action Recognition is an Image Classification task

    Authors: Junlin Chen, Chengcheng Xu, Yangfan Xu, Jian Yang, Jun Li, Zhiping Shi

    Abstract: In recent years, video action recognition, as a fundamental task in the field of video understanding, has been deeply explored by numerous researchers.Most traditional video action recognition methods typically involve converting videos into three-dimensional data that encapsulates both spatial and temporal information, subsequently leveraging prevalent image understanding models to model and anal… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: 13pages, 6figures

  20. arXiv:2408.09064  [pdf, other

    cs.CV cs.LG

    MoRA: LoRA Guided Multi-Modal Disease Diagnosis with Missing Modality

    Authors: Zhiyi Shi, Junsik Kim, Wanhua Li, Yicong Li, Hanspeter Pfister

    Abstract: Multi-modal pre-trained models efficiently extract and fuse features from different modalities with low memory requirements for fine-tuning. Despite this efficiency, their application in disease diagnosis is under-explored. A significant challenge is the frequent occurrence of missing modalities, which impairs performance. Additionally, fine-tuning the entire pre-trained model demands substantial… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: Accepted by MICCAI 2024

  21. arXiv:2408.08500  [pdf, other

    cs.CV

    CoSEC: A Coaxial Stereo Event Camera Dataset for Autonomous Driving

    Authors: Shihan Peng, Hanyu Zhou, Hao Dong, Zhiwei Shi, Haoyue Liu, Yuxing Duan, Yi Chang, Luxin Yan

    Abstract: Conventional frame camera is the mainstream sensor of the autonomous driving scene perception, while it is limited in adverse conditions, such as low light. Event camera with high dynamic range has been applied in assisting frame camera for the multimodal fusion, which relies heavily on the pixel-level spatial alignment between various modalities. Typically, existing multimodal datasets mainly pla… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: This work has been submitted to the IEEE for possible publication

  22. arXiv:2408.07321  [pdf, other

    cs.SE cs.CR

    LLM-Enhanced Static Analysis for Precise Identification of Vulnerable OSS Versions

    Authors: Yiran Cheng, Lwin Khin Shar, Ting Zhang, Shouguo Yang, Chaopeng Dong, David Lo, Shichao Lv, Zhiqiang Shi, Limin Sun

    Abstract: Open-source software (OSS) has experienced a surge in popularity, attributed to its collaborative development model and cost-effective nature. However, the adoption of specific software versions in development projects may introduce security risks when these versions bring along vulnerabilities. Current methods of identifying vulnerable versions typically analyze and trace the code involved in vul… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  23. arXiv:2408.06604  [pdf, other

    cs.CV

    MV-DETR: Multi-modality indoor object detection by Multi-View DEtecton TRansformers

    Authors: Zichao Dong, Yilin Zhang, Xufeng Huang, Hang Ji, Zhan Shi, Xin Zhan, Junbo Chen

    Abstract: We introduce a novel MV-DETR pipeline which is effective while efficient transformer based detection method. Given input RGBD data, we notice that there are super strong pretraining weights for RGB data while less effective works for depth related data. First and foremost , we argue that geometry and texture cues are both of vital importance while could be encoded separately. Secondly, we find tha… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  24. arXiv:2408.06395  [pdf, ps, other

    cs.DS cs.CR cs.LG

    Fast John Ellipsoid Computation with Differential Privacy Optimization

    Authors: Jiuxiang Gu, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song, Junwei Yu

    Abstract: Determining the John ellipsoid - the largest volume ellipsoid contained within a convex polytope - is a fundamental problem with applications in machine learning, optimization, and data analytics. Recent work has developed fast algorithms for approximating the John ellipsoid using sketching and leverage score sampling techniques. However, these algorithms do not provide privacy guarantees for sens… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  25. arXiv:2408.05723  [pdf, other

    cs.LG cs.CR cs.CV

    Deep Learning with Data Privacy via Residual Perturbation

    Authors: Wenqi Tao, Huaming Ling, Zuoqiang Shi, Bao Wang

    Abstract: Protecting data privacy in deep learning (DL) is of crucial importance. Several celebrated privacy notions have been established and used for privacy-preserving DL. However, many existing mechanisms achieve privacy at the cost of significant utility degradation and computational overhead. In this paper, we propose a stochastic differential equation-based residual perturbation for privacy-preservin… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  26. arXiv:2408.05707  [pdf, other

    cs.LG

    Fast and Scalable Semi-Supervised Learning for Multi-View Subspace Clustering

    Authors: Huaming Ling, Chenglong Bao, Jiebo Song, Zuoqiang Shi

    Abstract: In this paper, we introduce a Fast and Scalable Semi-supervised Multi-view Subspace Clustering (FSSMSC) method, a novel solution to the high computational complexity commonly found in existing approaches. FSSMSC features linear computational and space complexity relative to the size of the data. The method generates a consensus anchor graph across all views, representing each data point as a spars… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: 40 pages,7 figures

  27. arXiv:2408.05645  [pdf

    eess.IV cs.CV cs.LG

    BeyondCT: A deep learning model for predicting pulmonary function from chest CT scans

    Authors: Kaiwen Geng, Zhiyi Shi, Xiaoyan Zhao, Alaa Ali, Jing Wang, Joseph Leader, Jiantao Pu

    Abstract: Abstract Background: Pulmonary function tests (PFTs) and computed tomography (CT) imaging are vital in diagnosing, managing, and monitoring lung diseases. A common issue in practice is the lack of access to recorded pulmonary functions despite available chest CT scans. Purpose: To develop and validate a deep learning algorithm for predicting pulmonary function directly from chest CT scans. M… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: 5 tables, 7 figures,22 pages

  28. arXiv:2408.05419  [pdf, other

    cs.LG

    Interface Laplace Learning: Learnable Interface Term Helps Semi-Supervised Learning

    Authors: Tangjun Wang, Chenglong Bao, Zuoqiang Shi

    Abstract: We introduce a novel framework, called Interface Laplace learning, for graph-based semi-supervised learning. Motivated by the observation that an interface should exist between different classes where the function value is non-smooth, we introduce a Laplace learning model that incorporates an interface term. This model challenges the long-standing assumption that functions are smooth at all unlabe… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  29. arXiv:2408.04499  [pdf, other

    cs.LG

    Knowledge-Aided Semantic Communication Leveraging Probabilistic Graphical Modeling

    Authors: Haowen Wan, Qianqian Yang, Jiancheng Tang, Zhiguo shi

    Abstract: In this paper, we propose a semantic communication approach based on probabilistic graphical model (PGM). The proposed approach involves constructing a PGM from a training dataset, which is then shared as common knowledge between the transmitter and receiver. We evaluate the importance of various semantic features and present a PGM-based compression algorithm designed to eliminate predictable port… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  30. arXiv:2408.03815  [pdf, other

    cond-mat.quant-gas quant-ph

    Dissipation Driven Coherent Dynamics Observed in Bose-Einstein Condensates

    Authors: Ye Tian, Yajuan Zhao, Yue Wu, Jilai Ye, Shuyao Mei, Zhihao Chi, Tian Tian, Ce Wang, Zhe-Yu Shi, Yu Chen, Jiazhong Hu, Hui Zhai, Wenlan Chen

    Abstract: We report the first experimental observation of dissipation-driven coherent quantum many-body oscillation, and this oscillation is manifested as the coherent exchange of atoms between the thermal and the condensate components in a three-dimensional partially condensed Bose gas. Firstly, we observe that the dissipation leads to two different atom loss rates between the thermal and the condensate co… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: 11 pages, 5 figures, 1 table

  31. arXiv:2408.03075  [pdf

    astro-ph.EP physics.space-ph

    Characterizing the current systems in the Martian ionosphere

    Authors: Jiawei Gao, Shibang Li, Anna Mittelholz, Zhaojin Rong, Moa Persson, Zhen Shi, Haoyu Lu, Chi Zhang, Xiaodong Wang, Chuanfei Dong, Lucy Klinger, Jun Cui, Yong Wei, Yongxin Pan

    Abstract: When the solar wind interacts with the ionosphere of an unmagnetized planet, it induces currents that form an induced magnetosphere. These currents and their associated magnetic fields play a pivotal role in controlling the movement of charged particles, which is essential for understanding the escape of planetary ions. Unlike the well-documented magnetospheric current systems, the ionospheric cur… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    Comments: 20 pages, 6 figures

  32. arXiv:2408.02780  [pdf

    cs.CV

    LR-Net: A Lightweight and Robust Network for Infrared Small Target Detection

    Authors: Chuang Yu, Yunpeng Liu, Jinmiao Zhao, Zelin Shi

    Abstract: Limited by equipment limitations and the lack of target intrinsic features, existing infrared small target detection methods have difficulty meeting actual comprehensive performance requirements. Therefore, we propose an innovative lightweight and robust network (LR-Net), which abandons the complex structure and achieves an effective balance between detection accuracy and resource consumption. Spe… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  33. arXiv:2408.02773  [pdf

    cs.CV

    Refined Infrared Small Target Detection Scheme with Single-Point Supervision

    Authors: Jinmiao Zhao, Zelin Shi, Chuang Yu, Yunpeng Liu

    Abstract: Recently, infrared small target detection with single-point supervision has attracted extensive attention. However, the detection accuracy of existing methods has difficulty meeting actual needs. Therefore, we propose an innovative refined infrared small target detection scheme with single-point supervision, which has excellent segmentation accuracy and detection rate. Specifically, we introduce l… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  34. arXiv:2408.02095  [pdf, other

    cs.IT eess.SP

    Secure Semantic Communications: From Perspective of Physical Layer Security

    Authors: Yongkang Li, Zheng Shi, Han Hu, Yaru Fu, Hong Wang, Hongjiang Lei

    Abstract: Semantic communications have been envisioned as a potential technique that goes beyond Shannon paradigm. Unlike modern communications that provide bit-level security, the eaves-dropping of semantic communications poses a significant risk of potentially exposing intention of legitimate user. To address this challenge, a novel deep neural network (DNN) enabled secure semantic communication (DeepSSC)… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

  35. arXiv:2408.01291  [pdf, other

    cs.CV

    TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling

    Authors: Dong Huo, Zixin Guo, Xinxin Zuo, Zhihao Shi, Juwei Lu, Peng Dai, Songcen Xu, Li Cheng, Yee-Hong Yang

    Abstract: Given a 3D mesh, we aim to synthesize 3D textures that correspond to arbitrary textual descriptions. Current methods for generating and assembling textures from sampled views often result in prominent seams or excessive smoothing. To tackle these issues, we present TexGen, a novel multi-view sampling and resampling framework for texture generation leveraging a pre-trained text-to-image diffusion m… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: European Conference on Computer Vision (ECCV) 2024

  36. arXiv:2408.00234  [pdf, other

    cond-mat.supr-con

    Superconductive Sodalite-like Clathrate Hydrides MXH$_{12}$ with Critical Temperatures of near 300 K under Pressures

    Authors: Yuxiang Fan, Bin Li, Cong Zhu, Jie Cheng, Shengli Liu, Zhixiang Shi

    Abstract: We designed and investigated a series of ternary hydride compounds MXH$_{12}$ crystallizing in the cubic $Pm\overline{3}m$ structure as potential rare-earth and alkaline-earth superconductors. First-principles calculations were performed on these prospective superconductors across the pressure range of 50-200 GPa, revealing their electronic band structures, phonon dispersions, electron-phonon inte… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

    Journal ref: Phys. Status Solidi B 2400240 (2024)

  37. arXiv:2407.21346  [pdf, other

    math-ph math.OC

    A network based approach for unbalanced optimal transport on surfaces

    Authors: Jiangong Pan, Wei Wan, Yuejin Zhang, Chenlong Bao, Zuoqiang Shi

    Abstract: In this paper, we present a neural network approach to address the dynamic unbalanced optimal transport problem on surfaces with point cloud representation. For surfaces with point cloud representation, traditional method is difficult to apply due to the difficulty of mesh generating. Neural network is easy to implement even for complicate geometry. Moreover, instead of solving the original dynami… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: 24 pages, 11 figures, 7 tables

    MSC Class: 65K10; 68T05; 68T07

  38. arXiv:2407.20518  [pdf, other

    eess.IV cs.AI cs.CV

    High-Resolution Spatial Transcriptomics from Histology Images using HisToSGE

    Authors: Zhiceng Shi, Shuailin Xue, Fangfang Zhu, Wenwen Min

    Abstract: Spatial transcriptomics (ST) is a groundbreaking genomic technology that enables spatial localization analysis of gene expression within tissue sections. However, it is significantly limited by high costs and sparse spatial resolution. An alternative, more cost-effective strategy is to use deep learning methods to predict high-density gene expression profiles from histological images. However, exi… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  39. arXiv:2407.20090  [pdf

    cs.CV

    Infrared Small Target Detection based on Adjustable Sensitivity Strategy and Multi-Scale Fusion

    Authors: Jinmiao Zhao, Zelin Shi, Chuang Yu, Yunpeng Liu

    Abstract: Recently, deep learning-based single-frame infrared small target (SIRST) detection technology has made significant progress. However, existing infrared small target detection methods are often optimized for a fixed image resolution, a single wavelength, or a specific imaging system, limiting their breadth and flexibility in practical applications. Therefore, we propose a refined infrared small tar… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  40. arXiv:2407.19813  [pdf, other

    cs.CL cs.AI

    Improving Retrieval Augmented Language Model with Self-Reasoning

    Authors: Yuan Xia, Jingbo Zhou, Zhenhui Shi, Jun Chen, Haifeng Huang

    Abstract: The Retrieval-Augmented Language Model (RALM) has shown remarkable performance on knowledge-intensive tasks by incorporating external knowledge during inference, which mitigates the factual hallucinations inherited in large language models (LLMs). Despite these advancements, challenges persist in the implementation of RALMs, particularly concerning their reliability and traceability. To be specifi… ▽ More

    Submitted 2 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

  41. arXiv:2407.19555  [pdf

    cond-mat.str-el cond-mat.supr-con

    Crystal-symmetry-paired spin-valley locking in a layered room-temperature antiferromagnet

    Authors: Fayuan Zhang, Xingkai Cheng, Zhouyi Yin, Changchao Liu, Liwei Deng, Yuxi Qiao, Zheng Shi, Shuxuan Zhang, Junhao Lin, Zhengtai Liu, Mao Ye, Yaobo Huang, Xiangyu Meng, Cheng Zhang, Taichi Okuda, Kenya Shimada, Shengtao Cui, Yue Zhao, Guang-Han Cao, Shan Qiao, Junwei Liu, Chaoyu Chen

    Abstract: Recent theoretical efforts predicted a type of unconventional antiferromagnet characterized by the crystal symmetry C (rotation or mirror), which connects antiferromagnetic sublattices in real space and simultaneously couples spin and momentum in reciprocal space. This results in a unique C-paired spin-valley locking (SVL) and corresponding novel properties such as piezomagnetism and noncollinear… ▽ More

    Submitted 2 August, 2024; v1 submitted 28 July, 2024; originally announced July 2024.

    Comments: 22 pages, 5 figures

  42. arXiv:2407.18172  [pdf

    physics.optics

    Chip-scale sensor for spectroscopic metrology

    Authors: Chunhui Yao, Wanlu Zhang, Peng Bao, Jie Ma, Wei Zhuo, Minjia Chen, Zhitian Shi, Jingwen Zhou, Yuxiao Ye, Liang Ming, Ting Yan, Richard Penty, Qixiang Cheng

    Abstract: Miniaturized spectrometers hold great promise for in situ, in vitro, and even in vivo sensing applications. However, their size reduction imposes vital performance constraints in meeting the rigorous demands of spectroscopy, including fine resolution, high accuracy, and ultra-wide observation window. The prevailing view in the community holds that miniaturized spectrometers are most suitable for t… ▽ More

    Submitted 12 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

  43. arXiv:2407.17902  [pdf, other

    eess.AS

    Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization

    Authors: Ruijie Tao, Zhan Shi, Yidi Jiang, Duc-Tuan Truong, Eng-Siong Chng, Massimo Alioto, Haizhou Li

    Abstract: The human brain has the capability to associate the unknown person's voice and face by leveraging their general relationship, referred to as ``cross-modal speaker verification''. This task poses significant challenges due to the complex relationship between the modalities. In this paper, we propose a ``Multi-stage Face-voice Association Learning with Keynote Speaker Diarization''~(MFV-KSD) framewo… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  44. arXiv:2407.15720  [pdf, other

    cs.CL cs.AI cs.LG

    Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability

    Authors: Zhuoyan Xu, Zhenmei Shi, Yingyu Liang

    Abstract: Large language models (LLMs) have emerged as powerful tools for many AI problems and exhibit remarkable in-context learning (ICL) capabilities. Compositional ability, solving unseen complex tasks that combine two or more simple tasks, is an essential reasoning ability for Artificial General Intelligence. Despite the tremendous success of LLMs, how they approach composite tasks, especially those no… ▽ More

    Submitted 11 August, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

  45. Learning at a Glance: Towards Interpretable Data-limited Continual Semantic Segmentation via Semantic-Invariance Modelling

    Authors: Bo Yuan, Danpei Zhao, Zhenwei Shi

    Abstract: Continual semantic segmentation (CSS) based on incremental learning (IL) is a great endeavour in developing human-like segmentation models. However, current CSS approaches encounter challenges in the trade-off between preserving old knowledge and learning new ones, where they still need large-scale annotated data for incremental training and lack interpretability. In this paper, we present Learnin… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  46. arXiv:2407.15317  [pdf, other

    cs.CV

    Open-CD: A Comprehensive Toolbox for Change Detection

    Authors: Kaiyu Li, Jiawei Jiang, Andrea Codegoni, Chengxi Han, Yupeng Deng, Keyan Chen, Zhuo Zheng, Hao Chen, Zhengxia Zou, Zhenwei Shi, Sheng Fang, Deyu Meng, Zhi Wang, Xiangyong Cao

    Abstract: We present Open-CD, a change detection toolbox that contains a rich set of change detection methods as well as related components and modules. The toolbox started from a series of open source general vision task tools, including OpenMMLab Toolkits, PyTorch Image Models, etc. It gradually evolves into a unified platform that covers many popular change detection methods and contemporary modules. It… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: 9 pages

  47. arXiv:2407.15162  [pdf, other

    math.PR

    Random walk on dynamical percolation in Euclidean lattices: separating critical and supercritical regimes

    Authors: Chenlin Gu, Jianping Jiang, Yuval Peres, Zhan Shi, Hao Wu, Fan Yang

    Abstract: We study the random walk on dynamical percolation of $\mathbb{Z}^d$ (resp., the two-dimensional triangular lattice $\mathcal{T}$), where each edge (resp., each site) can be either open or closed, refreshing its status at rate $μ\in (0,1/e]$. The random walk moves along open edges in $\mathbb{Z}^d$ (resp., open sites in $\mathcal{T}$) at rate $1$. For the critical regime $p=p_c$, we prove the follo… ▽ More

    Submitted 1 August, 2024; v1 submitted 21 July, 2024; originally announced July 2024.

    Comments: 23 pages, 1 figure; minor revision

    MSC Class: 60K35; 60K37

  48. arXiv:2407.15079  [pdf, other

    math.PR

    Speed of random walk on dynamical percolation in nonamenable transitive graphs

    Authors: Chenlin Gu, Jianping Jiang, Yuval Peres, Zhan Shi, Hao Wu, Fan Yang

    Abstract: Let $G$ be a nonamenable transitive unimodular graph. In dynamical percolation, every edge in $G$ refreshes its status at rate $μ>0$, and following the refresh, each edge is open independently with probability $p$. The random walk traverses $G$ only along open edges, moving at rate $1$. In the critical regime $p=p_c$, we prove that the speed of the random walk is at most $O(\sqrt{μ\log(1/μ)})$, pr… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: 29 pages, 1 figure

  49. arXiv:2407.14717  [pdf, other

    cs.LG cs.AI cs.CR

    Differential Privacy of Cross-Attention with Provable Guarantee

    Authors: Jiuxiang Gu, Yingyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou

    Abstract: Cross-attention has become a fundamental module nowadays in many important artificial intelligence applications, e.g., retrieval-augmented generation (RAG), system prompt, guided stable diffusion, and many so on. Ensuring cross-attention privacy is crucial and urgently needed because its key and value matrices may contain sensitive information about companies and their users, many of which profit… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  50. arXiv:2407.14032  [pdf, other

    cs.CV

    Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance

    Authors: Yongshuo Zhu, Lu Li, Keyan Chen, Chenyang Liu, Fugen Zhou, Zhenwei Shi

    Abstract: Remote sensing image change captioning (RSICC) aims to articulate the changes in objects of interest within bi-temporal remote sensing images using natural language. Given the limitations of current RSICC methods in expressing general features across multi-temporal and spatial scenarios, and their deficiency in providing granular, robust, and precise change descriptions, we introduce a novel chang… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.