Skip to main content

Showing 1–50 of 1,310 results for author: Zhou, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10841  [pdf, other

    cs.AI cs.CL

    DELIA: Diversity-Enhanced Learning for Instruction Adaptation in Large Language Models

    Authors: Yuanhao Zeng, Fei Ren, Xinpeng Zhou, Yihang Wang, Yingxia Shao

    Abstract: Although instruction tuning is widely used to adjust behavior in Large Language Models (LLMs), extensive empirical evidence and research indicates that it is primarily a process where the model fits to specific task formats, rather than acquiring new knowledge or capabilities. We propose that this limitation stems from biased features learned during instruction tuning, which differ from ideal task… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 8 pages, 5 figures

  2. arXiv:2408.10501  [pdf, other

    cs.IT eess.SP

    Generative Diffusion Models for High Dimensional Channel Estimation

    Authors: Xingyu Zhou, Le Liang, Jing Zhang, Peiwen Jiang, Yong Li, Shi Jin

    Abstract: Along with the prosperity of generative artificial intelligence (AI), its potential for solving conventional challenges in wireless communications has also surfaced. Inspired by this trend, we investigate the application of the advanced diffusion models (DMs), a representative class of generative AI models, to high dimensional wireless channel estimation. By capturing the structure of multiple-inp… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  3. arXiv:2408.10453  [pdf, other

    cs.CV cs.GR cs.MM

    Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation

    Authors: Liu He, Yizhi Song, Hejun Huang, Daniel Aliaga, Xin Zhou

    Abstract: Text-to-video generation has been dominated by end-to-end diffusion-based or autoregressive models. On one hand, those novel models provide plausible versatility, but they are criticized for physical correctness, shading and illumination, camera motion, and temporal consistency. On the other hand, film industry relies on manually-edited Computer-Generated Imagery (CGI) using 3D modeling software.… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  4. arXiv:2408.10053  [pdf, other

    cs.CL cs.CR

    Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory

    Authors: Haoran Li, Wei Fan, Yulin Chen, Jiayang Cheng, Tianshu Chu, Xuebing Zhou, Peizhao Hu, Yangqiu Song

    Abstract: Privacy research has attracted wide attention as individuals worry that their private data can be easily leaked during interactions with smart devices, social platforms, and AI applications. Computer science researchers, on the other hand, commonly study privacy issues through privacy attacks and defenses on segmented fields. Privacy research is conducted on various sub-fields, including Computer… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  5. arXiv:2408.09395  [pdf, other

    cs.CV

    OU-CoViT: Copula-Enhanced Bi-Channel Multi-Task Vision Transformers with Dual Adaptation for OU-UWF Images

    Authors: Yang Li, Jianing Deng, Chong Zhong, Danjuan Yang, Meiyan Li, A. H. Welsh, Aiyi Liu, Xingtao Zhou, Catherine C. Liu, Bo Fu

    Abstract: Myopia screening using cutting-edge ultra-widefield (UWF) fundus imaging and joint modeling of multiple discrete and continuous clinical scores presents a promising new paradigm for multi-task problems in Ophthalmology. The bi-channel framework that arises from the Ophthalmic phenomenon of ``interocular asymmetries'' of both eyes (OU) calls for new employment on the SOTA transformer-based models.… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  6. arXiv:2408.09357  [pdf, other

    cs.GR cs.AI cs.SD eess.AS

    Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face Animation

    Authors: Xukun Zhou, Fengxin Li, Ziqiao Peng, Kejian Wu, Jun He, Biao Qin, Zhaoxin Fan, Hongyan Liu

    Abstract: Audio-driven 3D face animation is increasingly vital in live streaming and augmented reality applications. While remarkable progress has been observed, most existing approaches are designed for specific individuals with predefined speaking styles, thus neglecting the adaptability to varied speaking styles. To address this limitation, this paper introduces MetaFace, a novel methodology meticulously… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  7. arXiv:2408.08669  [pdf, other

    cs.SD eess.AS

    HSDreport: Heart Sound Diagnosis with Echocardiography Reports

    Authors: Zihan Zhao, Pingjie Wang, Liudan Zhao, Yuchen Yang, Ya Zhang, Kun Sun, Xin Sun, Xin Zhou, Yu Wang, Yanfeng Wang

    Abstract: Heart sound auscultation holds significant importance in the diagnosis of congenital heart disease. However, existing methods for Heart Sound Diagnosis (HSD) tasks are predominantly limited to a few fixed categories, framing the HSD task as a rigid classification problem that does not fully align with medical practice and offers only limited information to physicians. Besides, such methods do not… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  8. arXiv:2408.08003  [pdf, other

    cs.CL

    Leveraging Web-Crawled Data for High-Quality Fine-Tuning

    Authors: Jing Zhou, Chenglin Jiang, Wei Shen, Xiao Zhou, Xiaonan He

    Abstract: Most large language models are fine-tuned using either expensive human-annotated data or GPT-4 generated data which cannot guarantee performance in certain domains. We argue that although the web-crawled data often has formatting errors causing semantic inaccuracies, it can still serve as a valuable source for high-quality supervised fine-tuning in specific domains without relying on advanced mode… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  9. arXiv:2408.06717  [pdf, other

    cs.LG cs.AI

    Computation-friendly Graph Neural Network Design by Accumulating Knowledge on Large Language Models

    Authors: Jialiang Wang, Shimin Di, Hanmo Liu, Zhili Wang, Jiachuan Wang, Lei Chen, Xiaofang Zhou

    Abstract: Graph Neural Networks (GNNs), like other neural networks, have shown remarkable success but are hampered by the complexity of their architecture designs, which heavily depend on specific data and tasks. Traditionally, designing proper architectures involves trial and error, which requires intensive manual effort to optimize various components. To reduce human workload, researchers try to develop a… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  10. arXiv:2408.06027  [pdf, other

    eess.SP cs.LG

    A Comprehensive Survey on EEG-Based Emotion Recognition: A Graph-Based Perspective

    Authors: Chenyu Liu, Xinliang Zhou, Yihao Wu, Yi Ding, Liming Zhai, Kun Wang, Ziyu Jia, Yang Liu

    Abstract: Compared to other modalities, electroencephalogram (EEG) based emotion recognition can intuitively respond to emotional patterns in the human brain and, therefore, has become one of the most focused tasks in affective computing. The nature of emotions is a physiological and psychological state change in response to brain region connectivity, making emotion recognition focus more on the dependency… ▽ More

    Submitted 13 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

  11. arXiv:2408.05905  [pdf, other

    cs.CV cs.AI

    Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts

    Authors: Peng Wu, Xuerong Zhou, Guansong Pang, Zhiwei Yang, Qingsen Yan, Peng Wang, Yanning Zhang

    Abstract: Current weakly supervised video anomaly detection (WSVAD) task aims to achieve frame-level anomalous event detection with only coarse video-level annotations available. Existing works typically involve extracting global features from full-resolution video frames and training frame-level classifiers to detect anomalies in the temporal dimension. However, most anomalous events tend to occur in local… ▽ More

    Submitted 13 August, 2024; v1 submitted 11 August, 2024; originally announced August 2024.

    Comments: Accepted by ACMMM2024

  12. arXiv:2408.05748  [pdf, ps, other

    cs.AI cs.LG

    Low-Dimensional Federated Knowledge Graph Embedding via Knowledge Distillation

    Authors: Xiaoxiong Zhang, Zhiwei Zeng, Xin Zhou, Zhiqi Shen

    Abstract: Federated Knowledge Graph Embedding (FKGE) aims to facilitate collaborative learning of entity and relation embeddings from distributed Knowledge Graphs (KGs) across multiple clients, while preserving data privacy. Training FKGE models with higher dimensions is typically favored due to their potential for achieving superior performance. However, high-dimensional embeddings present significant chal… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  13. arXiv:2408.05507  [pdf

    cs.RO

    A Multimodal Soft Gripper with Variable Stiffness and Variable Gripping Range Based on MASH Actuator

    Authors: Dannuo Li, Xuanyi Zhou, Quan Xiong, Chen-Hua Yeow

    Abstract: Soft pneumatic actuators with integrated strain limiting layers have emerged as predominant components in the field of soft gripper technology for several decades. However, owing to their intrinsic strain-limiting layer design, these soft grippers possess a singular gripping functionality, rendering them incapable of adapting to diverse gripping tasks with different strategies. Based on our previo… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: 6 pages, 9 figures

  14. arXiv:2408.04974  [pdf, other

    cs.CR cs.CV

    XNN: Paradigm Shift in Mitigating Identity Leakage within Cloud-Enabled Deep Learning

    Authors: Kaixin Liu, Huixin Xiong, Bingyu Duan, Zexuan Cheng, Xinyu Zhou, Wanqian Zhang, Xiangyu Zhang

    Abstract: In the domain of cloud-based deep learning, the imperative for external computational resources coexists with acute privacy concerns, particularly identity leakage. To address this challenge, we introduce XNN and XNN-d, pioneering methodologies that infuse neural network features with randomized perturbations, striking a harmonious balance between utility and privacy. XNN, designed for the trainin… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  15. arXiv:2408.04268  [pdf, other

    cs.CV

    Evaluating Modern Approaches in 3D Scene Reconstruction: NeRF vs Gaussian-Based Methods

    Authors: Yiming Zhou, Zixuan Zeng, Andi Chen, Xiaofan Zhou, Haowei Ni, Shiyao Zhang, Panfeng Li, Liangxi Liu, Mengyao Zheng, Xupeng Chen

    Abstract: Exploring the capabilities of Neural Radiance Fields (NeRF) and Gaussian-based methods in the context of 3D scene reconstruction, this study contrasts these modern approaches with traditional Simultaneous Localization and Mapping (SLAM) systems. Utilizing datasets such as Replica and ScanNet, we assess performance based on tracking accuracy, mapping fidelity, and view synthesis. Findings reveal th… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Accepted by 2024 6th International Conference on Data-driven Optimization of Complex Systems

  16. arXiv:2408.04245  [pdf, other

    cs.LG cs.AI cs.IR

    Scalable Transformer for High Dimensional Multivariate Time Series Forecasting

    Authors: Xin Zhou, Weiqing Wang, Wray Buntine, Shilin Qu, Abishek Sriramulu, Weicong Tan, Christoph Bergmeir

    Abstract: Deep models for Multivariate Time Series (MTS) forecasting have recently demonstrated significant success. Channel-dependent models capture complex dependencies that channel-independent models cannot capture. However, the number of channels in real-world applications outpaces the capabilities of existing channel-dependent models, and contrary to common expectations, some models underperform the ch… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    ACM Class: H.3

  17. arXiv:2408.03191  [pdf, other

    cs.RO

    Integrated Intention Prediction and Decision-Making with Spectrum Attention Net and Proximal Policy Optimization

    Authors: Xiao Zhou, Chengzhen Meng, Wenru Liu, Zengqi Peng, Ming Liu, Jun Ma

    Abstract: For autonomous driving in highly dynamic environments, it is anticipated to predict the future behaviors of surrounding vehicles (SVs) and make safe and effective decisions. However, modeling the inherent coupling effect between the prediction and decision-making modules has been a long-standing challenge, especially when there is a need to maintain appropriate computational efficiency. To tackle… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  18. arXiv:2408.03025  [pdf, other

    cs.IR

    The Crowd in MOOCs: A Study of Learning Patterns at Scale

    Authors: Xin Zhou, Aixin Sun, Jie Zhang, Donghui Lin

    Abstract: The increasing availability of learning activity data in Massive Open Online Courses (MOOCs) enables us to conduct a large-scale analysis of learners' learning behavior. In this paper, we analyze a dataset of 351 million learning activities from 0.8 million unique learners enrolled in over 1.6 thousand courses within two years. Specifically, we mine and identify the learning patterns of the crowd… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    Comments: 16 pages

  19. arXiv:2408.01906  [pdf, ps, other

    cs.IT

    Binary $[n,(n\pm1)/2]$ cyclic codes with good minimum distances from sequences

    Authors: Xianhong Xie, Yaxin Zhao, Zhonghua Sun, Xiaobo Zhou

    Abstract: Recently, binary cyclic codes with parameters $[n,(n\pm1)/2,\geq \sqrt{n}]$ have been a hot topic since their minimum distances have a square-root bound. In this paper, we construct four classes of binary cyclic codes $\mathcal{C}_{\mathcal{S},0}$, $\mathcal{C}_{\mathcal{S},1}$ and $\mathcal{C}_{\mathcal{D},0}$, $\mathcal{C}_{\mathcal{D},1}$ by using two families of sequences, and obtain some code… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

  20. arXiv:2408.01691  [pdf, other

    cs.LG cs.AI

    TreeCSS: An Efficient Framework for Vertical Federated Learning

    Authors: Qinbo Zhang, Xiao Yan, Yukai Ding, Quanqing Xu, Chuang Hu, Xiaokai Zhou, Jiawei Jiang

    Abstract: Vertical federated learning (VFL) considers the case that the features of data samples are partitioned over different participants. VFL consists of two main steps, i.e., identify the common data samples for all participants (alignment) and train model using the aligned data samples (training). However, when there are many participants and data samples, both alignment and training become slow. As s… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: 16 pages, 7 figures

  21. arXiv:2408.00989  [pdf, other

    cs.AI

    On the Resilience of Multi-Agent Systems with Malicious Agents

    Authors: Jen-tse Huang, Jiaxu Zhou, Tailin Jin, Xuhui Zhou, Zixi Chen, Wenxuan Wang, Youliang Yuan, Maarten Sap, Michael R. Lyu

    Abstract: Multi-agent systems, powered by large language models, have shown great abilities across various tasks due to the collaboration of expert agents, each focusing on a specific domain. However, when agents are deployed separately, there is a risk that malicious users may introduce malicious agents who generate incorrect or irrelevant results that are too stealthy to be identified by other non-special… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: 10 pages

  22. arXiv:2407.21075  [pdf, other

    cs.AI cs.CL cs.LG

    Apple Intelligence Foundation Language Models

    Authors: Tom Gunter, Zirui Wang, Chong Wang, Ruoming Pang, Andy Narayanan, Aonan Zhang, Bowen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek , et al. (130 additional authors not shown)

    Abstract: We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  23. arXiv:2407.20818  [pdf, other

    cs.CV

    WARM-3D: A Weakly-Supervised Sim2Real Domain Adaptation Framework for Roadside Monocular 3D Object Detection

    Authors: Xingcheng Zhou, Deyu Fu, Walter Zimmer, Mingyu Liu, Venkatnarayanan Lakshminarasimhan, Leah Strand, Alois C. Knoll

    Abstract: Existing roadside perception systems are limited by the absence of publicly available, large-scale, high-quality 3D datasets. Exploring the use of cost-effective, extensive synthetic datasets offers a viable solution to tackle this challenge and enhance the performance of roadside monocular 3D detection. In this study, we introduce the TUMTraf Synthetic Dataset, offering a diverse and substantial… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  24. arXiv:2407.18489  [pdf, other

    cs.IT eess.SP

    Mini-Batch Gradient-Based MCMC for Decentralized Massive MIMO Detection

    Authors: Xingyu Zhou, Le Liang, Jing Zhang, Chao-Kai Wen, Shi Jin

    Abstract: Massive multiple-input multiple-output (MIMO) technology has significantly enhanced spectral and power efficiency in cellular communications and is expected to further evolve towards extra-large-scale MIMO. However, centralized processing for massive MIMO faces practical obstacles, including excessive computational complexity and a substantial volume of baseband data to be exchanged. To address th… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: 15 pages, 10 figures, 1 tables. This paper has been accepted for publication by the IEEE Transactions on Communications. Copyright may be transferred without notice, after which this version may no longer be accessible

  25. arXiv:2407.17378  [pdf, other

    cs.CV

    PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction

    Authors: Nan Peng, Xun Zhou, Mingming Wang, Xiaojun Yang, Songming Chen, Guisong Chen

    Abstract: Temporal information is crucial for detecting occluded instances. Existing temporal representations have progressed from BEV or PV features to more compact query features. Compared to these aforementioned features, predictions offer the highest level of abstraction, providing explicit information. In the context of online vectorized HD map construction, this unique characteristic of predictions is… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  26. arXiv:2407.17226  [pdf, other

    cs.LG cs.AI

    Sublinear Regret for An Actor-Critic Algorithm in Continuous-Time Linear-Quadratic Reinforcement Learning

    Authors: Yilie Huang, Yanwei Jia, Xun Yu Zhou

    Abstract: We study reinforcement learning (RL) for a class of continuous-time linear-quadratic (LQ) control problems for diffusions where volatility of the state processes depends on both state and control variables. We apply a model-free approach that relies neither on knowledge of model parameters nor on their estimations, and devise an actor-critic algorithm to learn the optimal policy parameter directly… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 42 pages, 4 figures

  27. arXiv:2407.16716  [pdf, ps, other

    cs.NE cs.CV cs.LG

    Exploring The Neural Burden In Pruned Models: An Insight Inspired By Neuroscience

    Authors: Zeyu Wang, Weichen Dai, Xiangyu Zhou, Ji Qi, Yi Zhou

    Abstract: Vision Transformer and its variants have been adopted in many visual tasks due to their powerful capabilities, which also bring significant challenges in computation and storage. Consequently, researchers have introduced various compression methods in recent years, among which the pruning techniques are widely used to remove a significant fraction of the network. Therefore, these methods can reduc… ▽ More

    Submitted 27 July, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

  28. arXiv:2407.16255  [pdf

    cs.LG cond-mat.mes-hall cs.AI

    Self-Reasoning Assistant Learning for non-Abelian Gauge Fields Design

    Authors: Jinyang Sun, Xi Chen, Xiumei Wang, Dandan Zhu, Xingping Zhou

    Abstract: Non-Abelian braiding has attracted substantial attention because of its pivotal role in describing the exchange behaviour of anyons, in which the input and outcome of non-Abelian braiding are connected by a unitary matrix. Implementing braiding in a classical system can assist the experimental investigation of non-Abelian physics. However, the design of non-Abelian gauge fields faces numerous chal… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  29. arXiv:2407.16235  [pdf, other

    cs.SE cs.AI

    Comparison of Static Application Security Testing Tools and Large Language Models for Repo-level Vulnerability Detection

    Authors: Xin Zhou, Duc-Manh Tran, Thanh Le-Cong, Ting Zhang, Ivana Clairine Irsan, Joshua Sumarlin, Bach Le, David Lo

    Abstract: Software vulnerabilities pose significant security challenges and potential risks to society, necessitating extensive efforts in automated vulnerability detection. There are two popular lines of work to address automated vulnerability detection. On one hand, Static Application Security Testing (SAST) is usually utilized to scan source code for security vulnerabilities, especially in industries. On… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  30. arXiv:2407.14933  [pdf, other

    cs.CL cs.AI cs.LG

    Consent in Crisis: The Rapid Decline of the AI Data Commons

    Authors: Shayne Longpre, Robert Mahari, Ariel Lee, Campbell Lund, Hamidah Oderinwale, William Brannon, Nayan Saxena, Naana Obeng-Marnu, Tobin South, Cole Hunter, Kevin Klyman, Christopher Klamm, Hailey Schoelkopf, Nikhil Singh, Manuel Cherep, Ahmad Anis, An Dinh, Caroline Chitongo, Da Yin, Damien Sileo, Deividas Mataciunas, Diganta Misra, Emad Alghamdi, Enrico Shippole, Jianguo Zhang , et al. (24 additional authors not shown)

    Abstract: General-purpose artificial intelligence (AI) systems are built on massive swathes of public web data, assembled into corpora such as C4, RefinedWeb, and Dolma. To our knowledge, we conduct the first, large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training corpora. Our audit of 14,000 web domains provides an expansive view of crawlable web data and how co… ▽ More

    Submitted 24 July, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

    Comments: 41 pages (13 main), 5 figures, 9 tables

  31. arXiv:2407.14086  [pdf, other

    cs.CV

    Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking

    Authors: Yunfei Zhang, Chao Liang, Jin Gao, Zhipeng Zhang, Weiming Hu, Stephen Maybank, Xue Zhou, Liang Li

    Abstract: Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks by incorporating the extraction of appearance features as auxiliary tasks through embedding Re-Identification task (ReID) into the detector, achieving a balance between inference speed and tracking performance. However, solving the competition between the detector and the featu… ▽ More

    Submitted 6 August, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

    Comments: A submission to IJCV

  32. arXiv:2407.13981  [pdf, other

    q-bio.BM cs.LG

    Decomposed Direct Preference Optimization for Structure-Based Drug Design

    Authors: Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, Quanquan Gu

    Abstract: Diffusion models have achieved promising results for Structure-Based Drug Design (SBDD). Nevertheless, high-quality protein subpocket and ligand data are relatively scarce, which hinders the models' generation capabilities. Recently, Direct Preference Optimization (DPO) has emerged as a pivotal tool for the alignment of generative models such as large language models and diffusion models, providin… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  33. arXiv:2407.13151  [pdf, other

    eess.IV cs.CV

    Wavelet-based Bi-dimensional Aggregation Network for SAR Image Change Detection

    Authors: Jiangwei Xie, Feng Gao, Xiaowei Zhou, Junyu Dong

    Abstract: Synthetic aperture radar (SAR) image change detection is critical in remote sensing image analysis. Recently, the attention mechanism has been widely used in change detection tasks. However, existing attention mechanisms often employ down-sampling operations such as average pooling on the Key and Value components to enhance computational efficiency. These irreversible operations result in the loss… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: IEEE GRSL 2024

  34. arXiv:2407.12851  [pdf

    cs.CL

    ISPO: An Integrated Ontology of Symptom Phenotypes for Semantic Integration of Traditional Chinese Medical Data

    Authors: Zixin Shu, Rui Hua, Dengying Yan, Chenxia Lu, Ning Xu, Jun Li, Hui Zhu, Jia Zhang, Dan Zhao, Chenyang Hui, Junqiu Ye, Chu Liao, Qi Hao, Wen Ye, Cheng Luo, Xinyan Wang, Chuang Cheng, Xiaodong Li, Baoyan Liu, Xiaji Zhou, Runshun Zhang, Min Xu, Xuezhong Zhou

    Abstract: Symptom phenotypes are one of the key types of manifestations for diagnosis and treatment of various disease conditions. However, the diversity of symptom terminologies is one of the major obstacles hindering the analysis and knowledge sharing of various types of symptom-related medical data particularly in the fields of Traditional Chinese Medicine (TCM). Objective: This study aimed to construct… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 39 pages, 6 figures, 6 tables

  35. arXiv:2407.10680  [pdf, ps, other

    cs.SI cs.NI

    Friedkin-Johnsen Model for Opinion Dynamics on Signed Graphs

    Authors: Xiaotian Zhou, Haoxin Sun, Wanyue Xu, Wei Li, Zhongzhi Zhang

    Abstract: A signed graph offers richer information than an unsigned graph, since it describes both collaborative and competitive relationships in social networks. In this paper, we study opinion dynamics on a signed graph, based on the Friedkin-Johnsen model. We first interpret the equilibrium opinion in terms of a defined random walk on an augmented signed graph, by representing the equilibrium opinion of… ▽ More

    Submitted 17 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  36. arXiv:2407.10671  [pdf, other

    cs.CL cs.AI

    Qwen2 Technical Report

    Authors: An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin , et al. (37 additional authors not shown)

    Abstract: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a… ▽ More

    Submitted 17 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 25 pages, 1 figure

  37. arXiv:2407.10510  [pdf, other

    cs.CL cs.AI cs.CE

    TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction

    Authors: Xingzhi Zhou, Xin Dong, Chunhao Li, Yuning Bai, Yulong Xu, Ka Chun Cheung, Simon See, Xinpeng Song, Runshun Zhang, Xuezhong Zhou, Nevin L. Zhang

    Abstract: Traditional Chinese medicine (TCM) relies on specific combinations of herbs in prescriptions to treat symptoms and signs, a practice that spans thousands of years. Predicting TCM prescriptions presents a fascinating technical challenge with practical implications. However, this task faces limitations due to the scarcity of high-quality clinical datasets and the intricate relationship between sympt… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  38. arXiv:2407.10377  [pdf

    eess.IV cs.AI cs.CV

    Enhanced Self-supervised Learning for Multi-modality MRI Segmentation and Classification: A Novel Approach Avoiding Model Collapse

    Authors: Linxuan Han, Sa Xiao, Zimeng Li, Haidong Li, Xiuchao Zhao, Fumin Guo, Yeqing Han, Xin Zhou

    Abstract: Multi-modality magnetic resonance imaging (MRI) can provide complementary information for computer-aided diagnosis. Traditional deep learning algorithms are suitable for identifying specific anatomical structures segmenting lesions and classifying diseases with magnetic resonance images. However, manual labels are limited due to high expense, which hinders further improvement of model accuracy. Se… ▽ More

    Submitted 17 July, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

  39. arXiv:2407.09690  [pdf, other

    cs.LG cs.CR math.OC

    Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses

    Authors: Changyu Gao, Andrew Lowy, Xingyu Zhou, Stephen J. Wright

    Abstract: We revisit the problem of federated learning (FL) with private data from people who do not trust the server or other silos/clients. In this context, every silo (e.g. hospital) has data from several people (e.g. patients) and needs to protect the privacy of each person's data (e.g. health records), even if the server and/or other silos try to uncover this data. Inter-Silo Record-Level Differential… ▽ More

    Submitted 17 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: The 41st International Conference on Machine Learning (ICML 2024)

  40. arXiv:2407.09523  [pdf, other

    cs.CV cs.AI cs.CY cs.LG

    MuseCL: Predicting Urban Socioeconomic Indicators via Multi-Semantic Contrastive Learning

    Authors: Xixian Yong, Xiao Zhou

    Abstract: Predicting socioeconomic indicators within urban regions is crucial for fostering inclusivity, resilience, and sustainability in cities and human settlements. While pioneering studies have attempted to leverage multi-modal data for socioeconomic prediction, jointly exploring their underlying semantics remains a significant challenge. To address the gap, this paper introduces a Multi-Semantic Contr… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

  41. arXiv:2407.08995  [pdf, other

    cs.CL

    Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs

    Authors: Aobo Kong, Shiwan Zhao, Hao Chen, Qicheng Li, Yong Qin, Ruiqi Sun, Xin Zhou, Jiaming Zhou, Haoqin Sun

    Abstract: Recent advancements in LLMs have showcased their remarkable role-playing capabilities, able to accurately simulate the dialogue styles and cognitive processes of various roles based on different instructions and contexts. Studies indicate that assigning LLMs the roles of experts, a strategy known as role-play prompting, can enhance their performance in the corresponding domains. However, the promp… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  42. arXiv:2407.08532  [pdf, other

    cs.CR cs.SE

    Tactics, Techniques, and Procedures (TTPs) in Interpreted Malware: A Zero-Shot Generation with Large Language Models

    Authors: Ying Zhang, Xiaoyan Zhou, Hui Wen, Wenjia Niu, Jiqiang Liu, Haining Wang, Qiang Li

    Abstract: Nowadays, the open-source software (OSS) ecosystem suffers from security threats of software supply chain (SSC) attacks. Interpreted OSS malware plays a vital role in SSC attacks, as criminals have an arsenal of attack vectors to deceive users into installing malware and executing malicious activities. In this paper, we introduce tactics, techniques, and procedures (TTPs) proposed by MITRE ATT\&CK… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 19 pages, 11 figures

  43. arXiv:2407.06042  [pdf, ps, other

    eess.SP cs.IT

    Near-Optimal MIMO Detection Using Gradient-Based MCMC in Discrete Spaces

    Authors: Xingyu Zhou, Le Liang, Jing Zhang, Chao-Kai Wen, Shi Jin

    Abstract: The discrete nature of transmitted symbols poses challenges for achieving optimal detection in multiple-input multiple-output (MIMO) systems associated with a large number of antennas. Recently, the combination of two powerful machine learning methods, Markov chain Monte Carlo (MCMC) sampling and gradient descent, has emerged as a highly efficient solution to address this issue. However, existing… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  44. arXiv:2407.05619  [pdf, other

    cs.RO eess.SY

    AIRA: A Low-cost IR-based Approach Towards Autonomous Precision Drone Landing and NLOS Indoor Navigation

    Authors: Yanchen Liu, Minghui Zhao, Kaiyuan Hou, Junxi Xia, Charlie Carver, Stephen Xia, Xia Zhou, Xiaofan Jiang

    Abstract: Automatic drone landing is an important step for achieving fully autonomous drones. Although there are many works that leverage GPS, video, wireless signals, and active acoustic sensing to perform precise landing, autonomous drone landing remains an unsolved challenge for palm-sized microdrones that may not be able to support the high computational requirements of vision, wireless, or active audio… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  45. arXiv:2407.05365  [pdf, other

    cs.AI

    ElecBench: a Power Dispatch Evaluation Benchmark for Large Language Models

    Authors: Xiyuan Zhou, Huan Zhao, Yuheng Cheng, Yuji Cao, Gaoqi Liang, Guolong Liu, Wenxuan Liu, Yan Xu, Junhua Zhao

    Abstract: In response to the urgent demand for grid stability and the complex challenges posed by renewable energy integration and electricity market dynamics, the power sector increasingly seeks innovative technological solutions. In this context, large language models (LLMs) have become a key technology to improve efficiency and promote intelligent progress in the power sector with their excellent natural… ▽ More

    Submitted 11 August, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

  46. arXiv:2407.05238  [pdf, other

    cs.CV

    P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds

    Authors: Jiahao Nie, Fei Xie, Sifan Zhou, Xueyi Zhou, Dong-Kyu Chae, Zhiwei He

    Abstract: 3D single object tracking (SOT) methods based on appearance matching has long suffered from insufficient appearance information incurred by incomplete, textureless and semantically deficient LiDAR point clouds. While motion paradigm exploits motion cues instead of appearance matching for tracking, it incurs complex multi-stage processing and segmentation module. In this paper, we first provide in-… ▽ More

    Submitted 8 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

    Comments: The source code and pre-trained models are available at https://github.com/haooozi/P2P

  47. arXiv:2407.04224  [pdf, other

    cs.RO

    PA-LOCO: Learning Perturbation-Adaptive Locomotion for Quadruped Robots

    Authors: Zhiyuan Xiao, Xinyu Zhang, Xiang Zhou, Qingrui Zhang

    Abstract: Numerous locomotion controllers have been designed based on Reinforcement Learning (RL) to facilitate blind quadrupedal locomotion traversing challenging terrains. Nevertheless, locomotion control is still a challenging task for quadruped robots traversing diverse terrains amidst unforeseen disturbances. Recently, privileged learning has been employed to learn reliable and robust quadrupedal locom… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 8 pages, Accepted by IROS 2024

  48. arXiv:2407.03263  [pdf, other

    cs.CV

    A Unified Framework for 3D Scene Understanding

    Authors: Wei Xu, Chunsheng Shi, Sifan Tu, Xin Zhou, Dingkang Liang, Xiang Bai

    Abstract: We propose UniSeg3D, a unified 3D segmentation framework that achieves panoptic, semantic, instance, interactive, referring, and open-vocabulary semantic segmentation tasks within a single model. Most previous 3D segmentation approaches are specialized for a specific task, thereby limiting their understanding of 3D scenes to a task-specific perspective. In contrast, the proposed method unifies six… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: The code will be available at https://dk-liang.github.io/UniSeg3D/

  49. arXiv:2407.02960  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    ObfuscaTune: Obfuscated Offsite Fine-tuning and Inference of Proprietary LLMs on Private Datasets

    Authors: Ahmed Frikha, Nassim Walha, Ricardo Mendes, Krishna Kanth Nakka, Xue Jiang, Xuebing Zhou

    Abstract: This work addresses the timely yet underexplored problem of performing inference and finetuning of a proprietary LLM owned by a model provider entity on the confidential/private data of another data owner entity, in a way that ensures the confidentiality of both the model and the data. Hereby, the finetuning is conducted offsite, i.e., on the computation infrastructure of a third-party cloud provi… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Preprint

  50. arXiv:2407.02956  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization

    Authors: Ahmed Frikha, Nassim Walha, Krishna Kanth Nakka, Ricardo Mendes, Xue Jiang, Xuebing Zhou

    Abstract: In this work, we address the problem of text anonymization where the goal is to prevent adversaries from correctly inferring private attributes of the author, while keeping the text utility, i.e., meaning and semantics. We propose IncogniText, a technique that anonymizes the text to mislead a potential adversary into predicting a wrong private attribute value. Our empirical evaluation shows a redu… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Preprint