Skip to main content

Showing 1–50 of 418 results for author: Xiong, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.03977  [pdf, other

    eess.IV cs.CV cs.LG

    Bi-modality Images Transfer with a Discrete Process Matching Method

    Authors: Zhe Xiong, Qiaoqiao Ding, Xiaoqun Zhang

    Abstract: Recently, medical image synthesis gains more and more popularity, along with the rapid development of generative models. Medical image synthesis aims to generate an unacquired image modality, often from other observed data modalities. Synthesized images can be used for clinical diagnostic assistance, data augmentation for model training and validation or image quality improving. In the meanwhile,… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  2. arXiv:2408.09261  [pdf, other

    cs.CV

    Adaptify: A Refined Adaptation Scheme for Frame Classification in Atrophic Gastritis Videos

    Authors: Zinan Xiong, Shuijiao Chen, Yizhe Zhang, Yu Cao, Benyuan Liu, Xiaowei Liu

    Abstract: Atrophic gastritis is a significant risk factor for developing gastric cancer. The incorporation of machine learning algorithms can efficiently elevate the possibility of accurately detecting atrophic gastritis. Nevertheless, when the trained model is applied in real-life circumstances, its output is often not consistently reliable. In this paper, we propose Adaptify, an adaptation scheme in which… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: ISBI 2024 Proceeding

  3. arXiv:2408.08765  [pdf, other

    cs.NI

    Rethinking Generative Semantic Communication for Multi-User Systems with Multi-Modal LLM

    Authors: Wanting Yang, Zehui Xiong, Shiwen Mao, Tony Q. S. Quek, Ping Zhang, Merouane Debbah, Rahim Tafazolli

    Abstract: The surge in connected devices in 6G with typical massive access scenarios, such as smart agriculture, and smart cities, poses significant challenges to unsustainable traditional communication with limited radio resources and already high system complexity. Fortunately, the booming artificial intelligence technology and the growing computational power of devices offer a promising 6G enabler: seman… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  4. arXiv:2408.07050  [pdf, other

    cs.SD cs.CV eess.AS

    PSM: Learning Probabilistic Embeddings for Multi-scale Zero-Shot Soundscape Mapping

    Authors: Subash Khanal, Eric Xing, Srikumar Sastry, Aayush Dhakal, Zhexiao Xiong, Adeel Ahmad, Nathan Jacobs

    Abstract: A soundscape is defined by the acoustic environment a person perceives at a location. In this work, we propose a framework for mapping soundscapes across the Earth. Since soundscapes involve sound distributions that span varying spatial scales, we represent locations with multi-scale satellite imagery and learn a joint representation among this imagery, audio, and text. To capture the inherent unc… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: Accepted at ACM MM 2024

  5. arXiv:2408.01653  [pdf, other

    cs.CV

    MCPDepth: Omnidirectional Depth Estimation via Stereo Matching from Multi-Cylindrical Panoramas

    Authors: Feng Qiao, Zhexiao Xiong, Xinge Zhu, Yuexin Ma, Qiumeng He, Nathan Jacobs

    Abstract: We introduce Multi-Cylindrical Panoramic Depth Estimation (MCPDepth), a two-stage framework for omnidirectional depth estimation via stereo matching between multiple cylindrical panoramas. MCPDepth uses cylindrical panoramas for initial stereo matching and then fuses the resulting depth maps across views. A circular attention module is employed to overcome the distortion along the vertical axis. M… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  6. arXiv:2407.15483  [pdf, other

    cs.NI

    Enhancing Wireless Networks with Attention Mechanisms: Insights from Mobile Crowdsensing

    Authors: Yaoqi Yang, Hongyang Du, Zehui Xiong, Dusit Niyato, Abbas Jamalipour, Zhu Han

    Abstract: The increasing demand for sensing, collecting, transmitting, and processing vast amounts of data poses significant challenges for resource-constrained mobile users, thereby impacting the performance of wireless networks. In this regard, from a case of mobile crowdsensing (MCS), we aim at leveraging attention mechanisms in machine learning approaches to provide solutions for building an effective,… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  7. arXiv:2407.10068  [pdf, other

    cs.CL

    Multi-Granularity Semantic Revision for Large Language Model Distillation

    Authors: Xiaoyu Liu, Yun Zhang, Wei Li, Simiao Li, Xudong Huang, Hanting Chen, Yehui Tang, Jie Hu, Zhiwei Xiong, Yunhe Wang

    Abstract: Knowledge distillation plays a key role in compressing the Large Language Models (LLMs), which boosts a small-size student model under large teacher models' guidance. However, existing LLM distillation methods overly rely on student-generated outputs, which may introduce generation errors and misguide the distillation process. Moreover, the distillation loss functions introduced in previous art st… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  8. arXiv:2407.09935  [pdf, other

    cs.CV cs.MM eess.IV

    LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation

    Authors: Jiacheng Li, Chang Chen, Fenglong Song, Youliang Yan, Zhiwei Xiong

    Abstract: Image resampling is a basic technique that is widely employed in daily applications, such as camera photo editing. Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors. Still, these methods are not the perfect substitute for interpolation, due to the drawbacks in efficiency and versatility. In this work, we propose a novel method of Lea… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Code: https://github.com/ddlee-cn/LeRF-PyTorch

  9. arXiv:2407.09672  [pdf, other

    cs.CV

    Mixed-View Panorama Synthesis using Geospatially Guided Diffusion

    Authors: Zhexiao Xiong, Xin Xing, Scott Workman, Subash Khanal, Nathan Jacobs

    Abstract: We introduce the task of mixed-view panorama synthesis, where the goal is to synthesize a novel panorama given a small set of input panoramas and a satellite image of the area. This contrasts with previous work which only uses input panoramas (same-view synthesis), or an input satellite image (cross-view synthesis). We argue that the mixed-view setting is the most natural to support panorama synth… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  10. arXiv:2407.08340  [pdf, other

    cs.LG

    SLRL: Structured Latent Representation Learning for Multi-view Clustering

    Authors: Zhangci Xiong, Meng Cao

    Abstract: In recent years, Multi-View Clustering (MVC) has attracted increasing attention for its potential to reduce the annotation burden associated with large datasets. The aim of MVC is to exploit the inherent consistency and complementarity among different views, thereby integrating information from multiple perspectives to improve clustering outcomes. Despite extensive research in MVC, most existing… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  11. arXiv:2407.06611  [pdf, other

    cs.CV cs.AI

    CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based Understanding

    Authors: Wenhao Xu, Wenming Weng, Yueyi Zhang, Zhiwei Xiong

    Abstract: We present CEIA, an effective framework for open-world event-based understanding. Currently training a large event-text model still poses a huge challenge due to the shortage of paired event-text data. In response to this challenge, CEIA learns to align event and image data as an alternative instead of directly aligning event and text data. Specifically, we leverage the rich event-image datasets t… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  12. arXiv:2407.00891  [pdf, other

    cs.LG q-bio.BM

    ZeroDDI: A Zero-Shot Drug-Drug Interaction Event Prediction Method with Semantic Enhanced Learning and Dual-Modal Uniform Alignment

    Authors: Ziyan Wang, Zhankun Xiong, Feng Huang, Xuan Liu, Wen Zhang

    Abstract: Drug-drug interactions (DDIs) can result in various pharmacological changes, which can be categorized into different classes known as DDI events (DDIEs). In recent years, previously unobserved/unseen DDIEs have been emerging, posing a new classification task when unseen classes have no labelled instances in the training stage, which is formulated as a zero-shot DDIE prediction (ZS-DDIE) task. Howe… ▽ More

    Submitted 31 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

    Comments: Accepted by IJCAI2024

  13. arXiv:2406.19292  [pdf, other

    cs.LG cs.AI cs.CL

    From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data

    Authors: Zheyang Xiong, Vasilis Papageorgiou, Kangwook Lee, Dimitris Papailiopoulos

    Abstract: Recent studies have shown that Large Language Models (LLMs) struggle to accurately retrieve information and maintain reasoning capabilities when processing long-context inputs. To address these limitations, we propose a finetuning approach utilizing a carefully designed synthetic dataset comprising numerical key-value retrieval tasks. Our experiments on models like GPT-3.5 Turbo and Mistral 7B dem… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  14. arXiv:2406.19156  [pdf, other

    cs.LG

    Heterogeneous Causal Metapath Graph Neural Network for Gene-Microbe-Disease Association Prediction

    Authors: Kexin Zhang, Feng Huang, Luotao Liu, Zhankun Xiong, Hongyu Zhang, Yuan Quan, Wen Zhang

    Abstract: The recent focus on microbes in human medicine highlights their potential role in the genetic framework of diseases. To decode the complex interactions among genes, microbes, and diseases, computational predictions of gene-microbe-disease (GMD) associations are crucial. Existing methods primarily address gene-disease and microbe-disease associations, but the more intricate triple-wise GMD associat… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  15. arXiv:2406.16083  [pdf, other

    eess.IV cs.CV

    Mamba-based Light Field Super-Resolution with Efficient Subspace Scanning

    Authors: Ruisheng Gao, Zeyu Xiao, Zhiwei Xiong

    Abstract: Transformer-based methods have demonstrated impressive performance in 4D light field (LF) super-resolution by effectively modeling long-range spatial-angular correlations, but their quadratic complexity hinders the efficient processing of high resolution 4D inputs, resulting in slow inference speed and high memory cost. As a compromise, most prior work adopts a patch-based strategy, which fails to… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 17 pages,7 figures

  16. arXiv:2406.13964  [pdf, other

    cs.NI

    Hierarchical Micro-Segmentations for Zero-Trust Services via Large Language Model (LLM)-enhanced Graph Diffusion

    Authors: Yinqiu Liu, Guangyuan Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim, Xuemin Shen

    Abstract: In the rapidly evolving Next-Generation Networking (NGN) era, the adoption of zero-trust architectures has become increasingly crucial to protect security. However, provisioning zero-trust services in NGNs poses significant challenges, primarily due to the environmental complexity and dynamics. Motivated by these challenges, this paper explores efficient zero-trust service provisioning using hiera… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 13 pages

  17. arXiv:2406.12300  [pdf

    eess.IV cs.CV q-bio.NC

    IR2QSM: Quantitative Susceptibility Mapping via Deep Neural Networks with Iterative Reverse Concatenations and Recurrent Modules

    Authors: Min Li, Chen Chen, Zhuang Xiong, Ying Liu, Pengfei Rong, Shanshan Shan, Feng Liu, Hongfu Sun, Yang Gao

    Abstract: Quantitative susceptibility mapping (QSM) is an MRI phase-based post-processing technique to extract the distribution of tissue susceptibilities, demonstrating significant potential in studying neurological diseases. However, the ill-conditioned nature of dipole inversion makes QSM reconstruction from the tissue field prone to noise and artifacts. In this work, we propose a novel deep learning-bas… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 10 pages, 9 figures

  18. arXiv:2406.11208  [pdf

    cs.NI

    Privacy-preserving Pseudonym Schemes for Personalized 3D Avatars in Mobile Social Metaverses

    Authors: Cheng Su, Xiaofeng Luo, Zhenmou Liu, Jiawen Kang, Min Hao, Zehui Xiong, Zhaohui Yang, Chongwen Huang

    Abstract: The emergence of mobile social metaverses, a novel paradigm bridging physical and virtual realms, has led to the widespread adoption of avatars as digital representations for Social Metaverse Users (SMUs) within virtual spaces. Equipped with immersive devices, SMUs leverage Edge Servers (ESs) to deploy their avatars and engage with other SMUs in virtual spaces. To enhance immersion, SMUs incline t… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 6pages, 4 figures

  19. arXiv:2406.09187  [pdf, other

    cs.LG

    GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

    Authors: Zhen Xiang, Linzhi Zheng, Yanjie Li, Junyuan Hong, Qinbin Li, Han Xie, Jiawei Zhang, Zidi Xiong, Chulin Xie, Carl Yang, Dawn Song, Bo Li

    Abstract: The rapid advancement of large language models (LLMs) has catalyzed the deployment of LLM-powered agents across numerous applications, raising new concerns regarding their safety and trustworthiness. Existing methods for enhancing the safety of LLMs are not directly transferable to LLM-powered agents due to their diverse objectives and output modalities. In this paper, we propose GuardAgent, the f… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  20. arXiv:2406.08204  [pdf, other

    cs.CV

    Diffusion-Promoted HDR Video Reconstruction

    Authors: Yuanshen Guan, Ruikang Xu, Mingde Yao, Ruisheng Gao, Lizhi Wang, Zhiwei Xiong

    Abstract: High dynamic range (HDR) video reconstruction aims to generate HDR videos from low dynamic range (LDR) frames captured with alternating exposures. Most existing works solely rely on the regression-based paradigm, leading to adverse effects such as ghosting artifacts and missing details in saturated regions. In this paper, we propose a diffusion-promoted method for HDR video reconstruction, termed… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Arxiv Preprint

  21. arXiv:2406.06357  [pdf, other

    cs.CL cs.AI

    MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows

    Authors: Xingjian Zhang, Yutong Xie, Jin Huang, Jinge Ma, Zhaoying Pan, Qijia Liu, Ziyang Xiong, Tolga Ergen, Dongsub Shim, Honglak Lee, Qiaozhu Mei

    Abstract: Scientific innovation relies on detailed workflows, which include critical steps such as analyzing literature, generating ideas, validating these ideas, interpreting results, and inspiring follow-up research. However, scientific publications that document these workflows are extensive and unstructured. This makes it difficult for both human researchers and AI systems to effectively navigate and ex… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:1706.03762 by other authors

  22. arXiv:2406.05613  [pdf, other

    cs.RO

    Distributed Motion Control of Multiple Mobile Manipulator System with Disturbance and Communication Delay

    Authors: Wenhang Liu, Meng Ren, Kun Song, Michael Yu Wang, Zhenhua Xiong

    Abstract: In real-world object manipulation scenarios, multiple mobile manipulator systems may suffer from disturbances and asynchrony, leading to excessive interaction forces and causing object damage or emergency stops. This paper presents a novel distributed motion control approach aimed at reducing these unnecessary interaction forces. The control strategy only utilizes force information without the nee… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  23. arXiv:2406.05418  [pdf, other

    cs.AI cs.NI

    Multi-attribute Auction-based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-based DRL Approach

    Authors: Yongju Tong, Junlong Chen, Minrui Xu, Jiawen Kang, Zehui Xiong, Dusit Niyato, Chau Yuen, Zhu Han

    Abstract: Vehicular Metaverses are developed to enhance the modern automotive industry with an immersive and safe experience among connected vehicles and roadside infrastructures, e.g., RoadSide Units (RSUs). For seamless synchronization with virtual spaces, Vehicle Twins (VTs) are constructed as digital representations of physical entities. However, resource-intensive VTs updating and high mobility of vehi… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 16 pages, 6 figures, 3 tables

  24. arXiv:2406.04111  [pdf, other

    cs.CV eess.IV

    UrbanSARFloods: Sentinel-1 SLC-Based Benchmark Dataset for Urban and Open-Area Flood Mapping

    Authors: Jie Zhao, Zhitong Xiong, Xiao Xiang Zhu

    Abstract: Due to its cloud-penetrating capability and independence from solar illumination, satellite Synthetic Aperture Radar (SAR) is the preferred data source for large-scale flood mapping, providing global coverage and including various land cover classes. However, most studies on large-scale SAR-derived flood mapping using deep learning algorithms have primarily focused on flooded open areas, utilizing… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by CVPR 2024 EarthVision Workshop

  25. arXiv:2405.17932  [pdf, ps, other

    cs.LG cs.DC

    Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization

    Authors: Xiumei Deng, Jun Li, Kang Wei, Long Shi, Zeihui Xiong, Ming Ding, Wen Chen, Shi Jin, H. Vincent Poor

    Abstract: Adaptive moment estimation (Adam), as a Stochastic Gradient Descent (SGD) variant, has gained widespread popularity in federated learning (FL) due to its fast convergence. However, federated Adam (FedAdam) algorithms suffer from a threefold increase in uplink communication overhead compared to federated SGD (FedSGD) algorithms, which arises from the necessity to transmit both local model updates a… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  26. arXiv:2405.16850  [pdf, other

    eess.IV cs.CV cs.LG

    UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation

    Authors: Runzhao Yang, Yinda Chen, Zhihong Zhang, Xiaoyu Liu, Zongren Li, Kunlun He, Zhiwei Xiong, Jinli Suo, Qionghai Dai

    Abstract: In the field of medical image compression, Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios, yet they are constrained by a one-to-one fitting approach that results in lengthy encoding times. Our novel method, ``\textbf{UniCompress}'', innovatively extends the compression capabilities of INR by being the first to compress multi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  27. arXiv:2405.16847  [pdf, other

    cs.CV cs.AI

    TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction

    Authors: Yinda Chen, Haoyuan Shi, Xiaoyu Liu, Te Shi, Ruobing Zhang, Dong Liu, Zhiwei Xiong, Feng Wu

    Abstract: Autoregressive next-token prediction is a standard pretraining method for large-scale language models, but its application to vision tasks is hindered by the non-sequential nature of image data, leading to cumulative errors. Most vision models employ masked autoencoder (MAE) based pretraining, which faces scalability issues. To address these challenges, we introduce \textbf{TokenUnify}, a novel pr… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  28. arXiv:2405.14391  [pdf, other

    cs.AI cs.CL cs.CY

    Explainable Few-shot Knowledge Tracing

    Authors: Haoxuan Li, Jifan Yu, Yuanxin Ouyang, Zhuang Liu, Wenge Rong, Juanzi Li, Zhang Xiong

    Abstract: Knowledge tracing (KT), aiming to mine students' mastery of knowledge by their exercise records and predict their performance on future test questions, is a critical task in educational assessment. While researchers achieved tremendous success with the rapid development of deep learning techniques, current knowledge tracing tasks fall into the cracks from real-world teaching scenarios. Relying hea… ▽ More

    Submitted 25 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  29. arXiv:2405.12472  [pdf, ps, other

    cs.NI

    Optimizing Generative AI Networking: A Dual Perspective with Multi-Agent Systems and Mixture of Experts

    Authors: Ruichen Zhang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Ping Zhang, Dong In Kim

    Abstract: In the continued development of next-generation networking and artificial intelligence content generation (AIGC) services, the integration of multi-agent systems (MAS) and the mixture of experts (MoE) frameworks is becoming increasingly important. Motivated by this, this article studies the contrasting and converging of MAS and MoE in AIGC-enabled networking. First, we discuss the architectural de… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  30. arXiv:2405.11726  [pdf, other

    cs.RO

    RHAML: Rendezvous-based Hierarchical Architecture for Mutual Localization

    Authors: Gaoming Chen, Kun Song, Xiang Xu, Wenhang Liu, Zhenhua Xiong

    Abstract: Mutual localization serves as the foundation for collaborative perception and task assignment in multi-robot systems. Effectively utilizing limited onboard sensors for mutual localization between marker-less robots is a worthwhile goal. However, due to inadequate consideration of large scale variations of the observed robot and localization refinement, previous work has shown limited accuracy when… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 8 pages, 8 figures, submitted to RA-L

  31. arXiv:2405.10521  [pdf, other

    cs.CR

    Generative AI for Secure and Privacy-Preserving Mobile Crowdsensing

    Authors: Yaoqi Yang, Bangning Zhang, Daoxing Guo, Hongyang Du, Zehui Xiong, Dusit Niyato, Zhu Han

    Abstract: Recently, generative AI has attracted much attention from both academic and industrial fields, which has shown its potential, especially in the data generation and synthesis aspects. Simultaneously, secure and privacy-preserving mobile crowdsensing (SPPMCS) has been widely applied in data collection/ acquirement due to an advantage on low deployment cost, flexible implementation, and high adaptabi… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  32. arXiv:2405.08345  [pdf, other

    cs.RO

    Multi-Robot Rendezvous in Unknown Environment with Limited Communication

    Authors: Kun Song, Gaoming Chen, Wenhang Liu, Zhenhua Xiong

    Abstract: Rendezvous aims at gathering all robots at a specific location, which is an important collaborative behavior for multirobot systems. However, in an unknown environment, it is challenging to achieve rendezvous. Previous researches mainly focus on special scenarios where communication is not allowed and each robot executes a random searching strategy, which is highly time-consuming, especially in la… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Submit to RAL. 8 pages, 6 figures

  33. arXiv:2405.08289  [pdf, other

    cs.GT

    Exploring Equilibrium Strategies in Network Games with Generative AI

    Authors: Yaoqi Yang, Hongyang Du, Geng Sun, Zehui Xiong, Dusit Niyato, Zhu Han

    Abstract: Game theory offers a powerful framework for analyzing strategic interactions among decision-makers, providing tools to model, analyze, and predict their behavior. However, implementing game theory can be challenging due to difficulties in deriving solutions, understanding interactions, and ensuring optimal performance. Traditional non-AI and discriminative AI approaches have made valuable contribu… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  34. arXiv:2405.07140  [pdf, other

    cs.LG cs.AI cs.NI

    Edge Intelligence Optimization for Large Language Model Inference with Batching and Quantization

    Authors: Xinyuan Zhang, Jiang Liu, Zehui Xiong, Yudong Huang, Gaochang Xie, Ran Zhang

    Abstract: Generative Artificial Intelligence (GAI) is taking the world by storm with its unparalleled content creation ability. Large Language Models (LLMs) are at the forefront of this movement. However, the significant resource demands of LLMs often require cloud hosting, which raises issues regarding privacy, latency, and usage limitations. Although edge intelligence has long been utilized to solve these… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  35. arXiv:2405.07012  [pdf, other

    cs.CV

    Incorporating Degradation Estimation in Light Field Spatial Super-Resolution

    Authors: Zeyu Xiao, Zhiwei Xiong

    Abstract: Recent advancements in light field super-resolution (SR) have yielded impressive results. In practice, however, many existing methods are limited by assuming fixed degradation models, such as bicubic downsampling, which hinders their robustness in real-world scenarios with complex degradations. To address this limitation, we present LF-DEST, an effective blind Light Field SR method that incorporat… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  36. arXiv:2405.04285  [pdf, other

    cs.AI eess.SP

    On the Foundations of Earth and Climate Foundation Models

    Authors: Xiao Xiang Zhu, Zhitong Xiong, Yi Wang, Adam J. Stewart, Konrad Heidler, Yuanyuan Wang, Zhenghang Yuan, Thomas Dujardin, Qingsong Xu, Yilei Shi

    Abstract: Foundation models have enormous potential in advancing Earth and climate sciences, however, current approaches may not be optimal as they focus on a few basic features of a desirable Earth and climate foundation model. Crafting the ideal Earth foundation model, we define eleven features which would allow such a foundation model to be beneficial for any geoscientific downstream application in an en… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  37. arXiv:2405.04278  [pdf, other

    cs.LG cs.RO

    Uncertainty Quantification Metrics for Deep Regression

    Authors: Simon Kristoffersson Lind, Ziliang Xiong, Per-Erik Forssén, Volker Krüger

    Abstract: When deploying deep neural networks on robots or other physical systems, the learned model should reliably quantify predictive uncertainty. A reliable uncertainty allows downstream modules to reason about the safety of its actions. In this work, we address metrics for evaluating such an uncertainty. Specifically, we focus on regression tasks, and investigate Area Under Sparsification Error (AUSE),… ▽ More

    Submitted 22 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  38. arXiv:2405.04198  [pdf, other

    cs.CR

    Enhancing Physical Layer Communication Security through Generative AI with Mixture of Experts

    Authors: Changyuan Zhao, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim, Xuemin, Shen, Khaled B. Letaief

    Abstract: AI technologies have become more widely adopted in wireless communications. As an emerging type of AI technologies, the generative artificial intelligence (GAI) gains lots of attention in communication security. Due to its powerful learning ability, GAI models have demonstrated superiority over conventional AI methods. However, GAI still has several limitations, including high computational comple… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  39. arXiv:2405.03998  [pdf, other

    cs.HC cs.CL

    Sketch Then Generate: Providing Incremental User Feedback and Guiding LLM Code Generation through Language-Oriented Code Sketches

    Authors: Chen Zhu-Tian, Zeyu Xiong, Xiaoshuo Yao, Elena Glassman

    Abstract: Crafting effective prompts for code generation or editing with Large Language Models (LLMs) is not an easy task. Particularly, the absence of immediate, stable feedback during prompt crafting hinders effective interaction, as users are left to mentally imagine possible outcomes until the code is generated. In response, we introduce Language-Oriented Code Sketching, an interactive approach that pro… ▽ More

    Submitted 10 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 4 pages

  40. arXiv:2405.01326  [pdf, other

    cs.CV

    Multi-modal Learnable Queries for Image Aesthetics Assessment

    Authors: Zhiwei Xiong, Yunfan Zhang, Zhiqi Shen, Peiran Ren, Han Yu

    Abstract: Image aesthetics assessment (IAA) is attracting wide interest with the prevalence of social media. The problem is challenging due to its subjective and ambiguous nature. Instead of directly extracting aesthetic features solely from the image, user comments associated with an image could potentially provide complementary knowledge that is useful for IAA. With existing large-scale pre-trained models… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted by ICME2024

  41. arXiv:2405.00431  [pdf, other

    cs.CV

    Detail-Enhancing Framework for Reference-Based Image Super-Resolution

    Authors: Zihan Wang, Ziliang Xiong, Hongying Tang, Xiaobing Yuan

    Abstract: Recent years have witnessed the prosperity of reference-based image super-resolution (Ref-SR). By importing the high-resolution (HR) reference images into the single image super-resolution (SISR) approach, the ill-posed nature of this long-standing field has been alleviated with the assistance of texture transferred from reference images. Although the significant improvement in quantitative and qu… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  42. arXiv:2404.16356  [pdf, other

    cs.NI cs.AI cs.LG

    Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey

    Authors: Minrui Xu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Yuguang Fang, Dong In Kim, Xuemin, Shen

    Abstract: Generative AI (GAI) can enhance the cognitive, reasoning, and planning capabilities of intelligent modules in the Internet of Vehicles (IoV) by synthesizing augmented datasets, completing sensor data, and making sequential decisions. In addition, the mixture of experts (MoE) can enable the distributed and collaborative execution of AI models without performance degradation between connected vehicl… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  43. arXiv:2404.13898  [pdf, other

    cs.NI

    Cross-Modal Generative Semantic Communications for Mobile AIGC: Joint Semantic Encoding and Prompt Engineering

    Authors: Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Shiwen Mao, Ping Zhang, Xuemin Shen

    Abstract: Employing massive Mobile AI-Generated Content (AIGC) Service Providers (MASPs) with powerful models, high-quality AIGC services can become accessible for resource-constrained end users. However, this advancement, referred to as mobile AIGC, also introduces a significant challenge: users should download large AIGC outputs from the MASPs, leading to substantial bandwidth consumption and potential tr… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  44. arXiv:2404.12014  [pdf, other

    cs.CL cs.CR

    Enhance Robustness of Language Models Against Variation Attack through Graph Integration

    Authors: Zi Xiong, Lizhi Qing, Yangyang Kang, Jiawei Liu, Hongsong Li, Changlong Sun, Xiaozhong Liu, Wei Lu

    Abstract: The widespread use of pre-trained language models (PLMs) in natural language processing (NLP) has greatly improved performance outcomes. However, these models' vulnerability to adversarial attacks (e.g., camouflaged hints from drug dealers), particularly in the Chinese language with its rich character diversity/variation and complex structures, hatches vital apprehension. In this study, we propose… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 12 pages, 4 figures, accepted by COLING 2024

  45. arXiv:2404.09134  [pdf, ps, other

    cs.NI cs.LG

    Generative AI Agents with Large Language Model for Satellite Networks via a Mixture of Experts Transmission

    Authors: Ruichen Zhang, Hongyang Du, Yinqiu Liu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Dong In Kim

    Abstract: In response to the needs of 6G global communications, satellite communication networks have emerged as a key solution. However, the large-scale development of satellite communication networks is constrained by the complex system models, whose modeling is challenging for massive users. Moreover, transmission interference between satellites and users seriously affects communication performance. To s… ▽ More

    Submitted 29 June, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

    Comments: 15 pages, 10 figures

  46. arXiv:2404.08899  [pdf, other

    cs.NI

    ProSecutor: Protecting Mobile AIGC Services on Two-Layer Blockchain via Reputation and Contract Theoretic Approaches

    Authors: Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Xuemin, Shen

    Abstract: Mobile AI-Generated Content (AIGC) has achieved great attention in unleashing the power of generative AI and scaling the AIGC services. By employing numerous Mobile AIGC Service Providers (MASPs), ubiquitous and low-latency AIGC services for clients can be realized. Nonetheless, the interactions between clients and MASPs in public mobile networks, pertaining to three key mechanisms, namely MASP se… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: 17 pages

  47. arXiv:2404.06997  [pdf, other

    cs.NI cs.LG

    Agent-driven Generative Semantic Communication with Cross-Modality and Prediction

    Authors: Wanting Yang, Zehui Xiong, Yanli Yuan, Wenchao Jiang, Tony Q. S. Quek, Merouane Debbah

    Abstract: In the era of 6G, with compelling visions of intelligent transportation systems and digital twins, remote surveillance is poised to become a ubiquitous practice. Substantial data volume and frequent updates present challenges in wireless networks. To address these challenges, we propose a novel agent-driven generative semantic communication (A-GSC) framework based on reinforcement learning. In con… ▽ More

    Submitted 19 July, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  48. arXiv:2404.06182  [pdf, other

    cs.NI

    Streamlined Transmission: A Semantic-Aware XR Deployment Framework Enhanced by Generative AI

    Authors: Wanting Yang, Zehui Xiong, Tony Q. S. Quek, Xuemin Shen

    Abstract: In the era of 6G, featuring compelling visions of digital twins and metaverses, Extended Reality (XR) has emerged as a vital conduit connecting the digital and physical realms, garnering widespread interest. Ensuring a fully immersive wireless XR experience stands as a paramount technical necessity, demanding the liberation of XR from the confines of wired connections. In this paper, we first intr… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: Under review with IEEE Network

  49. arXiv:2404.05962  [pdf, other

    cs.IR cs.IT

    Wasserstein Dependent Graph Attention Network for Collaborative Filtering with Uncertainty

    Authors: Haoxuan Li, Yuanxin Ouyang, Zhuang Liu, Wenge Rong, Zhang Xiong

    Abstract: Collaborative filtering (CF) is an essential technique in recommender systems that provides personalized recommendations by only leveraging user-item interactions. However, most CF methods represent users and items as fixed points in the latent space, lacking the ability to capture uncertainty. While probabilistic embedding is proposed to intergrate uncertainty, they suffer from several limitation… ▽ More

    Submitted 29 June, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE TCSS

  50. CEAR: Comprehensive Event Camera Dataset for Rapid Perception of Agile Quadruped Robots

    Authors: Shifan Zhu, Zixun Xiong, Donghyun Kim

    Abstract: When legged robots perform agile movements, traditional RGB cameras often produce blurred images, posing a challenge for rapid perception. Event cameras have emerged as a promising solution for capturing rapid perception and coping with challenging lighting conditions thanks to their low latency, high temporal resolution, and high dynamic range. However, integrating event cameras into agile-legged… ▽ More

    Submitted 12 July, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

    Comments: 8 pages, 7 figures

    Journal ref: Robot and Automation Letters, June 2024