Skip to main content

Showing 1–50 of 4,061 results for author: Liu, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10921  [pdf, other

    cs.AI

    MTFinEval:A Multi-domain Chinese Financial Benchmark with Eurypalynous questions

    Authors: Xinyu Liu, Ke Jin

    Abstract: With the emergence of more and more economy-specific LLMS, how to measure whether they can be safely invested in production becomes a problem. Previous research has primarily focused on evaluating the performance of LLMs within specific application scenarios. However, these benchmarks cannot reflect the theoretical level and generalization ability, and the backward datasets are increasingly unsuit… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  2. arXiv:2408.10852  [pdf, other

    cs.SD eess.AS

    EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech

    Authors: Xin Qi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Shuchen Shi, Yi Lu, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Guanjun Li, Xuefei Liu, Yongwei Li

    Abstract: In the current era of Artificial Intelligence Generated Content (AIGC), a Low-Rank Adaptation (LoRA) method has emerged. It uses a plugin-based approach to learn new knowledge with lower parameter quantities and computational costs, and it can be plugged in and out based on the specific sub-tasks, offering high flexibility. However, the current application schemes primarily incorporate LoRA into t… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  3. arXiv:2408.10849  [pdf, other

    cs.SD eess.AS

    A Noval Feature via Color Quantisation for Fake Audio Detection

    Authors: Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Yukun Liu, Guanjun Li, Xin Qi, Yi Lu, Xuefei Liu, Yongwei Li

    Abstract: In the field of deepfake detection, previous studies focus on using reconstruction or mask and prediction methods to train pre-trained models, which are then transferred to fake audio detection training where the encoder is used to extract features, such as wav2vec2.0 and Masked Auto Encoder. These methods have proven that using real audio for reconstruction pre-training can better help the model… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: accepted by ISCSLP2024

  4. arXiv:2408.10679  [pdf, other

    cs.CV

    DemMamba: Alignment-free Raw Video Demoireing with Frequency-assisted Spatio-Temporal Mamba

    Authors: Shuning Xu, Xina Liu, Binbin Song, Xiangyu Chen, Qiubo Chen, Jiantao Zhou

    Abstract: Moire patterns arise when two similar repetitive patterns interfere, a phenomenon frequently observed during the capture of images or videos on screens. The color, shape, and location of moire patterns may differ across video frames, posing a challenge in learning information from adjacent frames and preserving temporal consistency. Previous video demoireing methods heavily rely on well-designed a… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  5. arXiv:2408.10631  [pdf, other

    cs.LG cs.AI cs.CL

    LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Models

    Authors: Yupeng Su, Ziyi Guan, Xiaoqun Liu, Tianlai Jin, Dongkuan Wu, Graziano Chesi, Ngai Wong, Hao Yu

    Abstract: Large language models (LLMs) have grown significantly in scale, leading to a critical need for efficient model pruning techniques. Existing post-training pruning techniques primarily focus on measuring weight importance on converged dense models to determine salient weights to retain. However, they often overlook the changes in weight importance during the pruning process, which can lead to perfor… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  6. arXiv:2408.10566  [pdf, other

    cs.LG cs.AI

    SparseGrow: Addressing Growth-Induced Forgetting in Task-Agnostic Continual Learning

    Authors: Yuqing Zhao, Divya Saxena, Jiannong Cao, Xiaoyun Liu, Changlin Song

    Abstract: In continual learning (CL), model growth enhances adaptability over new data, improving knowledge retention for more tasks. However, improper model growth can lead to severe degradation of previously learned knowledge, an issue we name as growth-induced forgetting (GIFt), especially in task-agnostic CL using entire grown model for inference. Existing works, despite adopting model growth and random… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: This paper has been submitted to the AAAI conference. If accepted, the final version will be updated to reflect the conference proceedings

  7. arXiv:2408.10469  [pdf, other

    cs.CV cs.IR

    LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS

    Authors: Xinyu Liu, Jing Zhang, Kexin Zhang, Xu Liu, Lingling Li

    Abstract: Video Object Segmentation (VOS) presents several challenges, including object occlusion and fragmentation, the dis-appearance and re-appearance of objects, and tracking specific objects within crowded scenes. In this work, we combine the strengths of the state-of-the-art (SOTA) models SAM2 and Cutie to address these challenges. Additionally, we explore the impact of various hyperparameters on vide… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  8. arXiv:2408.09731  [pdf, other

    eess.IV cs.CV

    Diff2CT: Diffusion Learning to Reconstruct Spine CT from Biplanar X-Rays

    Authors: Zhi Qiao, Xuhui Liu, Xiaopeng Wang, Runkun Liu, Xiantong Zhen, Pei Dong, Zhen Qian

    Abstract: Intraoperative CT imaging serves as a crucial resource for surgical guidance; however, it may not always be readily accessible or practical to implement. In scenarios where CT imaging is not an option, reconstructing CT scans from X-rays can offer a viable alternative. In this paper, we introduce an innovative method for 3D CT reconstruction utilizing biplanar X-rays. Distinct from previous resear… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  9. arXiv:2408.09449  [pdf, other

    cs.CV cs.LG

    Attention Is Not What You Need: Revisiting Multi-Instance Learning for Whole Slide Image Classification

    Authors: Xin Liu, Weijia Zhang, Min-Ling Zhang

    Abstract: Although attention-based multi-instance learning algorithms have achieved impressive performances on slide-level whole slide image (WSI) classification tasks, they are prone to mistakenly focus on irrelevant patterns such as staining conditions and tissue morphology, leading to incorrect patch-level predictions and unreliable interpretability. Moreover, these attention-based MIL algorithms tend to… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  10. arXiv:2408.09348  [pdf, other

    cs.CV

    Hyperstroke: A Novel High-quality Stroke Representation for Assistive Artistic Drawing

    Authors: Haoyun Qin, Jian Lin, Hanyuan Liu, Xueting Liu, Chengze Li

    Abstract: Assistive drawing aims to facilitate the creative process by providing intelligent guidance to artists. Existing solutions often fail to effectively model intricate stroke details or adequately address the temporal aspects of drawing. We introduce hyperstroke, a novel stroke representation designed to capture precise fine stroke details, including RGB appearance and alpha-channel opacity. Using a… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

    Comments: 11 pages, 10 figures

  11. arXiv:2408.09261  [pdf, other

    cs.CV

    Adaptify: A Refined Adaptation Scheme for Frame Classification in Atrophic Gastritis Videos

    Authors: Zinan Xiong, Shuijiao Chen, Yizhe Zhang, Yu Cao, Benyuan Liu, Xiaowei Liu

    Abstract: Atrophic gastritis is a significant risk factor for developing gastric cancer. The incorporation of machine learning algorithms can efficiently elevate the possibility of accurately detecting atrophic gastritis. Nevertheless, when the trained model is applied in real-life circumstances, its output is often not consistently reliable. In this paper, we propose Adaptify, an adaptation scheme in which… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: ISBI 2024 Proceeding

  12. arXiv:2408.08881  [pdf, other

    eess.IV cs.AI cs.CV

    U-MedSAM: Uncertainty-aware MedSAM for Medical Image Segmentation

    Authors: Xin Wang, Xiaoyu Liu, Peng Huang, Pu Huang, Shu Hu, Hongtu Zhu

    Abstract: Medical Image Foundation Models have proven to be powerful tools for mask prediction across various datasets. However, accurately assessing the uncertainty of their predictions remains a significant challenge. To address this, we propose a new model, U-MedSAM, which integrates the MedSAM model with an uncertainty-aware loss function and the Sharpness-Aware Minimization (SharpMin) optimizer. The un… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

  13. arXiv:2408.08859  [pdf, other

    cs.LG

    Stochastic Bandits Robust to Adversarial Attacks

    Authors: Xuchuang Wang, Jinhang Zuo, Xutong Liu, John C. S. Lui, Mohammad Hajiesmaili

    Abstract: This paper investigates stochastic multi-armed bandit algorithms that are robust to adversarial attacks, where an attacker can first observe the learner's action and {then} alter their reward observation. We study two cases of this model, with or without the knowledge of an attack budget $C$, defined as an upper bound of the summation of the difference between the actual and altered rewards. For b… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  14. arXiv:2408.08802  [pdf, other

    cs.CV

    PriorMapNet: Enhancing Online Vectorized HD Map Construction with Priors

    Authors: Rongxuan Wang, Xin Lu, Xiaoyang Liu, Xiaoyi Zou, Tongyi Cao, Ying Li

    Abstract: Online vectorized High-Definition (HD) map construction is crucial for subsequent prediction and planning tasks in autonomous driving. Following MapTR paradigm, recent works have made noteworthy achievements. However, reference points are randomly initialized in mainstream methods, leading to unstable matching between predictions and ground truth. To address this issue, we introduce PriorMapNet to… ▽ More

    Submitted 20 August, 2024; v1 submitted 16 August, 2024; originally announced August 2024.

  15. arXiv:2408.08739  [pdf, other

    eess.AS cs.AI cs.SD

    ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale

    Authors: Xin Wang, Hector Delgado, Hemlata Tak, Jee-weon Jung, Hye-jin Shim, Massimiliano Todisco, Ivan Kukanov, Xuechen Liu, Md Sahidullah, Tomi Kinnunen, Nicholas Evans, Kong Aik Lee, Junichi Yamagishi

    Abstract: ASVspoof 5 is the fifth edition in a series of challenges that promote the study of speech spoofing and deepfake attacks, and the design of detection solutions. Compared to previous challenges, the ASVspoof 5 database is built from crowdsourced data collected from a vastly greater number of speakers in diverse acoustic conditions. Attacks, also crowdsourced, are generated and tested using surrogat… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 8 pages, ASVspoof 5 Workshop (Interspeech2024 Satellite)

  16. arXiv:2408.08493  [pdf, other

    cs.LG stat.ML

    Fishers Harvest Parallel Unlearning in Inherited Model Networks

    Authors: Xiao Liu, Mingyuan Li, Xu Wang, Guangsheng Yu, Wei Ni, Lixiang Li, Haipeng Peng, Renping Liu

    Abstract: Unlearning in various learning frameworks remains challenging, with the continuous growth and updates of models exhibiting complex inheritance relationships. This paper presents a novel unlearning framework, which enables fully parallel unlearning among models exhibiting inheritance. A key enabler is the new Unified Model Inheritance Graph (UMIG), which captures the inheritance using a Directed Ac… ▽ More

    Submitted 20 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

  17. An Unsupervised Learning Framework Combined with Heuristics for the Maximum Minimal Cut Problem

    Authors: Huaiyuan Liu, Xianzhang Liu, Donghua Yang, Hongzhi Wang, Yingchi Long, Mengtong Ji, Dongjing Miao, Zhiyu Liang

    Abstract: The Maximum Minimal Cut Problem (MMCP), a NP-hard combinatorial optimization (CO) problem, has not received much attention due to the demanding and challenging bi-connectivity constraint. Moreover, as a CO problem, it is also a daunting task for machine learning, especially without labeled instances. To deal with these problems, this work proposes an unsupervised learning framework combined with h… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  18. arXiv:2408.08333  [pdf, ps, other

    cs.SE cs.AI cs.CL

    CodeMirage: Hallucinations in Code Generated by Large Language Models

    Authors: Vibhor Agarwal, Yulong Pei, Salwa Alamir, Xiaomo Liu

    Abstract: Large Language Models (LLMs) have shown promising potentials in program generation and no-code automation. However, LLMs are prone to generate hallucinations, i.e., they generate text which sounds plausible but is incorrect. Although there has been a recent surge in research on LLM hallucinations for text generation, similar hallucination phenomenon can happen in code generation. Sometimes the gen… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted at AutoMates @ IJCAI 2024

  19. arXiv:2408.08231  [pdf, other

    cs.IR

    DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System

    Authors: Xihong Yang, Heming Jing, Zixing Zhang, Jindong Wang, Huakang Niu, Shuaiqiang Wang, Yu Lu, Junfeng Wang, Dawei Yin, Xinwang Liu, En Zhu, Defu Lian, Erxue Min

    Abstract: Benefiting from the strong reasoning capabilities, Large language models (LLMs) have demonstrated remarkable performance in recommender systems. Various efforts have been made to distill knowledge from LLMs to enhance collaborative models, employing techniques like contrastive learning for representation alignment. In this work, we prove that directly aligning the representations of LLMs and colla… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  20. arXiv:2408.06603  [pdf, other

    cs.AI

    Simple but Effective Compound Geometric Operations for Temporal Knowledge Graph Completion

    Authors: Rui Ying, Mengting Hu, Jianfeng Wu, Yalan Xie, Xiaoyi Liu, Zhunheng Wang, Ming Jiang, Hang Gao, Linlin Zhang, Renhong Cheng

    Abstract: Temporal knowledge graph completion aims to infer the missing facts in temporal knowledge graphs. Current approaches usually embed factual knowledge into continuous vector space and apply geometric operations to learn potential patterns in temporal knowledge graphs. However, these methods only adopt a single operation, which may have limitations in capturing the complex temporal dynamics present i… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  21. arXiv:2408.06327  [pdf, other

    cs.AI cs.CL cs.CV

    VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents

    Authors: Xiao Liu, Tianjie Zhang, Yu Gu, Iat Long Iong, Yifan Xu, Xixuan Song, Shudan Zhang, Hanyu Lai, Xinyi Liu, Hanlin Zhao, Jiadai Sun, Xinyue Yang, Yu Yang, Zehan Qi, Shuntian Yao, Xueqiao Sun, Siyi Cheng, Qinkai Zheng, Hao Yu, Hanchen Zhang, Wenyi Hong, Ming Ding, Lihang Pan, Xiaotao Gu, Aohan Zeng , et al. (5 additional authors not shown)

    Abstract: Large Multimodal Models (LMMs) have ushered in a new era in artificial intelligence, merging capabilities in both language and vision to form highly capable Visual Foundation Agents. These agents are postulated to excel across a myriad of tasks, potentially approaching general artificial intelligence. However, existing benchmarks fail to sufficiently challenge or showcase the full potential of LMM… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  22. arXiv:2408.05854  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    On the Robustness of Kernel Goodness-of-Fit Tests

    Authors: Xing Liu, François-Xavier Briol

    Abstract: Goodness-of-fit testing is often criticized for its lack of practical relevance; since ``all models are wrong'', the null hypothesis that the data conform to our model is ultimately always rejected when the sample size is large enough. Despite this, probabilistic models are still used extensively, raising the more pertinent question of whether the model is good enough for a specific task. This que… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: 50 pages, 13 figures

  23. arXiv:2408.05849  [pdf, other

    cs.LG stat.ML

    An End-to-End Model for Time Series Classification In the Presence of Missing Values

    Authors: Pengshuai Yao, Mengna Liu, Xu Cheng, Fan Shi, Huan Li, Xiufeng Liu, Shengyong Chen

    Abstract: Time series classification with missing data is a prevalent issue in time series analysis, as temporal data often contain missing values in practical applications. The traditional two-stage approach, which handles imputation and classification separately, can result in sub-optimal performance as label information is not utilized in the imputation process. On the other hand, a one-stage approach ca… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  24. arXiv:2408.05628  [pdf, other

    cs.LG cs.AI eess.SY

    Forecasting Day-Ahead Electricity Prices in the Integrated Single Electricity Market: Addressing Volatility with Comparative Machine Learning Methods

    Authors: Ben Harkin, Xueqin Liu

    Abstract: This paper undertakes a comprehensive investigation of electricity price forecasting methods, focused on the Irish Integrated Single Electricity Market, particularly on changes during recent periods of high volatility. The primary objective of this research is to evaluate and compare the performance of various forecasting models, ranging from traditional machine learning models to more complex neu… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  25. arXiv:2408.05160  [pdf, other

    cs.LG

    Federated Hypergraph Learning with Hyperedge Completion

    Authors: Linfeng Luo, Fengxiao Tang, Xiyu Liu, Zhiqi Guo, Zihao Qiu, Ming Zhao

    Abstract: Hypergraph neural networks enhance conventional graph neural networks by capturing high-order relationships among nodes, which proves vital in data-rich environments where interactions are not merely pairwise. As data complexity and interconnectivity grow, it is common for graph-structured data to be split and stored in a distributed manner, underscoring the necessity of federated learning on subg… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  26. arXiv:2408.05109  [pdf, other

    cs.DB

    A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?

    Authors: Xinyu Liu, Shuyu Shen, Boyan Li, Peixian Ma, Runzhi Jiang, Yuyu Luo, Yuxin Zhang, Ju Fan, Guoliang Li, Nan Tang

    Abstract: Translating users' natural language queries (NL) into SQL queries (i.e., NL2SQL) can significantly reduce barriers to accessing relational databases and support various commercial applications. The performance of NL2SQL has been greatly enhanced with the emergence of Large Language Models (LLMs). In this survey, we provide a comprehensive review of NL2SQL techniques powered by LLMs, covering its e… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  27. arXiv:2408.05065  [pdf, other

    cs.LG

    Masked adversarial neural network for cell type deconvolution in spatial transcriptomics

    Authors: Lin Huang, Xiaofei Liu, Shunfang Wang, Wenwen Min

    Abstract: Accurately determining cell type composition in disease-relevant tissues is crucial for identifying disease targets. Most existing spatial transcriptomics (ST) technologies cannot achieve single-cell resolution, making it challenging to accurately determine cell types. To address this issue, various deconvolution methods have been developed. Most of these methods use single-cell RNA sequencing (sc… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  28. arXiv:2408.04912  [pdf, other

    cs.SD cs.CE cs.ET cs.LG eess.AS

    AcousAF: Acoustic Sensing-Based Atrial Fibrillation Detection System for Mobile Phones

    Authors: Xuanyu Liu, Haoxian Liu, Jiao Li, Zongqi Yang, Yi Huang, Jin Zhang

    Abstract: Atrial fibrillation (AF) is characterized by irregular electrical impulses originating in the atria, which can lead to severe complications and even death. Due to the intermittent nature of the AF, early and timely monitoring of AF is critical for patients to prevent further exacerbation of the condition. Although ambulatory ECG Holter monitors provide accurate monitoring, the high cost of these d… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: Accepted for publication in Companion of the 2024 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp Companion '24)

  29. arXiv:2408.04243  [pdf, other

    cs.CV cs.MM

    MU-MAE: Multimodal Masked Autoencoders-Based One-Shot Learning

    Authors: Rex Liu, Xin Liu

    Abstract: With the exponential growth of multimedia data, leveraging multimodal sensors presents a promising approach for improving accuracy in human activity recognition. Nevertheless, accurately identifying these activities using both video data and wearable sensor data presents challenges due to the labor-intensive data annotation, and reliance on external pretrained models or additional data. To address… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: IEEE MIPR 2024

  30. arXiv:2408.04144  [pdf, other

    cs.CV

    Integrated Dynamic Phenological Feature for Remote Sensing Image Land Cover Change Detection

    Authors: Yi Liu, Chenhao Sun, Hao Ye, Xiangying Liu, Weilong Ju

    Abstract: Remote sensing image change detection (CD) is essential for analyzing land surface changes over time, with a significant challenge being the differentiation of actual changes from complex scenes while filtering out pseudo-changes. A primary contributor to this challenge is the intra-class dynamic changes due to phenological characteristics in natural areas. To overcome this, we introduce the InPhe… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  31. arXiv:2408.03910  [pdf, other

    cs.SE cs.AI cs.CL

    CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

    Authors: Xiangyan Liu, Bo Lan, Zhiyuan Hu, Yang Liu, Zhicheng Zhang, Fei Wang, Michael Shieh, Wenmeng Zhou

    Abstract: Large Language Models (LLMs) excel in stand-alone code tasks like HumanEval and MBPP, but struggle with handling entire code repositories. This challenge has prompted research on enhancing LLM-codebase interaction at a repository scale. Current solutions rely on similarity-based retrieval or manual tools and APIs, each with notable drawbacks. Similarity-based retrieval often has low recall in comp… ▽ More

    Submitted 11 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: work in progress

  32. Dual-Modeling Decouple Distillation for Unsupervised Anomaly Detection

    Authors: Xinyue Liu, Jianyuan Wang, Biao Leng, Shuo Zhang

    Abstract: Knowledge distillation based on student-teacher network is one of the mainstream solution paradigms for the challenging unsupervised Anomaly Detection task, utilizing the difference in representation capabilities of the teacher and student networks to implement anomaly localization. However, over-generalization of the student network to the teacher network may lead to negligible differences in rep… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: 10 pages, 8 figures, Accepted to ACM MM '24

  33. arXiv:2408.03568  [pdf

    cs.CV cs.LG eess.IV

    A comparative study of generative adversarial networks for image recognition algorithms based on deep learning and traditional methods

    Authors: Yihao Zhong, Yijing Wei, Yingbin Liang, Xiqing Liu, Rongwei Ji, Yiru Cang

    Abstract: In this paper, an image recognition algorithm based on the combination of deep learning and generative adversarial network (GAN) is studied, and compared with traditional image recognition methods. The purpose of this study is to evaluate the advantages and application prospects of deep learning technology, especially GAN, in the field of image recognition. Firstly, this paper reviews the basic pr… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  34. arXiv:2408.03519  [pdf, other

    cs.SE cs.AI

    RepoMasterEval: Evaluating Code Completion via Real-World Repositories

    Authors: Qinyun Wu, Chao Peng, Pengfei Gao, Ruida Hu, Haoyu Gan, Bo Jiang, Jinhe Tang, Zhiwen Deng, Zhanming Guan, Cuiyun Gao, Xia Liu, Ping Yang

    Abstract: With the growing reliance on automated code completion tools in software development, the need for robust evaluation benchmarks has become critical. However, existing benchmarks focus more on code generation tasks in function and class level and provide rich text description to prompt the model. By contrast, such descriptive prompt is commonly unavailable in real development and code completion ca… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  35. arXiv:2408.03499  [pdf, other

    cs.CV

    FacialPulse: An Efficient RNN-based Depression Detection via Temporal Facial Landmarks

    Authors: Ruiqi Wang, Jinyang Huang, Jie Zhang, Xin Liu, Xiang Zhang, Zhi Liu, Peng Zhao, Sigui Chen, Xiao Sun

    Abstract: Depression is a prevalent mental health disorder that significantly impacts individuals' lives and well-being. Early detection and intervention are crucial for effective treatment and management of depression. Recently, there are many end-to-end deep learning methods leveraging the facial expression features for automatic depression detection. However, most current methods overlook the temporal dy… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  36. arXiv:2408.03149  [pdf, other

    cs.CV cs.CL

    Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization

    Authors: Yanghai Zhang, Ye Liu, Shiwei Wu, Kai Zhang, Xukai Liu, Qi Liu, Enhong Chen

    Abstract: The rapid increase in multimedia data has spurred advancements in Multimodal Summarization with Multimodal Output (MSMO), which aims to produce a multimodal summary that integrates both text and relevant images. The inherent heterogeneity of content within multimodal inputs and outputs presents a significant challenge to the execution of MSMO. Traditional approaches typically adopt a holistic pers… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    Comments: In ACL-Findings 2024

  37. arXiv:2408.03091  [pdf, other

    cs.IR

    Modeling User Intent Beyond Trigger: Incorporating Uncertainty for Trigger-Induced Recommendation

    Authors: Jianxing Ma, Zhibo Xiao, Luwei Yang, Hansheng Xue, Xuanzhou Liu, Wen Jiang, Wei Ning, Guannan Zhang

    Abstract: To cater to users' desire for an immersive browsing experience, numerous e-commerce platforms provide various recommendation scenarios, with a focus on Trigger-Induced Recommendation (TIR) tasks. However, the majority of current TIR methods heavily rely on the trigger item to understand user intent, lacking a higher-level exploration and exploitation of user intent (e.g., popular items and complem… ▽ More

    Submitted 7 August, 2024; v1 submitted 6 August, 2024; originally announced August 2024.

    Comments: Accepted at CIKM 2024

  38. arXiv:2408.02983  [pdf, other

    cs.CV

    Diffusion Model Meets Non-Exemplar Class-Incremental Learning and Beyond

    Authors: Jichuan Zhang, Yali Li, Xin Liu, Shengjin Wang

    Abstract: Non-exemplar class-incremental learning (NECIL) is to resist catastrophic forgetting without saving old class samples. Prior methodologies generally employ simple rules to generate features for replaying, suffering from large distribution gap between replayed features and real ones. To address the aforementioned issue, we propose a simple, yet effective \textbf{Diff}usion-based \textbf{F}eature \t… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  39. arXiv:2408.02882  [pdf, other

    cs.AI cs.CR cs.LG

    Compromising Embodied Agents with Contextual Backdoor Attacks

    Authors: Aishan Liu, Yuguang Zhou, Xianglong Liu, Tianyuan Zhang, Siyuan Liang, Jiakai Wang, Yanjun Pu, Tianlin Li, Junqi Zhang, Wenbo Zhou, Qing Guo, Dacheng Tao

    Abstract: Large language models (LLMs) have transformed the development of embodied intelligence. By providing a few contextual demonstrations, developers can utilize the extensive internal knowledge of LLMs to effortlessly translate complex tasks described in abstract language into sequences of code snippets, which will serve as the execution logic for embodied agents. However, this paper uncovers a signif… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  40. arXiv:2408.02769  [pdf, other

    cs.CV

    From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation

    Authors: Xin Liu, Chao Hao, Zitong Yu, Huanjing Yue, Jingyu Yang

    Abstract: The action anticipation task refers to predicting what action will happen based on observed videos, which requires the model to have a strong ability to summarize the present and then reason about the future. Experience and common sense suggest that there is a significant correlation between different actions, which provides valuable prior knowledge for the action anticipation task. However, previ… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: Accepted by ACM TOMM

  41. arXiv:2408.01952  [pdf, other

    cs.CV

    CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization

    Authors: Xiang He, Xiangxi Liu, Yang Li, Dongcheng Zhao, Guobin Shen, Qingqun Kong, Xin Yang, Yi Zeng

    Abstract: The audio-visual event localization task requires identifying concurrent visual and auditory events from unconstrained videos within a network model, locating them, and classifying their category. The efficient extraction and integration of audio and visual modal information have always been challenging in this field. In this paper, we introduce CACE-Net, which differs from most existing methods t… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

    Comments: Accepted by ACM MM 2024. Code is available at this https://github.com/Brain-Cog-Lab/CACE-Net

  42. arXiv:2408.01046  [pdf, other

    cs.CL

    QUDSELECT: Selective Decoding for Questions Under Discussion Parsing

    Authors: Ashima Suvarna, Xiao Liu, Tanmay Parekh, Kai-Wei Chang, Nanyun Peng

    Abstract: Question Under Discussion (QUD) is a discourse framework that uses implicit questions to reveal discourse relationships between sentences. In QUD parsing, each sentence is viewed as an answer to a question triggered by an anchor sentence in prior context. The resulting QUD structure is required to conform to several theoretical criteria like answer compatibility (how well the question is answered)… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: 11 Pages, 5 figures

  43. arXiv:2408.01003  [pdf, other

    cs.AI

    Piculet: Specialized Models-Guided Hallucination Decrease for MultiModal Large Language Models

    Authors: Kohou Wang, Xiang Liu, Zhaoxiang Liu, Kai Wang, Shiguo Lian

    Abstract: Multimodal Large Language Models (MLLMs) have made significant progress in bridging the gap between visual and language modalities. However, hallucinations in MLLMs, where the generated text does not align with image content, continue to be a major challenge. Existing methods for addressing hallucinations often rely on instruction-tuning, which requires retraining the model with specific data, whi… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: 14 pages, 5 figures

  44. arXiv:2408.00775  [pdf, other

    cs.LG math.NA

    Dilated convolution neural operator for multiscale partial differential equations

    Authors: Bo Xu, Xinliang Liu, Lei Zhang

    Abstract: This paper introduces a data-driven operator learning method for multiscale partial differential equations, with a particular emphasis on preserving high-frequency information. Drawing inspiration from the representation of multiscale parameterized solutions as a combination of low-rank global bases (such as low-frequency Fourier modes) and localized bases over coarse patches (analogous to dilated… ▽ More

    Submitted 16 July, 2024; originally announced August 2024.

  45. arXiv:2408.00706  [pdf, other

    cs.CV cs.AI cs.LG eess.IV physics.med-ph

    Point-supervised Brain Tumor Segmentation with Box-prompted MedSAM

    Authors: Xiaofeng Liu, Jonghye Woo, Chao Ma, Jinsong Ouyang, Georges El Fakhri

    Abstract: Delineating lesions and anatomical structure is important for image-guided interventions. Point-supervised medical image segmentation (PSS) has great potential to alleviate costly expert delineation labeling. However, due to the lack of precise size and boundary guidance, the effectiveness of PSS often falls short of expectations. Although recent vision foundational models, such as the medical seg… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: 2024 IEEE Nuclear Science Symposium and Medical Imaging Conference

  46. arXiv:2408.00636  [pdf, other

    cs.CV

    Deep Learning in Medical Image Classification from MRI-based Brain Tumor Images

    Authors: Xiaoyi Liu, Zhuoyue Wang

    Abstract: Brain tumors are among the deadliest diseases in the world. Magnetic Resonance Imaging (MRI) is one of the most effective ways to detect brain tumors. Accurate detection of brain tumors based on MRI scans is critical, as it can potentially save many lives and facilitate better decision-making at the early stages of the disease. Within our paper, four different types of MRI-based images have been c… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  47. arXiv:2408.00332  [pdf, other

    cs.CV cs.RO

    Vision-based Wearable Steering Assistance for People with Impaired Vision in Jogging

    Authors: Xiaotong Liu, Binglu Wang, Zhijun Li

    Abstract: Outdoor sports pose a challenge for people with impaired vision. The demand for higher-speed mobility inspired us to develop a vision-based wearable steering assistance. To ensure broad applicability, we focused on a representative sports environment, the athletics track. Our efforts centered on improving the speed and accuracy of perception, enhancing planning adaptability for the real world, and… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: Accepted to ICRA 2024

  48. arXiv:2408.00278  [pdf, other

    cs.LG cs.AI cs.NE

    High Performance Im2win and Direct Convolutions using Three Tensor Layouts on SIMD Architectures

    Authors: Xiang Fu, Xinpeng Zhang, Jixiang Ma, Peng Zhao, Shuai Lu, Xu T. Liu

    Abstract: Convolution is the core component within deep neural networks and it is computationally intensive and time consuming. Tensor data layouts significantly impact convolution operations in terms of memory access and computational efficiency. Yet, there is still a lack of comprehensive performance characterization on data layouts on SIMD architectures concerning convolution methods. This paper proposes… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  49. arXiv:2408.00243  [pdf, other

    cs.CR cs.CC

    A Survey on the Applications of Zero-Knowledge Proofs

    Authors: Ryan Lavin, Xuekai Liu, Hardhik Mohanty, Logan Norman, Giovanni Zaarour, Bhaskar Krishnamachari

    Abstract: Zero-knowledge proofs (ZKPs) represent a revolutionary advance in computational integrity and privacy technology, enabling the secure and private exchange of information without revealing underlying private data. ZKPs have unique advantages in terms of universality and minimal security assumptions when compared to other privacy-sensitive computational methods for distributed systems, such as homom… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

  50. arXiv:2407.21593  [pdf, other

    cs.HC

    LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows

    Authors: Lukas Teufelberger, Xintong Liu, Zhipeng Li, Max Moebus, Christian Holz

    Abstract: To enhance productivity and to streamline workflows, there is a growing trend to embed large language model (LLM) functionality into applications, from browser-based web apps to native apps that run on personal computers. Here, we introduce LLM-for-X, a system-wide shortcut layer that seamlessly augments any application with LLM services through a lightweight popup dialog. Our native layer seamles… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.