Skip to main content

Showing 1–50 of 2,999 results for author: Chen, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.11811  [pdf, other

    cs.CV cs.RO

    EmbodiedSAM: Online Segment Any 3D Thing in Real Time

    Authors: Xiuwei Xu, Huangxing Chen, Linqing Zhao, Ziwei Wang, Jie Zhou, Jiwen Lu

    Abstract: Embodied tasks require the agent to fully understand 3D scenes simultaneously with its exploration, so an online, real-time, fine-grained and highly-generalized 3D perception model is desperately needed. Since high-quality 3D data is limited, directly training such a model in 3D is almost infeasible. Meanwhile, vision foundation models (VFM) has revolutionized the field of 2D computer vision with… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: Project page: https://xuxw98.github.io/ESAM/

  2. arXiv:2408.11540  [pdf, other

    cs.CV

    DeRainGS: Gaussian Splatting for Enhanced Scene Reconstruction in Rainy

    Authors: Shuhong Liu, Xiang Chen, Hongming Chen, Quanfeng Xu, Mingrui Li

    Abstract: Reconstruction under adverse rainy conditions poses significant challenges due to reduced visibility and the distortion of visual perception. These conditions can severely impair the quality of geometric maps, which is essential for applications ranging from autonomous planning to environmental monitoring. In response to these challenges, this study introduces the novel task of 3D Reconstruction i… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  3. arXiv:2408.11338  [pdf, other

    cs.AI cs.LG

    Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond

    Authors: Minghao Liu, Zonglin Di, Jiaheng Wei, Zhongruo Wang, Hengxiang Zhang, Ruixuan Xiao, Haoyu Wang, Jinlong Pang, Hao Chen, Ankit Shah, Hongxin Wei, Xinlei He, Zhaowei Zhao, Haobo Wang, Lei Feng, Jindong Wang, James Davis, Yang Liu

    Abstract: Large-scale data collection is essential for developing personalized training data, mitigating the shortage of training data, and fine-tuning specialized models. However, creating high-quality datasets quickly and accurately remains a challenge due to annotation errors, the substantial time and costs associated with human labor. To address these issues, we propose Automatic Dataset Construction (A… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  4. arXiv:2408.11290  [pdf, other

    eess.SP cs.IT

    Privacy Preservation in Delay-Based Localization Systems: Artificial Noise or Artificial Multipath?

    Authors: Yuchen Zhang, Hui Chen, Henk Wymeersch

    Abstract: Localization plays an increasingly pivotal role in 5G/6G systems, enabling various applications. This paper focuses on the privacy concerns associated with delay-based localization, where unauthorized base stations attempt to infer the location of the end user. We propose a method to disrupt localization at unauthorized nodes by injecting artificial components into the pilot signal, exploiting mod… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 6pages, conference paper

  5. arXiv:2408.10948  [pdf, other

    cs.LG cs.AI

    GAIM: Attacking Graph Neural Networks via Adversarial Influence Maximization

    Authors: Xiaodong Yang, Xiaoting Li, Huiyuan Chen, Yiwei Cai

    Abstract: Recent studies show that well-devised perturbations on graph structures or node features can mislead trained Graph Neural Network (GNN) models. However, these methods often overlook practical assumptions, over-rely on heuristics, or separate vital attack components. In response, we present GAIM, an integrated adversarial attack method conducted on a node feature basis while considering the strict… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  6. arXiv:2408.10854  [pdf, other

    physics.ao-ph cs.AI cs.CV

    MambaDS: Near-Surface Meteorological Field Downscaling with Topography Constrained Selective State Space Modeling

    Authors: Zili Liu, Hao Chen, Lei Bai, Wenyuan Li, Wanli Ouyang, Zhengxia Zou, Zhenwei Shi

    Abstract: In an era of frequent extreme weather and global warming, obtaining precise, fine-grained near-surface weather forecasts is increasingly essential for human activities. Downscaling (DS), a crucial task in meteorological forecasting, enables the reconstruction of high-resolution meteorological states for target regions from global-scale forecast results. Previous downscaling methods, inspired by CN… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  7. arXiv:2408.10777  [pdf, other

    cs.CV cs.AI

    Just a Hint: Point-Supervised Camouflaged Object Detection

    Authors: Huafeng Chen, Dian Shao, Guangqian Guo, Shan Gao

    Abstract: Camouflaged Object Detection (COD) demands models to expeditiously and accurately distinguish objects which conceal themselves seamlessly in the environment. Owing to the subtle differences and ambiguous boundaries, COD is not only a remarkably challenging task for models but also for human annotators, requiring huge efforts to provide pixel-wise annotations. To alleviate the heavy annotation burd… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: Accepted by ECCV2024

  8. arXiv:2408.10760  [pdf, other

    cs.CV cs.AI

    SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection

    Authors: Huafeng Chen, Pengxu Wei, Guangqian Guo, Shan Gao

    Abstract: Most Camouflaged Object Detection (COD) methods heavily rely on mask annotations, which are time-consuming and labor-intensive to acquire. Existing weakly-supervised COD approaches exhibit significantly inferior performance compared to fully-supervised methods and struggle to simultaneously support all the existing types of camouflaged object labels, including scribbles, bounding boxes, and points… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: Accepted by ECCV2024

  9. arXiv:2408.10642  [pdf, other

    cs.AI cs.CL

    Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation

    Authors: Shiming Xie, Hong Chen, Fred Yu, Zeye Sun, Xiuyu Wu

    Abstract: Instruct LLM provide a paradigm used in large scale language model to align LLM to human preference. The paradigm contains supervised fine tuning and reinforce learning from human feedback. This paradigm is also used in downstream scenarios to adapt LLM to specific corpora and applications. Comparing to SFT, there are many efforts focused on RLHF and several algorithms being proposed, such as PPO,… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 8 pages, 5 figures

  10. arXiv:2408.10600  [pdf

    cs.CV cs.AI

    Breast tumor classification based on self-supervised contrastive learning from ultrasound videos

    Authors: Yunxin Tang, Siyuan Tang, Jian Zhang, Hao Chen

    Abstract: Background: Breast ultrasound is prominently used in diagnosing breast tumors. At present, many automatic systems based on deep learning have been developed to help radiologists in diagnosis. However, training such systems remains challenging because they are usually data-hungry and demand amounts of labeled data, which need professional knowledge and are expensive. Methods: We adopted a triplet n… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  11. arXiv:2408.10005  [pdf, ps, other

    cs.IT

    Optimal Few-GHW Linear Codes and Their Subcode Support Weight Distributions

    Authors: Xu Pan, Hao Chen, Hongwei Liu, Shengwei Liu

    Abstract: Few-weight codes have been constructed and studied for many years, since their fascinating relations to finite geometries, strongly regular graphs and Boolean functions. Simplex codes are one-weight Griesmer $[\frac{q^k-1}{q-1},k ,q^{k-1}]_q$-linear codes and they meet all Griesmer bounds of the generalized Hamming weights of linear codes. All the subcodes with dimension $r$ of a… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  12. arXiv:2408.09951  [pdf

    cs.AI eess.SP

    Principle Driven Parameterized Fiber Model based on GPT-PINN Neural Network

    Authors: Yubin Zang, Boyu Hua, Zhenzhou Tang, Zhipeng Lin, Fangzheng Zhang, Simin Li, Zuxing Zhang, Hongwei Chen

    Abstract: In cater the need of Beyond 5G communications, large numbers of data driven artificial intelligence based fiber models has been put forward as to utilize artificial intelligence's regression ability to predict pulse evolution in fiber transmission at a much faster speed compared with the traditional split step Fourier method. In order to increase the physical interpretabiliy, principle driven fibe… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  13. arXiv:2408.09947  [pdf

    cs.AI eess.SP

    Fiber Transmission Model with Parameterized Inputs based on GPT-PINN Neural Network

    Authors: Yubin Zang, Boyu Hua, Zhipeng Lin, Fangzheng Zhang, Simin Li, Zuxing Zhang, Hongwei Chen

    Abstract: In this manuscript, a novelty principle driven fiber transmission model for short-distance transmission with parameterized inputs is put forward. By taking into the account of the previously proposed principle driven fiber model, the reduced basis expansion method and transforming the parameterized inputs into parameterized coefficients of the Nonlinear Schrodinger Equations, universal solutions w… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  14. arXiv:2408.09834  [pdf, other

    cs.AI

    Minor DPO reject penalty to increase training robustness

    Authors: Shiming Xie, Hong Chen, Fred Yu, Zeye Sun, Xiuyu Wu, Yingfan Hu

    Abstract: Learning from human preference is a paradigm used in large-scale language model (LLM) fine-tuning step to better align pretrained LLM to human preference for downstream task. In the past it uses reinforcement learning from human feedback (RLHF) algorithm to optimize the LLM policy to align with these preferences and not to draft too far from the original model. Recently, Direct Preference Optimiza… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 8 pages, 19 figures

  15. arXiv:2408.09815  [pdf, other

    cs.LG cs.HC

    A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent Prediction

    Authors: Jiahui Gong, Jingtao Ding, Fanjin Meng, Guilong Chen, Hong Chen, Shen Zhao, Haisheng Lu, Yong Li

    Abstract: Mobile devices, especially smartphones, can support rich functions and have developed into indispensable tools in daily life. With the rise of generative AI services, smartphones can potentially transform into personalized assistants, anticipating user needs and scheduling services accordingly. Predicting user intents on smartphones, and reflecting anticipated activities based on past interactions… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: accepted by KDD 2024

  16. arXiv:2408.09199  [pdf, other

    cs.IR

    TC-RAG:Turing-Complete RAG's Case study on Medical LLM Systems

    Authors: Xinke Jiang, Yue Fang, Rihong Qiu, Haoyu Zhang, Yongxin Xu, Hao Chen, Wentao Zhang, Ruizhe Zhang, Yuchen Fang, Xu Chu, Junfeng Zhao, Yasha Wang

    Abstract: In the pursuit of enhancing domain-specific Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) emerges as a promising solution to mitigate issues such as hallucinations, outdated knowledge, and limited expertise in highly specialized queries. However, existing approaches to RAG fall short by neglecting system state variables, which are crucial for ensuring adaptive control, retriev… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: version 1.0

  17. arXiv:2408.09027  [pdf, other

    cs.SD cs.AI eess.AS

    Efficient Autoregressive Audio Modeling via Next-Scale Prediction

    Authors: Kai Qiu, Xiang Li, Hao Chen, Jie Sun, Jinglu Wang, Zhe Lin, Marios Savvides, Bhiksha Raj

    Abstract: Audio generation has achieved remarkable progress with the advance of sophisticated generative models, such as diffusion models (DMs) and autoregressive (AR) models. However, due to the naturally significant sequence length of audio, the efficiency of audio generation remains an essential issue to be addressed, especially for AR models that are incorporated in large language models (LLMs). In this… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 7 pages, 6 figures, 7 tables

  18. arXiv:2408.08796  [pdf, ps, other

    cs.IT eess.SP

    Multi-Antenna Broadband Backscatter Communications

    Authors: Hao Chen, Zhizhi Huang, Ying-Chang Liang, Robert Schober

    Abstract: Backscatter communication offers a promising solution to connect massive Internet-of-Things (IoT) devices with low cost and high energy efficiency. Nevertheless, its inherently passive nature limits transmission reliability, thereby hindering improvements in communication range and data rate. To overcome these challenges, we introduce a bistatic broadband backscatter communication (BBBC) system, w… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  19. arXiv:2408.08456  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Efficient Data-Sketches and Fine-Tuning for Early Detection of Distributional Drift in Medical Imaging

    Authors: Yusen Wu, Hao Chen, Alex Pissinou Makki, Phuong Nguyen, Yelena Yesha

    Abstract: Distributional drift detection is important in medical applications as it helps ensure the accuracy and reliability of models by identifying changes in the underlying data distribution that could affect diagnostic or treatment decisions. However, current methods have limitations in detecting drift; for example, the inclusion of abnormal datasets can lead to unfair comparisons. This paper presents… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  20. arXiv:2408.07869  [pdf, other

    cs.LG

    A Systematic Evaluation of Generated Time Series and Their Effects in Self-Supervised Pretraining

    Authors: Audrey Der, Chin-Chia Michael Yeh, Xin Dai, Huiyuan Chen, Yan Zheng, Yujie Fan, Zhongfang Zhuang, Vivian Lai, Junpeng Wang, Liang Wang, Wei Zhang, Eamonn Keogh

    Abstract: Self-supervised Pretrained Models (PTMs) have demonstrated remarkable performance in computer vision and natural language processing tasks. These successes have prompted researchers to design PTMs for time series data. In our experiments, most self-supervised time series PTMs were surpassed by simple supervised models. We hypothesize this undesired phenomenon may be caused by data scarcity. In res… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: To appear in CIKM 2024 as a short paper; the version here is the self-contained version that includes the non-mandatory supplementary material available on the paper's companion website

  21. arXiv:2408.07736  [pdf, other

    cs.LG cs.AI

    Enhancing Model Interpretability with Local Attribution over Global Exploration

    Authors: Zhiyu Zhu, Zhibo Jin, Jiayu Zhang, Huaming Chen

    Abstract: In the field of artificial intelligence, AI models are frequently described as `black boxes' due to the obscurity of their internal mechanisms. It has ignited research interest on model interpretability, especially in attribution methods that offers precise explanations of model decisions. Current attribution algorithms typically evaluate the importance of each parameter by exploring the sample sp… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted by ACMMM 2024

  22. arXiv:2408.07636  [pdf, other

    q-bio.QM cs.AI cs.LG

    Drug Discovery SMILES-to-Pharmacokinetics Diffusion Models with Deep Molecular Understanding

    Authors: Bing Hu, Anita Layton, Helen Chen

    Abstract: Artificial intelligence (AI) is increasingly used in every stage of drug development. One challenge facing drug discovery AI is that drug pharmacokinetic (PK) datasets are often collected independently from each other, often with limited overlap, creating data overlap sparsity. Data sparsity makes data curation difficult for researchers looking to answer research questions in poly-pharmacy, drug c… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 13 pages, 5 figures, 4 tables

  23. arXiv:2408.07476  [pdf, other

    cs.CV

    One Step Diffusion-based Super-Resolution with Time-Aware Distillation

    Authors: Xiao He, Huaao Tang, Zhijun Tu, Junchao Zhang, Kun Cheng, Hanting Chen, Yong Guo, Mingrui Zhu, Nannan Wang, Xinbo Gao, Jie Hu

    Abstract: Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts. However, these approaches typically require tens or even hundreds of iterative samplings, resulting in significant latency. Recently, techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowl… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 18 pages

  24. arXiv:2408.07422  [pdf, other

    cs.CV cs.AI

    LLMI3D: Empowering LLM with 3D Perception from a Single 2D Image

    Authors: Fan Yang, Sicheng Zhao, Yanhao Zhang, Haoxiang Chen, Hui Chen, Wenbo Tang, Haonan Lu, Pengfei Xu, Zhenyu Yang, Jungong Han, Guiguang Ding

    Abstract: Recent advancements in autonomous driving, augmented reality, robotics, and embodied intelligence have necessitated 3D perception algorithms. However, current 3D perception methods, particularly small models, struggle with processing logical reasoning, question-answering, and handling open scenario categories. On the other hand, generative multimodal large language models (MLLMs) excel in general… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  25. arXiv:2408.07325  [pdf, other

    eess.IV cs.GR

    RoCoSDF: Row-Column Scanned Neural Signed Distance Fields for Freehand 3D Ultrasound Imaging Shape Reconstruction

    Authors: Hongbo Chen, Yuchong Gao, Shuhang Zhang, Jiangjie Wu, Yuexin Ma, Rui Zheng

    Abstract: The reconstruction of high-quality shape geometry is crucial for developing freehand 3D ultrasound imaging. However, the shape reconstruction of multi-view ultrasound data remains challenging due to the elevation distortion caused by thick transducer probes. In this paper, we present a novel learning-based framework RoCoSDF, which can effectively generate an implicit surface through continuous sha… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted by MICCAI 2024

  26. arXiv:2408.07146  [pdf, other

    cs.CV cs.AI

    Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces

    Authors: Zhiling Chen, Hanning Chen, Mohsen Imani, Ruimin Chen, Farhad Imani

    Abstract: Workplace accidents due to personal protective equipment (PPE) non-compliance raise serious safety concerns and lead to legal liabilities, financial penalties, and reputational damage. While object detection models have shown the capability to address this issue by identifying safety items, most existing models, such as YOLO, Faster R-CNN, and SSD, are limited in verifying the fine-grained attribu… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 20 pages, 7 figures

  27. Learning Rule-Induced Subgraph Representations for Inductive Relation Prediction

    Authors: Tianyu Liu, Qitan Lv, Jie Wang, Shuling Yang, Hanzhu Chen

    Abstract: Inductive relation prediction (IRP) -- where entities can be different during training and inference -- has shown great power for completing evolving knowledge graphs. Existing works mainly focus on using graph neural networks (GNNs) to learn the representation of the subgraph induced from the target link, which can be seen as an implicit rule-mining process to measure the plausibility of the targ… ▽ More

    Submitted 20 August, 2024; v1 submitted 8 August, 2024; originally announced August 2024.

    Journal ref: Advances in Neural Information Processing Systems 36 (2024)

  28. arXiv:2408.06788  [pdf, other

    cs.CV cs.HC

    Visual Neural Decoding via Improved Visual-EEG Semantic Consistency

    Authors: Hongzhou Chen, Lianghua He, Yihang Liu, Longzhen Yang

    Abstract: Visual neural decoding refers to the process of extracting and interpreting original visual experiences from human brain activity. Recent advances in metric learning-based EEG visual decoding methods have delivered promising results and demonstrated the feasibility of decoding novel visual categories from brain activity. However, methods that directly map EEG features to the CLIP embedding space m… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  29. arXiv:2408.06361  [pdf, other

    q-fin.TR cs.CL

    Large Language Model Agent in Financial Trading: A Survey

    Authors: Han Ding, Yinheng Li, Junhao Wang, Hang Chen

    Abstract: Trading is a highly competitive task that requires a combination of strategy, knowledge, and psychological fortitude. With the recent success of large language models(LLMs), it is appealing to apply the emerging intelligence of LLM agents in this competitive arena and understanding if they can outperform professional traders. In this survey, we provide a comprehensive review of the current researc… ▽ More

    Submitted 26 July, 2024; originally announced August 2024.

  30. arXiv:2408.05683  [pdf, other

    cs.CV cs.MM

    Single Image Dehazing Using Scene Depth Ordering

    Authors: Pengyang Ling, Huaian Chen, Xiao Tan, Yimeng Shan, Yi Jin

    Abstract: Images captured in hazy weather generally suffer from quality degradation, and many dehazing methods have been developed to solve this problem. However, single image dehazing problem is still challenging due to its ill-posed nature. In this paper, we propose a depth order guided single image dehazing method, which utilizes depth order in hazy images to guide the dehazing process to achieve a simil… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: 14 pages, 15 figures

  31. arXiv:2408.05617  [pdf, other

    cs.LG cs.AI cs.CV cs.DC cs.IT

    Residual-INR: Communication Efficient On-Device Learning Using Implicit Neural Representation

    Authors: Hanqiu Chen, Xuebin Yao, Pradeep Subedi, Cong Hao

    Abstract: Edge computing is a distributed computing paradigm that collects and processes data at or near the source of data generation. The on-device learning at edge relies on device-to-device wireless communication to facilitate real-time data sharing and collaborative decision-making among multiple devices. This significantly improves the adaptability of the edge computing system to the changing environm… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: This paper has been accepted by ICCAD 2024

  32. arXiv:2408.05614  [pdf, other

    cs.AR cs.ET eess.SY

    ICGMM: CXL-enabled Memory Expansion with Intelligent Caching Using Gaussian Mixture Model

    Authors: Hanqiu Chen, Yitu Wang, Luis Vitorio Cargnini, Mohammadreza Soltaniyeh, Dongyang Li, Gongjin Sun, Pradeep Subedi, Andrew Chang, Yiran Chen, Cong Hao

    Abstract: Compute Express Link (CXL) emerges as a solution for wide gap between computational speed and data communication rates among host and multiple devices. It fosters a unified and coherent memory space between host and CXL storage devices such as such as Solid-state drive (SSD) for memory expansion, with a corresponding DRAM implemented as the device cache. However, this introduces challenges such as… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: This paper is accepted by DAC2024

  33. arXiv:2408.05524  [pdf, other

    cs.CL cs.DB

    Context-Driven Index Trimming: A Data Quality Perspective to Enhancing Precision of RALMs

    Authors: Kexin Ma, Ruochun Jin, Xi Wang, Huan Chen, Jing Ren, Yuhua Tang

    Abstract: Retrieval-Augmented Large Language Models (RALMs) have made significant strides in enhancing the accuracy of generated responses.However, existing research often overlooks the data quality issues within retrieval results, often caused by inaccurate existing vector-distance-based retrieval methods.We propose to boost the precision of RALMs' answers from a data quality perspective through the Contex… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  34. arXiv:2408.05426  [pdf, other

    cs.CV

    SAM-FNet: SAM-Guided Fusion Network for Laryngo-Pharyngeal Tumor Detection

    Authors: Jia Wei, Yun Li, Meiyu Qiu, Hongyu Chen, Xiaomao Fan, Wenbin Lei

    Abstract: Laryngo-pharyngeal cancer (LPC) is a highly fatal malignant disease affecting the head and neck region. Previous studies on endoscopic tumor detection, particularly those leveraging dual-branch network architectures, have shown significant advancements in tumor detection. These studies highlight the potential of dual-branch networks in improving diagnostic accuracy by effectively integrating globa… ▽ More

    Submitted 14 August, 2024; v1 submitted 10 August, 2024; originally announced August 2024.

  35. arXiv:2408.03867  [pdf, other

    cs.CV

    Surgformer: Surgical Transformer with Hierarchical Temporal Attention for Surgical Phase Recognition

    Authors: Shu Yang, Luyang Luo, Qiong Wang, Hao Chen

    Abstract: Existing state-of-the-art methods for surgical phase recognition either rely on the extraction of spatial-temporal features at a short-range temporal resolution or adopt the sequential extraction of the spatial and temporal features across the entire temporal resolution. However, these methods have limitations in modeling spatial-temporal dependency and addressing spatial-temporal redundancy: 1) T… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  36. arXiv:2408.02479  [pdf, other

    cs.SE cs.AI cs.CL

    From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future

    Authors: Haolin Jin, Linghan Huang, Haipeng Cai, Jun Yan, Bo Li, Huaming Chen

    Abstract: With the rise of large language models (LLMs), researchers are increasingly exploring their applications in var ious vertical domains, such as software engineering. LLMs have achieved remarkable success in areas including code generation and vulnerability detection. However, they also exhibit numerous limitations and shortcomings. LLM-based agents, a novel tech nology with the potential for Artifi… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  37. arXiv:2408.02404  [pdf, other

    cs.IR

    Feedback Reciprocal Graph Collaborative Filtering

    Authors: Weijun Chen, Yuanchen Bei, Qijie Shen, Hao Chen, Xiao Huang, Feiran Huang

    Abstract: Collaborative filtering on user-item interaction graphs has achieved success in the industrial recommendation. However, recommending users' truly fascinated items poses a seesaw dilemma for collaborative filtering models learned from the interaction graph. On the one hand, not all items that users interact with are equally appealing. Some items are genuinely fascinating to users, while others are… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: 9 pages, accepted by CIKM 2024

  38. arXiv:2408.02285  [pdf, other

    cs.CV

    Joint-Motion Mutual Learning for Pose Estimation in Videos

    Authors: Sifan Wu, Haipeng Chen, Yifang Yin, Sihao Hu, Runyang Feng, Yingying Jiao, Ziqi Yang, Zhenguang Liu

    Abstract: Human pose estimation in videos has long been a compelling yet challenging task within the realm of computer vision. Nevertheless, this task remains difficult because of the complex video scenes, such as video defocus and self-occlusion. Recent methods strive to integrate multi-frame visual features generated by a backbone network for pose estimation. However, they often ignore the useful joint in… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: 10 pages, 5 figures

  39. arXiv:2408.02265  [pdf, other

    cs.CV

    Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts

    Authors: Andong Tan, Fengtao Zhou, Hao Chen

    Abstract: The concept bottleneck model (CBM) is an interpretable-by-design framework that makes decisions by first predicting a set of interpretable concepts, and then predicting the class label based on the given concepts. Existing CBMs are trained with a fixed set of concepts (concepts are either annotated by the dataset or queried from language models). However, this closed-world assumption is unrealisti… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: ECCV2024

  40. arXiv:2408.02213  [pdf, other

    cs.DB cs.AI

    Is Large Language Model Good at Database Knob Tuning? A Comprehensive Experimental Evaluation

    Authors: Yiyan Li, Haoyang Li, Zhao Pu, Jing Zhang, Xinyi Zhang, Tao Ji, Luming Sun, Cuiping Li, Hong Chen

    Abstract: Knob tuning plays a crucial role in optimizing databases by adjusting knobs to enhance database performance. However, traditional tuning methods often follow a Try-Collect-Adjust approach, proving inefficient and database-specific. Moreover, these methods are often opaque, making it challenging for DBAs to grasp the underlying decision-making process. The emergence of large language models (LLMs… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

  41. X.509 Information Security Certification Based on Post-Quantum Cryptography

    Authors: Abel C. H. Chen

    Abstract: In recent years, with the advancement of quantum computing, mainstream asymmetric cryptographic methods in the current Public Key Infrastructure (PKI) systems are gradually being threatened. Therefore, this study explores X.509 security certificates based on Post-Quantum Cryptography (PQC) and discusses implemented solutions. This study compares mainstream asymmetric cryptographic methods (includi… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

    Comments: The manuscript was submitted to arXiv on 6 May 2024, but it was rejected on 11 July 2024. The appeal was submitted on 11 July 2024, and it was accepted on 2 August 2024. The manuscript is written in Chinese language

  42. arXiv:2408.01285  [pdf, other

    cs.CL cs.CY

    The Mismeasure of Man and Models: Evaluating Allocational Harms in Large Language Models

    Authors: Hannah Chen, Yangfeng Ji, David Evans

    Abstract: Large language models (LLMs) are now being considered and even deployed for applications that support high-stakes decision-making, such as recruitment and clinical decisions. While several methods have been proposed for measuring bias, there remains a gap between predictions, which are what the proposed methods consider, and how they are used to make decisions. In this work, we introduce Rank-Allo… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  43. arXiv:2408.01038  [pdf, other

    cs.CL

    UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents

    Authors: Yi Tu, Chong Zhang, Ya Guo, Huan Chen, Jinyang Tang, Huijia Zhu, Qi Zhang

    Abstract: The recognition of named entities in visually-rich documents (VrD-NER) plays a critical role in various real-world scenarios and applications. However, the research in VrD-NER faces three major challenges: complex document layouts, incorrect reading orders, and unsuitable task formulations. To address these challenges, we propose a query-aware entity extraction head, namely UNER, to collaborate wi… ▽ More

    Submitted 11 August, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

    Comments: accepted by ACM Multimedia 2024

  44. Cross-domain Named Entity Recognition via Graph Matching

    Authors: Junhao Zheng, Haibin Chen, Qianli Ma

    Abstract: Cross-domain NER is a practical yet challenging problem since the data scarcity in the real-world scenario. A common practice is first to learn a NER model in a rich-resource general domain and then adapt the model to specific domains. Due to the mismatch problem between entity types across domains, the wide knowledge in the general domain can not effectively transfer to the target domain NER mode… ▽ More

    Submitted 7 August, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

    Comments: Findings of ACL; available at Findings 2022 https://aclanthology.org/2022.findings-acl.210/; Improve presentation

  45. arXiv:2408.00955  [pdf, other

    stat.ML cs.LG stat.ME

    Aggregation Models with Optimal Weights for Distributed Gaussian Processes

    Authors: Haoyuan Chen, Rui Tuo

    Abstract: Gaussian process (GP) models have received increasingly attentions in recent years due to their superb prediction accuracy and modeling flexibility. To address the computational burdens of GP models for large-scale datasets, distributed learning for GPs are often adopted. Current aggregation models for distributed GPs are not time-efficient when incorporating correlations between GP experts. In th… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: 25 pages, 12 figures, 3 tables

  46. arXiv:2408.00315  [pdf, other

    cs.LG cs.AI cs.CV

    ADBM: Adversarial diffusion bridge model for reliable adversarial purification

    Authors: Xiao Li, Wenxuan Sun, Huanran Chen, Qiongxiu Li, Yining Liu, Yingzhe He, Jie Shi, Xiaolin Hu

    Abstract: Recently Diffusion-based Purification (DiffPure) has been recognized as an effective defense method against adversarial examples. However, we find DiffPure which directly employs the original pre-trained diffusion models for adversarial purification, to be suboptimal. This is due to an inherent trade-off between noise purification performance and data recovery quality. Additionally, the reliabilit… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: 20 pages

  47. arXiv:2408.00139  [pdf, other

    cs.SI physics.soc-ph stat.AP

    Multiway Alignment of Political Attitudes

    Authors: Letizia Iannucci, Ali Faqeeh, Ali Salloum, Ted Hsuan Yun Chen, Mikko Kivelä

    Abstract: The related concepts of partisan belief systems, issue alignment, and partisan sorting are central to our understanding of politics. These phenomena have been studied using measures of alignment between pairs of topics, or how much individuals' attitudes toward a topic reveal about their attitudes toward another topic. We introduce a higher-order measure that extends the assessment of alignment be… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

  48. arXiv:2407.21323  [pdf

    eess.IV cs.CV

    STANet: A Novel Spatio-Temporal Aggregation Network for Depression Classification with Small and Unbalanced FMRI Data

    Authors: Wei Zhang, Weiming Zeng, Hongyu Chen, Jie Liu, Hongjie Yan, Kaile Zhang, Ran Tao, Wai Ting Siok, Nizhuan Wang

    Abstract: Accurate diagnosis of depression is crucial for timely implementation of optimal treatments, preventing complications and reducing the risk of suicide. Traditional methods rely on self-report questionnaires and clinical assessment, lacking objective biomarkers. Combining fMRI with artificial intelligence can enhance depression diagnosis by integrating neuroimaging indicators. However, the specific… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

  49. arXiv:2407.21034  [pdf, other

    cs.IR cs.CR cs.LG

    Watermarking Recommender Systems

    Authors: Sixiao Zhang, Cheng Long, Wei Yuan, Hongxu Chen, Hongzhi Yin

    Abstract: Recommender systems embody significant commercial value and represent crucial intellectual property. However, the integrity of these systems is constantly challenged by malicious actors seeking to steal their underlying models. Safeguarding against such threats is paramount to upholding the rights and interests of the model owner. While model watermarking has emerged as a potent defense mechanism… ▽ More

    Submitted 14 August, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

  50. arXiv:2407.20068  [pdf, other

    cs.CR cs.AI

    Unleash the Power of Ellipsis: Accuracy-enhanced Sparse Vector Technique with Exponential Noise

    Authors: Yuhan Liu, Sheng Wang, Yixuan Liu, Feifei Li, Hong Chen

    Abstract: The Sparse Vector Technique (SVT) is one of the most fundamental tools in differential privacy (DP). It works as a backbone for adaptive data analysis by answering a sequence of queries on a given dataset, and gleaning useful information in a privacy-preserving manner. Unlike the typical private query releases that directly publicize the noisy query results, SVT is less informative -- it keeps the… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.