Skip to main content

Showing 1–50 of 320 results for author: Ding, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.04170  [pdf

    cs.CV

    M2EF-NNs: Multimodal Multi-instance Evidence Fusion Neural Networks for Cancer Survival Prediction

    Authors: Hui Luo, Jiashuang Huang, Hengrong Ju, Tianyi Zhou, Weiping Ding

    Abstract: Accurate cancer survival prediction is crucial for assisting clinical doctors in formulating treatment plans. Multimodal data, including histopathological images and genomic data, offer complementary and comprehensive information that can greatly enhance the accuracy of this task. However, the current methods, despite yielding promising results, suffer from two notable limitations: they do not eff… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  2. FDiff-Fusion:Denoising diffusion fusion network based on fuzzy learning for 3D medical image segmentation

    Authors: Weiping Ding, Sheng Geng, Haipeng Wang, Jiashuang Huang, Tianyi Zhou

    Abstract: In recent years, the denoising diffusion model has achieved remarkable success in image segmentation modeling. With its powerful nonlinear modeling capabilities and superior generalization performance, denoising diffusion models have gradually been applied to medical image segmentation tasks, bringing new perspectives and methods to this field. However, existing methods overlook the uncertainty of… ▽ More

    Submitted 21 July, 2024; originally announced August 2024.

    Comments: This paper has been accepted by Information Fusion. Permission from Elsevier must be obtained for all other uses, in any current or future media. The final version is available at [doi:10.1016/J.INFFUS.2024.102540]

    Journal ref: Information Fusion, 2024: 102540

  3. arXiv:2408.01072  [pdf, other

    cs.AI

    A Survey on Self-play Methods in Reinforcement Learning

    Authors: Ruize Zhang, Zelai Xu, Chengdong Ma, Chao Yu, Wei-Wei Tu, Shiyu Huang, Deheng Ye, Wenbo Ding, Yaodong Yang, Yu Wang

    Abstract: Self-play, characterized by agents' interactions with copies or past versions of itself, has recently gained prominence in reinforcement learning. This paper first clarifies the preliminaries of self-play, including the multi-agent reinforcement learning framework and basic game theory concepts. Then it provides a unified framework and classifies existing self-play algorithms within this framework… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  4. arXiv:2408.00699  [pdf, other

    cs.LG

    Granular-Balls based Fuzzy Twin Support Vector Machine for Classification

    Authors: Lixi Zhao, Weiping Ding, Duoqian Miao, Guangming Lang

    Abstract: The twin support vector machine (TWSVM) classifier has attracted increasing attention because of its low computational complexity. However, its performance tends to degrade when samples are affected by noise. The granular-ball fuzzy support vector machine (GBFSVM) classifier partly alleviates the adverse effects of noise, but it relies solely on the distance between the granular-ball's center and… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  5. arXiv:2407.18333  [pdf, other

    cs.AR cs.AI

    AutoVCoder: A Systematic Framework for Automated Verilog Code Generation using LLMs

    Authors: Mingzhe Gao, Jieru Zhao, Zhe Lin, Wenchao Ding, Xiaofeng Hou, Yu Feng, Chao Li, Minyi Guo

    Abstract: Recently, the use of large language models (LLMs) for software code generation, e.g., C/C++ and Python, has proven a great success. However, LLMs still suffer from low syntactic and functional correctness when it comes to the generation of register-transfer level (RTL) code, such as Verilog. To address this issue, in this paper, we develop AutoVCoder, a systematic open-source framework that signif… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  6. Cascaded two-stage feature clustering and selection via separability and consistency in fuzzy decision systems

    Authors: Yuepeng Chen, Weiping Ding, Hengrong Ju, Jiashuang Huang, Tao Yin

    Abstract: Feature selection is a vital technique in machine learning, as it can reduce computational complexity, improve model performance, and mitigate the risk of overfitting. However, the increasing complexity and dimensionality of datasets pose significant challenges in the selection of features. Focusing on these challenges, this paper proposes a cascaded two-stage feature clustering and selection algo… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by IEEE Transactions on Fuzzy Systems for publication. Permission from IEEE must be obtained for all other uses, in any current or future media. The final version is available at [10.1109/TFUZZ.2024.3420963]

    Journal ref: IEEE Transactions on Fuzzy Systems 2024

  7. FMDNN: A Fuzzy-guided Multi-granular Deep Neural Network for Histopathological Image Classification

    Authors: Weiping Ding, Tianyi Zhou, Jiashuang Huang, Shu Jiang, Tao Hou, Chin-Teng Lin

    Abstract: Histopathological image classification constitutes a pivotal task in computer-aided diagnostics. The precise identification and categorization of histopathological images are of paramount significance for early disease detection and treatment. In the diagnostic process of pathologists, a multi-tiered approach is typically employed to assess abnormalities in cell regions at different magnifications… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by IEEE Transactions on Fuzzy Systems for publication. Permission from IEEE must be obtained for all other uses, in any current or future media. The final version is available at [doi: 10.1109/TFUZZ.2024.3410929]

    Journal ref: IEEE Transactions on Fuzzy Systems ( Early Access ) 2024

  8. arXiv:2407.14653  [pdf, other

    cs.LG

    OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning

    Authors: Yihang Yao, Zhepeng Cen, Wenhao Ding, Haohong Lin, Shiqi Liu, Tingnan Zhang, Wenhao Yu, Ding Zhao

    Abstract: Offline safe reinforcement learning (RL) aims to train a policy that satisfies constraints using a pre-collected dataset. Most current methods struggle with the mismatch between imperfect demonstrations and the desired safe and rewarding performance. In this paper, we introduce OASIS (cOnditionAl diStributIon Shaping), a new paradigm in offline safe RL designed to overcome these critical limitatio… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  9. arXiv:2407.12074  [pdf, other

    cs.LG cs.AI

    Enhancing Parameter Efficiency and Generalization in Large-Scale Models: A Regularized and Masked Low-Rank Adaptation Approach

    Authors: Yuzhu Mao, Siqi Ping, Zihao Zhao, Yang Liu, Wenbo Ding

    Abstract: Large pre-trained models, such as large language models (LLMs), present significant resource challenges for fine-tuning due to their extensive parameter sizes, especially for applications in mobile systems. To address this, Low-Rank Adaptation (LoRA) has been developed to reduce resource consumption while maintaining satisfactory fine-tuning results. Despite its effectiveness, the original LoRA me… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  10. Incremental high average-utility itemset mining: survey and challenges

    Authors: Jing Chen, Shengyi Yang, Weiping Ding, Peng Li, Aijun Liu, Hongjun Zhang, Tian Li

    Abstract: The High Average Utility Itemset Mining (HAUIM) technique, a variation of High Utility Itemset Mining (HUIM), uses the average utility of the itemsets. Historically, most HAUIM algorithms were designed for static databases. However, practical applications like market basket analysis and business decision-making necessitate regular updates of the database with new transactions. As a result, researc… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 25 pages, 23 figures

  11. arXiv:2407.10967  [pdf, other

    cs.LG cs.AI

    BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning

    Authors: Haohong Lin, Wenhao Ding, Jian Chen, Laixi Shi, Jiacheng Zhu, Bo Li, Ding Zhao

    Abstract: Offline model-based reinforcement learning (MBRL) enhances data efficiency by utilizing pre-collected datasets to learn models and policies, especially in scenarios where exploration is costly or infeasible. Nevertheless, its performance often suffers from the objective mismatch between model and policy learning, resulting in inferior performance despite accurate model predictions. This paper firs… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  12. arXiv:2407.06842  [pdf, other

    cs.CV

    Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts

    Authors: Shuangkang Fang, Yufeng Wang, Yi-Hsuan Tsai, Yi Yang, Wenrui Ding, Shuchang Zhou, Ming-Hsuan Yang

    Abstract: Recent work on image content manipulation based on vision-language pre-training models has been effectively extended to text-driven 3D scene editing. However, existing schemes for 3D scene editing still exhibit certain shortcomings, hindering their further interactive design. Such schemes typically adhere to fixed input patterns, limiting users' flexibility in text input. Moreover, their editing c… ▽ More

    Submitted 9 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024; Project Website: https://sk-fun.fun/CE3D

  13. arXiv:2407.06754  [pdf, other

    cs.DC cs.AI

    Threats and Defenses in Federated Learning Life Cycle: A Comprehensive Survey and Challenges

    Authors: Yanli Li, Zhongliang Guo, Nan Yang, Huaming Chen, Dong Yuan, Weiping Ding

    Abstract: Federated Learning (FL) offers innovative solutions for privacy-preserving collaborative machine learning (ML). Despite its promising potential, FL is vulnerable to various attacks due to its distributed nature, affecting the entire life cycle of FL services. These threats can harm the model's utility or compromise participants' privacy, either directly or indirectly. In response, numerous defense… ▽ More

    Submitted 11 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  14. arXiv:2407.04368  [pdf, other

    cs.CL cs.SD eess.AS

    Romanization Encoding For Multilingual ASR

    Authors: Wen Ding, Fei Jia, Hainan Xu, Yu Xi, Junjie Lai, Boris Ginsburg

    Abstract: We introduce romanization encoding for script-heavy languages to optimize multilingual and code-switching Automatic Speech Recognition (ASR) systems. By adopting romanization encoding alongside a balanced concatenated tokenizer within a FastConformer-RNNT framework equipped with a Roman2Char module, we significantly reduce vocabulary and output dimensions, enabling larger training batches and redu… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  15. arXiv:2407.03543  [pdf, ps, other

    cs.CR

    Asymmetric Mempool DoS Security: Formal Definitions and Provable Secure Designs

    Authors: Wanning Ding, Yibo Wang, Yuzhe Tang

    Abstract: The mempool plays a crucial role in blockchain systems as a buffer zone for pending transactions before they are executed and included in a block. However, existing works primarily focus on mitigating defenses against already identified real-world attacks. This paper introduces secure blockchain-mempool designs capable of defending against any form of asymmetric eviction DoS attacks. We establish… ▽ More

    Submitted 24 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

  16. arXiv:2406.15948  [pdf, other

    cs.CL

    Teaching LLMs to Abstain across Languages via Multilingual Feedback

    Authors: Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Orevaoghene Ahia, Shuyue Stella Li, Vidhisha Balachandran, Sunayana Sitaram, Yulia Tsvetkov

    Abstract: Multilingual LLMs often have knowledge disparities across languages, with larger gaps in under-resourced languages. Teaching LLMs to abstain in the face of knowledge gaps is thus a promising strategy to mitigate hallucinations in multilingual settings. However, previous studies on LLM abstention primarily focus on English; we find that directly applying existing solutions beyond English results in… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  17. arXiv:2406.14434  [pdf, other

    cs.CL

    Towards Truthful Multilingual Large Language Models: Benchmarking and Alignment Strategies

    Authors: Weihao Liu, Ning Wu, Wenbiao Ding, Shining Liang, Ming Gong, Dongmei Zhang

    Abstract: In the era of large language models (LLMs), building multilingual large language models (MLLMs) that can serve users worldwide holds great significance. However, existing research seldom focuses on the truthfulness of MLLMs. Meanwhile, contemporary multilingual aligning technologies struggle to balance massive languages and often exhibit serious truthfulness gaps across different languages, especi… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 15 pages

  18. When Vision Meets Touch: A Contemporary Review for Visuotactile Sensors from the Signal Processing Perspective

    Authors: Shoujie Li, Zihan Wang, Changsheng Wu, Xiang Li, Shan Luo, Bin Fang, Fuchun Sun, Xiao-Ping Zhang, Wenbo Ding

    Abstract: Tactile sensors, which provide information about the physical properties of objects, are an essential component of robotic systems. The visuotactile sensing technology with the merits of high resolution and low cost has facilitated the development of robotics from environment exploration to dexterous operation. Over the years, several reviews on visuotactile sensors for robots have been presented,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE Journal of Selected Topics in Signal Processing

  19. arXiv:2406.10885  [pdf, other

    cs.CL

    On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions

    Authors: Weiqi Wang, Tianqing Fang, Haochen Shi, Baixuan Xu, Wenxuan Ding, Liyu Zhang, Wei Fan, Jiaxin Bai, Haoran Li, Xin Liu, Yangqiu Song

    Abstract: Entity- and event-level conceptualization, as fundamental elements of human cognition, plays a pivotal role in generalizable reasoning. This process involves abstracting specific instances into higher-level concepts and forming abstract knowledge that can be applied in unfamiliar or novel situations, which can enhance models' inferential capabilities and support the effective transfer of knowledge… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  20. arXiv:2406.10701  [pdf, other

    cs.CL

    MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding

    Authors: Baixuan Xu, Weiqi Wang, Haochen Shi, Wenxuan Ding, Huihao Jing, Tianqing Fang, Jiaxin Bai, Long Chen, Yangqiu Song

    Abstract: Improving user experience and providing personalized search results in E-commerce platforms heavily rely on understanding purchase intention. However, existing methods for acquiring large-scale intentions bank on distilling large language models with human annotation for verification. Such an approach tends to generate product-centric intentions, overlook valuable visual information from product i… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures

  21. arXiv:2406.10173  [pdf, other

    cs.CL

    IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce

    Authors: Wenxuan Ding, Weiqi Wang, Sze Heng Douglas Kwok, Minghao Liu, Tianqing Fang, Jiaxin Bai, Junxian He, Yangqiu Song

    Abstract: Enhancing Language Models' (LMs) ability to understand purchase intentions in E-commerce scenarios is crucial for their effective assistance in various downstream tasks. However, previous approaches that distill intentions from LMs often fail to generate meaningful and human-centric intentions applicable in real-world E-commerce contexts. This raises concerns about the true comprehension and utili… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  22. arXiv:2406.08455  [pdf, other

    cs.RO

    AToM-Bot: Embodied Fulfillment of Unspoken Human Needs with Affective Theory of Mind

    Authors: Wei Ding, Fanhong Li, Ziteng Ji, Zhengrong Xue, Jia Liu

    Abstract: We propose AToM-Bot, a novel task generation and execution framework for proactive robot-human interaction, which leverages the human mental and physical state inference capabilities of the Vision Language Model (VLM) prompted by the Affective Theory of Mind (AToM). Without requiring explicit commands by humans, AToM-Bot proactively generates and follows feasible tasks to improve general human wel… ▽ More

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  23. arXiv:2406.08160  [pdf, other

    cs.RO

    Chemistry3D: Robotic Interaction Benchmark for Chemistry Experiments

    Authors: Shoujie Li, Yan Huang, Changqing Guo, Tong Wu, Jiawei Zhang, Linrui Zhang, Wenbo Ding

    Abstract: The advent of simulation engines has revolutionized learning and operational efficiency for robots, offering cost-effective and swift pipelines. However, the lack of a universal simulation platform tailored for chemical scenarios impedes progress in robotic manipulation and visualization of reaction processes. Addressing this void, we present Chemistry3D, an innovative toolkit that integrates exte… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  24. arXiv:2405.10288  [pdf, other

    cs.CL cs.AI

    Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction

    Authors: Jianhao Chen, Haoyuan Ouyang, Junyang Ren, Wentao Ding, Wei Hu, Yuzhong Qu

    Abstract: Facts extraction is pivotal for constructing knowledge graphs. Recently, the increasing demand for temporal facts in downstream tasks has led to the emergence of the task of temporal fact extraction. In this paper, we specifically address the extraction of temporal facts from natural language text. Previous studies fail to handle the challenge of establishing time-to-fact correspondences in comple… ▽ More

    Submitted 18 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL2024 main conference

  25. arXiv:2405.00334  [pdf, other

    cs.LG

    A Survey on Deep Active Learning: Recent Advances and New Frontiers

    Authors: Dongyuan Li, Zhen Wang, Yankai Chen, Renhe Jiang, Weiping Ding, Manabu Okumura

    Abstract: Active learning seeks to achieve strong performance with fewer training samples. It does this by iteratively asking an oracle to label new selected samples in a human-in-the-loop manner. This technique has gained increasing popularity due to its broad applicability, yet its survey papers, especially for deep learning-based active learning (DAL), remain scarce. Therefore, we conduct an advanced and… ▽ More

    Submitted 15 July, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: This paper is accepted by IEEE Transactions on Neural Networks and Learning Systems

  26. arXiv:2404.15384  [pdf, other

    cs.LG cs.AI

    FL-TAC: Enhanced Fine-Tuning in Federated Learning via Low-Rank, Task-Specific Adapter Clustering

    Authors: Siqi Ping, Yuzhu Mao, Yang Liu, Xiao-Ping Zhang, Wenbo Ding

    Abstract: Although large-scale pre-trained models hold great potential for adapting to downstream tasks through fine-tuning, the performance of such fine-tuned models is often limited by the difficulty of collecting sufficient high-quality, task-specific data. Federated Learning (FL) offers a promising solution by enabling fine-tuning across large-scale clients with a variety of task data, but it is bottlen… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  27. arXiv:2404.12045  [pdf, other

    cs.AI cs.CL

    RAM: Towards an Ever-Improving Memory System by Learning from Communications

    Authors: Jiaqi Li, Xiaobo Wang, Wentao Ding, Zihao Wang, Yipeng Kang, Zixia Jia, Zilong Zheng

    Abstract: We introduce an innovative RAG-based framework with an ever-improving memory. Inspired by humans'pedagogical process, RAM utilizes recursively reasoning-based retrieval and experience reflections to continually update the memory and learn from users' communicative feedback, namely communicative learning. Extensive experiments with both simulated and real users demonstrate significant improvements… ▽ More

    Submitted 5 July, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  28. arXiv:2404.10342  [pdf, other

    cs.CV cs.MM

    Referring Flexible Image Restoration

    Authors: Runwei Guan, Rongsheng Hu, Zhuhao Zhou, Tianlang Xue, Ka Lok Man, Jeremy Smith, Eng Gee Lim, Weiping Ding, Yutao Yue

    Abstract: In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 15 pages, 19 figures

  29. arXiv:2404.08501  [pdf, ps, other

    cs.NE cs.AI

    Analyzing and Overcoming Local Optima in Complex Multi-Objective Optimization by Decomposition-Based Evolutionary Algorithms

    Authors: Ting Dong, Haoxin Wang, Hengxi Zhang, Wenbo Ding

    Abstract: When addressing the challenge of complex multi-objective optimization problems, particularly those with non-convex and non-uniform Pareto fronts, Decomposition-based Multi-Objective Evolutionary Algorithms (MOEADs) often converge to local optima, thereby limiting solution diversity. Despite its significance, this issue has received limited theoretical exploration. Through a comprehensive geometric… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  30. The Survey on Multi-Source Data Fusion in Cyber-Physical-Social Systems:Foundational Infrastructure for Industrial Metaverses and Industries 5.0

    Authors: Xiao Wang, Yutong Wang, Jing Yang, Xiaofeng Jia, Lijun Li, Weiping Ding, Fei-Yue Wang

    Abstract: As the concept of Industries 5.0 develops, industrial metaverses are expected to operate in parallel with the actual industrial processes to offer ``Human-Centric" Safe, Secure, Sustainable, Sensitive, Service, and Smartness ``6S" manufacturing solutions. Industrial metaverses not only visualize the process of productivity in a dynamic and evolutional way, but also provide an immersive laboratory… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Journal ref: Information Fusion 2024

  31. arXiv:2404.06836  [pdf, other

    cs.CV

    O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation

    Authors: Muer Tie, Julong Wei, Zhengjun Wang, Ke Wu, Shansuai Yuan, Kaizhao Zhang, Jie Jia, Jieru Zhao, Zhongxue Gan, Wenchao Ding

    Abstract: Online construction of open-ended language scenes is crucial for robotic applications, where open-vocabulary interactive scene understanding is required. Recently, neural implicit representation has provided a promising direction for online interactive mapping. However, implementing open-vocabulary scene understanding capability into online neural implicit mapping still faces three challenges: lac… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  32. arXiv:2404.06036  [pdf, other

    cs.CV

    Space-Time Video Super-resolution with Neural Operator

    Authors: Yuantong Zhang, Hanyou Zheng, Daiqin Yang, Zhenzhong Chen, Haichuan Ma, Wenpeng Ding

    Abstract: This paper addresses the task of space-time video super-resolution (ST-VSR). Existing methods generally suffer from inaccurate motion estimation and motion compensation (MEMC) problems for large motions. Inspired by recent progress in physics-informed neural networks, we model the challenges of MEMC in ST-VSR as a mapping between two continuous function spaces. Specifically, our approach transform… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  33. arXiv:2404.04970  [pdf, other

    cs.LG

    How to characterize imprecision in multi-view clustering?

    Authors: Jinyi Xu, Zuowei Zhang, Ze Lin, Yixiang Chen, Zhe Liu, Weiping Ding

    Abstract: It is still challenging to cluster multi-view data since existing methods can only assign an object to a specific (singleton) cluster when combining different view information. As a result, it fails to characterize imprecision of objects in overlapping regions of different clusters, thus leading to a high risk of errors. In this paper, we thereby want to answer the question: how to characterize im… ▽ More

    Submitted 7 July, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

    Comments: 19 pages with 8 pages of supplementary

  34. arXiv:2404.04799  [pdf, other

    cs.CV

    Few-Shot Object Detection: Research Advances and Challenges

    Authors: Zhimeng Xin, Shiming Chen, Tianxu Wu, Yuanjie Shao, Weiping Ding, Xinge You

    Abstract: Object detection as a subfield within computer vision has achieved remarkable progress, which aims to accurately identify and locate a specific object from images or videos. Such methods rely on large-scale labeled training samples for each object category to ensure accurate detection, but obtaining extensive annotated data is a labor-intensive and expensive process in many real-world scenarios. T… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  35. arXiv:2404.00443  [pdf, ps, other

    cs.RO

    UDE-based Dynamic Motion Force Control of Mobile Manipulators

    Authors: Songqun Gao, Wendi Ding, Qinyuan Ren, Ben M. Chen

    Abstract: Mobile manipulators are known for their superior mobility over manipulators on fixed bases, offering promising applications in smart industry and housekeeping scenarios. However, the dynamic coupling nature between the mobile base and the manipulator presents challenges for the physical interactive tasks of the mobile manipulator. Current methods suffer from complex modeling processes and poor tra… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  36. arXiv:2403.20327  [pdf, other

    cs.CL cs.AI

    Gecko: Versatile Text Embeddings Distilled from Large Language Models

    Authors: Jinhyuk Lee, Zhuyun Dai, Xiaoqi Ren, Blair Chen, Daniel Cer, Jeremy R. Cole, Kai Hui, Michael Boratko, Rajvi Kapadia, Wen Ding, Yi Luan, Sai Meher Karthik Duddu, Gustavo Hernandez Abrego, Weiqiang Shi, Nithi Gupta, Aditya Kusupati, Prateek Jain, Siddhartha Reddy Jonnalagadda, Ming-Wei Chang, Iftekhar Naim

    Abstract: We present Gecko, a compact and versatile text embedding model. Gecko achieves strong retrieval performance by leveraging a key idea: distilling knowledge from large language models (LLMs) into a retriever. Our two-step distillation process begins with generating diverse, synthetic paired data using an LLM. Next, we further refine the data quality by retrieving a set of candidate passages for each… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 18 pages

  37. arXiv:2403.20159  [pdf, other

    cs.CV

    HGS-Mapping: Online Dense Mapping Using Hybrid Gaussian Representation in Urban Scenes

    Authors: Ke Wu, Kaizhao Zhang, Zhiwei Zhang, Shanshuai Yuan, Muer Tie, Julong Wei, Zijun Xu, Jieru Zhao, Zhongxue Gan, Wenchao Ding

    Abstract: Online dense mapping of urban scenes forms a fundamental cornerstone for scene understanding and navigation of autonomous vehicles. Recent advancements in mapping methods are mainly based on NeRF, whose rendering speed is too slow to meet online requirements. 3D Gaussian Splatting (3DGS), with its rendering speed hundreds of times faster than NeRF, holds greater potential in online dense mapping.… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

  38. Fusion Dynamical Systems with Machine Learning in Imitation Learning: A Comprehensive Overview

    Authors: Yingbai Hu, Fares J. Abu-Dakka, Fei Chen, Xiao Luo, Zheng Li, Alois Knoll, Weiping Ding

    Abstract: Imitation Learning (IL), also referred to as Learning from Demonstration (LfD), holds significant promise for capturing expert motor skills through efficient imitation, facilitating adept navigation of complex scenarios. A persistent challenge in IL lies in extending generalization from historical demonstrations, enabling the acquisition of new skills without re-teaching. Dynamical system-based IL… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  39. arXiv:2403.18957  [pdf, other

    cs.CY cs.CL cs.LG cs.SI

    Moderating Illicit Online Image Promotion for Unsafe User-Generated Content Games Using Large Vision-Language Models

    Authors: Keyan Guo, Ayush Utkarsh, Wenbo Ding, Isabelle Ondracek, Ziming Zhao, Guo Freeman, Nishant Vishwamitra, Hongxin Hu

    Abstract: Online user generated content games (UGCGs) are increasingly popular among children and adolescents for social interaction and more creative online entertainment. However, they pose a heightened risk of exposure to explicit content, raising growing concerns for the online safety of children and adolescents. Despite these concerns, few studies have addressed the issue of illicit image-based promoti… ▽ More

    Submitted 12 August, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: To Appear in the 33rd USENIX Security Symposium, August 14-16, 2024

  40. arXiv:2403.13208  [pdf, other

    cs.RO

    CaDRE: Controllable and Diverse Generation of Safety-Critical Driving Scenarios using Real-World Trajectories

    Authors: Peide Huang, Wenhao Ding, Jonathan Francis, Bingqing Chen, Ding Zhao

    Abstract: Simulation is an indispensable tool in the development and testing of autonomous vehicles (AVs), offering an efficient and safe alternative to road testing by allowing the exploration of a wide range of scenarios. Despite its advantages, a significant challenge within simulation-based testing is the generation of safety-critical scenarios, which are essential to ensure that AVs can handle rare but… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  41. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  42. arXiv:2402.11496  [pdf, other

    cs.RO

    Point-Wise Vibration Pattern Production via a Sparse Actuator Array for Surface Tactile Feedback

    Authors: Xiaosa Li, Runze Zhao, Chengyue Lu, Xiao Xiao, Wenbo Ding

    Abstract: Surface vibration tactile feedback is capable of conveying various semantic information to humans via the handheld electronic devices, like smartphone, touch panel,and game controller. However, covering the whole device contacting surface with dense actuator arrangement can affect its normal use, how to produce desired vibration patterns at any contact point with only several sparse actuators depl… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  43. arXiv:2402.07648  [pdf, other

    cs.RO

    DeformNet: Latent Space Modeling and Dynamics Prediction for Deformable Object Manipulation

    Authors: Chenchang Li, Zihao Ai, Tong Wu, Xiaosa Li, Wenbo Ding, Huazhe Xu

    Abstract: Manipulating deformable objects is a ubiquitous task in household environments, demanding adequate representation and accurate dynamics prediction due to the objects' infinite degrees of freedom. This work proposes DeformNet, which utilizes latent space modeling with a learned 3D representation model to tackle these challenges effectively. The proposed representation model combines a PointNet enco… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 7 pages, Submitted to 2024 IEEE International Conference on Robotics and Automation (ICRA), Japan, Yokohama

  44. arXiv:2402.05948  [pdf, other

    cs.LG cs.CL

    DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks

    Authors: Jianing He, Qi Zhang, Weiping Ding, Duoqian Miao, Jun Zhao, Liang Hu, Longbing Cao

    Abstract: Early exiting has demonstrated its effectiveness in accelerating the inference of pre-trained language models like BERT by dynamically adjusting the number of layers executed. However, most existing early exiting methods only consider local information from an individual test sample to determine their exiting indicators, failing to leverage the global information offered by sample population. This… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: 16 pages

  45. arXiv:2402.05725  [pdf, other

    cs.RO eess.SP

    Dual-modal Tactile E-skin: Enabling Bidirectional Human-Robot Interaction via Integrated Tactile Perception and Feedback

    Authors: Shilong Mu, Runze Zhao, Zenan Lin, Yan Huang, Shoujie Li, Chenchang Li, Xiao-Ping Zhang, Wenbo Ding

    Abstract: To foster an immersive and natural human-robot interaction, the implementation of tactile perception and feedback becomes imperative, effectively bridging the conventional sensory gap. In this paper, we propose a dual-modal electronic skin (e-skin) that integrates magnetic tactile sensing and vibration feedback for enhanced human-robot interaction. The dual-modal tactile e-skin offers multi-functi… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 7 pages, 8 figures. Submitted to 2024 IEEE International Conference on Robotics and Automation (ICRA), Japan, Yokohama

  46. arXiv:2402.02175  [pdf, other

    cs.CL cs.IR

    Enhancing Complex Question Answering over Knowledge Graphs through Evidence Pattern Retrieval

    Authors: Wentao Ding, Jinmao Li, Liangchuan Luo, Yuzhong Qu

    Abstract: Information retrieval (IR) methods for KGQA consist of two stages: subgraph extraction and answer reasoning. We argue current subgraph extraction methods underestimate the importance of structural dependencies among evidence facts. We propose Evidence Pattern Retrieval (EPR) to explicitly model the structural dependencies during subgraph extraction. We implement EPR by indexing the atomic adjacenc… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: Accepted to TheWebConf'24 (WWW 2024). This is a preprint version; the CR version will include more details. Github: https://github.com/nju-websoft/EPR-KGQA

  47. arXiv:2402.00585  [pdf, other

    cs.RO

    SATac: A Thermoluminescence Enabled Tactile Sensor for Concurrent Perception of Temperature, Pressure, and Shear

    Authors: Ziwu Song, Ran Yu, Xuan Zhang, Kit Wa Sou, Shilong Mu, Dengfeng Peng, Xiao-Ping Zhang, Wenbo Ding

    Abstract: Most vision-based tactile sensors use elastomer deformation to infer tactile information, which can not sense some modalities, like temperature. As an important part of human tactile perception, temperature sensing can help robots better interact with the environment. In this work, we propose a novel multimodal vision-based tactile sensor, SATac, which can simultaneously perceive information of te… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  48. arXiv:2402.00367  [pdf, other

    cs.CL

    Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration

    Authors: Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Vidhisha Balachandran, Yulia Tsvetkov

    Abstract: Despite efforts to expand the knowledge of large language models (LLMs), knowledge gaps -- missing or outdated information in LLMs -- might always persist given the evolving nature of knowledge. In this work, we study approaches to identify LLM knowledge gaps and abstain from answering questions when knowledge gaps are present. We first adapt existing approaches to model calibration or adaptation… ▽ More

    Submitted 30 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: ACL 2024

  49. arXiv:2401.16791  [pdf, other

    cs.LG

    Accelerated Cloud for Artificial Intelligence (ACAI)

    Authors: Dachi Chen, Weitian Ding, Chen Liang, Chang Xu, Junwei Zhang, Majd Sakr

    Abstract: Training an effective Machine learning (ML) model is an iterative process that requires effort in multiple dimensions. Vertically, a single pipeline typically includes an initial ETL (Extract, Transform, Load) of raw datasets, a model training stage, and an evaluation stage where the practitioners obtain statistics of the model performance. Horizontally, many such pipelines may be required to find… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  50. arXiv:2401.13462  [pdf, other

    cs.RO cs.AI

    Growing from Exploration: A self-exploring framework for robots based on foundation models

    Authors: Shoujie Li, Ran Yu, Tong Wu, JunWen Zhong, Xiao-Ping Zhang, Wenbo Ding

    Abstract: Intelligent robot is the ultimate goal in the robotics field. Existing works leverage learning-based or optimization-based methods to accomplish human-defined tasks. However, the challenge of enabling robots to explore various environments autonomously remains unresolved. In this work, we propose a framework named GExp, which enables robots to explore and learn autonomously without human intervent… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 19 pages