Skip to main content

Showing 1–50 of 312 results for author: Yu, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.06277  [pdf, other

    cs.LG cs.AI

    Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models

    Authors: Yao Shu, Wenyang Hu, See-Kiong Ng, Bryan Kian Hsiang Low, Fei Richard Yu

    Abstract: Large Language Models (LLMs) have become indispensable in numerous real-world applications. Unfortunately, fine-tuning these models at scale, especially in federated settings where data privacy and communication efficiency are critical, presents significant challenges. Existing methods often resort to parameter-efficient fine-tuning (PEFT) to mitigate communication overhead, but this typically com… ▽ More

    Submitted 10 September, 2024; v1 submitted 10 September, 2024; originally announced September 2024.

  2. arXiv:2409.04390  [pdf, other

    cs.CV

    Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences

    Authors: Rui Yu, Runkai Zhao, Cong Nie, Heng Wang, HuaiCheng Yan, Meng Wang

    Abstract: Accurate and robust LiDAR 3D object detection is essential for comprehensive scene understanding in autonomous driving. Despite its importance, LiDAR detection performance is limited by inherent constraints of point cloud data, particularly under conditions of extended distances and occlusions. Recently, temporal aggregation has been proven to significantly enhance detection accuracy by fusing mul… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

  3. arXiv:2409.01581  [pdf, other

    cs.RO cs.AI

    GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting

    Authors: Zixuan Guo, Yifan Xie, Weijing Xie, Peng Huang, Fei Ma, Fei Richard Yu

    Abstract: Dense colored point clouds enhance visual perception and are of significant value in various robotic applications. However, existing learning-based point cloud upsampling methods are constrained by computational resources and batch processing strategies, which often require subdividing point clouds into smaller patches, leading to distortions that degrade perceptual quality. To address this challe… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Comments: 7 pages, 5 figures

  4. arXiv:2409.00897  [pdf, other

    cs.NI cs.CR cs.ET

    Infiltrating the Sky: Data Delay and Overflow Attacks in Earth Observation Constellations

    Authors: Xiaojian Wang, Ruozhou Yu, Dejun Yang, Guoliang Xue

    Abstract: Low Earth Orbit (LEO) Earth Observation (EO) satellites have changed the way we monitor Earth. Acting like moving cameras, EO satellites are formed in constellations with different missions and priorities, and capture vast data that needs to be transmitted to the ground for processing. However, EO satellites have very limited downlink communication capability, limited by transmission bandwidth, nu… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  5. arXiv:2408.16982  [pdf, other

    cs.CV cs.GR

    2DGH: 2D Gaussian-Hermite Splatting for High-quality Rendering and Better Geometry Reconstruction

    Authors: Ruihan Yu, Tianyu Huang, Jingwang Ling, Feng Xu

    Abstract: 2D Gaussian Splatting has recently emerged as a significant method in 3D reconstruction, enabling novel view synthesis and geometry reconstruction simultaneously. While the well-known Gaussian kernel is broadly used, its lack of anisotropy and deformation ability leads to dim and vague edges at object silhouettes, limiting the reconstruction quality of current Gaussian splatting methods. To enhanc… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  6. arXiv:2408.15516  [pdf, other

    cs.NI

    Predicting Parameter Change's Effect on Cellular Network Time Series

    Authors: Mingjie Li, Yongqian Sun, Xiaolei Hua, Renkai Yu, Xinwen Fan, Lin Zhu, Junlan Feng, Dan Pei

    Abstract: The cellular network provides convenient network access for ever-growing mobile phones. During the continuous optimization, operators can adjust cell parameters to enhance the Quality of Service (QoS) flexibly. A precise prediction of the parameter change's effect can help operators make proper parameter adjustments. This work focuses on predicting cell status (like the workload and QoS) after adj… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  7. arXiv:2408.15270  [pdf, other

    cs.CV cs.GR cs.LG cs.RO

    SkillMimic: Learning Reusable Basketball Skills from Demonstrations

    Authors: Yinhuai Wang, Qihan Zhao, Runyi Yu, Ailing Zeng, Jing Lin, Zhengyi Luo, Hok Wai Tsui, Jiwen Yu, Xiu Li, Qifeng Chen, Jian Zhang, Lei Zhang, Ping Tan

    Abstract: Mastering basketball skills such as diverse layups and dribbling involves complex interactions with the ball and requires real-time adjustments. Traditional reinforcement learning methods for interaction skills rely on labor-intensive, manually designed rewards that do not generalize well across different skills. Inspired by how humans learn from demonstrations, we propose SkillMimic, a data-drive… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  8. arXiv:2408.14997  [pdf, other

    cs.RO cs.CV

    Depth Restoration of Hand-Held Transparent Objects for Human-to-Robot Handover

    Authors: Ran Yu, Haixin Yu, Huang Yan, Ziwu Song, Shoujie Li, Wenbo Ding

    Abstract: Transparent objects are common in daily life, while their unique optical properties pose challenges for RGB-D cameras, which struggle to capture accurate depth information. For assistant robots, accurately perceiving transparent objects held by humans is essential for effective human-robot interaction. This paper presents a Hand-Aware Depth Restoration (HADR) method for hand-held transparent objec… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: 7 pages, 7 figures, conference

  9. arXiv:2408.13740  [pdf, other

    cs.RO

    PIE: Parkour with Implicit-Explicit Learning Framework for Legged Robots

    Authors: Shixin Luo, Songbo Li, Ruiqi Yu, Zhicheng Wang, Jun Wu, Qiuguo Zhu

    Abstract: Parkour presents a highly challenging task for legged robots, requiring them to traverse various terrains with agile and smooth locomotion. This necessitates comprehensive understanding of both the robot's own state and the surrounding terrain, despite the inherent unreliability of robot perception and actuation. Current state-of-the-art methods either rely on complex pre-trained high-level terrai… ▽ More

    Submitted 3 September, 2024; v1 submitted 25 August, 2024; originally announced August 2024.

    Comments: Accepted for IEEE Robotics and Automation Letters (RA-L)

  10. arXiv:2408.13501  [pdf

    cs.CL cs.IR

    Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine against COVID-19 Literature: Comparative Study

    Authors: Xu Tong, Nina Smirnova, Sharmila Upadhyaya, Ran Yu, Jack H. Culbert, Chao Sun, Wolfgang Otto, Philipp Mayr

    Abstract: Objective: To explore and compare the performance of ChatGPT and other state-of-the-art LLMs on domain-specific NER tasks covering different entity types and domains in TCM against COVID-19 literature. Methods: We established a dataset of 389 articles on TCM against COVID-19, and manually annotated 48 of them with 6 types of entities belonging to 3 domains as the ground truth, against which the NE… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

    Comments: 22 pages with 2 figures

    ACM Class: H.3.3

  11. arXiv:2408.10903  [pdf, other

    cs.CL cs.HC

    BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model

    Authors: Yeyong Yu, Runsheng Yu, Haojie Wei, Zhanqiu Zhang, Quan Qian

    Abstract: The rapid advancement of large language models (LLMs) has revolutionized role-playing, enabling the development of general role-playing models. However, current role-playing training has two significant issues: (I) Using a predefined role profile to prompt dialogue training for specific scenarios usually leads to inconsistencies and even conflicts between the dialogue and the profile, resulting in… ▽ More

    Submitted 28 August, 2024; v1 submitted 20 August, 2024; originally announced August 2024.

  12. arXiv:2408.10774  [pdf, other

    cs.AI cs.CL

    Flexora: Flexible Low Rank Adaptation for Large Language Models

    Authors: Chenxing Wei, Yao Shu, Ying Tiffany He, Fei Richard Yu

    Abstract: Large Language Models (LLMs) are driving advancements in artificial intelligence by increasing the scale of model parameters, which has significantly enhanced generalization ability and unlocked new capabilities in practice. However, their performance in specific downstream tasks is usually hindered by their knowledge boundaries on these tasks. Thus, fine-tuning techniques, especially the widely u… ▽ More

    Submitted 21 August, 2024; v1 submitted 20 August, 2024; originally announced August 2024.

    Comments: 29 pages, 13 figures

  13. arXiv:2408.09698  [pdf, other

    cs.IR cs.AI

    Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation

    Authors: Yuyang Ye, Zhi Zheng, Yishan Shen, Tianshu Wang, Hengruo Zhang, Peijun Zhu, Runlong Yu, Kai Zhang, Hui Xiong

    Abstract: Recent advances in Large Language Models (LLMs) have demonstrated significant potential in the field of Recommendation Systems (RSs). Most existing studies have focused on converting user behavior logs into textual prompts and leveraging techniques such as prompt tuning to enable LLMs for recommendation tasks. Meanwhile, research interest has recently grown in multimodal recommendation systems tha… ▽ More

    Submitted 20 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  14. arXiv:2408.08201  [pdf, other

    cs.CV

    Heavy Labels Out! Dataset Distillation with Label Space Lightening

    Authors: Ruonan Yu, Songhua Liu, Zigeng Chen, Jingwen Ye, Xinchao Wang

    Abstract: Dataset distillation or condensation aims to condense a large-scale training dataset into a much smaller synthetic one such that the training performance of distilled and original sets on neural networks are similar. Although the number of training samples can be reduced substantially, current state-of-the-art methods heavily rely on enormous soft labels to achieve satisfactory performance. As a r… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  15. arXiv:2408.07490  [pdf, other

    cs.CV

    Attention-Guided Perturbation for Unsupervised Image Anomaly Detection

    Authors: Tingfeng Huang, Yuxuan Cheng, Jingbo Xia, Rui Yu, Yuxuan Cai, Jinhai Xiang, Xinwei He, Xiang Bai

    Abstract: Reconstruction-based methods have significantly advanced modern unsupervised anomaly detection. However, the strong capacity of neural networks often violates the underlying assumptions by reconstructing abnormal samples well. To alleviate this issue, we present a simple yet effective reconstruction framework named Attention-Guided Pertuation Network (AGPNet), which learns to add perturbation nois… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  16. arXiv:2408.06656  [pdf, other

    cs.RO

    MAPPO-PIS: A Multi-Agent Proximal Policy Optimization Method with Prior Intent Sharing for CAVs' Cooperative Decision-Making

    Authors: Yicheng Guo, Jiaqi Liu, Rongjie Yu, Peng Hang, Jian Sun

    Abstract: Vehicle-to-Vehicle (V2V) technologies have great potential for enhancing traffic flow efficiency and safety. However, cooperative decision-making in multi-agent systems, particularly in complex human-machine mixed merging areas, remains challenging for connected and autonomous vehicles (CAVs). Intent sharing, a key aspect of human coordination, may offer an effective solution to these decision-mak… ▽ More

    Submitted 26 August, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

  17. arXiv:2408.04590  [pdf, other

    cs.LG

    Learn To Learn More Precisely

    Authors: Runxi Cheng, Yongxian Wei, Xianglong He, Wanyun Zhu, Songsong Huang, Fei Richard Yu, Fei Ma, Chun Yuan

    Abstract: Meta-learning has been extensively applied in the domains of few-shot learning and fast adaptation, achieving remarkable performance. While Meta-learning methods like Model-Agnostic Meta-Learning (MAML) and its variants provide a good set of initial parameters for the model, the model still tends to learn shortcut features, which leads to poor generalization. In this paper, we propose the formal c… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 10pages,4 figures, meta learning

  18. arXiv:2407.16674  [pdf, other

    cs.LG cs.AI

    KAN or MLP: A Fairer Comparison

    Authors: Runpeng Yu, Weihao Yu, Xinchao Wang

    Abstract: This paper does not introduce a novel method. Instead, it offers a fairer and more comprehensive comparison of KAN and MLP models across various tasks, including machine learning, computer vision, audio processing, natural language processing, and symbolic formula representation. Specifically, we control the number of parameters and FLOPs to compare the performance of KAN and MLP. Our main observa… ▽ More

    Submitted 17 August, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

    Comments: Technical Report

  19. arXiv:2407.15141  [pdf, other

    cs.AI cs.LG physics.chem-ph

    Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation

    Authors: Yu Zhang, Ruijie Yu, Kaipeng Zeng, Ding Li, Feng Zhu, Xiaokang Yang, Yaohui Jin, Yanyan Xu

    Abstract: High-throughput reaction condition (RC) screening is fundamental to chemical synthesis. However, current RC screening suffers from laborious and costly trial-and-error workflows. Traditional computer-aided synthesis planning (CASP) tools fail to find suitable RCs due to data sparsity and inadequate reaction representations. Nowadays, large language models (LLMs) are capable of tackling chemistry-r… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  20. arXiv:2407.14968  [pdf, other

    cs.LG q-bio.BM

    Technical report: Improving the properties of molecules generated by LIMO

    Authors: Vineet Thumuluri, Peter Eckmann, Michael K. Gilson, Rose Yu

    Abstract: This technical report investigates variants of the Latent Inceptionism on Molecules (LIMO) framework to improve the properties of generated molecules. We conduct ablative studies of molecular representation, decoder model, and surrogate model training scheme. The experiments suggest that an autogressive Transformer decoder with GroupSELFIES achieves the best average properties for the random gener… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: 9 pages, 2 figures

  21. arXiv:2407.11902  [pdf, other

    cs.CV

    Encapsulating Knowledge in One Prompt

    Authors: Qi Li, Runpeng Yu, Xinchao Wang

    Abstract: This paradigm encapsulates knowledge from various models into a solitary prompt without altering the original models or requiring access to the training data, which enables us to achieve efficient and convenient knowledge transfer in more realistic scenarios. From a practicality standpoint, this paradigm not only for the first time proves the effectiveness of Visual Prompt in data inaccessible con… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  22. arXiv:2407.11203  [pdf

    cs.CY cs.CL

    The Life Cycle of Large Language Models: A Review of Biases in Education

    Authors: Jinsook Lee, Yann Hicke, Renzhe Yu, Christopher Brooks, René F. Kizilcec

    Abstract: Large Language Models (LLMs) are increasingly adopted in educational contexts to provide personalized support to students and teachers. The unprecedented capacity of LLM-based applications to understand and generate natural language can potentially improve instructional effectiveness and learning outcomes, but the integration of LLMs in education technology has renewed concerns over algorithmic bi… ▽ More

    Submitted 3 June, 2024; originally announced July 2024.

    Comments: 20 pages, 2 figures, preprint for British Journal of Educational Technology submission

  23. arXiv:2407.11036  [pdf, other

    cs.AI cs.NI

    Hybrid-Generative Diffusion Models for Attack-Oriented Twin Migration in Vehicular Metaverses

    Authors: Yingkai Kang, Jinbo Wen, Jiawen Kang, Tao Zhang, Hongyang Du, Dusit Niyato, Rong Yu, Shengli Xie

    Abstract: The vehicular metaverse is envisioned as a blended immersive domain that promises to bring revolutionary changes to the automotive industry. As a core component of vehicular metaverses, Vehicle Twins (VTs) are digital twins that cover the entire life cycle of vehicles, providing immersive virtual services for Vehicular Metaverse Users (VMUs). Vehicles with limited resources offload the computation… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  24. arXiv:2407.10528  [pdf, other

    cs.CV

    Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation

    Authors: Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Runyi Yu, Chang Liu, Xiangyang Ji, Li Yuan, Jie Chen

    Abstract: Text-to-motion generation requires not only grounding local actions in language but also seamlessly blending these individual actions to synthesize diverse and realistic global motions. However, existing motion generation methods primarily focus on the direct synthesis of global motions while neglecting the importance of generating and controlling local actions. In this paper, we propose the local… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  25. PAIL: Performance based Adversarial Imitation Learning Engine for Carbon Neutral Optimization

    Authors: Yuyang Ye, Lu-An Tang, Haoyu Wang, Runlong Yu, Wenchao Yu, Erhu He, Haifeng Chen, Hui Xiong

    Abstract: Achieving carbon neutrality within industrial operations has become increasingly imperative for sustainable development. It is both a significant challenge and a key opportunity for operational optimization in industry 4.0. In recent years, Deep Reinforcement Learning (DRL) based methods offer promising enhancements for sequential optimization processes and can be used for reducing carbon emission… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  26. arXiv:2407.08882  [pdf, ps, other

    cs.HC

    Emerging Practices for Large Multimodal Model (LMM) Assistance for People with Visual Impairments: Implications for Design

    Authors: Jingyi Xie, Rui Yu, He Zhang, Sooyeon Lee, Syed Masum Billah, John M. Carroll

    Abstract: People with visual impairments perceive their environment non-visually and often use AI-powered assistive tools to obtain textual descriptions of visual information. Recent large vision-language model-based AI-powered tools like Be My AI are more capable of understanding users' inquiries in natural language and describing the scene in audible text; however, the extent to which these tools are usef… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  27. arXiv:2407.03640  [pdf, other

    cs.LG cs.CL cs.CV

    Generative Technology for Human Emotion Recognition: A Scope Review

    Authors: Fei Ma, Yucheng Yuan, Yifan Xie, Hongwei Ren, Ivan Liu, Ying He, Fuji Ren, Fei Richard Yu, Shiguang Ni

    Abstract: Affective computing stands at the forefront of artificial intelligence (AI), seeking to imbue machines with the ability to comprehend and respond to human emotions. Central to this field is emotion recognition, which endeavors to identify and interpret human emotional states from different modalities, such as speech, facial images, text, and physiological signals. In recent years, important progre… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Under Review

  28. arXiv:2407.00610  [pdf, other

    cs.LG

    Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization

    Authors: Dongxia Wu, Nikki Lijing Kuang, Ruijia Niu, Yi-An Ma, Rose Yu

    Abstract: Black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle. This process demands sample-efficient optimization due to the high computational cost of function evaluations. While prior studies focus on forward approaches to learn surrogates for the unknown objective function, they struggle with high-dimensional inputs where valid inputs form a smal… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  29. arXiv:2406.14798  [pdf, other

    cs.LG cs.AI physics.ao-ph stat.ML

    Probabilistic Emulation of a Global Climate Model with Spherical DYffusion

    Authors: Salva Rühling Cachay, Brian Henn, Oliver Watt-Meyer, Christopher S. Bretherton, Rose Yu

    Abstract: Data-driven deep learning models are on the verge of transforming global weather forecasting. It is an open question if this success can extend to climate modeling, where long inference rollouts and data complexity pose significant challenges. Here, we present the first conditional generative model able to produce global climate ensemble simulations that are accurate and physically consistent. Our… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  30. arXiv:2406.13362  [pdf, other

    cs.CV cs.CL cs.LG

    VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models

    Authors: Haowen Hou, Peigen Zeng, Fei Ma, Fei Richard Yu

    Abstract: Visual Language Models (VLMs) have rapidly progressed with the recent success of large language models. However, there have been few attempts to incorporate efficient linear Recurrent Neural Networks (RNNs) architectures into VLMs. In this study, we introduce VisualRWKV, the first application of a linear RNN model to multimodal learning tasks, leveraging the pre-trained RWKV language model. We pro… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 18 pages,14 tables,6 figures

  31. arXiv:2406.11129  [pdf, other

    cs.CV

    Neural Lineage

    Authors: Runpeng Yu, Xinchao Wang

    Abstract: Given a well-behaved neural network, is possible to identify its parent, based on which it was tuned? In this paper, we introduce a novel task known as neural lineage detection, aiming at discovering lineage relationships between parent and child models. Specifically, from a set of parent models, neural lineage detection predicts which parent model a child model has been fine-tuned from. We propos… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  32. arXiv:2406.10411  [pdf, other

    cs.MA cs.AI

    Tree Search for Simultaneous Move Games via Equilibrium Approximation

    Authors: Ryan Yu, Alex Olshevsky, Peter Chin

    Abstract: Neural network supported tree-search has shown strong results in a variety of perfect information multi-agent tasks. However, the performance of these methods on partial information games has generally been below competing approaches. Here we study the class of simultaneous-move games, which are a subclass of partial information games which are most similar to perfect information games: both agent… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 tables, 1 figure

  33. arXiv:2406.08096  [pdf, other

    cs.CV

    Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement

    Authors: Runyi Yu, Tianyu He, Ailing Zhang, Yuchi Wang, Junliang Guo, Xu Tan, Chang Liu, Jie Chen, Jiang Bian

    Abstract: We aim to edit the lip movements in talking video according to the given speech while preserving the personal identity and visual details. The task can be decomposed into two sub-problems: (1) speech-driven lip motion generation and (2) visual appearance synthesis. Current solutions handle the two sub-problems within a single generative model, resulting in a challenging trade-off between lip-sync… ▽ More

    Submitted 16 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 14 pages of main text, 23 pages in total, 9 figures

  34. arXiv:2406.03324  [pdf, ps, other

    cs.LG

    UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning

    Authors: Yu Zhang, Rui Yu, Zhipeng Yao, Wenyuan Zhang, Jun Wang, Liming Zhang

    Abstract: The Mean Square Error (MSE) is commonly utilized to estimate the solution of the optimal value function in the vast majority of offline reinforcement learning (RL) models and has achieved outstanding performance. However, we find that its principle can lead to overestimation phenomenon for the value function. In this paper, we first theoretically analyze overestimation phenomenon led by MSE and pr… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  35. arXiv:2405.21040  [pdf, other

    cs.CL cs.AI

    Direct Alignment of Language Models via Quality-Aware Self-Refinement

    Authors: Runsheng Yu, Yong Wang, Xiaoqi Jiao, Youzhi Zhang, James T. Kwok

    Abstract: Reinforcement Learning from Human Feedback (RLHF) has been commonly used to align the behaviors of Large Language Models (LLMs) with human preferences. Recently, a popular alternative is Direct Policy Optimization (DPO), which replaces an LLM-based reward model with the policy itself, thus obviating the need for extra memory and training time to learn the reward model. However, DPO does not consid… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  36. arXiv:2405.18708  [pdf, other

    cs.AI cs.IR cs.NE

    Cognitive Evolutionary Learning to Select Feature Interactions for Recommender Systems

    Authors: Runlong Yu, Qixiang Shao, Qi Liu, Huan Liu, Enhong Chen

    Abstract: Feature interaction selection is a fundamental problem in commercial recommender systems. Most approaches equally enumerate all features and interactions by the same pre-defined operation under expert guidance. Their recommendation is unsatisfactory sometimes due to the following issues: (1)~They cannot ensure the learning abilities of models because their architectures are poorly adaptable to tas… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  37. arXiv:2405.18536  [pdf, other

    cs.LG

    Data-Driven Simulator for Mechanical Circulatory Support with Domain Adversarial Neural Process

    Authors: Sophia Sun, Wenyuan Chen, Zihao Zhou, Sonia Fereidooni, Elise Jortberg, Rose Yu

    Abstract: Mechanical Circulatory Support (MCS) devices, implemented as a probabilistic deep sequence model. Existing mechanical simulators for MCS rely on oversimplifying assumptions and are insensitive to patient-specific behavior, limiting their applicability to real-world treatment scenarios. To address these shortcomings, our model Domain Adversarial Neural Process (DANP) employs a neural process archit… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  38. arXiv:2405.18291  [pdf, other

    cs.LG cs.AI cs.DC

    FedSAC: Dynamic Submodel Allocation for Collaborative Fairness in Federated Learning

    Authors: Zihui Wang, Zheng Wang, Lingjuan Lyu, Zhaopeng Peng, Zhicheng Yang, Chenglu Wen, Rongshan Yu, Cheng Wang, Xiaoliang Fan

    Abstract: Collaborative fairness stands as an essential element in federated learning to encourage client participation by equitably distributing rewards based on individual contributions. Existing methods primarily focus on adjusting gradient allocations among clients to achieve collaborative fairness. However, they frequently overlook crucial factors such as maintaining consistency across local models and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD'24

  39. arXiv:2405.17816  [pdf, other

    cs.CV cs.LG

    Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection

    Authors: Yingwen Wu, Ruiji Yu, Xinwen Cheng, Zhengbao He, Xiaolin Huang

    Abstract: In the open world, detecting out-of-distribution (OOD) data, whose labels are disjoint with those of in-distribution (ID) samples, is important for reliable deep neural networks (DNNs). To achieve better detection performance, one type of approach proposes to fine-tune the model with auxiliary OOD datasets to amplify the difference between ID and OOD data through a separation loss defined on model… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  40. arXiv:2405.16756  [pdf, other

    cs.LG

    Symmetry-Informed Governing Equation Discovery

    Authors: Jianke Yang, Wang Rao, Nima Dehmamy, Robin Walters, Rose Yu

    Abstract: Despite the advancements in learning governing differential equations from observations of dynamical systems, data-driven methods are often unaware of fundamental physical laws, such as frame invariance. As a result, these algorithms may search an unnecessarily large space and discover equations that are less accurate or overly complex. In this paper, we propose to leverage symmetry in automated e… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  41. arXiv:2405.15758  [pdf, other

    cs.CV cs.AI

    InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation

    Authors: Yuchi Wang, Junliang Guo, Jianhong Bai, Runyi Yu, Tianyu He, Xu Tan, Xu Sun, Jiang Bian

    Abstract: Recent talking avatar generation models have made strides in achieving realistic and accurate lip synchronization with the audio, but often fall short in controlling and conveying detailed expressions and emotions of the avatar, making the generated video less vivid and controllable. In this paper, we propose a novel text-guided approach for generating emotionally expressive 2D avatars, offering f… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Project page: https://wangyuchi369.github.io/InstructAvatar/

  42. arXiv:2405.14186  [pdf

    cs.LG cs.CY

    Fairness Hub Technical Briefs: Definition and Detection of Distribution Shift

    Authors: Nicolas Acevedo, Carmen Cortez, Chris Brooks, Rene Kizilcec, Renzhe Yu

    Abstract: Distribution shift is a common situation in machine learning tasks, where the data used for training a model is different from the data the model is applied to in the real world. This issue arises across multiple technical settings: from standard prediction tasks, to time-series forecasting, and to more recent applications of large language models (LLMs). This mismatch can lead to performance redu… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Learning Engineering Virtual Institute

  43. arXiv:2405.12872  [pdf, other

    eess.IV cs.CV

    Spatial-aware Attention Generative Adversarial Network for Semi-supervised Anomaly Detection in Medical Image

    Authors: Zerui Zhang, Zhichao Sun, Zelong Liu, Bo Du, Rui Yu, Zhou Zhao, Yongchao Xu

    Abstract: Medical anomaly detection is a critical research area aimed at recognizing abnormal images to aid in diagnosis.Most existing methods adopt synthetic anomalies and image restoration on normal samples to detect anomaly. The unlabeled data consisting of both normal and abnormal data is not well explored. We introduce a novel Spatial-aware Attention Generative Adversarial Network (SAGAN) for one-class… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Early Accept by MICCAI 2024

  44. arXiv:2405.07067  [pdf, other

    cs.LG

    Learning Flame Evolution Operator under Hybrid Darrieus Landau and Diffusive Thermal Instability

    Authors: Rixin Yu, Erdzan Hodzic, Karl-Johan Nogenmyr

    Abstract: Recent advancements in the integration of artificial intelligence (AI) and machine learning (ML) with physical sciences have led to significant progress in addressing complex phenomena governed by nonlinear partial differential equations (PDE). This paper explores the application of novel operator learning methodologies to unravel the intricate dynamics of flame instability, particularly focusing… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 25 page, 10 figures

    MSC Class: 68T07; 76E30

  45. arXiv:2405.02561  [pdf, other

    cs.LG math.NA

    Understanding the Difficulty of Solving Cauchy Problems with PINNs

    Authors: Tao Wang, Bo Zhao, Sicun Gao, Rose Yu

    Abstract: Physics-Informed Neural Networks (PINNs) have gained popularity in scientific computing in recent years. However, they often fail to achieve the same level of accuracy as classical methods in solving differential equations. In this paper, we identify two sources of this issue in the case of Cauchy problems: the use of $L^2$ residuals as objective functions and the approximation gap of neural netwo… ▽ More

    Submitted 18 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: 13 pages and 18 figures

  46. arXiv:2405.01555  [pdf, ps, other

    cs.NI cs.AI

    Digital Twin-Empowered Task Assignment in Aerial MEC Network: A Resource Coalition Cooperation Approach with Generative Model

    Authors: Xin Tang, Qian Chen, Rong Yu, Xiaohuan Li

    Abstract: To meet the demands for ubiquitous communication and temporary edge computing in 6G networks, aerial mobile edge computing (MEC) networks have been envisioned as a new paradigm. However, dynamic user requests pose challenges for task assignment strategies. Most of the existing research assumes that the strategy is deployed on ground-based stations or UAVs, which will be ineffective in an environme… ▽ More

    Submitted 1 August, 2024; v1 submitted 17 March, 2024; originally announced May 2024.

  47. arXiv:2405.00236  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    STT: Stateful Tracking with Transformers for Autonomous Driving

    Authors: Longlong Jing, Ruichi Yu, Xu Chen, Zhengli Zhao, Shiwei Sheng, Colin Graber, Qi Chen, Qinru Li, Shangxuan Wu, Han Deng, Sangjin Lee, Chris Sweeney, Qiurui He, Wei-Chih Hung, Tong He, Xingyi Zhou, Farshid Moussavi, Zijian Guo, Yin Zhou, Mingxing Tan, Weilong Yang, Congcong Li

    Abstract: Tracking objects in three-dimensional space is critical for autonomous driving. To ensure safety while driving, the tracker must be able to reliably track objects across frames and accurately estimate their states such as velocity and acceleration in the present. Existing works frequently focus on the association task while either neglecting the model performance on state estimation or deploying c… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: ICRA 2024

  48. arXiv:2404.14006  [pdf, other

    cs.LG cs.CV

    Distilled Datamodel with Reverse Gradient Matching

    Authors: Jingwen Ye, Ruonan Yu, Songhua Liu, Xinchao Wang

    Abstract: The proliferation of large-scale AI models trained on extensive datasets has revolutionized machine learning. With these models taking on increasingly central roles in various applications, the need to understand their behavior and enhance interpretability has become paramount. To investigate the impact of changes in training data on a pre-trained model, a common approach is leave-one-out retraini… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024

  49. arXiv:2404.13163  [pdf, other

    econ.GN cs.CL

    A national longitudinal dataset of skills taught in U.S. higher education curricula

    Authors: Alireza Javadian Sabet, Sarah H. Bana, Renzhe Yu, Morgan R. Frank

    Abstract: Higher education plays a critical role in driving an innovative economy by equipping students with knowledge and skills demanded by the workforce. While researchers and practitioners have developed data systems to track detailed occupational skills, such as those established by the U.S. Department of Labor (DOL), much less effort has been made to document skill development in higher education at a… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 44 pages, 21 figures, 10 tables

  50. arXiv:2404.08027  [pdf, other

    cs.CV cs.AI cs.LG q-bio.QM

    SurvMamba: State Space Model with Multi-grained Multi-modal Interaction for Survival Prediction

    Authors: Ying Chen, Jiajing Xie, Yuxiang Lin, Yuhang Song, Wenxian Yang, Rongshan Yu

    Abstract: Multi-modal learning that combines pathological images with genomic data has significantly enhanced the accuracy of survival prediction. Nevertheless, existing methods have not fully utilized the inherent hierarchical structure within both whole slide images (WSIs) and transcriptomic data, from which better intra-modal representations and inter-modal integration could be derived. Moreover, many ex… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.