Skip to main content

Showing 1–50 of 211 results for author: Xie, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.07237  [pdf, other

    cs.IR

    Negative Sampling in Recommendation: A Survey and Future Directions

    Authors: Haokai Ma, Ruobing Xie, Lei Meng, Fuli Feng, Xiaoyu Du, Xingwu Sun, Zhanhui Kang, Xiangxu Meng

    Abstract: Recommender systems aim to capture users' personalized preferences from the cast amount of user behaviors, making them pivotal in the era of information explosion. However, the presence of the dynamic preference, the "information cocoons", and the inherent feedback loops in recommendation make users interact with a limited number of items. Conventional recommendation algorithms typically focus on… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

    Comments: 38 pages, 9 figures; Under review

  2. arXiv:2409.06130  [pdf, other

    cs.CR cs.AI

    On the Weaknesses of Backdoor-based Model Watermarking: An Information-theoretic Perspective

    Authors: Aoting Hu, Yanzhi Chen, Renjie Xie, Adrian Weller

    Abstract: Safeguarding the intellectual property of machine learning models has emerged as a pressing concern in AI security. Model watermarking is a powerful technique for protecting ownership of machine learning models, yet its reliability has been recently challenged by recent watermark removal attacks. In this work, we investigate why existing watermark embedding techniques particularly those based on b… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  3. PIP: Detecting Adversarial Examples in Large Vision-Language Models via Attention Patterns of Irrelevant Probe Questions

    Authors: Yudong Zhang, Ruobing Xie, Jiansheng Chen, Xingwu Sun, Yu Wang

    Abstract: Large Vision-Language Models (LVLMs) have demonstrated their powerful multimodal capabilities. However, they also face serious safety problems, as adversaries can induce robustness issues in LVLMs through the use of well-designed adversarial examples. Therefore, LVLMs are in urgent need of detection tools for adversarial examples to prevent incorrect responses. In this work, we first discover that… ▽ More

    Submitted 8 September, 2024; originally announced September 2024.

    Comments: Accepted by ACM Multimedia 2024 BNI track (Oral)

  4. arXiv:2409.02657  [pdf, other

    cs.CV cs.AI cs.MM

    PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation

    Authors: Jun Ling, Yiwen Wang, Han Xue, Rong Xie, Li Song

    Abstract: While previous audio-driven talking head generation (THG) methods generate head poses from driving audio, the generated poses or lips cannot match the audio well or are not editable. In this study, we propose \textbf{PoseTalk}, a THG system that can freely generate lip-synchronized talking head videos with free head poses conditioned on text prompts and audio. The core insight of our method is usi… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: 7+5 pages, 15 figures

  5. arXiv:2408.10681  [pdf, other

    cs.CL cs.LG

    HMoE: Heterogeneous Mixture of Experts for Language Modeling

    Authors: An Wang, Xingwu Sun, Ruobing Xie, Shuaipeng Li, Jiaqi Zhu, Zhen Yang, Pinxue Zhao, J. N. Han, Zhanhui Kang, Di Wang, Naoaki Okazaki, Cheng-zhong Xu

    Abstract: Mixture of Experts (MoE) offers remarkable performance and computational efficiency by selectively activating subsets of model parameters. Traditionally, MoE models use homogeneous experts, each with identical capacity. However, varying complexity in input data necessitates experts with diverse capabilities, while homogeneous MoE hinders effective expert specialization and efficient parameter util… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  6. arXiv:2408.08209  [pdf, other

    cs.IR

    Modeling Domain and Feedback Transitions for Cross-Domain Sequential Recommendation

    Authors: Changshuo Zhang, Teng Shi, Xiao Zhang, Qi Liu, Ruobing Xie, Jun Xu, Ji-Rong Wen

    Abstract: Nowadays, many recommender systems encompass various domains to cater to users' diverse needs, leading to user behaviors transitioning across different domains. In fact, user behaviors across different domains reveal changes in preference toward recommended items. For instance, a shift from negative feedback to positive feedback indicates improved user satisfaction. However, existing cross-domain… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  7. arXiv:2408.01191  [pdf, other

    cs.CV

    A Weakly Supervised and Globally Explainable Learning Framework for Brain Tumor Segmentation

    Authors: Ruitao Xie, Limai Jiang, Xiaoxi He, Yi Pan, Yunpeng Cai

    Abstract: Machine-based brain tumor segmentation can help doctors make better diagnoses. However, the complex structure of brain tumors and expensive pixel-level annotations present challenges for automatic tumor segmentation. In this paper, we propose a counterfactual generation framework that not only achieves exceptional brain tumor segmentation performance without the need for pixel-level annotations, b… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: 2024 IEEE International Conference on Multimedia and Expo

  8. arXiv:2407.15866  [pdf, other

    cs.LG cs.AI cs.AR

    SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable Weight Quantization

    Authors: Rui Xie, Asad Ul Haq, Linsen Ma, Krystal Sun, Sanchari Sen, Swagath Venkataramani, Liu Liu, Tong Zhang

    Abstract: Recent studies have revealed that, during the inference on generative AI models such as transformer, the importance of different weights exhibits substantial context-dependent variations. This naturally manifests a promising potential of adaptively configuring weight quantization to improve the generative AI inference efficiency. Although configurable weight quantization can readily leverage the h… ▽ More

    Submitted 17 August, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

  9. arXiv:2407.14386  [pdf, other

    cs.LG

    Frontiers of Deep Learning: From Novel Application to Real-World Deployment

    Authors: Rui Xie

    Abstract: Deep learning continues to re-shape numerous fields, from natural language processing and imaging to data analytics and recommendation systems. This report studies two research papers that represent recent progress on deep learning from two largely different aspects: The first paper applied the transformer networks, which are typically used in language models, to improve the quality of synthetic a… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  10. arXiv:2407.07061  [pdf, other

    cs.CL

    Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

    Authors: Weize Chen, Ziming You, Ran Li, Yitong Guan, Chen Qian, Chenyang Zhao, Cheng Yang, Ruobing Xie, Zhiyuan Liu, Maosong Sun

    Abstract: The rapid advancement of large language models (LLMs) has paved the way for the development of highly capable autonomous agents. However, existing multi-agent frameworks often struggle with integrating diverse capable third-party agents due to reliance on agents defined within their own ecosystems. They also face challenges in simulating distributed environments, as most frameworks are limited to… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: work in progress

  11. arXiv:2407.03636  [pdf, other

    cs.CV

    Diff-Restorer: Unleashing Visual Prompts for Diffusion-based Universal Image Restoration

    Authors: Yuhong Zhang, Hengsheng Zhang, Xinning Chai, Zhengxue Cheng, Rong Xie, Li Song, Wenjun Zhang

    Abstract: Image restoration is a classic low-level problem aimed at recovering high-quality images from low-quality images with various degradations such as blur, noise, rain, haze, etc. However, due to the inherent complexity and non-uniqueness of degradation in real-world images, it is challenging for a model trained for single tasks to handle real-world restoration problems effectively. Moreover, existin… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  12. arXiv:2407.03635  [pdf, other

    cs.CV

    MRIR: Integrating Multimodal Insights for Diffusion-based Realistic Image Restoration

    Authors: Yuhong Zhang, Hengsheng Zhang, Xinning Chai, Rong Xie, Li Song, Wenjun Zhang

    Abstract: Realistic image restoration is a crucial task in computer vision, and the use of diffusion-based models for image restoration has garnered significant attention due to their ability to produce realistic results. However, the quality of the generated images is still a significant challenge due to the severity of image degradation and the uncontrollability of the diffusion model. In this work, we de… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  13. arXiv:2407.02371  [pdf, other

    cs.CV

    OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

    Authors: Kepan Nan, Rui Xie, Penghao Zhou, Tiehan Fan, Zhenheng Yang, Zhijie Chen, Xiang Li, Jian Yang, Ying Tai

    Abstract: Text-to-video (T2V) generation has recently garnered significant attention thanks to the large multi-modality model Sora. However, T2V generation still faces two important challenges: 1) Lacking a precise open sourced high-quality dataset. The previous popular video datasets, e.g. WebVid-10M and Panda-70M, are either with low quality or too large for most research institutions. Therefore, it is ch… ▽ More

    Submitted 2 August, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: 15 pages, 9 figures

  14. arXiv:2407.00100  [pdf, other

    cs.LG cs.AI cs.CL

    Enhancing In-Context Learning via Implicit Demonstration Augmentation

    Authors: Xiaoling Zhou, Wei Ye, Yidong Wang, Chaoya Jiang, Zhemg Lee, Rui Xie, Shikun Zhang

    Abstract: The emergence of in-context learning (ICL) enables large pre-trained language models (PLMs) to make predictions for unseen inputs without updating parameters. Despite its potential, ICL's effectiveness heavily relies on the quality, quantity, and permutation of demonstrations, commonly leading to suboptimal and unstable performance. In this paper, we tackle this challenge for the first time from t… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Accepted by ACL 2024 Main 19 pages,10 figures

    ACM Class: I.2.7

  15. arXiv:2406.15968  [pdf, other

    cs.CL cs.LG

    ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods

    Authors: Roy Xie, Junlin Wang, Ruomin Huang, Minxing Zhang, Rong Ge, Jian Pei, Neil Zhenqiang Gong, Bhuwan Dhingra

    Abstract: The rapid scaling of large language models (LLMs) has raised concerns about the transparency and fair use of the pretraining data used for training them. Detecting such content is challenging due to the scale of the data and limited exposure of each instance during training. We propose ReCaLL (Relative Conditional Log-Likelihood), a novel membership inference attack (MIA) to detect LLMs' pretraini… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  16. arXiv:2406.12501  [pdf, other

    cs.IR

    Improving Multi-modal Recommender Systems by Denoising and Aligning Multi-modal Content and User Feedback

    Authors: Guipeng Xv, Xinyu Li, Ruobing Xie, Chen Lin, Chong Liu, Feng Xia, Zhanhui Kang, Leyu Lin

    Abstract: Multi-modal recommender systems (MRSs) are pivotal in diverse online web platforms and have garnered considerable attention in recent years. However, previous studies overlook the challenges of (1) noisy multi-modal content, (2) noisy user feedback, and (3) aligning multi-modal content with user feedback. In order to tackle these challenges, we propose Denoising and Aligning Multi-modal Recommende… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  17. arXiv:2406.12298  [pdf

    cs.AI physics.ao-ph

    Research on Dangerous Flight Weather Prediction based on Machine Learning

    Authors: Haoxing Liu, Renjie Xie, Haoshen Qin, Yizhou Li

    Abstract: With the continuous expansion of the scale of air transport, the demand for aviation meteorological support also continues to grow. The impact of hazardous weather on flight safety is critical. How to effectively use meteorological data to improve the early warning capability of flight dangerous weather and ensure the safe flight of aircraft is the primary task of aviation meteorological services.… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  18. Accurate Explanation Model for Image Classifiers using Class Association Embedding

    Authors: Ruitao Xie, Jingbang Chen, Limai Jiang, Rui Xiao, Yi Pan, Yunpeng Cai

    Abstract: Image classification is a primary task in data analysis where explainable models are crucially demanded in various applications. Although amounts of methods have been proposed to obtain explainable knowledge from the black-box classifiers, these approaches lack the efficiency of extracting global knowledge regarding the classification task, thus is vulnerable to local traps and often leads to poor… ▽ More

    Submitted 22 August, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 2024 IEEE 40th International Conference on Data Engineering (ICDE)

  19. arXiv:2406.06737  [pdf, other

    cs.CR cs.CL

    Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications

    Authors: Junlin Wang, Tianyi Yang, Roy Xie, Bhuwan Dhingra

    Abstract: With the proliferation of LLM-integrated applications such as GPT-s, millions are deployed, offering valuable services through proprietary instruction prompts. These systems, however, are prone to prompt extraction attacks through meticulously designed queries. To help mitigate this problem, we introduce the Raccoon benchmark which comprehensively evaluates a model's susceptibility to prompt extra… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  20. arXiv:2406.04828  [pdf, other

    cs.IR

    QAGCF: Graph Collaborative Filtering for Q&A Recommendation

    Authors: Changshuo Zhang, Teng Shi, Xiao Zhang, Yanping Zheng, Ruobing Xie, Qi Liu, Jun Xu, Ji-Rong Wen

    Abstract: Question and answer (Q&A) platforms usually recommend question-answer pairs to meet users' knowledge acquisition needs, unlike traditional recommendations that recommend only one item. This makes user behaviors more complex, and presents two challenges for Q&A recommendation, including: the collaborative information entanglement, which means user feedback is influenced by either the question or th… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  21. arXiv:2406.01873  [pdf, other

    cs.CL cs.CR cs.LG

    CR-UTP: Certified Robustness against Universal Text Perturbations on Large Language Models

    Authors: Qian Lou, Xin Liang, Jiaqi Xue, Yancheng Zhang, Rui Xie, Mengxin Zheng

    Abstract: It is imperative to ensure the stability of every prediction made by a language model; that is, a language's prediction should remain consistent despite minor input variations, like word substitutions. In this paper, we investigate the problem of certifying a language model's robustness against Universal Text Perturbations (UTPs), which have been widely used in universal adversarial attacks and ba… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL Findings 2024

  22. arXiv:2405.20701  [pdf, other

    cs.CL cs.AI

    Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization for Prompt Enhancement

    Authors: Pengwei Zhan, Zhen Xu, Qian Tan, Jie Song, Ru Xie

    Abstract: Large language models (LLMs) demonstrate exceptional instruct-following ability to complete various downstream tasks. Although this impressive ability makes LLMs flexible task solvers, their performance in solving tasks also heavily relies on instructions. In this paper, we reveal that LLMs are over-sensitive to lexical variations in task instructions, even when the variations are imperceptible to… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  23. arXiv:2405.18979  [pdf, other

    cs.LG stat.ML

    MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts

    Authors: Renchunzi Xie, Ambroise Odonnat, Vasilii Feofanov, Weijian Deng, Jianfeng Zhang, Bo An

    Abstract: Leveraging the models' outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution (OOD) samples without requiring access to the corresponding ground truth labels. Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to predict… ▽ More

    Submitted 24 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: The three first authors contributed equally

  24. arXiv:2405.15280  [pdf, other

    cs.IR cs.AI cs.LG

    DFGNN: Dual-frequency Graph Neural Network for Sign-aware Feedback

    Authors: Yiqing Wu, Ruobing Xie, Zhao Zhang, Xu Zhang, Fuzhen Zhuang, Leyu Lin, Zhanhui Kang, Yongjun Xu

    Abstract: The graph-based recommendation has achieved great success in recent years. However, most existing graph-based recommendations focus on capturing user preference based on positive edges/feedback, while ignoring negative edges/feedback (e.g., dislike, low rating) that widely exist in real-world recommender systems. How to utilize negative feedback in graph-based recommendations still remains underex… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024 Research Track

  25. arXiv:2405.14580  [pdf, other

    cs.GR

    LDM: Large Tensorial SDF Model for Textured Mesh Generation

    Authors: Rengan Xie, Wenting Zheng, Kai Huang, Yizheng Chen, Qi Wang, Qi Ye, Wei Chen, Yuchi Huo

    Abstract: Previous efforts have managed to generate production-ready 3D assets from text or images. However, these methods primarily employ NeRF or 3D Gaussian representations, which are not adept at producing smooth, high-quality geometries required by modern rendering pipelines. In this paper, we propose LDM, a novel feed-forward framework capable of generating high-fidelity, illumination-decoupled textur… ▽ More

    Submitted 20 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  26. arXiv:2405.11270  [pdf, other

    cs.CV

    HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos

    Authors: Qifeng Chen, Rengan Xie, Kai Huang, Qi Wang, Wenting Zheng, Rong Li, Yuchi Huo

    Abstract: Recently, implicit neural representation has been widely used to generate animatable human avatars. However, the materials and geometry of those representations are coupled in the neural network and hard to edit, which hinders their application in traditional graphics engines. We present a framework for acquiring human avatars that are attached with high-resolution physically-based material textur… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  27. arXiv:2405.10861  [pdf, other

    cs.CL cs.AI cs.CY

    Tailoring Vaccine Messaging with Common-Ground Opinions

    Authors: Rickard Stureborg, Sanxing Chen, Ruoyu Xie, Aayushi Patel, Christopher Li, Chloe Qinyu Zhu, Tingnan Hu, Jun Yang, Bhuwan Dhingra

    Abstract: One way to personalize chatbot interactions is by establishing common ground with the intended reader. A domain where establishing mutual understanding could be particularly impactful is vaccine concerns and misinformation. Vaccine interventions are forms of messaging which aim to answer concerns expressed about vaccination. Tailoring responses in this domain is difficult, since opinions often hav… ▽ More

    Submitted 23 July, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: NAACL Findings 2024

    MSC Class: 68T50 (Primary) 68T01; 68T37; 91F20 (Secondary) ACM Class: I.2; I.2.7; I.7

  28. arXiv:2405.03562  [pdf, other

    cs.IR

    ID-centric Pre-training for Recommendation

    Authors: Yiqing Wu, Ruobing Xie, Zhao Zhang, Fuzhen Zhuang, Xu Zhang, Leyu Lin, Zhanhui Kang, Yongjun Xu

    Abstract: Classical sequential recommendation models generally adopt ID embeddings to store knowledge learned from user historical behaviors and represent items. However, these unique IDs are challenging to be transferred to new domains. With the thriving of pre-trained language model (PLM), some pioneer works adopt PLM for pre-trained recommendation, where modality information (e.g., text) is considered un… ▽ More

    Submitted 7 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  29. arXiv:2405.00742  [pdf, other

    cs.CR cs.LG stat.ML

    Federated Graph Learning for EV Charging Demand Forecasting with Personalization Against Cyberattacks

    Authors: Yi Li, Renyou Xie, Chaojie Li, Yi Wang, Zhaoyang Dong

    Abstract: Mitigating cybersecurity risk in electric vehicle (EV) charging demand forecasting plays a crucial role in the safe operation of collective EV chargings, the stability of the power grid, and the cost-effective infrastructure expansion. However, existing methods either suffer from the data privacy issue and the susceptibility to cyberattacks or fail to consider the spatial correlation among differe… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages,4 figures

  30. arXiv:2404.18109  [pdf, other

    cs.CV

    Finding Beautiful and Happy Images for Mental Health and Well-being Applications

    Authors: Ruitao Xie, Connor Qiu, Guoping Qiu

    Abstract: This paper explores how artificial intelligence (AI) technology can contribute to achieve progress on good health and well-being, one of the United Nations' 17 Sustainable Development Goals. It is estimated that one in ten of the global population lived with a mental disorder. Inspired by studies showing that engaging and viewing beautiful natural images can make people feel happier and less stres… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  31. arXiv:2404.16678  [pdf, other

    cs.CV

    Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior

    Authors: Han Wang, Xinning Chai, Yiwen Wang, Yuhong Zhang, Rong Xie, Li Song

    Abstract: Colorizing grayscale images offers an engaging visual experience. Existing automatic colorization methods often fail to generate satisfactory results due to incorrect semantic colors and unsaturated colors. In this work, we propose an automatic colorization pipeline to overcome these challenges. We leverage the extraordinary generative ability of the diffusion prior to synthesize color with plausi… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  32. arXiv:2404.16307  [pdf, other

    cs.LG cs.CV

    Boosting Model Resilience via Implicit Adversarial Data Augmentation

    Authors: Xiaoling Zhou, Wei Ye, Zhemg Lee, Rui Xie, Shikun Zhang

    Abstract: Data augmentation plays a pivotal role in enhancing and diversifying training data. Nonetheless, consistently improving model performance in varied learning scenarios, especially those with inherent data biases, remains challenging. To address this, we propose to augment the deep features of samples by incorporating their adversarial and anti-adversarial perturbation distributions, enabling adapti… ▽ More

    Submitted 1 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: 9 pages, 6 figures, accepted by IJCAI 2024

    ACM Class: I.2.6; I.4.3

  33. arXiv:2404.14567  [pdf, other

    cs.CL

    WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models

    Authors: Ronald Xie, Steven Palayew, Augustin Toma, Gary Bader, Bo Wang

    Abstract: This paper outlines our submission to the MEDIQA2024 Multilingual and Multimodal Medical Answer Generation (M3G) shared task. We report results for two standalone solutions under the English category of the task, the first involving two consecutive API calls to the Claude 3 Opus API and the second involving training an image-disease label joint embedding in the style of CLIP for image classificati… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  34. arXiv:2404.14544  [pdf, other

    cs.CL

    WangLab at MEDIQA-CORR 2024: Optimized LLM-based Programs for Medical Error Detection and Correction

    Authors: Augustin Toma, Ronald Xie, Steven Palayew, Patrick R. Lawler, Bo Wang

    Abstract: Medical errors in clinical text pose significant risks to patient safety. The MEDIQA-CORR 2024 shared task focuses on detecting and correcting these errors across three subtasks: identifying the presence of an error, extracting the erroneous sentence, and generating a corrected sentence. In this paper, we present our approach that achieved top performance in all three subtasks. For the MS dataset,… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  35. arXiv:2404.08796  [pdf, other

    cs.IR

    The Elephant in the Room: Rethinking the Usage of Pre-trained Language Model in Sequential Recommendation

    Authors: Zekai Qu, Ruobing Xie, Chaojun Xiao, Xingwu Sun, Zhanhui Kang

    Abstract: Sequential recommendation (SR) has seen significant advancements with the help of Pre-trained Language Models (PLMs). Some PLM-based SR models directly use PLM to encode user historical behavior's text sequences to learn user representations, while there is seldom an in-depth exploration of the capability and suitability of PLM in behavior sequence modeling. In this work, we first conduct extensiv… ▽ More

    Submitted 14 August, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted at RecSys 2024

  36. arXiv:2404.04062  [pdf, other

    cs.LG math.OC

    Derivative-free tree optimization for complex systems

    Authors: Ye Wei, Bo Peng, Ruiwen Xie, Yangtao Chen, Yu Qin, Peng Wen, Stefan Bauer, Po-Yen Tung

    Abstract: A tremendous range of design tasks in materials, physics, and biology can be formulated as finding the optimum of an objective function depending on many parameters without knowing its closed-form expression or the derivative. Traditional derivative-free optimization techniques often rely on strong assumptions about objective functions, thereby failing at optimizing non-convex systems beyond 100 d… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 39 pages, 3 figures

  37. arXiv:2404.02078  [pdf, other

    cs.AI cs.CL cs.LG

    Advancing LLM Reasoning Generalists with Preference Trees

    Authors: Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, Bowen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun

    Abstract: We introduce Eurus, a suite of large language models (LLMs) optimized for reasoning. Finetuned from Mistral-7B and CodeLlama-70B, Eurus models achieve state-of-the-art results among open-source models on a diverse set of benchmarks covering mathematics, code generation, and logical reasoning problems. Notably, Eurus-70B beats GPT-3.5 Turbo in reasoning through a comprehensive benchmarking across 1… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Models and data are available at https://github.com/OpenBMB/Eurus

  38. arXiv:2404.01717  [pdf, other

    cs.CV eess.IV

    AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation

    Authors: Rui Xie, Ying Tai, Chen Zhao, Kai Zhang, Zhenyu Zhang, Jun Zhou, Xiaoqian Ye, Qian Wang, Jian Yang

    Abstract: Blind super-resolution methods based on stable diffusion showcase formidable generative capabilities in reconstructing clear high-resolution images with intricate details from low-resolution inputs. However, their practical applicability is often hampered by poor efficiency, stemming from the requirement of thousands or hundreds of sampling steps. Inspired by the efficient adversarial diffusion di… ▽ More

    Submitted 23 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  39. arXiv:2403.19287  [pdf, other

    cs.SE

    CoderUJB: An Executable and Unified Java Benchmark for Practical Programming Scenarios

    Authors: Zhengran Zeng, Yidong Wang, Rui Xie, Wei Ye, Shikun Zhang

    Abstract: In the evolving landscape of large language models (LLMs) tailored for software engineering, the need for benchmarks that accurately reflect real-world development scenarios is paramount. Current benchmarks are either too simplistic or fail to capture the multi-tasking nature of software development. To address this, we introduce CoderUJB, a new benchmark designed to evaluate LLMs across diverse J… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 11 pages, 4 figures, issta2024 accepted

    MSC Class: 68N30 (Primary) 68T20 (Secondary) ACM Class: D.2.0

  40. arXiv:2403.19185  [pdf, other

    cs.IT eess.SP

    Deep CSI Compression for Dual-Polarized Massive MIMO Channels with Disentangled Representation Learning

    Authors: Suhang Fan, Wei Xu, Renjie Xie, Shi Jin, Derrick Wing Kwan Ng, Naofal Al-Dhahir

    Abstract: Channel state information (CSI) feedback is critical for achieving the promised advantages of enhancing spectral and energy efficiencies in massive multiple-input multiple-output (MIMO) wireless communication systems. Deep learning (DL)-based methods have been proven effective in reducing the required signaling overhead for CSI feedback. In practical dual-polarized MIMO scenarios, channels in the… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  41. arXiv:2403.18480  [pdf, other

    cs.IR

    Enhanced Generative Recommendation via Content and Collaboration Integration

    Authors: Yidan Wang, Zhaochun Ren, Weiwei Sun, Jiyuan Yang, Zhixiang Liang, Xin Chen, Ruobing Xie, Su Yan, Xu Zhang, Pengjie Ren, Zhumin Chen, Xin Xin

    Abstract: Generative recommendation has emerged as a promising paradigm aimed at augmenting recommender systems with recent advancements in generative artificial intelligence. This task has been formulated as a sequence-to-sequence generation process, wherein the input sequence encompasses data pertaining to the user's previously interacted items, and the output sequence denotes the generative identifier fo… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  42. arXiv:2403.15747  [pdf, other

    cs.SE cs.AI

    CodeShell Technical Report

    Authors: Rui Xie, Zhengran Zeng, Zhuohao Yu, Chang Gao, Shikun Zhang, Wei Ye

    Abstract: Code large language models mark a pivotal breakthrough in artificial intelligence. They are specifically crafted to understand and generate programming languages, significantly boosting the efficiency of coding development workflows. In this technical report, we present CodeShell-Base, a seven billion-parameter foundation model with 8K context length, showcasing exceptional proficiency in code com… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  43. From Hardware Fingerprint to Access Token: Enhancing the Authentication on IoT Devices

    Authors: Yue Xiao, Yi He, Xiaoli Zhang, Qian Wang, Renjie Xie, Kun Sun, Ke Xu, Qi Li

    Abstract: The proliferation of consumer IoT products in our daily lives has raised the need for secure device authentication and access control. Unfortunately, these resource-constrained devices typically use token-based authentication, which is vulnerable to token compromise attacks that allow attackers to impersonate the devices and perform malicious operations by stealing the access token. Using hardware… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  44. arXiv:2403.11116  [pdf, other

    cs.CV cs.AI

    PhD: A Prompted Visual Hallucination Evaluation Dataset

    Authors: Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Li

    Abstract: Multimodal Large Language Models (MLLMs) hallucinate, resulting in an emerging topic of visual hallucination evaluation (VHE). We introduce in this paper PhD, a large-scale benchmark for VHE. The essence of VHE is to ask an MLLM the right questions concerning a specific image. Depending on what to ask (objects, attributes, sentiment, etc.) and how the questions are asked, we structure PhD along tw… ▽ More

    Submitted 21 August, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  45. arXiv:2403.10104  [pdf, other

    cs.CV

    CSDNet: Detect Salient Object in Depth-Thermal via A Lightweight Cross Shallow and Deep Perception Network

    Authors: Xiaotong Yu, Ruihan Xie, Zhihe Zhao, Chang-Wen Chen

    Abstract: While we enjoy the richness and informativeness of multimodal data, it also introduces interference and redundancy of information. To achieve optimal domain interpretation with limited resources, we propose CSDNet, a lightweight \textbf{C}ross \textbf{S}hallow and \textbf{D}eep Perception \textbf{Net}work designed to integrate two modalities with less coherence, thereby discarding redundant inform… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  46. arXiv:2403.09671  [pdf, other

    cs.DC cs.AI

    CoRaiS: Lightweight Real-Time Scheduler for Multi-Edge Cooperative Computing

    Authors: Yujiao Hu, Qingmin Jia, Jinchao Chen, Yuan Yao, Yan Pan, Renchao Xie, F. Richard Yu

    Abstract: Multi-edge cooperative computing that combines constrained resources of multiple edges into a powerful resource pool has the potential to deliver great benefits, such as a tremendous computing power, improved response time, more diversified services. However, the mass heterogeneous resources composition and lack of scheduling strategies make the modeling and cooperating of multi-edge computing sys… ▽ More

    Submitted 20 May, 2024; v1 submitted 4 February, 2024; originally announced March 2024.

    Comments: Accepted by IEEE Internet of Things Journal

  47. arXiv:2403.08281  [pdf, other

    cs.CL cs.AI

    Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models

    Authors: Ning Ding, Yulin Chen, Ganqu Cui, Xingtai Lv, Weilin Zhao, Ruobing Xie, Bowen Zhou, Zhiyuan Liu, Maosong Sun

    Abstract: Underlying data distributions of natural language, programming code, and mathematical symbols vary vastly, presenting a complex challenge for large language models (LLMs) that strive to achieve high performance across all three domains simultaneously. Achieving a very high level of proficiency for an LLM within a specific domain often requires extensive training with relevant corpora, which is typ… ▽ More

    Submitted 26 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  48. arXiv:2403.04526  [pdf, other

    cs.LG cs.AI cs.CV

    Hyperspectral unmixing for Raman spectroscopy via physics-constrained autoencoders

    Authors: Dimitar Georgiev, Álvaro Fernández-Galiana, Simon Vilms Pedersen, Georgios Papadopoulos, Ruoxiao Xie, Molly M. Stevens, Mauricio Barahona

    Abstract: Raman spectroscopy is widely used across scientific domains to characterize the chemical composition of samples in a non-destructive, label-free manner. Many applications entail the unmixing of signals from mixtures of molecular species to identify the individual components present and their proportions, yet conventional methods for chemometrics often struggle with complex mixture scenarios encoun… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  49. arXiv:2403.04015  [pdf, other

    cs.LG cs.AI stat.ML

    Knockoff-Guided Feature Selection via A Single Pre-trained Reinforced Agent

    Authors: Xinyuan Wang, Dongjie Wang, Wangyang Ying, Rui Xie, Haifeng Chen, Yanjie Fu

    Abstract: Feature selection prepares the AI-readiness of data by eliminating redundant features. Prior research falls into two primary categories: i) Supervised Feature Selection, which identifies the optimal feature subset based on their relevance to the target variable; ii) Unsupervised Feature Selection, which reduces the feature space dimensionality by capturing the essential information within the feat… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  50. arXiv:2403.02063  [pdf, other

    cs.CV

    Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views

    Authors: Shuai Guo, Qiuwen Wang, Yijie Gao, Rong Xie, Li Song

    Abstract: Novel-view synthesis with sparse input views is important for real-world applications like AR/VR and autonomous driving. Recent methods have integrated depth information into NeRFs for sparse input synthesis, leveraging depth prior for geometric and spatial understanding. However, most existing works tend to overlook inaccuracies within depth maps and have low time efficiency. To address these iss… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.