Skip to main content

Showing 1–50 of 184 results for author: Lu, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.11077  [pdf, other

    cs.LG cs.CV stat.ML

    Solving Oscillator ODEs via Soft-constrained Physics-informed Neural Network with Small Data

    Authors: Kai-liang Lu, Yu-meng Su, Cheng Qiu, Zhuo Bi, Wen-jun Zhang

    Abstract: This paper compared physics-informed neural network (PINN), conventional neural network (NN) and numerical discretization methods on solving differential equations through literature research. We formalized the mathematical framework and computational flow of the soft-constrained PINN method for solving differential equations (e.g., ODEs/PDEs). Its working mechanism and its accuracy and efficiency… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 17 pages, 7 figures, 2 tables, etc

    MSC Class: 68T07 ACM Class: I.5

  2. arXiv:2408.10764  [pdf, other

    cs.CL

    Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model

    Authors: Chenhan Yuan, Fei Huang, Ru Peng, Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou

    Abstract: Transformer-based large language models (LLMs) exhibit limitations such as generating unsafe responses, unreliable reasoning, etc. Existing inference intervention approaches attempt to mitigate these issues by finetuning additional models to produce calibration signals (such as rewards) that guide the LLM's decoding process. However, this solution introduces substantial time and space overhead due… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 16 pages

  3. arXiv:2408.05029  [pdf, other

    cs.CV

    Collaborative Static-Dynamic Teaching: A Semi-Supervised Framework for Stripe-Like Space Target Detection

    Authors: Zijian Zhu, Ali Zia, Xuesong Li, Bingbing Dan, Yuebo Ma, Hongfeng Long, Kaili Lu, Enhai Liu, Rujin Zhao

    Abstract: Stripe-like space target detection (SSTD) is crucial for space situational awareness. Traditional unsupervised methods often fail in low signal-to-noise ratio and variable stripe-like space targets scenarios, leading to weak generalization. Although fully supervised learning methods improve model generalization, they require extensive pixel-level labels for training. In the SSTD task, manually cre… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  4. arXiv:2408.01934  [pdf, other

    cs.CV

    A Survey and Evaluation of Adversarial Attacks for Object Detection

    Authors: Khoi Nguyen Tiet Nguyen, Wenyu Zhang, Kangkang Lu, Yuhuan Wu, Xingjian Zheng, Hui Li Tan, Liangli Zhen

    Abstract: Deep learning models excel in various computer vision tasks but are susceptible to adversarial examples-subtle perturbations in input data that lead to incorrect predictions. This vulnerability poses significant risks in safety-critical applications such as autonomous vehicles, security surveillance, and aircraft health monitoring. While numerous surveys focus on adversarial attacks in image class… ▽ More

    Submitted 5 August, 2024; v1 submitted 4 August, 2024; originally announced August 2024.

    Comments: 14 pages

  5. arXiv:2407.16576  [pdf, other

    cs.CR

    Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs

    Authors: Yifan Xia, Zichen Xie, Peiyu Liu, Kangjie Lu, Yan Liu, Wenhai Wang, Shouling Ji

    Abstract: While the automated detection of cryptographic API misuses has progressed significantly, its precision diminishes for intricate targets due to the reliance on manually defined patterns. Large Language Models (LLMs), renowned for their contextual understanding, offer a promising avenue to address existing shortcomings. However, applying LLMs in this security-critical domain presents challenges, par… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  6. arXiv:2407.10671  [pdf, other

    cs.CL cs.AI

    Qwen2 Technical Report

    Authors: An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin , et al. (37 additional authors not shown)

    Abstract: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a… ▽ More

    Submitted 17 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 25 pages, 1 figure

  7. arXiv:2407.09886  [pdf, other

    eess.AS cs.CL cs.SD

    Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation

    Authors: Chun-Yi Kuan, Chih-Kai Yang, Wei-Ping Huang, Ke-Han Lu, Hung-yi Lee

    Abstract: In this work, we introduce Speech-Copilot, a modular framework for instruction-oriented speech-processing tasks that minimizes human effort in toolset construction. Unlike end-to-end methods using large audio-language models, Speech-Copilot builds speech processing-specific toolsets by analyzing pre-collected task instructions and breaking tasks into manageable sub-tasks. It features a flexible ag… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: 8 pages, 2 figures

  8. arXiv:2407.06957  [pdf, other

    eess.AS cs.CL cs.CY

    Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models

    Authors: Yi-Cheng Lin, Tzu-Quan Lin, Chih-Kai Yang, Ke-Han Lu, Wei-Chih Chen, Chun-Yi Kuan, Hung-yi Lee

    Abstract: Speech Integrated Large Language Models (SILLMs) combine large language models with speech perception to perform diverse tasks, such as emotion recognition to speaker verification, demonstrating universal audio understanding capability. However, these models may amplify biases present in training data, potentially leading to biased access to information for marginalized groups. This work introduce… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  9. arXiv:2407.04294  [pdf, other

    cs.CR

    SQLaser: Detecting DBMS Logic Bugs with Clause-Guided Fuzzing

    Authors: Jin Wei, Ping Chen, Kangjie Lu, Jun Dai, Xiaoyan Sun

    Abstract: Database Management Systems (DBMSs) are vital components in modern data-driven systems. Their complexity often leads to logic bugs, which are implementation errors within the DBMSs that can lead to incorrect query results, data exposure, unauthorized access, etc., without necessarily causing visible system failures. Existing detection employs two strategies: rule-based bug detection and coverage-g… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  10. arXiv:2406.18871  [pdf, other

    eess.AS cs.CL

    DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment

    Authors: Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, He Huang, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee

    Abstract: Recent speech language models (SLMs) typically incorporate pre-trained speech models to extend the capabilities from large language models (LLMs). In this paper, we propose a Descriptive Speech-Text Alignment approach that leverages speech captioning to bridge the gap between speech and text modalities, enabling SLMs to interpret and generate comprehensive natural language descriptions, thereby fa… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  11. arXiv:2406.14024  [pdf, other

    cs.CL

    LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

    Authors: Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Dayiheng Liu, Chang Zhou, Wen Xiao, Junjie Hu, Tianyu Liu, Baobao Chang

    Abstract: Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale la… ▽ More

    Submitted 8 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: 9 pages

  12. arXiv:2406.13542  [pdf, other

    cs.CL cs.AI cs.LG

    Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

    Authors: Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, Jingren Zhou

    Abstract: One core capability of large language models (LLMs) is to follow natural language instructions. However, the issue of automatically constructing high-quality training data to enhance the complex instruction-following abilities of LLMs without manual annotation remains unresolved. In this paper, we introduce AutoIF, the first scalable and reliable method for automatically generating instruction-fol… ▽ More

    Submitted 18 July, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: Work in progress

  13. arXiv:2406.02069  [pdf, other

    cs.CL cs.AI

    PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling

    Authors: Zefan Cai., Yichi Zhang, Bofei Gao, Yuliang Liu, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Baobao Chang, Junjie Hu, Wen Xiao

    Abstract: In this study, we investigate whether attention-based information flow inside large language models (LLMs) is aggregated through noticeable patterns for long context processing. Our observations reveal that LLMs aggregate information through Pyramidal Information Funneling where attention is scattering widely in lower layers, progressively consolidating within specific contexts, and ultimately foc… ▽ More

    Submitted 16 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  14. arXiv:2406.01252  [pdf, other

    cs.CL cs.AI stat.ML

    Towards Scalable Automated Alignment of LLMs: A Survey

    Authors: Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, Bowen Yu

    Abstract: Alignment is the most critical step in building large language models (LLMs) that meet human needs. With the rapid development of LLMs gradually surpassing human capabilities, traditional alignment methods based on human-annotation are increasingly unable to meet the scalability demands. Therefore, there is an urgent need to explore new sources of automated alignment signals and technical approach… ▽ More

    Submitted 16 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  15. arXiv:2405.17931  [pdf, other

    cs.CL cs.LG

    Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment

    Authors: Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, Chang Zhou

    Abstract: Effectively aligning Large Language Models (LLMs) with human-centric values while preventing the degradation of abilities acquired through Pre-training and Supervised Fine-tuning (SFT) poses a central challenge in Reinforcement Learning from Human Feedback (RLHF). In this paper, we first discover that interpolating RLHF and SFT model parameters can adjust the trade-off between human preference and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  16. arXiv:2405.01561  [pdf

    cs.SE cs.AI cs.CY

    Rapid Mobile App Development for Generative AI Agents on MIT App Inventor

    Authors: Jaida Gao, Calab Su, Etai Miller, Kevin Lu, Yu Meng

    Abstract: The evolution of Artificial Intelligence (AI) stands as a pivotal force shaping our society, finding applications across diverse domains such as education, sustainability, and safety. Leveraging AI within mobile applications makes it easily accessible to the public, catalyzing its transformative potential. In this paper, we present a methodology for the rapid development of AI agent applications u… ▽ More

    Submitted 31 March, 2024; originally announced May 2024.

    Journal ref: Journal of advances in information science and technology 2(3) 1-8, March 2024

  17. arXiv:2403.13438  [pdf, other

    cs.CV

    SpatialPIN: Enhancing Spatial Reasoning Capabilities of Vision-Language Models through Prompting and Interacting 3D Priors

    Authors: Chenyang Ma, Kai Lu, Ta-Ying Cheng, Niki Trigoni, Andrew Markham

    Abstract: Current state-of-the-art spatial reasoning-enhanced VLMs are trained to excel at spatial visual question answering (VQA). However, we believe that higher-level 3D-aware tasks, such as articulating dynamic scene changes and motion planning, require a fundamental and explicit 3D understanding beyond current spatial VQA datasets. In this work, we present SpatialPIN, a framework designed to enhance th… ▽ More

    Submitted 6 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Project Page: https://dannymcy.github.io/zeroshot_task_hallucination/

  18. arXiv:2403.09747  [pdf, other

    cs.CL cs.AI

    Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

    Authors: Guanghua Li, Wensheng Lu, Wei Zhang, Defu Lian, Kezhong Lu, Rui Mao, Kai Shu, Hao Liao

    Abstract: The proliferation of fake news has had far-reaching implications on politics, the economy, and society at large. While Fake news detection methods have been employed to mitigate this issue, they primarily depend on two essential elements: the quality and relevance of the evidence, and the effectiveness of the verdict prediction mechanism. Traditional methods, which often source information from st… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  19. arXiv:2403.08383  [pdf, other

    cs.CV

    AFGI: Towards Accurate and Fast-convergent Gradient Inversion Attack in Federated Learning

    Authors: Can Liu, Jin Wang, and Yipeng Zhou, Yachao Yuan, Quanzheng Sheng, Kejie Lu

    Abstract: Federated learning (FL) empowers privacypreservation in model training by only exposing users' model gradients. Yet, FL users are susceptible to gradient inversion attacks (GIAs) which can reconstruct ground-truth training data such as images based on model gradients. However, reconstructing high-resolution images by existing GIAs faces two challenges: inferior accuracy and slow-convergence, espec… ▽ More

    Submitted 31 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  20. arXiv:2403.08164  [pdf, other

    cs.SD cs.LG eess.AS

    EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech

    Authors: Ziqi Liang, Haoxiang Shi, Jiawei Wang, Keda Lu

    Abstract: Recently, deep learning-based Text-to-Speech (TTS) systems have achieved high-quality speech synthesis results. Recurrent neural networks have become a standard modeling technique for sequential data in TTS systems and are widely used. However, training a TTS model which includes RNN components requires powerful GPU performance and takes a long time. In contrast, CNN-based sequence synthesis techn… ▽ More

    Submitted 17 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted by the 27th IEEE International Conference on Computer Supported Cooperative Work in Design (IEEE CSCWD 2024). arXiv admin note: substantial text overlap with arXiv:2211.01948

  21. arXiv:2403.06946  [pdf, other

    cs.CV

    Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation

    Authors: Xinyao Li, Yuke Li, Zhekai Du, Fengling Li, Ke Lu, Jingjing Li

    Abstract: Large vision-language models (VLMs) like CLIP have demonstrated good zero-shot learning performance in the unsupervised domain adaptation task. Yet, most transfer approaches for VLMs focus on either the language or visual branches, overlooking the nuanced interplay between both modalities. In this work, we introduce a Unified Modality Separation (UniMoS) framework for unsupervised domain adaptatio… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: CVPR 2024 camera ready

  22. arXiv:2403.05062  [pdf, other

    cs.CV

    Agile Multi-Source-Free Domain Adaptation

    Authors: Xinyao Li, Jingjing Li, Fengling Li, Lei Zhu, Ke Lu

    Abstract: Efficiently utilizing rich knowledge in pretrained models has become a critical topic in the era of large models. This work focuses on adaptively utilizing knowledge from multiple source-pretrained models to an unlabeled target domain without accessing the source data. Despite being a practically useful setting, existing methods require extensive parameter tuning over each source model, which is c… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted to AAAI2024

  23. arXiv:2403.02899  [pdf, other

    cs.AI

    Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation

    Authors: Zhekai Du, Xinyao Li, Fengling Li, Ke Lu, Lei Zhu, Jingjing Li

    Abstract: Conventional Unsupervised Domain Adaptation (UDA) strives to minimize distribution discrepancy between domains, which neglects to harness rich semantics from data and struggles to handle complex domain shifts. A promising technique is to leverage the knowledge of large-scale pre-trained vision-language models for more guided adaptation. Despite some endeavors, current methods often learn textual p… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  24. arXiv:2402.11354  [pdf, other

    cs.LG cs.AI cs.CV cs.DB cs.DS

    Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search

    Authors: Kejing Lu, Chuan Xiao, Yoshiharu Ishikawa

    Abstract: Approximate nearest neighbor search (ANNS) in high-dimensional spaces is a pivotal challenge in the field of machine learning. In recent years, graph-based methods have emerged as the superior approach to ANNS, establishing a new state of the art. Although various optimizations for graph-based ANNS have been introduced, they predominantly rely on heuristic methods that lack formal theoretical back… ▽ More

    Submitted 10 July, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: Source code is available at https://github.com/ICML2024-code/PEOs

  25. arXiv:2402.07792  [pdf, other

    cs.LG cs.DC

    Empowering Federated Learning for Massive Models with NVIDIA FLARE

    Authors: Holger R. Roth, Ziyue Xu, Yuan-Ting Hsieh, Adithya Renduchintala, Isaac Yang, Zhihong Zhang, Yuhong Wen, Sean Yang, Kevin Lu, Kristopher Kersten, Camir Ricketts, Daguang Xu, Chester Chen, Yan Cheng, Andrew Feng

    Abstract: In the ever-evolving landscape of artificial intelligence (AI) and large language models (LLMs), handling and leveraging data effectively has become a critical challenge. Most state-of-the-art machine learning algorithms are data-centric. However, as the lifeblood of model performance, necessary data cannot always be centralized due to various factors such as privacy, regulation, geopolitics, copy… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  26. arXiv:2401.15967  [pdf, other

    cs.CR cs.SE

    INSTILLER: Towards Efficient and Realistic RTL Fuzzing

    Authors: Gen Zhang, Pengfei Wang, Tai Yue, Danjun Liu, Yubei Guo, Kai Lu

    Abstract: Bugs exist in hardware, such as CPU. Unlike software bugs, these hardware bugs need to be detected before deployment. Previous fuzzing work in CPU bug detection has several disadvantages, e.g., the length of RTL input instructions keeps growing, and longer inputs are ineffective for fuzzing. In this paper, we propose INSTILLER (Instruction Distiller), an RTL fuzzer based on ant colony optimization… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Journal ref: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2024

  27. MobFuzz: Adaptive Multi-objective Optimization in Gray-box Fuzzing

    Authors: Gen Zhang, Pengfei Wang, Tai Yue, Xiangdong Kong, Shan Huang, Xu Zhou, Kai Lu

    Abstract: Coverage-guided gray-box fuzzing (CGF) is an efficient software testing technique. There are usually multiple objectives to optimize in CGF. However, existing CGF methods cannot successfully find the optimal values for multiple objectives simultaneously. In this paper, we propose a gray-box fuzzer for multi-objective optimization (MOO) called MobFuzz. We model the multi-objective optimization proc… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Journal ref: Network and Distributed Systems Security (NDSS) Symposium 2022

  28. arXiv:2401.15603  [pdf, other

    cs.LG cs.SI

    Improving Expressive Power of Spectral Graph Neural Networks with Eigenvalue Correction

    Authors: Kangkang Lu, Yanhua Yu, Hao Fei, Xuan Li, Zixuan Yang, Zirui Guo, Meiyu Liang, Mengran Yin, Tat-Seng Chua

    Abstract: In recent years, spectral graph neural networks, characterized by polynomial filters, have garnered increasing attention and have achieved remarkable performance in tasks such as node classification. These models typically assume that eigenvalues for the normalized Laplacian matrix are distinct from each other, thus expecting a polynomial filter to have a high fitting ability. However, this paper… ▽ More

    Submitted 18 March, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI-24

  29. Color Maker: a Mixed-Initiative Approach to Creating Accessible Color Maps

    Authors: Amey Salvi, Kecheng Lu, Michael E. Papka, Yunhai Wang, Khairi Reda

    Abstract: Quantitative data is frequently represented using color, yet designing effective color mappings is a challenging task, requiring one to balance perceptual standards with personal color preference. Current design tools either overwhelm novices with complexity or offer limited customization options. We present ColorMaker, a mixed-initiative approach for creating colormaps. ColorMaker combines fluid… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: To appear at the ACM CHI '24 Conference on Human Factors in Computing Systems

  30. arXiv:2401.13714  [pdf, other

    cs.CV cs.LG

    Value-Driven Mixed-Precision Quantization for Patch-Based Inference on Microcontrollers

    Authors: Wei Tao, Shenglin He, Kai Lu, Xiaoyang Qu, Guokuan Li, Jiguang Wan, Jianzong Wang, Jing Xiao

    Abstract: Deploying neural networks on microcontroller units (MCUs) presents substantial challenges due to their constrained computation and memory resources. Previous researches have explored patch-based inference as a strategy to conserve memory without sacrificing model accuracy. However, this technique suffers from severe redundant computation overhead, leading to a substantial increase in execution lat… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted by the 27th Design, Automation and Test in Europe Conference (DATE 2024)

  31. arXiv:2401.12474  [pdf, other

    cs.CL cs.LG

    Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

    Authors: Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou

    Abstract: Considerable efforts have been invested in augmenting the role-playing proficiency of open-source large language models (LLMs) by emulating proprietary counterparts. Nevertheless, we posit that LLMs inherently harbor role-play capabilities, owing to the extensive knowledge of characters and potential dialogues ingrained in their vast training corpora. Thus, in this study, we introduce Ditto, a sel… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  32. arXiv:2401.00273  [pdf, ps, other

    eess.AS cs.CL

    Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision

    Authors: Chih-Kai Yang, Kuan-Po Huang, Ke-Han Lu, Chun-Yi Kuan, Chi-Yuan Hsiao, Hung-yi Lee

    Abstract: This work evaluated several cutting-edge large-scale foundation models based on self-supervision or weak supervision, including SeamlessM4T, SeamlessM4T v2, and Whisper-large-v3, on three code-switched corpora. We found that self-supervised models can achieve performances close to the supervised model, indicating the effectiveness of multilingual self-supervised pre-training. We also observed that… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Comments: Submitted to ICASSP 2024 Self-supervision in Audio, Speech and Beyond workshop

  33. arXiv:2312.08610  [pdf, other

    eess.AS cs.SD

    A computationally efficient semi-blind source separation based approach for nonlinear echo cancellation based on an element-wise iterative source steering

    Authors: Kunxing Lu, Xianrui Wang, Tetsuya Ueda, Shoji Makino, Jingdong Chen

    Abstract: While the semi-blind source separation-based acoustic echo cancellation (SBSS-AEC) has received much research attention due to its promising performance during double-talk compared to the traditional adaptive algorithms, it suffers from system latency and nonlinear distortions. To circumvent these drawbacks, the recently developed ideas on convolutive transfer function (CTF) approximation and nonl… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  34. arXiv:2311.12058  [pdf, other

    cs.CV

    FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin

    Authors: Zichen Yu, Changyong Shu, Jiajun Deng, Kangjie Lu, Zongdai Liu, Jiangyong Yu, Dawei Yang, Hui Li, Yan Chen

    Abstract: Given the capability of mitigating the long-tail deficiencies and intricate-shaped absence prevalent in 3D object detection, occupancy prediction has become a pivotal component in autonomous driving systems. However, the procession of three-dimensional voxel-level representations inevitably introduces large overhead in both memory and computation, obstructing the deployment of to-date occupancy pr… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: 10 pages, 4 figures

  35. arXiv:2311.08981  [pdf, other

    cs.CL

    Speculative Contrastive Decoding

    Authors: Hongyi Yuan, Keming Lu, Fei Huang, Zheng Yuan, Chang Zhou

    Abstract: Large language models~(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias. Inspired by speculative decoding and contrastive decoding, we introduce Speculative Contrastive Decoding~(SCD), a straightforward yet powerful decoding approach that leverages predictions f… ▽ More

    Submitted 13 March, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Revised version

  36. arXiv:2311.08692  [pdf, other

    cs.CL cs.LG

    Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models

    Authors: Keming Lu, Hongyi Yuan, Runji Lin, Junyang Lin, Zheng Yuan, Chang Zhou, Jingren Zhou

    Abstract: The complementary potential of Large Language Models (LLM) assumes off-the-shelf LLMs have heterogeneous expertise in a wide range of domains and tasks so that an ensemble of LLMs can achieve consistently better performance. Existing ensemble methods for LLMs mainly focus on reward model ranking of outputs, leading to significant computation overhead. To combat this issue, we revisit the complemen… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  37. arXiv:2311.08182  [pdf, other

    cs.CL cs.LG

    Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning

    Authors: Shengguang Wu, Keming Lu, Benfeng Xu, Junyang Lin, Qi Su, Chang Zhou

    Abstract: Enhancing the instruction-following ability of Large Language Models (LLMs) primarily demands substantial instruction-tuning datasets. However, the sheer volume of these imposes a considerable computational burden and annotation cost. To investigate a label-efficient instruction tuning method that allows the model itself to actively sample subsets that are equally or even more effective, we introd… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  38. arXiv:2311.06530  [pdf, other

    cs.SE cs.AI cs.CL cs.CR

    Exploring ChatGPT's Capabilities on Vulnerability Management

    Authors: Peiyu Liu, Junming Liu, Lirong Fu, Kangjie Lu, Yifan Xia, Xuhong Zhang, Wenzhi Chen, Haiqin Weng, Shouling Ji, Wenhai Wang

    Abstract: Recently, ChatGPT has attracted great attention from the code analysis domain. Prior works show that ChatGPT has the capabilities of processing foundational code analysis tasks, such as abstract syntax tree generation, which indicates the potential of using ChatGPT to comprehend code syntax and static behaviors. However, it is unclear whether ChatGPT can complete more complicated real-world vulner… ▽ More

    Submitted 20 June, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

    Comments: Accepted by USENIX Security 2024

  39. arXiv:2310.18999  [pdf, other

    cs.CV

    DynPoint: Dynamic Neural Point For View Synthesis

    Authors: Kaichen Zhou, Jia-Xing Zhong, Sangyun Shin, Kai Lu, Yiyuan Yang, Andrew Markham, Niki Trigoni

    Abstract: The introduction of neural radiance fields has greatly improved the effectiveness of view synthesis for monocular videos. However, existing algorithms face difficulties when dealing with uncontrolled or lengthy scenarios, and require extensive training time specific to each new scenario. To tackle these limitations, we propose DynPoint, an algorithm designed to facilitate the rapid synthesis of no… ▽ More

    Submitted 18 January, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

  40. arXiv:2310.05506  [pdf, other

    cs.CL cs.AI cs.LG

    MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning

    Authors: Chengpeng Li, Zheng Yuan, Hongyi Yuan, Guanting Dong, Keming Lu, Jiancan Wu, Chuanqi Tan, Xiang Wang, Chang Zhou

    Abstract: In math reasoning with large language models (LLMs), fine-tuning data augmentation by query evolution and diverse reasoning paths is empirically verified effective, profoundly narrowing the gap between open-sourced LLMs and cutting-edge proprietary LLMs. In this paper, we conduct an investigation for such data augmentation in math reasoning and are intended to answer: (1) What strategies of data a… ▽ More

    Submitted 17 July, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted to ACL 2024 Main Conference

  41. arXiv:2310.05492  [pdf, other

    cs.CL cs.AI cs.LG

    How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition

    Authors: Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, Jingren Zhou

    Abstract: Large language models (LLMs) with enormous pre-training tokens and parameters emerge diverse abilities, including math reasoning, code generation, and instruction following. These abilities are further enhanced by supervised fine-tuning (SFT). While the open-source community has explored ad-hoc SFT for enhancing individual capabilities, proprietary LLMs exhibit versatility across various skills. T… ▽ More

    Submitted 7 June, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted to ACL 2024 Main Conference

  42. arXiv:2309.16681  [pdf, other

    cs.IT cs.AI

    Alternate Learning based Sparse Semantic Communications for Visual Transmission

    Authors: Siyu Tong, Xiaoxue Yu, Rongpeng Li, Kun Lu, Zhifeng Zhao, Honggang Zhang

    Abstract: Semantic communication (SemCom) demonstrates strong superiority over conventional bit-level accurate transmission, by only attempting to recover the essential semantic information of data. In this paper, in order to tackle the non-differentiability of channels, we propose an alternate learning based SemCom system for visual transmission, named SparseSBC. Specially, SparseSBC leverages two separate… ▽ More

    Submitted 30 July, 2023; originally announced September 2023.

  43. arXiv:2309.16609  [pdf, other

    cs.CL

    Qwen Technical Report

    Authors: Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan , et al. (23 additional authors not shown)

    Abstract: Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Q… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 59 pages, 5 figures

  44. arXiv:2309.09838  [pdf, ps, other

    cs.CL cs.SD eess.AS

    HypR: A comprehensive study for ASR hypothesis revising with a reference corpus

    Authors: Yi-Wei Wang, Ke-Han Lu, Kuan-Yu Chen

    Abstract: With the development of deep learning, automatic speech recognition (ASR) has made significant progress. To further enhance the performance of ASR, revising recognition results is one of the lightweight but efficient manners. Various methods can be roughly classified into N-best reranking modeling and error correction modeling. The former aims to select the hypothesis with the lowest error rate fr… ▽ More

    Submitted 13 June, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: Accepted to Interspeech 2024

  45. arXiv:2309.09510  [pdf, ps, other

    eess.AS cs.LG cs.SD

    Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech

    Authors: Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-yi Lee

    Abstract: Text language models have shown remarkable zero-shot capability in generalizing to unseen tasks when provided with well-formulated instructions. However, existing studies in speech processing primarily focus on limited or specific tasks. Moreover, the lack of standardized benchmarks hinders a fair comparison across different approaches. Thus, we present Dynamic-SUPERB, a benchmark designed for bui… ▽ More

    Submitted 22 March, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: To appear in the proceedings of ICASSP 2024

  46. arXiv:2309.00267  [pdf, other

    cs.CL cs.AI cs.LG

    RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

    Authors: Harrison Lee, Samrat Phatale, Hassan Mansoor, Thomas Mesnard, Johan Ferret, Kellie Lu, Colton Bishop, Ethan Hall, Victor Carbune, Abhinav Rastogi, Sushant Prakash

    Abstract: Reinforcement learning from human feedback (RLHF) has proven effective in aligning large language models (LLMs) with human preferences. However, gathering high-quality human preference labels can be a time-consuming and expensive endeavor. RL from AI Feedback (RLAIF), introduced by Bai et al., offers a promising alternative that leverages a powerful off-the-shelf LLM to generate preferences in lie… ▽ More

    Submitted 30 November, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Added two more tasks and many more experiments and analyses (e.g. same-size RLAIF, direct RLAIF, cost analysis)

  47. arXiv:2308.10451  [pdf, other

    cs.GT math.OC

    Game-theoretical approach for task allocation problems with constraints

    Authors: Chunxia Liu, Kaihong Lu, Xiaojie Chen, Attila Szolnoki

    Abstract: The distributed task allocation problem, as one of the most interesting distributed optimization challenges, has received considerable research attention recently. Previous works mainly focused on the task allocation problem in a population of individuals, where there are no constraints for affording task amounts. The latter condition, however, cannot always be hold. In this paper, we study the ta… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Journal ref: Applied Mathematics and Computation 458 (2023) 128251

  48. arXiv:2308.07074  [pdf, other

    cs.CL cs.AI cs.LG

    #InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models

    Authors: Keming Lu, Hongyi Yuan, Zheng Yuan, Runji Lin, Junyang Lin, Chuanqi Tan, Chang Zhou, Jingren Zhou

    Abstract: Foundation language models obtain the instruction-following ability through supervised fine-tuning (SFT). Diversity and complexity are considered critical factors of a successful SFT dataset, while their definitions remain obscure and lack quantitative analyses. In this work, we propose InsTag, an open-set fine-grained tagger, to tag samples within SFT datasets based on semantics and intentions an… ▽ More

    Submitted 15 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

  49. arXiv:2308.05756  [pdf, other

    eess.SP cs.LG

    WeldMon: A Cost-effective Ultrasonic Welding Machine Condition Monitoring System

    Authors: Beitong Tian, Kuan-Chieh Lu, Ahmadreza Eslaminia, Yaohui Wang, Chenhui Shao, Klara Nahrstedt

    Abstract: Ultrasonic welding machines play a critical role in the lithium battery industry, facilitating the bonding of batteries with conductors. Ensuring high-quality welding is vital, making tool condition monitoring systems essential for early-stage quality control. However, existing monitoring methods face challenges in cost, downtime, and adaptability. In this paper, we present WeldMon, an affordable… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: 9 pages, 5 figures

  50. arXiv:2308.01825  [pdf, other

    cs.CL

    Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

    Authors: Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Keming Lu, Chuanqi Tan, Chang Zhou, Jingren Zhou

    Abstract: Mathematical reasoning is a challenging task for large language models (LLMs), while the scaling relationship of it with respect to LLM capacity is under-explored. In this paper, we investigate how the pre-training loss, supervised data amount, and augmented data amount influence the reasoning performances of a supervised LLM. We find that pre-training loss is a better indicator of the model's per… ▽ More

    Submitted 12 September, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: Working in Progress