Skip to main content

Showing 1–50 of 325 results for author: Ren, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.11053  [pdf, other

    cs.SE cs.AI

    Revisiting VerilogEval: Newer LLMs, In-Context Learning, and Specification-to-RTL Tasks

    Authors: Nathaniel Pinckney, Christopher Batten, Mingjie Liu, Haoxing Ren, Brucek Khailany

    Abstract: The application of large-language models (LLMs) to digital hardware code generation is an emerging field. Most LLMs are primarily trained on natural language and software code. Hardware code, such as Verilog, represents only a small portion of the training data and few hardware benchmarks exist. To address this gap, the open-source VerilogEval benchmark was released in 2023, providing a consistent… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: This paper revisits and improves the benchmark first presented in arXiv:2309.07544. Seven pages, three figures

  2. arXiv:2408.08969  [pdf, other

    cs.AI physics.optics

    Differentiable Edge-based OPC

    Authors: Guojin Chen, Haoyu Yang, Haoxing Ren, Bei Yu, David Z. Pan

    Abstract: Optical proximity correction (OPC) is crucial for pushing the boundaries of semiconductor manufacturing and enabling the continued scaling of integrated circuits. While pixel-based OPC, termed as inverse lithography technology (ILT), has gained research interest due to its flexibility and precision. Its complexity and intricate features can lead to challenges in mask writing, increased defects, an… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: Accepted by ICCAD24

  3. arXiv:2408.08927  [pdf, other

    cs.AI cs.CL

    VerilogCoder: Autonomous Verilog Coding Agents with Graph-based Planning and Abstract Syntax Tree (AST)-based Waveform Tracing Tool

    Authors: Chia-Tung Ho, Haoxing Ren, Brucek Khailany

    Abstract: Due to the growing complexity of modern Integrated Circuits (ICs), automating hardware design can prevent a significant amount of human error from the engineering process and result in less errors. Verilog is a popular hardware description language for designing and modeling digital systems; thus, Verilog generation is one of the emerging areas of research to facilitate the design process. In this… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: main paper 7 pages, reference 1 page, appendix 22 pages. It is under review of AAAI 2025

  4. arXiv:2408.04958  [pdf, other

    cs.CV cs.RO

    Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic Surgery

    Authors: Long Bai, Guankun Wang, Mobarakol Islam, Lalithkumar Seenivasan, An Wang, Hongliang Ren

    Abstract: Medical visual question answering (VQA) bridges the gap between visual information and clinical decision-making, enabling doctors to extract understanding from clinical images and videos. In particular, surgical VQA can enhance the interpretation of surgical data, aiding in accurate diagnoses, effective education, and clinical interventions. However, the inability of VQA models to visually indicat… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: Accepted by Information Fusion. Code and data availability: https://github.com/longbai1006/Surgical-VQLAPlus

  5. arXiv:2408.04593  [pdf, other

    cs.CV cs.RO eess.IV

    SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation

    Authors: Jieming Yu, An Wang, Wenzhen Dong, Mengya Xu, Mobarakol Islam, Jie Wang, Long Bai, Hongliang Ren

    Abstract: The recent Segment Anything Model (SAM) 2 has demonstrated remarkable foundational competence in semantic segmentation, with its memory mechanism and mask decoder further addressing challenges in video tracking and object occlusion, thereby achieving superior results in interactive segmentation for both images and videos. Building upon our previous empirical studies, we further explore the zero-sh… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Empirical study. Previous work "SAM Meets Robotic Surgery" is accessible at: arXiv:2308.07156

  6. arXiv:2408.04426  [pdf, other

    cs.CV cs.RO

    A Review of 3D Reconstruction Techniques for Deformable Tissues in Robotic Surgery

    Authors: Mengya Xu, Ziqi Guo, An Wang, Long Bai, Hongliang Ren

    Abstract: As a crucial and intricate task in robotic minimally invasive surgery, reconstructing surgical scenes using stereo or monocular endoscopic video holds immense potential for clinical applications. NeRF-based techniques have recently garnered attention for the ability to reconstruct scenes implicitly. On the other hand, Gaussian splatting-based 3D-GS represents scenes explicitly using 3D Gaussians a… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: To appear in MICCAI 2024 EARTH Workshop. Code availability: https://github.com/Epsilon404/surgicalnerf

  7. arXiv:2408.01615  [pdf, other

    cs.RO

    Three-dimensional Morphological Reconstruction of Millimeter-Scale Soft Continuum Robots based on Dual-Stereo-Vision

    Authors: Tian-Ao Ren, Wenyan Liu, Tao Zhang, Lei Zhao, Hongliang Ren, Jiewen Lai

    Abstract: Continuum robots can be miniaturized to just a few millimeters in diameter. Among these, notched tubular continuum robots (NTCR) show great potential in many delicate applications. Existing works in robotic modeling focus on kinematics and dynamics but still face challenges in reproducing the robot's morphology -- a significant factor that can expand the research landscape of continuum robots, esp… ▽ More

    Submitted 15 August, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

    Comments: 6 pages, 6 figures, submitted to Robio 2024

  8. arXiv:2407.20213  [pdf, other

    cs.RO cs.CV

    Registering Neural 4D Gaussians for Endoscopic Surgery

    Authors: Yiming Huang, Beilei Cui, Ikemura Kei, Jiekai Zhang, Long Bai, Hongliang Ren

    Abstract: The recent advance in neural rendering has enabled the ability to reconstruct high-quality 4D scenes using neural networks. Although 4D neural reconstruction is popular, registration for such representations remains a challenging task, especially for dynamic scene registration in surgical planning and simulation. In this paper, we propose a novel strategy for dynamic surgical neural scene registra… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  9. arXiv:2407.19435  [pdf, other

    cs.CV cs.AI cs.CL cs.HC cs.RO

    ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding

    Authors: Zhen Chen, Zongming Zhang, Wenwu Guo, Xingjian Luo, Long Bai, Jinlin Wu, Hongliang Ren, Hongbin Liu

    Abstract: Surgical instrument segmentation is crucial in surgical scene understanding, thereby facilitating surgical safety. Existing algorithms directly detected all instruments of pre-defined categories in the input image, lacking the capability to segment specific instruments according to the surgeon's intention. During different stages of surgery, surgeons exhibit varying preferences and focus toward di… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: This work is accepted by IROS 2024 (Oral)

  10. arXiv:2407.12489  [pdf, other

    cs.CV

    Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation

    Authors: Ruijie Xu, Chuyu Zhang, Hui Ren, Xuming He

    Abstract: We tackle the novel class discovery in point cloud segmentation, which discovers novel classes based on the semantic knowledge of seen classes. Existing work proposes an online point-wise clustering method with a simplified equal class-size constraint on the novel classes to avoid degenerate solutions. However, the inherent imbalanced distribution of novel classes in point clouds typically violate… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  11. arXiv:2407.09048  [pdf, other

    cs.AI

    KUNPENG: An Embodied Large Model for Intelligent Maritime

    Authors: Naiyao Wang, Tongbang Jiang, Ye Wang, Shaoyang Qiu, Bo Zhang, Xinqiang Xie, Munan Li, Chunliu Wang, Yiyang Wang, Hongxiang Ren, Ruili Wang, Hongjun Shan, Hongbo Liu

    Abstract: Intelligent maritime, as an essential component of smart ocean construction, deeply integrates advanced artificial intelligence technology and data analysis methods, which covers multiple aspects such as smart vessels, route optimization, safe navigation, aiming to enhance the efficiency of ocean resource utilization and the intelligence of transportation networks. However, the complex and dynamic… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 9 pages, 3 figures

  12. arXiv:2407.05040  [pdf, other

    cs.SE cs.LG

    Code Less, Align More: Efficient LLM Fine-tuning for Code Generation with Data Pruning

    Authors: Yun-Da Tsai, Mingjie Liu, Haoxing Ren

    Abstract: Recent work targeting large language models (LLMs) for code generation demonstrated that increasing the amount of training data through synthetic code generation often leads to exceptional performance. In this paper we explore data pruning methods aimed at enhancing the efficiency of model training specifically for code LLMs. We present techniques that integrate various clustering and pruning metr… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  13. arXiv:2407.04579  [pdf, other

    cs.LG

    GOALPlace: Begin with the End in Mind

    Authors: Anthony Agnesina, Rongjian Liang, Geraldo Pradipta, Anand Rajaram, Haoxing Ren

    Abstract: Co-optimizing placement with congestion is integral to achieving high-quality designs. This paper presents GOALPlace, a new learning-based general approach to improving placement congestion by controlling cell density. Our method efficiently learns from an EDA tool's post-route optimized results and uses an empirical Bayes technique to adapt this goal/target to a specific placer's solutions, effec… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 10 pages, 7 figures, preprint

  14. arXiv:2407.03640  [pdf, other

    cs.LG cs.CL cs.CV

    Generative Technology for Human Emotion Recognition: A Scope Review

    Authors: Fei Ma, Yucheng Yuan, Yifan Xie, Hongwei Ren, Ivan Liu, Ying He, Fuji Ren, Fei Richard Yu, Shiguang Ni

    Abstract: Affective computing stands at the forefront of artificial intelligence (AI), seeking to imbue machines with the ability to comprehend and respond to human emotions. Central to this field is emotion recognition, which endeavors to identify and interpret human emotional states from different modalities, such as speech, facial images, text, and physiological signals. In recent years, important progre… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Under Review

  15. arXiv:2407.02933  [pdf, other

    cs.RO

    Online Time-Informed Kinodynamic Motion Planning of Nonlinear Systems

    Authors: Fei Meng, Jianbang Liu, Haojie Shi, Han Ma, Hongliang Ren, Max Q. -H. Meng

    Abstract: Sampling-based kinodynamic motion planners (SKMPs) are powerful in finding collision-free trajectories for high-dimensional systems under differential constraints. Time-informed set (TIS) can provide the heuristic search domain to accelerate their convergence to the time-optimal solution. However, existing TIS approximation methods suffer from the curse of dimensionality, computational burden, and… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  16. arXiv:2407.00782  [pdf, other

    cs.CL

    Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning

    Authors: Zimu Lu, Aojun Zhou, Ke Wang, Houxing Ren, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li

    Abstract: Direct Preference Optimization (DPO) has proven effective at improving the performance of large language models (LLMs) on downstream tasks such as reasoning and alignment. In this work, we propose Step-Controlled DPO (SCDPO), a method for automatically providing stepwise error supervision by creating negative samples of mathematical reasoning rationales that start making errors at a specified step… ▽ More

    Submitted 14 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

  17. arXiv:2406.13705  [pdf, other

    eess.IV cs.AI cs.CV

    EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy

    Authors: Long Bai, Tong Chen, Qiaozhi Tan, Wan Jun Nah, Yanheng Li, Zhicheng He, Sishen Yuan, Zhen Chen, Jinlin Wu, Mobarakol Islam, Zhen Li, Hongbin Liu, Hongliang Ren

    Abstract: Wireless Capsule Endoscopy (WCE) is highly valued for its non-invasive and painless approach, though its effectiveness is compromised by uneven illumination from hardware constraints and complex internal dynamics, leading to overexposed or underexposed images. While researchers have discussed the challenges of low-light enhancement in WCE, the issue of correcting for different exposure levels rema… ▽ More

    Submitted 8 July, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: To appear in MICCAI 2024. Code and dataset availability: https://github.com/longbai1006/EndoUIC

  18. arXiv:2406.13173  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Biomedical Visual Instruction Tuning with Clinician Preference Alignment

    Authors: Hejie Cui, Lingjun Mao, Xin Liang, Jieyu Zhang, Hui Ren, Quanzheng Li, Xiang Li, Carl Yang

    Abstract: Recent advancements in multimodal foundation models have showcased impressive capabilities in understanding and reasoning with visual and textual information. Adapting these foundation models trained for general usage to specialized domains like biomedicine requires large-scale domain-specific instruction datasets. While existing works have explored curating such datasets automatically, the result… ▽ More

    Submitted 16 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    MSC Class: 68T50; 68T45; 68T37; 68T05; 68T07; 68T09; ACM Class: I.2.7; I.2.6; I.2.10

  19. arXiv:2406.13048  [pdf, other

    cs.CV

    Head Pose Estimation and 3D Neural Surface Reconstruction via Monocular Camera in situ for Navigation and Safe Insertion into Natural Openings

    Authors: Ruijie Tang, Beilei Cui, Hongliang Ren

    Abstract: As the significance of simulation in medical care and intervention continues to grow, it is anticipated that a simplified and low-cost platform can be set up to execute personalized diagnoses and treatments. 3D Slicer can not only perform medical image analysis and visualization but can also provide surgical navigation and surgical planning functions. In this paper, we have chosen 3D Slicer as our… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by ICBIR 2024

  20. arXiv:2406.10508  [pdf, other

    cs.CV

    Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy Diagnosis

    Authors: Bowen Zhang, Ying Chen, Long Bai, Yan Zhao, Yuxiang Sun, Yixuan Yuan, Jianhua Zhang, Hongliang Ren

    Abstract: Foundation models have become prominent in computer vision, achieving notable success in various tasks. However, their effectiveness largely depends on pre-training with extensive datasets. Applying foundation models directly to small datasets of capsule endoscopy images from scratch is challenging. Pre-training on broad, general vision datasets is crucial for successfully fine-tuning our model fo… ▽ More

    Submitted 30 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: To appear in ICBIR 2024

  21. arXiv:2406.06549  [pdf, other

    cs.AR cs.AI

    Large Language Model (LLM) for Standard Cell Layout Design Optimization

    Authors: Chia-Tung Ho, Haoxing Ren

    Abstract: Standard cells are essential components of modern digital circuit designs. With process technologies advancing toward 2nm, more routability issues have arisen due to the decreasing number of routing tracks, increasing number and complexity of design rules, and strict patterning rules. The state-of-the-art standard cell design automation framework is able to automatically design standard cell layou… ▽ More

    Submitted 24 May, 2024; originally announced June 2024.

    Comments: 6 pages, 8 figures, IEEE International Workshop on LLM-Aided Design (LAD'24)

  22. arXiv:2406.03177  [pdf, other

    cs.CV

    FAPNet: An Effective Frequency Adaptive Point-based Eye Tracker

    Authors: Xiaopeng Lin, Hongwei Ren, Bojun Cheng

    Abstract: Eye tracking is crucial for human-computer interaction in different domains. Conventional cameras encounter challenges such as power consumption and image quality during different eye movements, prompting the need for advanced solutions with ultra-fast, low-power, and accurate eye trackers. Event cameras, fundamentally designed to capture information about moving objects, exhibit low power consump… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by CVPRW 2024 (AIS)

  23. arXiv:2405.20015  [pdf, other

    cs.AI cs.CL

    Efficient LLM-Jailbreaking by Introducing Visual Modality

    Authors: Zhenxing Niu, Yuyao Sun, Haodong Ren, Haoxuan Ji, Quan Wang, Xiaoke Ma, Gang Hua, Rong Jin

    Abstract: This paper focuses on jailbreaking attacks against large language models (LLMs), eliciting them to generate objectionable content in response to harmful user queries. Unlike previous LLM-jailbreaks that directly orient to LLMs, our approach begins by constructing a multimodal large language model (MLLM) through the incorporation of a visual module into the target LLM. Subsequently, we conduct an e… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  24. arXiv:2405.17103  [pdf, other

    cs.CL cs.AI

    Empowering Character-level Text Infilling by Eliminating Sub-Tokens

    Authors: Houxing Ren, Mingjie Zhan, Zhongyuan Wu, Hongsheng Li

    Abstract: In infilling tasks, sub-tokens, representing instances where a complete token is segmented into two parts, often emerge at the boundaries of prefixes, middles, and suffixes. Traditional methods focused on training models at the token level, leading to sub-optimal performance in character-level infilling tasks during the inference stage. Alternately, some approaches considered character-level infil… ▽ More

    Submitted 14 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL 2024 (main conference)

  25. arXiv:2405.17057  [pdf, other

    cs.CL cs.AI

    ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation

    Authors: Houxing Ren, Mingjie Zhan, Zhongyuan Wu, Aojun Zhou, Junting Pan, Hongsheng Li

    Abstract: Code generation plays a crucial role in various tasks, such as code auto-completion and mathematical reasoning. Previous work has proposed numerous methods to enhance code generation performance, including integrating feedback from the compiler. Inspired by this, we present ReflectionCoder, a novel approach that effectively leverages reflection sequences constructed by integrating compiler feedbac… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  26. arXiv:2405.15161  [pdf, other

    cs.CR cs.CV

    Are You Copying My Prompt? Protecting the Copyright of Vision Prompt for VPaaS via Watermark

    Authors: Huali Ren, Anli Yan, Chong-zhi Gao, Hongyang Yan, Zhenxin Zhang, Jin Li

    Abstract: Visual Prompt Learning (VPL) differs from traditional fine-tuning methods in reducing significant resource consumption by avoiding updating pre-trained model parameters. Instead, it focuses on learning an input perturbation, a visual prompt, added to downstream task data for making predictions. Since learning generalizable prompts requires expert design and creation, which is technically demanding… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 11 pages, 7 figures,

  27. arXiv:2405.15154  [pdf, other

    cs.AI cs.LG

    Online Prompt Pricing based on Combinatorial Multi-Armed Bandit and Hierarchical Stackelberg Game

    Authors: Meiling Li, Hongrun Ren, Haixu Xiong, Zhenxing Qian, Xinpeng Zhang

    Abstract: Generation models have shown promising performance in various tasks, making trading around machine learning models possible. In this paper, we aim at a novel prompt trading scenario, prompt bundle trading (PBT) system, and propose an online pricing mechanism. Based on the combinatorial multi-armed bandit (CMAB) and three-stage hierarchical Stackelburg (HS) game, our pricing mechanism considers the… ▽ More

    Submitted 31 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  28. arXiv:2405.10948  [pdf, other

    cs.CV cs.AI cs.RO eess.IV

    Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery

    Authors: Guankun Wang, Long Bai, Wan Jun Nah, Jie Wang, Zhaoxi Zhang, Zhen Chen, Jinlin Wu, Mobarakol Islam, Hongbin Liu, Hongliang Ren

    Abstract: Recent advancements in Surgical Visual Question Answering (Surgical-VQA) and related region grounding have shown great promise for robotic and medical applications, addressing the critical need for automated methods in personalized surgical mentorship. However, existing models primarily provide simple structured answers and struggle with complex scenarios due to their limited capability in recogni… ▽ More

    Submitted 22 March, 2024; originally announced May 2024.

  29. arXiv:2405.10550  [pdf, other

    eess.IV cs.CV

    LighTDiff: Surgical Endoscopic Image Low-Light Enhancement with T-Diffusion

    Authors: Tong Chen, Qingcheng Lyu, Long Bai, Erjian Guo, Huxin Gao, Xiaoxiao Yang, Hongliang Ren, Luping Zhou

    Abstract: Advances in endoscopy use in surgeries face challenges like inadequate lighting. Deep learning, notably the Denoising Diffusion Probabilistic Model (DDPM), holds promise for low-light image enhancement in the medical field. However, DDPMs are computationally demanding and slow, limiting their practical medical applications. To bridge this gap, we propose a lightweight DDPM, dubbed LighTDiff. It ad… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  30. arXiv:2405.08672  [pdf, other

    eess.IV cs.CV

    EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera

    Authors: Beilei Cui, Mobarakol Islam, Long Bai, An Wang, Hongliang Ren

    Abstract: Depth estimation plays a crucial role in various tasks within endoscopic surgery, including navigation, surface reconstruction, and augmented reality visualization. Despite the significant achievements of foundation models in vision tasks, including depth estimation, their direct application to the medical domain often results in suboptimal performance. This highlights the need for efficient adapt… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: early accepted by MICCAI 2024

  31. arXiv:2405.07601  [pdf, other

    cs.LG cs.AI cs.DB cs.DC

    On-device Online Learning and Semantic Management of TinyML Systems

    Authors: Haoyu Ren, Xue Li, Darko Anicic, Thomas A. Runkler

    Abstract: Recent advances in Tiny Machine Learning (TinyML) empower low-footprint embedded devices for real-time on-device Machine Learning. While many acknowledge the potential benefits of TinyML, its practical implementation presents unique challenges. This study aims to bridge the gap between prototyping single TinyML models and developing reliable TinyML systems in production: (1) Embedded devices opera… ▽ More

    Submitted 15 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: Accepted by Journal Transactions on Embedded Computing Systems (TECS)

  32. arXiv:2405.06116  [pdf, other

    cs.CV

    Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba

    Authors: Hongwei Ren, Yue Zhou, Jiadong Zhu, Haotian Fu, Yulong Huang, Xiaopeng Lin, Yuetong Fang, Fei Ma, Hao Yu, Bojun Cheng

    Abstract: Event cameras, drawing inspiration from biological systems, efficiently detect changes in ambient light with low latency and high dynamic range while consuming minimal power. The most current approach to processing event data often involves converting it into frame-based representations, which is well-established in traditional vision. However, this approach neglects the sparsity of event data, lo… ▽ More

    Submitted 2 July, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: Extension Journal of TTPOINT and PEPNet, modify the dataset split method

  33. arXiv:2405.05576  [pdf, other

    cs.SI cs.IR cs.NI

    LayerPlexRank: Exploring Node Centrality and Layer Influence through Algebraic Connectivity in Multiplex Networks

    Authors: Hao Ren, Jiaojiao Jiang

    Abstract: As the calculation of centrality in complex networks becomes increasingly vital across technological, biological, and social systems, precise and scalable ranking methods are essential for understanding these networks. This paper introduces LayerPlexRank, an algorithm that simultaneously assesses node centrality and layer influence in multiplex networks using algebraic connectivity metrics. This m… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  34. arXiv:2405.03574  [pdf, other

    cs.LG

    ILILT: Implicit Learning of Inverse Lithography Technologies

    Authors: Haoyu Yang, Haoxing Ren

    Abstract: Lithography, transferring chip design masks to the silicon wafer, is the most important phase in modern semiconductor manufacturing flow. Due to the limitations of lithography systems, Extensive design optimizations are required to tackle the design and silicon mismatch. Inverse lithography technology (ILT) is one of the promising solutions to perform pre-fabrication optimization, termed mask opti… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 12 pages, 10 figures, accepted by International Conference on Machine Learning (ICML24)

  35. arXiv:2405.00734  [pdf, other

    eess.SP cs.AI cs.LG

    EEG-MACS: Manifold Attention and Confidence Stratification for EEG-based Cross-Center Brain Disease Diagnosis under Unreliable Annotations

    Authors: Zhenxi Song, Ruihan Qin, Huixia Ren, Zhen Liang, Yi Guo, Min Zhang, Zhiguo Zhang

    Abstract: Cross-center data heterogeneity and annotation unreliability significantly challenge the intelligent diagnosis of diseases using brain signals. A notable example is the EEG-based diagnosis of neurodegenerative diseases, which features subtler abnormal neural dynamics typically observed in small-group settings. To advance this area, in this work, we introduce a transferable framework employing Mani… ▽ More

    Submitted 13 August, 2024; v1 submitted 29 April, 2024; originally announced May 2024.

  36. An Element-Wise Weights Aggregation Method for Federated Learning

    Authors: Yi Hu, Hanchi Ren, Chen Hu, Jingjing Deng, Xianghua Xie

    Abstract: Federated learning (FL) is a powerful Machine Learning (ML) paradigm that enables distributed clients to collaboratively learn a shared global model while keeping the data on the original device, thereby preserving privacy. A central challenge in FL is the effective aggregation of local model weights from disparate and potentially unbalanced participating clients. Existing methods often treat each… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 2023 IEEE International Conference on Data Mining Workshops (ICDMW)

  37. arXiv:2404.15854  [pdf, other

    cs.CR cs.LG cs.SD eess.AS

    CLAD: Robust Audio Deepfake Detection Against Manipulation Attacks with Contrastive Learning

    Authors: Haolin Wu, Jing Chen, Ruiying Du, Cong Wu, Kun He, Xingcan Shang, Hao Ren, Guowen Xu

    Abstract: The increasing prevalence of audio deepfakes poses significant security threats, necessitating robust detection methods. While existing detection systems exhibit promise, their robustness against malicious audio manipulations remains underexplored. To bridge the gap, we undertake the first comprehensive study of the susceptibility of the most widely adopted audio deepfake detectors to manipulation… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE TDSC

  38. arXiv:2404.15469  [pdf, other

    cs.IT eess.SP

    NMBEnet: Efficient Near-field mmWave Beam Training for Multiuser OFDM Systems Using Sub-6 GHz Pilots

    Authors: Wang Liu, Cunhua Pan, Hong Ren, Cheng-Xiang Wang, Jiangzhou Wang, Xiaohu You

    Abstract: Combining millimetre-wave (mmWave) communications with an extremely large-scale antenna array (ELAA) presents a promising avenue for meeting the spectral efficiency demands of the future sixth generation (6G) mobile communications. However, beam training for mmWave ELAA systems is challenged by excessive pilot overheads as well as insufficient accuracy, as the huge near-field codebook has to be ac… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  39. arXiv:2404.12091  [pdf, other

    cs.CV

    Harnessing Joint Rain-/Detail-aware Representations to Eliminate Intricate Rains

    Authors: Wu Ran, Peirong Ma, Zhiquan He, Hao Ren, Hong Lu

    Abstract: Recent advances in image deraining have focused on training powerful models on mixed multiple datasets comprising diverse rain types and backgrounds. However, this approach tends to overlook the inherent differences among rainy images, leading to suboptimal results. To overcome this limitation, we focus on addressing various rainy images by delving into meaningful representations that encapsulate… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 21 pages, 14 figures

    Journal ref: International Conference on Learning Representations 2024

  40. Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans

    Authors: Lixing Tan, Shuang Song, Kangneng Zhou, Chengbo Duan, Lanying Wang, Huayang Ren, Linlin Liu, Wei Zhang, Ruoxiu Xiao

    Abstract: X-ray images play a vital role in the intraoperative processes due to their high resolution and fast imaging speed and greatly promote the subsequent segmentation, registration and reconstruction. However, over-dosed X-rays superimpose potential risks to human health to some extent. Data-driven algorithms from volume scans to X-ray images are restricted by the scarcity of paired X-ray and volume d… ▽ More

    Submitted 30 July, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: 13 pages, 10 figures, ACM MM2024

  41. arXiv:2404.11770  [pdf, other

    cs.CV cs.AI

    Event-Based Eye Tracking. AIS 2024 Challenge Survey

    Authors: Zuowen Wang, Chang Gao, Zongwei Wu, Marcos V. Conde, Radu Timofte, Shih-Chii Liu, Qinyu Chen, Zheng-jun Zha, Wei Zhai, Han Han, Bohao Liao, Yuliang Wu, Zengyu Wan, Zhong Wang, Yang Cao, Ganchao Tan, Jinze Chen, Yan Ru Pei, Sasskia Brüers, Sébastien Crouzet, Douglas McLelland, Oliver Coenen, Baoheng Zhang, Yizhao Gao, Jingyuan Li , et al. (14 additional authors not shown)

    Abstract: This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge. The task of the challenge focuses on processing eye movement recorded with event cameras and predicting the pupil center of the eye. The challenge emphasizes efficient eye tracking with event cameras to achieve good task accuracy and efficiency trade-off. During the challenge period, 38 participants registered for the Kaggl… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Qinyu Chen is the corresponding author

  42. arXiv:2404.10213  [pdf, other

    cs.CV

    GaitPoint+: A Gait Recognition Network Incorporating Point Cloud Analysis and Recycling

    Authors: Huantao Ren, Jiajing Chen, Senem Velipasalar

    Abstract: Gait is a behavioral biometric modality that can be used to recognize individuals by the way they walk from a far distance. Most existing gait recognition approaches rely on either silhouettes or skeletons, while their joint use is underexplored. Features from silhouettes and skeletons can provide complementary information for more robust recognition against appearance changes or pose estimation e… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  43. arXiv:2404.08850  [pdf

    cs.AI cs.CE cs.LG

    Assessing Economic Viability: A Comparative Analysis of Total Cost of Ownership for Domain-Adapted Large Language Models versus State-of-the-art Counterparts in Chip Design Coding Assistance

    Authors: Amit Sharma, Teodor-Dumitru Ene, Kishor Kunal, Mingjie Liu, Zafar Hasan, Haoxing Ren

    Abstract: This paper presents a comparative analysis of total cost of ownership (TCO) and performance between domain-adapted large language models (LLM) and state-of-the-art (SoTA) LLMs , with a particular emphasis on tasks related to coding assistance for chip design. We examine the TCO and performance metrics of a domain-adaptive LLM, ChipNeMo, against two leading LLMs, Claude 3 Opus and ChatGPT-4 Turbo,… ▽ More

    Submitted 28 May, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: Paper accepted in IEEE-ACM conference: 2024 IEEE LLM-Aided Design Workshop (LAD)

  44. arXiv:2404.05916  [pdf, other

    cs.CV

    Prompt-driven Universal Model for View-Agnostic Echocardiography Analysis

    Authors: Sekeun Kim, Hui Ren, Peng Guo, Abder-Rahman Ali, Patrick Zhang, Kyungsang Kim, Xiang Li, Quanzheng Li

    Abstract: Echocardiography segmentation for cardiac analysis is time-consuming and resource-intensive due to the variability in image quality and the necessity to process scans from various standard views. While current automated segmentation methods in echocardiography show promising performance, they are trained on specific scan views to analyze corresponding data. However, this solution has a limitation… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  45. arXiv:2404.05334  [pdf

    cs.SI

    Modeling the Dynamic Process of Inventions for Reducing Knowledge Search Costs

    Authors: Haiying Ren, Yuanyuan Song, Rui Peng

    Abstract: A knowledge search is a key process for inventions. However, there is inadequate quantitative modeling of dynamic knowledge search processes and associated search costs. In this study, agent-based and complex network methodologies were proposed to quantitatively describe the dynamic process of knowledge search for actual inventions. Prior knowledge networks (PKNs), the search space of historical p… ▽ More

    Submitted 10 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: 16 pages, 8 figures

    ACM Class: J.4

  46. arXiv:2404.03446  [pdf, other

    cs.CV cs.LG

    SP$^2$OT: Semantic-Regularized Progressive Partial Optimal Transport for Imbalanced Clustering

    Authors: Chuyu Zhang, Hui Ren, Xuming He

    Abstract: Deep clustering, which learns representation and semantic clustering without labels information, poses a great challenge for deep learning-based approaches. Despite significant progress in recent years, most existing methods focus on uniformly distributed datasets, significantly limiting the practical applicability of their methods. In this paper, we propose a more practical problem setting named… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: under review. arXiv admin note: substantial text overlap with arXiv:2401.09266

  47. arXiv:2404.03191  [pdf, other

    cs.CV

    CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks

    Authors: Beibei Wang, Shuang Meng, Lu Zhang, Chenjie Wang, Jingjing Huang, Yao Li, Haojie Ren, Yuxuan Xiao, Yuru Peng, Jianmin Ji, Yu Zhang, Yanyong Zhang

    Abstract: Numerous roadside perception datasets have been introduced to propel advancements in autonomous driving and intelligent transportation systems research and development. However, it has been observed that the majority of their concentrates is on urban arterial roads, inadvertently overlooking residential areas such as parks and campuses that exhibit entirely distinct characteristics. In light of th… ▽ More

    Submitted 6 May, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  48. arXiv:2403.19412  [pdf, other

    cs.CV

    A Simple and Effective Point-based Network for Event Camera 6-DOFs Pose Relocalization

    Authors: Hongwei Ren, Jiadong Zhu, Yue Zhou, Haotian FU, Yulong Huang, Bojun Cheng

    Abstract: Event cameras exhibit remarkable attributes such as high dynamic range, asynchronicity, and low latency, making them highly suitable for vision tasks that involve high-speed motion in challenging lighting conditions. These cameras implicitly capture movement and depth information in events, making them appealing sensors for Camera Pose Relocalization (CPR) tasks. Nevertheless, existing CPR network… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  49. arXiv:2403.10009  [pdf

    eess.IV cs.CV

    Temporal-spatial Adaptation of Promptable SAM Enhance Accuracy and Generalizability of cine CMR Segmentation

    Authors: Zhennong Chen, Sekeun Kim, Hui Ren, Quanzheng Li, Xiang Li

    Abstract: Accurate myocardium segmentation across all phases in one cardiac cycle in cine cardiac magnetic resonance (CMR) scans is crucial for comprehensively cardiac function analysis. Despite advancements in deep learning (DL) for automatic cine CMR segmentation, generalizability on unseen data remains a significant challenge. Recently, the segment-anything-model (SAM) has been invented as a segmentation… ▽ More

    Submitted 15 July, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: 10 pages, 3 figures

  50. arXiv:2403.09058  [pdf, ps, other

    cs.IT eess.SP

    Performance Analysis on RIS-Aided Wideband Massive MIMO OFDM Systems with Low-Resolution ADCs

    Authors: Xianzhe Chen, Hong Ren, Cunhua Pan, Zhangjie Peng, Kangda Zhi, Yong Liu, Xiaojun Xi, Ana Garcia Armada, Cheng-Xiang Wang

    Abstract: This paper investigates a reconfigurable intelligent surface (RIS)-aided wideband massive multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) system with low-resolution analog-to-digital converters (ADCs). Frequency-selective Rician fading channels are considered, and the OFDM data transmission process is presented in time domain. This paper derives the closed-f… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.