Skip to main content

Showing 1–50 of 63 results for author: Kwon, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.00359  [pdf, other

    cs.LG stat.ML

    Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks

    Authors: Jy-yong Sohn, Dohyun Kwon, Seoyeon An, Kangwook Lee

    Abstract: Fine-tuning large pre-trained models is a common practice in machine learning applications, yet its mathematical analysis remains largely unexplored. In this paper, we study fine-tuning through the lens of memorization capacity. Our new measure, the Fine-Tuning Capacity (FTC), is defined as the maximum number of samples a neural network can fine-tune, or equivalently, as the minimum number of neur… ▽ More

    Submitted 19 August, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

    Comments: 10 pages, 9 figures, UAI 2024

  2. arXiv:2407.20021  [pdf, other

    cs.LG cs.AI cs.CV

    MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity

    Authors: Kanghyun Choi, Hye Yoon Lee, Dain Kwon, SunJong Park, Kyuyeun Kim, Noseong Park, Jinho Lee

    Abstract: Data-free quantization (DFQ) is a technique that creates a lightweight network from its full-precision counterpart without the original training data, often through a synthetic dataset. Although several DFQ methods have been proposed for vision transformer (ViT) architectures, they fail to achieve efficacy in low-bit settings. Examining the existing methods, we identify that their synthetic data p… ▽ More

    Submitted 1 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: Author Preprint

  3. arXiv:2407.00626  [pdf, other

    cs.LG cs.AI

    Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

    Authors: Sangwoong Yoon, Himchan Hwang, Dohyun Kwon, Yung-Kyun Noh, Frank C. Park

    Abstract: We present a maximum entropy inverse reinforcement learning (IRL) approach for improving the sample quality of diffusion generative models, especially when the number of generation time steps is small. Similar to how IRL trains a policy based on the reward function learned from expert demonstrations, we train (or fine-tune) a diffusion model using the log probability density estimated from trainin… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Code is released at https://github.com/swyoon/Diffusion-by-MaxEntIRL

  4. arXiv:2406.15635  [pdf, other

    cs.LG cs.CR cs.CV

    DataFreeShield: Defending Adversarial Attacks without Training Data

    Authors: Hyeyoon Lee, Kanghyun Choi, Dain Kwon, Sunjong Park, Mayoore Selvarasa Jaiswal, Noseong Park, Jonghyun Choi, Jinho Lee

    Abstract: Recent advances in adversarial robustness rely on an abundant set of training data, where using external or additional datasets has become a common setting. However, in real life, the training data is often kept private for security and privacy issues, while only the pretrained weight is available to the public. In such scenarios, existing methods that assume accessibility to the original data bec… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  5. arXiv:2406.01020  [pdf, other

    cs.CV

    CLIP-Guided Attribute Aware Pretraining for Generalizable Image Quality Assessment

    Authors: Daekyu Kwon, Dongyoung Kim, Sehwan Ki, Younghyun Jo, Hyong-Euk Lee, Seon Joo Kim

    Abstract: In no-reference image quality assessment (NR-IQA), the challenge of limited dataset sizes hampers the development of robust and generalizable models. Conventional methods address this issue by utilizing large datasets to extract rich representations for IQA. Also, some approaches propose vision language models (VLM) based IQA, but the domain gap between generic VLM and IQA constrains their scalabi… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  6. arXiv:2405.13345  [pdf, other

    cs.RO cs.LG

    Autonomous Algorithm for Training Autonomous Vehicles with Minimal Human Intervention

    Authors: Sang-Hyun Lee, Daehyeok Kwon, Seung-Woo Seo

    Abstract: Reinforcement learning (RL) provides a compelling framework for enabling autonomous vehicles to continue to learn and improve diverse driving behaviors on their own. However, training real-world autonomous vehicles with current RL algorithms presents several challenges. One critical challenge, often overlooked in these algorithms, is the need to reset a driving environment between every episode. W… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 8 pages, 6 figures, 2 tables, conference

  7. arXiv:2405.04620  [pdf, ps, other

    hep-ph cs.AI cs.CL cs.LG cs.NE

    Folded context condensation in Path Integral formalism for infinite context transformers

    Authors: Won-Gi Paeng, Daesuk Kwon

    Abstract: This short note is written for rapid communication of long context training and to share the idea of how to train it with low memory usage. In the note, we generalize the attention algorithm and neural network of Generative Pre-Trained Transformers and reinterpret it in Path integral formalism. First, the role of the transformer is understood as the time evolution of the token state and second, it… ▽ More

    Submitted 9 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 7 pages, 2 figures

  8. arXiv:2404.02135  [pdf, other

    cs.CV eess.IV

    Enhancing Ship Classification in Optical Satellite Imagery: Integrating Convolutional Block Attention Module with ResNet for Improved Performance

    Authors: Ryan Donghan Kwon, Gangjoo Robin Nam, Jisoo Tak, Junseob Shin, Hyerin Cha, Seung Won Lee

    Abstract: In this study, we present an advanced convolutional neural network (CNN) architecture for ship classification based on optical satellite imagery, which significantly enhances performance through the integration of a convolutional block attention module (CBAM) and additional architectural innovations. Building upon the foundational ResNet50 model, we first incorporated a standard CBAM to direct the… ▽ More

    Submitted 20 August, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE Access on August 16, 2024

  9. arXiv:2402.19267  [pdf, other

    cs.CL cs.AI

    Robust Guidance for Unsupervised Data Selection: Capturing Perplexing Named Entities for Domain-Specific Machine Translation

    Authors: Seunghyun Ji, Hagai Raja Sinulingga, Darongsae Kwon

    Abstract: Low-resourced data presents a significant challenge for neural machine translation. In most cases, the low-resourced environment is caused by high costs due to the need for domain experts or the lack of language experts. Therefore, identifying the most training-efficient data within an unsupervised setting emerges as a practical strategy. Recent research suggests that such effective data can be id… ▽ More

    Submitted 21 May, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: 11 pages, 3 figures, 5 tables. Oral presentation was given in SIGUL 2024, a satellite workshop of LREC-COLING 2024 (https://sigul-2024.ilc.cnr.it/wp-content/uploads/2024/05/Ji-et-al.pdf)

  10. arXiv:2402.14590  [pdf, other

    cs.IR cs.CL cs.LG

    Scaling Up LLM Reviews for Google Ads Content Moderation

    Authors: Wei Qiao, Tushar Dogra, Otilia Stretcu, Yu-Han Lyu, Tiantian Fang, Dongjin Kwon, Chun-Ta Lu, Enming Luo, Yuan Wang, Chih-Chun Chia, Ariel Fuxman, Fangzhou Wang, Ranjay Krishna, Mehmet Tek

    Abstract: Large language models (LLMs) are powerful tools for content moderation, but their inference costs and latency make them prohibitive for casual use on large datasets, such as the Google Ads repository. This study proposes a method for scaling up LLM reviews for content moderation in Google Ads. First, we use heuristics to select candidates via filtering and duplicate removal, and create clusters of… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  11. arXiv:2402.13550  [pdf, other

    cs.CL cs.AI

    Are LLMs Effective Negotiators? Systematic Evaluation of the Multifaceted Capabilities of LLMs in Negotiation Dialogues

    Authors: Deuksin Kwon, Emily Weiss, Tara Kulshrestha, Kushal Chawla, Gale M. Lucas, Jonathan Gratch

    Abstract: A successful negotiation demands a deep comprehension of the conversation context, Theory-of-Mind (ToM) skills to infer the partner's motives, as well as strategic reasoning and effective communication, making it challenging for automated systems. Given the remarkable performance of LLMs across a variety of NLP tasks, in this work, we aim to understand how LLMs can advance different aspects of neg… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  12. arXiv:2402.09264  [pdf, other

    cs.LG cs.HC

    UR2M: Uncertainty and Resource-Aware Event Detection on Microcontrollers

    Authors: Hong Jia, Young D. Kwon, Dong Ma, Nhat Pham, Lorena Qendro, Tam Vu, Cecilia Mascolo

    Abstract: Traditional machine learning techniques are prone to generating inaccurate predictions when confronted with shifts in the distribution of data between the training and testing phases. This vulnerability can lead to severe consequences, especially in applications such as mobile healthcare. Uncertainty estimation has the potential to mitigate this issue by assessing the reliability of a model's outp… ▽ More

    Submitted 12 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  13. arXiv:2402.07101  [pdf, ps, other

    math.OC cs.LG

    On the Complexity of First-Order Methods in Stochastic Bilevel Optimization

    Authors: Jeongyeol Kwon, Dohyun Kwon, Hanbaek Lyu

    Abstract: We consider the problem of finding stationary points in Bilevel optimization when the lower-level problem is unconstrained and strongly convex. The problem has been extensively studied in recent years; the main technical challenge is to keep track of lower-level solutions $y^*(x)$ in response to the changes in the upper-level variables $x$. Subsequently, all existing approaches tie their analyses… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

  14. arXiv:2312.17285  [pdf, other

    cs.CV cs.AI cs.LG

    Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision

    Authors: Wonjoon Chang, Dahee Kwon, Jaesik Choi

    Abstract: Understanding intermediate representations of the concepts learned by deep learning classifiers is indispensable for interpreting general model behaviors. Existing approaches to reveal learned concepts often rely on human supervision, such as pre-defined concept sets or segmentation processes. In this paper, we propose a novel unsupervised method for discovering distributed representations of conc… ▽ More

    Submitted 5 March, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in AAAI2024. First two authors contributed equally. The code is available at https://github.com/daheekwon/RDR

  15. arXiv:2312.03397  [pdf, other

    cs.LG cs.AI

    Generalized Contrastive Divergence: Joint Training of Energy-Based Model and Diffusion Model through Inverse Reinforcement Learning

    Authors: Sangwoong Yoon, Dohyun Kwon, Himchan Hwang, Yung-Kyun Noh, Frank C. Park

    Abstract: We present Generalized Contrastive Divergence (GCD), a novel objective function for training an energy-based model (EBM) and a sampler simultaneously. GCD generalizes Contrastive Divergence (Hinton, 2002), a celebrated algorithm for training EBM, by replacing Markov Chain Monte Carlo (MCMC) distribution with a trainable sampler, such as a diffusion model. In GCD, the joint training of EBM and a di… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023 Workshop on Diffusion Models

  16. arXiv:2311.11420  [pdf, other

    cs.LG cs.AI cs.CV

    LifeLearner: Hardware-Aware Meta Continual Learning System for Embedded Computing Platforms

    Authors: Young D. Kwon, Jagmohan Chauhan, Hong Jia, Stylianos I. Venieris, Cecilia Mascolo

    Abstract: Continual Learning (CL) allows applications such as user personalization and household robots to learn on the fly and adapt to context. This is an important feature when context, actions, and users change. However, enabling CL on resource-constrained embedded systems is challenging due to the limited labeled data, memory, and computing capacity. In this paper, we propose LifeLearner, a hardware-aw… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

    Comments: Accepted for publication at SenSys 2023

  17. arXiv:2311.10430  [pdf, other

    eess.IV cs.CV cs.LG

    Deep Residual CNN for Multi-Class Chest Infection Diagnosis

    Authors: Ryan Donghan Kwon, Dohyun Lim, Yoonha Lee, Seung Won Lee

    Abstract: The advent of deep learning has significantly propelled the capabilities of automated medical image diagnosis, providing valuable tools and resources in the realm of healthcare and medical diagnostics. This research delves into the development and evaluation of a Deep Residual Convolutional Neural Network (CNN) for the multi-class diagnosis of chest infections, utilizing chest X-ray images. The im… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  18. arXiv:2311.09243  [pdf, ps, other

    cs.HC cs.AI

    Evaluating the Efficacy of Interactive Language Therapy Based on LLM for High-Functioning Autistic Adolescent Psychological Counseling

    Authors: Yujin Cho, Mingeon Kim, Seojin Kim, Oyun Kwon, Ryan Donghan Kwon, Yoonha Lee, Dohyun Lim

    Abstract: This study investigates the efficacy of Large Language Models (LLMs) in interactive language therapy for high-functioning autistic adolescents. With the rapid advancement of artificial intelligence, particularly in natural language processing, LLMs present a novel opportunity to augment traditional psychological counseling methods. This research primarily focuses on evaluating the LLM's ability to… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

  19. arXiv:2311.02957  [pdf, other

    cs.RO

    Safe and Efficient Trajectory Optimization for Autonomous Vehicles using B-spline with Incremental Path Flattening

    Authors: Jongseo Choi, Hyuntai Chin, Hyunwoo Park, Daehyeok Kwon, Sanghyun Lee, Doosan Baek

    Abstract: B-spline-based trajectory optimization is widely used for robot navigation due to its computational efficiency and convex-hull property (ensures dynamic feasibility), especially as quadrotors, which have circular body shapes (enable efficient movement) and freedom to move each axis (enables convex-hull property utilization). However, using the B-spline curve for trajectory optimization is challeng… ▽ More

    Submitted 29 November, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: 14 pages, 21 figures, 4 tables, 3 algorithms

  20. arXiv:2310.12189  [pdf, other

    cs.CV

    Mesh Represented Recycle Learning for 3D Hand Pose and Mesh Estimation

    Authors: Bosang Kim, Jonghyun Kim, Hyotae Lee, Lanying Jin, Jeongwon Ha, Dowoo Kwon, Jungpyo Kim, Wonhyeok Im, KyungMin Jin, Jungho Lee

    Abstract: In general, hand pose estimation aims to improve the robustness of model performance in the real-world scenes. However, it is difficult to enhance the robustness since existing datasets are obtained in restricted environments to annotate 3D information. Although neural networks quantitatively achieve a high estimation accuracy, unsatisfied results can be observed in visual quality. This discrepanc… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  21. arXiv:2309.01753  [pdf, other

    math.OC cs.LG

    On Penalty Methods for Nonconvex Bilevel Optimization and First-Order Stochastic Approximation

    Authors: Jeongyeol Kwon, Dohyun Kwon, Stephen Wright, Robert Nowak

    Abstract: In this work, we study first-order algorithms for solving Bilevel Optimization (BO) where the objective functions are smooth but possibly nonconvex in both levels and the variables are restricted to closed convex sets. As a first step, we study the landscape of BO through the lens of penalty methods, in which the upper- and lower-level objectives are combined in a weighted sum with penalty paramet… ▽ More

    Submitted 11 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: ICLR 2024

  22. arXiv:2308.10269  [pdf, other

    cs.CV eess.IV

    Domain Reduction Strategy for Non Line of Sight Imaging

    Authors: Hyunbo Shim, In Cho, Daekyu Kwon, Seon Joo Kim

    Abstract: This paper presents a novel optimization-based method for non-line-of-sight (NLOS) imaging that aims to reconstruct hidden scenes under general setups with significantly reduced reconstruction time. In NLOS imaging, the visible surfaces of the target objects are notably sparse. To mitigate unnecessary computations arising from empty regions, we design our method to render the transients through pa… ▽ More

    Submitted 3 August, 2024; v1 submitted 20 August, 2023; originally announced August 2023.

    Comments: 27 pages, 15 figures

  23. arXiv:2307.09988  [pdf, other

    cs.LG cs.CV

    TinyTrain: Resource-Aware Task-Adaptive Sparse Training of DNNs at the Data-Scarce Edge

    Authors: Young D. Kwon, Rui Li, Stylianos I. Venieris, Jagmohan Chauhan, Nicholas D. Lane, Cecilia Mascolo

    Abstract: On-device training is essential for user personalisation and privacy. With the pervasiveness of IoT devices and microcontroller units (MCUs), this task becomes more challenging due to the constrained memory and compute resources, and the limited availability of labelled user data. Nonetheless, prior works neglect the data scarcity issue, require excessively long training time (e.g. a few hours), o… ▽ More

    Submitted 10 June, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted by ICML 2024

  24. arXiv:2306.03361  [pdf, other

    cs.CL cs.AI

    WHAT, WHEN, and HOW to Ground: Designing User Persona-Aware Conversational Agents for Engaging Dialogue

    Authors: Deuksin Kwon, Sunwoo Lee, Ki Hyun Kim, Seojin Lee, Taeyoon Kim, Eric Davis

    Abstract: This paper presents a method for building a personalized open-domain dialogue system to address the WWH (WHAT, WHEN, and HOW) problem for natural response generation in a commercial setting, where personalized dialogue responses are heavily interleaved with casual response turns. The proposed approach involves weighted dataset blending, negative persona information augmentation methods, and the de… ▽ More

    Submitted 3 July, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted in ACL 2023 Industry Track

    MSC Class: I.2.1; I.2.7

  25. arXiv:2306.02420  [pdf, other

    cs.LG cs.AI math.NA math.OC

    Complexity of Block Coordinate Descent with Proximal Regularization and Applications to Wasserstein CP-dictionary Learning

    Authors: Dohyun Kwon, Hanbaek Lyu

    Abstract: We consider the block coordinate descent methods of Gauss-Seidel type with proximal regularization (BCD-PR), which is a classical method of minimizing general nonconvex objectives under constraints that has a wide range of practical applications. We theoretically establish the worst-case complexity bound for this algorithm. Namely, we show that for general nonconvex smooth objectives with block-wi… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: Proceedings of the 40th International Conference on Machine Learning

  26. arXiv:2305.13259  [pdf, other

    cs.CR cs.CY cs.DC

    Network Participation and Accessibility of Proof-of-Stake (PoS) Blockchains: A Cross-platform Comparative Analysis

    Authors: Jiseong Noh, Donghwan Kwon, Soohwan Cho, Neo C. K. Yiu

    Abstract: The comparative analysis examined eleven Proof-of-Stake (PoS) consensus-based blockchain networks to assess their openness based on five indicative metrics. These metrics include those of decentralization-related aspects, such as the number of validators and capital concentration, and participation-related aspects, including entry capital requirements and economic network stability. This is to ass… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 8 pages, 8 tables and 5 figures

  27. arXiv:2305.13114  [pdf, other

    cs.CY cs.HC

    Exploring User Perspectives on ChatGPT: Applications, Perceptions, and Implications for AI-Integrated Education

    Authors: Reza Hadi Mogavi, Chao Deng, Justin Juho Kim, Pengyuan Zhou, Young D. Kwon, Ahmed Hosny Saleh Metwally, Ahmed Tlili, Simone Bassanelli, Antonio Bucchiarone, Sujit Gujar, Lennart E. Nacke, Pan Hui

    Abstract: To foster the development of pedagogically potent and ethically sound AI-integrated learning landscapes, it is pivotal to critically explore the perceptions and experiences of the users immersed in these contexts. In this study, we perform a thorough qualitative content analysis across four key social media platforms. Our goal is to understand the user experience (UX) and views of early adopters o… ▽ More

    Submitted 25 November, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: This is the authors' preprint version of the paper accepted by the Journal of Computers in Human Behavior: Artificial Humans (doi: https://doi.org/10.1016/j.chbah.2023.100027)

  28. arXiv:2305.01167  [pdf, other

    cs.CV

    Hybrid model for Single-Stage Multi-Person Pose Estimation

    Authors: Jonghyun Kim, Bosang Kim, Hyotae Lee, Jungpyo Kim, Wonhyeok Im, Lanying Jin, Dowoo Kwon, Jungho Lee

    Abstract: In general, human pose estimation methods are categorized into two approaches according to their architectures: regression (i.e., heatmap-free) and heatmap-based methods. The former one directly estimates precise coordinates of each keypoint using convolutional and fully-connected layers. Although this approach is able to detect overlapped and dense keypoints, unexpected results can be obtained by… ▽ More

    Submitted 18 June, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

  29. arXiv:2304.07675  [pdf, other

    cs.CV

    Multimodal Representation Learning of Cardiovascular Magnetic Resonance Imaging

    Authors: Jielin Qiu, Peide Huang, Makiya Nakashima, Jaehyun Lee, Jiacheng Zhu, Wilson Tang, Pohao Chen, Christopher Nguyen, Byung-Hak Kim, Debbie Kwon, Douglas Weber, Ding Zhao, David Chen

    Abstract: Self-supervised learning is crucial for clinical imaging applications, given the lack of explicit labels in healthcare. However, conventional approaches that rely on precise vision-language alignment are not always feasible in complex clinical imaging modalities, such as cardiac magnetic resonance (CMR). CMR provides a comprehensive visualization of cardiac anatomy, physiology, and microstructure,… ▽ More

    Submitted 15 April, 2023; originally announced April 2023.

    Comments: 24 pages

  30. arXiv:2304.04027  [pdf, other

    eess.IV cs.CV cs.LG

    NeBLa: Neural Beer-Lambert for 3D Reconstruction of Oral Structures from Panoramic Radiographs

    Authors: Sihwa Park, Seongjun Kim, Doeyoung Kwon, Yohan Jang, In-Seok Song, Seung Jun Baek

    Abstract: Panoramic radiography (Panoramic X-ray, PX) is a widely used imaging modality for dental examination. However, PX only provides a flattened 2D image, lacking in a 3D view of the oral structure. In this paper, we propose NeBLa (Neural Beer-Lambert) to estimate 3D oral structures from real-world PX. NeBLa tackles full 3D reconstruction for varying subjects (patients) where each reconstruction is bas… ▽ More

    Submitted 6 February, 2024; v1 submitted 8 April, 2023; originally announced April 2023.

    Comments: 18 pages, 16 figures, Accepted to AAAI 2024

  31. arXiv:2301.10945  [pdf, other

    math.OC cs.AI cs.LG

    A Fully First-Order Method for Stochastic Bilevel Optimization

    Authors: Jeongyeol Kwon, Dohyun Kwon, Stephen Wright, Robert Nowak

    Abstract: We consider stochastic unconstrained bilevel optimization problems when only the first-order gradient oracles are available. While numerous optimization methods have been proposed for tackling bilevel problems, existing methods either tend to require possibly expensive calculations regarding Hessians of lower-level objectives, or lack rigorous finite-time performance guarantees. In this work, we p… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

  32. arXiv:2212.06359  [pdf, other

    cs.LG cs.AI math.NA

    Score-based Generative Modeling Secretly Minimizes the Wasserstein Distance

    Authors: Dohyun Kwon, Ying Fan, Kangwook Lee

    Abstract: Score-based generative models are shown to achieve remarkable empirical performances in various applications such as image generation and audio synthesis. However, a theoretical understanding of score-based diffusion models is still incomplete. Recently, Song et al. showed that the training objective of score-based generative models is equivalent to minimizing the Kullback-Leibler divergence of th… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Journal ref: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  33. arXiv:2210.13582  [pdf, other

    cs.SI

    Causal Analysis on the Anchor Store Effect in a Location-based Social Network

    Authors: Anish K. Vallapuram, Young D. Kwon, Lik-Hang Lee, Fengli Xu, Pan Hui

    Abstract: A particular phenomenon of interest in Retail Economics is the spillover effect of anchor stores (specific stores with a reputable brand) to non-anchor stores in terms of customer traffic. Prior works in this area rely on small and survey-based datasets that are often confidential or expensive to collect on a large scale. Also, very few works study the underlying causal mechanisms between factors… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Journal ref: The 2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)

  34. arXiv:2209.11697  [pdf, other

    cs.CV cs.AI

    Edge-oriented Implicit Neural Representation with Channel Tuning

    Authors: Wonjoon Chang, Dahee Kwon, Bumjin Park

    Abstract: Implicit neural representation, which expresses an image as a continuous function rather than a discrete grid form, is widely used for image processing. Despite its outperforming results, there are still remaining limitations on restoring clear shapes of a given signal such as the edges of an image. In this paper, we propose Gradient Magnitude Adjustment algorithm which calculates the gradient of… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  35. arXiv:2206.04385  [pdf, other

    cs.LG

    HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign Supermask

    Authors: Anish K. Vallapuram, Pengyuan Zhou, Young D. Kwon, Lik Hang Lee, Hengwei Xu, Pan Hui

    Abstract: Federated learning alleviates the privacy risk in distributed learning by transmitting only the local model updates to the central server. However, it faces challenges including statistical heterogeneity of clients' datasets and resource constraints of client devices, which severely impact the training performance and user experience. Prior works have tackled these challenges by combining personal… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

  36. arXiv:2205.12429  [pdf, other

    eess.IV cs.CV

    Interaction of a priori Anatomic Knowledge with Self-Supervised Contrastive Learning in Cardiac Magnetic Resonance Imaging

    Authors: Makiya Nakashima, Inyeop Jang, Ramesh Basnet, Mitchel Benovoy, W. H. Wilson Tang, Christopher Nguyen, Deborah Kwon, Tae Hyun Hwang, David Chen

    Abstract: Training deep learning models on cardiac magnetic resonance imaging (CMR) can be a challenge due to the small amount of expert generated labels and inherent complexity of data source. Self-supervised contrastive learning (SSCL) has recently been shown to boost performance in several medical imaging tasks. However, it is unclear how much the pre-trained representation reflects the primary organ of… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: Under review at Machine Learning in Healthcare

  37. arXiv:2205.06491  [pdf, other

    cs.LG

    Tighter Regret Analysis and Optimization of Online Federated Learning

    Authors: Dohyeok Kwon, Jonghwan Park, Songnam Hong

    Abstract: In federated learning (FL), it is commonly assumed that all data are placed at clients in the beginning of machine learning (ML) optimization (i.e., offline learning). However, in many real-world applications, it is expected to proceed in an online fashion. To this end, online FL (OFL) has been introduced, which aims at learning a sequence of global models from decentralized streaming data such th… ▽ More

    Submitted 18 January, 2023; v1 submitted 13 May, 2022; originally announced May 2022.

    Comments: v3. Compared to the previous version, tighter regret analysis and parameter optimization have been included. v4. Add comments

  38. arXiv:2204.04541  [pdf, other

    cs.CL

    KOBEST: Korean Balanced Evaluation of Significant Tasks

    Authors: Dohyeong Kim, Myeongjun Jang, Deuk Sin Kwon, Eric Davis

    Abstract: A well-formulated benchmark plays a critical role in spurring advancements in the natural language processing (NLP) field, as it allows objective and precise evaluation of diverse models. As modern language models (LMs) have become more elaborate and sophisticated, more difficult benchmarks that require linguistic knowledge and reasoning have been proposed. However, most of these benchmarks only s… ▽ More

    Submitted 9 April, 2022; originally announced April 2022.

    Comments: 9 pages

  39. arXiv:2204.02078  [pdf, other

    cs.CV

    Semi-supervised Semantic Segmentation with Error Localization Network

    Authors: Donghyeon Kwon, Suha Kwak

    Abstract: This paper studies semi-supervised learning of semantic segmentation, which assumes that only a small portion of training images are labeled and the others remain unlabeled. The unlabeled images are usually assigned pseudo labels to be used in training, which however often causes the risk of performance degradation due to the confirmation bias towards errors on the pseudo labels. We present a nove… ▽ More

    Submitted 31 May, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

  40. arXiv:2203.03794  [pdf, other

    cs.LG

    YONO: Modeling Multiple Heterogeneous Neural Networks on Microcontrollers

    Authors: Young D. Kwon, Jagmohan Chauhan, Cecilia Mascolo

    Abstract: With the advancement of Deep Neural Networks (DNN) and large amounts of sensor data from Internet of Things (IoT) systems, the research community has worked to reduce the computational and resource demands of DNN to compute on low-resourced microcontrollers (MCUs). However, most of the current work in embedded deep learning focuses on solving a single task efficiently, while the multi-tasking natu… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: Accepted for publication at IPSN 2022

  41. arXiv:2202.10100  [pdf, other

    cs.LG cs.AR

    Enabling On-Device Smartphone GPU based Training: Lessons Learned

    Authors: Anish Das, Young D. Kwon, Jagmohan Chauhan, Cecilia Mascolo

    Abstract: Deep Learning (DL) has shown impressive performance in many mobile applications. Most existing works have focused on reducing the computational and resource overheads of running Deep Neural Networks (DNN) inference on resource-constrained mobile devices. However, the other aspect of DNN operations, i.e. training (forward and backward passes) on smartphone GPUs, has received little attention thus f… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  42. arXiv:2112.02409  [pdf, other

    cs.LG cs.AI

    Understanding Dynamic Spatio-Temporal Contexts in Long Short-Term Memory for Road Traffic Speed Prediction

    Authors: Won Kyung Lee, Deuk Sin Kwon, So Young Sohn

    Abstract: Reliable traffic flow prediction is crucial to creating intelligent transportation systems. Many big-data-based prediction approaches have been developed but they do not reflect complicated dynamic interactions between roads considering time and location. In this study, we propose a dynamically localised long short-term memory (LSTM) model that involves both spatial and temporal dependence between… ▽ More

    Submitted 16 June, 2023; v1 submitted 4 December, 2021; originally announced December 2021.

    Comments: 10pages, 2 tables, 4 figures, 2017 KDD Cup

  43. arXiv:2111.09425  [pdf, other

    cs.NI eess.SY

    Quality-Aware Deep Reinforcement Learning for Streaming in Infrastructure-Assisted Connected Vehicles

    Authors: Won Joon Yun, Dohyun Kwon, Minseok Choi, Joongheon Kim, Guiseppe Caire, Andreas F. Molisch

    Abstract: This paper proposes a deep reinforcement learning-based video streaming scheme for mobility-aware vehicular networks, e.g., vehicles on the highway. We consider infrastructure-assisted and mmWave-based scenarios in which the macro base station (MBS) cannot directly provide the streaming service to vehicles due to the short range of mmWave beams so that small mmWave base stations (mBSs) along the r… ▽ More

    Submitted 12 October, 2021; originally announced November 2021.

    Comments: 15 pages, 8 figures, Submitted to IEEE Transactions on Vehicular Technology

  44. arXiv:2110.14150  [pdf, other

    cs.LG cs.CV math.NA

    Training Wasserstein GANs without gradient penalties

    Authors: Dohyun Kwon, Yeoneung Kim, Guido Montúfar, Insoon Yang

    Abstract: We propose a stable method to train Wasserstein generative adversarial networks. In order to enhance stability, we consider two objective functions using the $c$-transform based on Kantorovich duality which arises in the theory of optimal transport. We experimentally show that this algorithm can effectively enforce the Lipschitz constraint on the discriminator while other standard methods fail to… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

  45. arXiv:2110.13290  [pdf

    cs.LG cs.AI cs.HC cs.PF

    Exploring System Performance of Continual Learning for Mobile and Embedded Sensing Applications

    Authors: Young D. Kwon, Jagmohan Chauhan, Abhishek Kumar, Pan Hui, Cecilia Mascolo

    Abstract: Continual learning approaches help deep neural network models adapt and learn incrementally by trying to solve catastrophic forgetting. However, whether these existing approaches, applied traditionally to image-based tasks, work with the same efficacy to the sequential time series data generated by mobile or embedded sensing systems remains an unanswered question. To address this void, we conduc… ▽ More

    Submitted 23 June, 2022; v1 submitted 25 October, 2021; originally announced October 2021.

    Comments: Accepted for publication at SEC 2021

  46. arXiv:2109.12370  [pdf, other

    cs.SI

    Interpretable Business Survival Prediction

    Authors: Anish K. Vallapuram, Nikhil Nanda, Young D. Kwon, Pan Hui

    Abstract: The survival of a business is undeniably pertinent to its success. A key factor contributing to its continuity depends on its customers. The surge of location-based social networks such as Yelp, Dianping, and Foursquare has paved the way for leveraging user-generated content on these platforms to predict business survival. Prior works in this area have developed several quantitative features to ca… ▽ More

    Submitted 25 September, 2021; originally announced September 2021.

    Comments: 8 pages, 10 figures

  47. arXiv:2108.09954  [pdf

    cs.ET cs.AI physics.app-ph

    Pulse-Width Modulation Neuron Implemented by Single Positive-Feedback Device

    Authors: Sung Yun Woo, Dongseok Kwon, Byung-Gook Park, Jong-Ho Lee, Jong-Ho Bae

    Abstract: Positive-feedback (PF) device and its operation scheme to implement pulse width modulation (PWM) function was proposed and demonstrated, and the device operation mechanism for implementing PWM function was analyzed. By adjusting the amount of the charge stored in the n- floating body (Qn), the potential of the floating body linearly changes with time. When Qn reaches to a threshold value (Qth), th… ▽ More

    Submitted 23 August, 2021; originally announced August 2021.

  48. arXiv:2108.06665  [pdf, other

    cs.CL

    Accurate, yet inconsistent? Consistency Analysis on Language Understanding Models

    Authors: Myeongjun Jang, Deuk Sin Kwon, Thomas Lukasiewicz

    Abstract: Consistency, which refers to the capability of generating the same predictions for semantically similar contexts, is a highly desirable property for a sound language understanding model. Although recent pretrained language models (PLMs) deliver outstanding performance in various downstream tasks, they should exhibit consistent behaviour provided the models truly understand language. In this paper,… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

  49. arXiv:2106.07268  [pdf, other

    cs.SD cs.LG eess.AS

    FastICARL: Fast Incremental Classifier and Representation Learning with Efficient Budget Allocation in Audio Sensing Applications

    Authors: Young D. Kwon, Jagmohan Chauhan, Cecilia Mascolo

    Abstract: Various incremental learning (IL) approaches have been proposed to help deep learning models learn new tasks/classes continuously without forgetting what was learned previously (i.e., avoid catastrophic forgetting). With the growing number of deployed audio sensing applications that need to dynamically incorporate new tasks and changing input distribution from users, the ability of IL on-device be… ▽ More

    Submitted 24 June, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: Accepted for publication at INTERSPEECH 2021

  50. arXiv:2106.05872  [pdf, other

    cs.LG

    Knowing when we do not know: Bayesian continual learning for sensing-based analysis tasks

    Authors: Sandra Servia-Rodriguez, Cecilia Mascolo, Young D. Kwon

    Abstract: Despite much research targeted at enabling conventional machine learning models to continually learn tasks and data distributions sequentially without forgetting the knowledge acquired, little effort has been devoted to account for more realistic situations where learning some tasks accurately might be more critical than forgetting previous ones. In this paper we propose a Bayesian inference based… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.