Skip to main content

Showing 1–50 of 77 results for author: Jeong, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.14733  [pdf, other

    cs.LG cs.AI cs.CL

    Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL

    Authors: Yunseon Choi, Sangmin Bae, Seonghyun Ban, Minchan Jeong, Chuheng Zhang, Lei Song, Li Zhao, Jiang Bian, Kee-Eung Kim

    Abstract: With the advent of foundation models, prompt tuning has positioned itself as an important technique for directing model behaviors and eliciting desired responses. Prompt tuning regards selecting appropriate keywords included into the input, thereby adapting to the downstream task without adjusting or fine-tuning the model parameters. There is a wide range of work in prompt tuning, from approaches… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  2. arXiv:2407.09014  [pdf, other

    cs.CL

    CompAct: Compressing Retrieved Documents Actively for Question Answering

    Authors: Chanwoong Yoon, Taewhoo Lee, Hyeon Hwang, Minbyul Jeong, Jaewoo Kang

    Abstract: Retrieval-augmented generation supports language models to strengthen their factual groundings by providing external contexts. However, language models often face challenges when given extensive information, diminishing their effectiveness in solving questions. Context compression tackles this issue by filtering out irrelevant information, but current methods still struggle in realistic scenarios… ▽ More

    Submitted 15 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: Code available at https://github.com/dmis-lab/CompAct

  3. arXiv:2407.00693  [pdf, other

    cs.AI cs.CL cs.LG

    BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models

    Authors: Gihun Lee, Minchan Jeong, Yujin Kim, Hojung Jung, Jaehoon Oh, Sangmook Kim, Se-Young Yun

    Abstract: While learning to align Large Language Models (LLMs) with human preferences has shown remarkable success, aligning these models to meet the diverse user preferences presents further challenges in preserving previous knowledge. This paper examines the impact of personalized preference optimization on LLMs, revealing that the extent of knowledge loss varies significantly with preference heterogeneit… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: under review

  4. arXiv:2406.05965  [pdf, other

    eess.AS cs.AI

    MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance

    Authors: Semin Kim, Myeonghun Jeong, Hyeonseung Lee, Minchan Kim, Byoung Jin Choi, Nam Soo Kim

    Abstract: In this paper, we propose MakeSinger, a semi-supervised training method for singing voice synthesis (SVS) via classifier-free diffusion guidance. The challenge in SVS lies in the costly process of gathering aligned sets of text, pitch, and audio data. MakeSinger enables the training of the diffusion-based SVS model from any speech and singing voice data regardless of its labeling, thereby enhancin… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  5. arXiv:2406.02355  [pdf, other

    cs.CV cs.AI cs.DC cs.LG

    FedDr+: Stabilizing Dot-regression with Global Feature Distillation for Federated Learning

    Authors: Seongyoon Kim, Minchan Jeong, Sungnyun Kim, Sungwoo Cho, Sumyeong Ahn, Se-Young Yun

    Abstract: Federated Learning (FL) has emerged as a pivotal framework for the development of effective global models (global FL) or personalized models (personalized FL) across clients with heterogeneous, non-iid data distribution. A key challenge in FL is client drift, where data heterogeneity impedes the aggregation of scattered knowledge. Recent studies have tackled the client drift issue by identifying s… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  6. arXiv:2405.17657  [pdf

    cs.RO

    Robust Perception and Navigation of Autonomous Surface Vehicles in Challenging Environments

    Authors: Mingi Jeong

    Abstract: Research on coastal regions traditionally involves methods like manual sampling, monitoring buoys, and remote sensing, but these methods face challenges in spatially and temporally diverse regions of interest. Autonomous surface vehicles (ASVs) with artificial intelligence (AI) are being explored, and recognized by the International Maritime Organization (IMO) as vital for future ecosystem underst… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: rss pioneer 2024

  7. arXiv:2405.12701  [pdf, other

    cs.CL cs.AI

    OLAPH: Improving Factuality in Biomedical Long-form Question Answering

    Authors: Minbyul Jeong, Hyeon Hwang, Chanwoong Yoon, Taewhoo Lee, Jaewoo Kang

    Abstract: In the medical domain, numerous scenarios necessitate the long-form generation ability of large language models (LLMs). Specifically, when addressing patients' questions, it is essential that the model's response conveys factual claims, highlighting the need for an automated method to evaluate those claims. Thus, we introduce MedLFQA, a benchmark dataset reconstructed using long-form question-answ… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  8. arXiv:2405.05823  [pdf, ps, other

    cs.IT

    On the Secrecy Capacity of 1-2-1 Atomic Networks

    Authors: Mohammad Milanian, Minoh Jeong, Martina Cardone

    Abstract: We consider the problem of secure communication over a noiseless 1-2-1 network, an abstract model introduced to capture the directivity characteristic of mmWave communications. We focus on structured networks, which we refer to as 1-2-1 atomic networks. Broadly speaking, these are characterized by a source, a destination, and three layers of intermediate nodes with sparse connections. The goal is… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted to ISIT 2024

  9. arXiv:2404.18411  [pdf, other

    cs.RO cs.CV

    Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles

    Authors: Mingi Jeong, Arihant Chadda, Ziang Ren, Luyang Zhao, Haowen Liu, Monika Roznere, Aiwei Zhang, Yitao Jiang, Sabriel Achong, Samuel Lensgraf, Alberto Quattrini Li

    Abstract: This paper introduces the first publicly accessible multi-modal perception dataset for autonomous maritime navigation, focusing on in-water obstacles within the aquatic environment to enhance situational awareness for Autonomous Surface Vehicles (ASVs). This dataset, consisting of diverse objects encountered under varying environmental conditions, aims to bridge the research gap in marine robotics… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted to the IEEE ICRA Workshop on Field Robotics 2024

  10. arXiv:2404.04366  [pdf, other

    cs.IT eess.SP

    A Comprehensive Study on Ziv-Zakai Lower Bounds on the MMSE

    Authors: Minoh Jeong, Alex Dytso, Martina Cardone

    Abstract: This paper explores Bayesian lower bounds on the minimum mean squared error (MMSE) that belong to the Ziv-Zakai (ZZ) family. The ZZ technique relies on connecting the bound to an M-ary hypothesis testing problem. Three versions of the ZZ bound (ZZB) exist: the first relies on the so-called valley-filling function (VFF), the second omits the VFF, and the third, i.e., the single-point ZZB (SZZB), us… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: Comments are welcome. arXiv admin note: substantial text overlap with arXiv:2305.02970

  11. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  12. A Magnetic Millirobot Walks on Slippery Biological Surfaces for Targeted Cargo Delivery

    Authors: Moonkwang Jeong, Xiangzhou Tan, Felix Fischer, Tian Qiu

    Abstract: Small-scale robots hold great potential for targeted cargo delivery in minimally-inv asive medicine. However, current robots often face challenges to locomote efficiently on slip pery biological tissue surfaces, especially when loaded with heavy cargos. Here, we report a magnetic millirobot that can walk on rough and slippery biological tissues by anchoring itself on the soft tissue surface altern… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 15 pages

    ACM Class: J.3

  13. arXiv:2403.02917  [pdf

    cs.RO physics.bio-ph

    A Miniaturized Device for Ultrafast On-demand Drug Release based on a Gigahertz Ultrasonic Resonator

    Authors: Yangchao Zhou, Moonkwang Jeong, Meng Zhang, Xuexin Duan, Tian Qiu

    Abstract: On-demand controlled drug delivery is essential for the treatment of a wide range of chronic diseases. As the drug is released at the time when required, its efficacy is boosted and the side effects are minimized. However, so far, drug delivery devices often rely on the passive diffusion process for a sustained release, which is slow and uncontrollable. Here, we present a miniaturized microfluidic… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 19 pages, 6 figures, 1 table

    MSC Class: J.3

    Journal ref: \c{opyright} 2024 The Authors. Advanced Engineering Materials published by Wiley-VCH GmbH

  14. arXiv:2402.14714  [pdf, other

    cs.CL cs.AI

    Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models

    Authors: Seungduk Kim, Seungtaek Choi, Myeongho Jeong

    Abstract: This report introduces \texttt{EEVE-Korean-v1.0}, a Korean adaptation of large language models that exhibit remarkable capabilities across English and Korean text understanding. Building on recent highly capable but English-centric LLMs, such as SOLAR-10.7B and Phi-2, where non-English texts are inefficiently processed with English-centric tokenizers, we present an efficient and effective vocabula… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  15. Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning

    Authors: Haeju Lee, Minchan Jeong, Se-Young Yun, Kee-Eung Kim

    Abstract: Prompt tuning, in which prompts are optimized to adapt large-scale pre-trained language models to downstream tasks instead of fine-tuning the full model parameters, has been shown to be particularly effective when the prompts are trained in a multi-task transfer learning setting. These methods generally involve individually training prompts for each source task and then aggregating them to provide… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: The first two authors equally contributed to this work. Findings of EMNLP 2023

  16. arXiv:2401.15500  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Data-Driven Estimation of the False Positive Rate of the Bayes Binary Classifier via Soft Labels

    Authors: Minoh Jeong, Martina Cardone, Alex Dytso

    Abstract: Classification is a fundamental task in many applications on which data-driven methods have shown outstanding performances. However, it is challenging to determine whether such methods have achieved the optimal performance. This is mainly because the best achievable performance is typically unknown and hence, effectively estimating it is of prime importance. In this paper, we consider binary class… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

  17. arXiv:2401.15269  [pdf, other

    cs.CL cs.AI cs.IR

    Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models

    Authors: Minbyul Jeong, Jiwoong Sohn, Mujeen Sung, Jaewoo Kang

    Abstract: Recent proprietary large language models (LLMs), such as GPT-4, have achieved a milestone in tackling diverse challenges in the biomedical domain, ranging from multiple-choice questions to long-form generations. To address challenges that still cannot be handled with the encoded knowledge of LLMs, various retrieval-augmented generation (RAG) methods have been developed by searching documents from… ▽ More

    Submitted 17 June, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: ISMB 2024

  18. arXiv:2401.01498  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

    Authors: Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Semin Kim, Joun Yeop Lee, Nam Soo Kim

    Abstract: We propose a novel text-to-speech (TTS) framework centered around a neural transducer. Our approach divides the whole TTS pipeline into semantic-level sequence-to-sequence (seq2seq) modeling and fine-grained acoustic modeling stages, utilizing discrete semantic tokens obtained from wav2vec2.0 embeddings. For a robust and efficient alignment modeling, we employ a neural transducer named token trans… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  19. arXiv:2401.01099  [pdf, other

    eess.AS cs.AI cs.LG

    Efficient Parallel Audio Generation using Group Masked Language Modeling

    Authors: Myeonghun Jeong, Minchan Kim, Joun Yeop Lee, Nam Soo Kim

    Abstract: We present a fast and high-quality codec language model for parallel audio generation. While SoundStorm, a state-of-the-art parallel audio generation model, accelerates inference speed compared to autoregressive models, it still suffers from slow inference due to iterative sampling. To resolve this problem, we propose Group-Masked Language Modeling~(G-MLM) and Group Iterative Parallel Decoding~(G-… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  20. arXiv:2311.02898  [pdf, other

    eess.AS cs.LG

    Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction

    Authors: Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Dongjune Lee, Nam Soo Kim

    Abstract: We introduce a text-to-speech(TTS) framework based on a neural transducer. We use discretized semantic tokens acquired from wav2vec2.0 embeddings, which makes it easy to adopt a neural transducer for the TTS framework enjoying its monotonic alignment constraints. The proposed model first generates aligned semantic tokens using the neural transducer, then synthesizes a speech sample from the semant… ▽ More

    Submitted 8 November, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted at ASRU2023

  21. arXiv:2308.12532  [pdf, other

    cs.LG cs.AI cs.CV

    FedSOL: Stabilized Orthogonal Learning with Proximal Restrictions in Federated Learning

    Authors: Gihun Lee, Minchan Jeong, Sangmook Kim, Jaehoon Oh, Se-Young Yun

    Abstract: Federated Learning (FL) aggregates locally trained models from individual clients to construct a global model. While FL enables learning a model with data privacy, it often suffers from significant performance degradation when clients have heterogeneous data distributions. This data heterogeneity causes the model to forget the global knowledge acquired from previously sampled clients after being t… ▽ More

    Submitted 28 March, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (CVPR 2024)

  22. arXiv:2307.04427  [pdf, other

    astro-ph.HE astro-ph.GA cs.LG

    Observation of high-energy neutrinos from the Galactic plane

    Authors: R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., S. W. Barwick, V. Basu, S. Baur, R. Bay, J. J. Beatty, K. -H. Becker, J. Becker Tjus , et al. (364 additional authors not shown)

    Abstract: The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrin… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: Submitted on May 12th, 2022; Accepted on May 4th, 2023

    Journal ref: Science 380, 6652, 1338-1343 (2023)

  23. arXiv:2306.04990  [pdf, other

    cs.CV

    Multi-Architecture Multi-Expert Diffusion Models

    Authors: Yunsung Lee, Jin-Young Kim, Hyojun Go, Myeongho Jeong, Shinhyeok Oh, Seungtaek Choi

    Abstract: In this paper, we address the performance degradation of efficient diffusion models by introducing Multi-architecturE Multi-Expert diffusion models (MEME). We identify the need for tailored operations at different time-steps in diffusion processes and leverage this insight to create compact yet high-performing models. MEME assigns distinct architectures to different time-step intervals, balancing… ▽ More

    Submitted 27 December, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: To be published in the AAAI 2024 Proceedings Main Track

  24. arXiv:2305.19051  [pdf, other

    eess.AS cs.AI cs.SD

    Towards single integrated spoofing-aware speaker verification embeddings

    Authors: Sung Hwan Mun, Hye-jin Shim, Hemlata Tak, Xin Wang, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim, Jee-weon Jung

    Abstract: This study aims to develop a single integrated spoofing-aware speaker verification (SASV) embeddings that satisfy two aspects. First, rejecting non-target speakers' input as well as target speakers' spoofed inputs should be addressed. Second, competitive performance should be demonstrated compared to the fusion of automatic speaker verification (ASV) and countermeasure (CM) embeddings, which outpe… ▽ More

    Submitted 1 June, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted by INTERSPEECH 2023. Code and models are available in https://github.com/sasv-challenge/ASVSpoof5-SASVBaseline

  25. arXiv:2305.18977  [pdf, other

    cs.CL

    Cross Encoding as Augmentation: Towards Effective Educational Text Classification

    Authors: Hyun Seung Lee, Seungtaek Choi, Yunsung Lee, Hyeongdon Moon, Shinhyeok Oh, Myeongho Jeong, Hyojun Go, Christian Wallraven

    Abstract: Text classification in education, usually called auto-tagging, is the automated process of assigning relevant tags to educational content, such as questions and textbooks. However, auto-tagging suffers from a data scarcity problem, which stems from two major challenges: 1) it possesses a large tag space and 2) it is multi-label. Though a retrieval approach is reportedly good at low-resource scenar… ▽ More

    Submitted 30 May, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL2023

  26. arXiv:2305.16626  [pdf, other

    cs.CL cs.AI

    Evaluation of Question Generation Needs More References

    Authors: Shinhyeok Oh, Hyojun Go, Hyeongdon Moon, Yunsung Lee, Myeongho Jeong, Hyun Seung Lee, Seungtaek Choi

    Abstract: Question generation (QG) is the task of generating a valid and fluent question based on a given context and the target answer. According to various purposes, even given the same context, instructors can ask questions about different concepts, and even the same concept can be written in different ways. However, the evaluation for QG usually depends on single reference-based similarity metrics, such… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL2023

    ACM Class: I.2.7

  27. arXiv:2305.02970  [pdf, ps, other

    cs.IT eess.SP math.ST

    Functional Properties of the Ziv-Zakai bound with Arbitrary Inputs

    Authors: Minoh Jeong, Alex Dytso, Martina Cardone

    Abstract: This paper explores the Ziv-Zakai bound (ZZB), which is a well-known Bayesian lower bound on the Minimum Mean Squared Error (MMSE). First, it is shown that the ZZB holds without any assumption on the distribution of the estimand, that is, the estimand does not necessarily need to have a probability density function. The ZZB is then further analyzed in the high-noise and low-noise regimes and shown… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: Full version of "Functional Properties of the Ziv-Zakai bound with Arbitrary Inputs" in 2023 IEEE International Symposium on Information Theory (ISIT)

  28. arXiv:2303.03103  [pdf, other

    cs.CL cs.AI

    Towards Zero-Shot Functional Compositionality of Language Models

    Authors: Hangyeol Yu, Myeongho Jeong, Jamin Shin, Hyeongdon Moon, Juneyoung Park, Seungtaek Choi

    Abstract: Large Pre-trained Language Models (PLM) have become the most desirable starting point in the field of NLP, as they have become remarkably good at solving many individual tasks. Despite such success, in this paper, we argue that current paradigms of working with PLMs are neglecting a critical aspect of modeling human intelligence: functional compositionality. Functional compositionality - the abili… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  29. arXiv:2303.01768  [pdf, other

    cs.LG cs.MA

    Toward Risk-based Optimistic Exploration for Cooperative Multi-Agent Reinforcement Learning

    Authors: Jihwan Oh, Joonkee Kim, Minchan Jeong, Se-Young Yun

    Abstract: The multi-agent setting is intricate and unpredictable since the behaviors of multiple agents influence one another. To address this environmental uncertainty, distributional reinforcement learning algorithms that incorporate uncertainty via distributional output have been integrated with multi-agent reinforcement learning (MARL) methods, achieving state-of-the-art performance. However, distributi… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: AAMAS2023 camera-ready version. First two authors contributed equally

  30. arXiv:2302.01530  [pdf, other

    cs.CL cs.AI cs.LG

    Revisiting Intermediate Layer Distillation for Compressing Language Models: An Overfitting Perspective

    Authors: Jongwoo Ko, Seungjoon Park, Minchan Jeong, Sukjin Hong, Euijai Ahn, Du-Seong Chang, Se-Young Yun

    Abstract: Knowledge distillation (KD) is a highly promising method for mitigating the computational problems of pre-trained language models (PLMs). Among various KD approaches, Intermediate Layer Distillation (ILD) has been a de facto standard KD method with its performance efficacy in the NLP field. In this paper, we find that existing ILD methods are prone to overfitting to training datasets, although the… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    Comments: The 17th Conference of the European Chapter of the Association for Computational Linguistics (Findings)

  31. arXiv:2212.05973  [pdf, other

    cs.CV

    Towards Practical Plug-and-Play Diffusion Models

    Authors: Hyojun Go, Yunsung Lee, Jin-Young Kim, Seunghyun Lee, Myeongho Jeong, Hyun Seung Lee, Seungtaek Choi

    Abstract: Diffusion-based generative models have achieved remarkable success in image generation. Their guidance formulation allows an external model to plug-and-play control the generation process for various tasks without finetuning the diffusion model. However, the direct use of publicly available off-the-shelf models for guidance fails due to their poor performance on noisy inputs. For that, the existin… ▽ More

    Submitted 27 March, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

    Comments: CVPR 2023 camera-ready

  32. SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech

    Authors: Byoung Jin Choi, Myeonghun Jeong, Joun Yeop Lee, Nam Soo Kim

    Abstract: Zero-shot multi-speaker text-to-speech (ZSM-TTS) models aim to generate a speech sample with the voice characteristic of an unseen speaker. The main challenge of ZSM-TTS is to increase the overall speaker similarity for unseen speakers. One of the most successful speaker conditioning methods for flow-based multi-speaker text-to-speech (TTS) models is to utilize the functions which predict the scal… ▽ More

    Submitted 30 November, 2022; originally announced November 2022.

    Comments: Accepted to IEEE Signal Processing Letters

  33. arXiv:2211.11938  [pdf, other

    cs.CV

    Supervised Contrastive Learning on Blended Images for Long-tailed Recognition

    Authors: Minki Jeong, Changick Kim

    Abstract: Real-world data often have a long-tailed distribution, where the number of samples per class is not equal over training classes. The imbalanced data form a biased feature space, which deteriorates the performance of the recognition model. In this paper, we propose a novel long-tailed recognition method to balance the latent feature space. First, we introduce a MixUp-based data augmentation techniq… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  34. Evaluating the Knowledge Dependency of Questions

    Authors: Hyeongdon Moon, Yoonseok Yang, Jamin Shin, Hangyeol Yu, Seunghyun Lee, Myeongho Jeong, Juneyoung Park, Minsam Kim, Seungtaek Choi

    Abstract: The automatic generation of Multiple Choice Questions (MCQ) has the potential to reduce the time educators spend on student assessment significantly. However, existing evaluation metrics for MCQ generation, such as BLEU, ROUGE, and METEOR, focus on the n-gram based similarity of the generated MCQ to the gold sample in the dataset and disregard their educational value. They fail to evaluate the MCQ… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022 (Main, Long)

    Journal ref: https://aclanthology.org/2022.emnlp-main.718

  35. arXiv:2210.12949  [pdf, other

    cs.CL cs.AI cs.LG

    Enhancing Label Consistency on Document-level Named Entity Recognition

    Authors: Minbyul Jeong, Jaewoo Kang

    Abstract: Named entity recognition (NER) is a fundamental part of extracting information from documents in biomedical applications. A notable advantage of NER is its consistency in extracting biomedical entities in a document context. Although existing document NER models show consistent predictions, they still do not meet our expectations. We investigated whether the adjectives and prepositions within an e… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  36. arXiv:2210.05979  [pdf, other

    eess.AS cs.SD

    Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech

    Authors: Byoung Jin Choi, Myeonghun Jeong, Minchan Kim, Sung Hwan Mun, Nam Soo Kim

    Abstract: Several recently proposed text-to-speech (TTS) models achieved to generate the speech samples with the human-level quality in the single-speaker and multi-speaker TTS scenarios with a set of pre-defined speakers. However, synthesizing a new speaker's voice with a single reference audio, commonly known as zero-shot multi-speaker text-to-speech (ZSM-TTS), is still a very challenging task. The main c… ▽ More

    Submitted 22 November, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: APSIPA 2022

  37. arXiv:2209.03042  [pdf, other

    hep-ex astro-ph.IM cs.LG physics.data-an physics.ins-det

    Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

    Authors: R. Abbasi, M. Ackermann, J. Adams, N. Aggarwal, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, V. Basu, R. Bay, J. J. Beatty, K. -H. Becker , et al. (359 additional authors not shown)

    Abstract: IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challen… ▽ More

    Submitted 11 October, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: Prepared for submission to JINST

  38. arXiv:2207.07862  [pdf, other

    cs.AR cs.DC cs.NE

    MAC-DO: An Efficient Output-Stationary GEMM Accelerator for CNNs Using DRAM Technology

    Authors: Minki Jeong, Wanyeong Jung

    Abstract: DRAM-based in-situ accelerators have shown their potential in addressing the memory wall challenge of the traditional von Neumann architecture. Such accelerators exploit charge sharing or logic circuits for simple logic operations at the DRAM subarray level. However, their throughput is limited due to low array utilization, as only a few row cells in a DRAM array participate in operations while mo… ▽ More

    Submitted 7 February, 2024; v1 submitted 16 July, 2022; originally announced July 2022.

    Comments: 13 pages, 21 figures

  39. Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus

    Authors: Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Sunghwan Ahn, Joun Yeop Lee, Nam Soo Kim

    Abstract: Training a text-to-speech (TTS) model requires a large scale text labeled speech corpus, which is troublesome to collect. In this paper, we propose a transfer learning framework for TTS that utilizes a large amount of unlabeled speech dataset for pre-training. By leveraging wav2vec2.0 representation, unlabeled speech can highly improve performance, especially in the lack of labeled speech. We also… ▽ More

    Submitted 6 October, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: Accepted by Interspeech2022

  40. Captivate! Contextual Language Guidance for Parent-Child Interaction

    Authors: Taeahn Kwon, Minkyung Jeong, Eon-Suk Ko, Youngki Lee

    Abstract: To acquire language, children need rich language input. However, many parents find it difficult to provide children with sufficient language input, which risks delaying their language development. To aid these parents, we design Captivate!, the first system that provides contextual language guidance to parents during play. Our system tracks both visual and spoken language cues to infer targets of… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

    Comments: Published as a conference paper at CHI 2022

  41. arXiv:2201.10168  [pdf, other

    cs.CV

    Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos

    Authors: Sangmin Woo, Jinyoung Park, Inyong Koo, Sumin Lee, Minki Jeong, Changick Kim

    Abstract: Natural Language Video Grounding (NLVG) aims to localize time segments in an untrimmed video according to sentence queries. In this work, we present a new paradigm named Explore-And-Match for NLVG that seamlessly unifies the strengths of two streams of NLVG methods: proposal-free and proposal-based; the former explores the search space to find time segments directly, and the latter matches the pre… ▽ More

    Submitted 4 August, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

    Comments: Code: https://github.com/sangminwoo/Explore-And-Match

  42. BERN2: an advanced neural biomedical named entity recognition and normalization tool

    Authors: Mujeen Sung, Minbyul Jeong, Yonghwa Choi, Donghyeon Kim, Jinhyuk Lee, Jaewoo Kang

    Abstract: In biomedical natural language processing, named entity recognition (NER) and named entity normalization (NEN) are key tasks that enable the automatic extraction of biomedical entities (e.g. diseases and drugs) from the ever-growing biomedical literature. In this article, we present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-b… ▽ More

    Submitted 6 October, 2022; v1 submitted 6 January, 2022; originally announced January 2022.

    Comments: Published in Bioinformatics 2022. Web service available at http://bern2.korea.ac.kr. Code available at https://github.com/dmis-lab/BERN2

  43. arXiv:2110.12172  [pdf, other

    cs.LG cs.DC

    Scalable Smartphone Cluster for Deep Learning

    Authors: Byunggook Na, Jaehee Jang, Seongsik Park, Seijoon Kim, Joonoo Kim, Moon Sik Jeong, Kwang Choon Kim, Seon Heo, Yoonsang Kim, Sungroh Yoon

    Abstract: Various deep learning applications on smartphones have been rapidly rising, but training deep neural networks (DNNs) has too large computational burden to be executed on a single smartphone. A portable cluster, which connects smartphones with a wireless network and supports parallel computation using them, can be a potential approach to resolve the issue. However, by our findings, the limitations… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: 6 pages

  44. Connecting Low-Loss Subspace for Personalized Federated Learning

    Authors: Seok-Ju Hahn, Minwoo Jeong, Junghye Lee

    Abstract: Due to the curse of statistical heterogeneity across clients, adopting a personalized federated learning method has become an essential choice for the successful deployment of federated learning-based services. Among diverse branches of personalization techniques, a model mixture-based personalization method is preferred as each client has their own personalized model as a result of federated lear… ▽ More

    Submitted 11 August, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

    Comments: Appears at ACM SIGKDD 2022. Code available at http://github.com/vaseline555/SuPerFed

  45. arXiv:2109.04650  [pdf, other

    cs.CL

    What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers

    Authors: Boseop Kim, HyoungSeok Kim, Sang-Woo Lee, Gichang Lee, Donghyun Kwak, Dong Hyeon Jeon, Sunghyun Park, Sungju Kim, Seonhoon Kim, Dongpil Seo, Heungsub Lee, Minyoung Jeong, Sungjae Lee, Minsub Kim, Suk Hyun Ko, Seokhun Kim, Taeyong Park, Jinuk Kim, Soyoung Kang, Na-Hyeon Ryu, Kang Min Yoo, Minsuk Chang, Soobin Suh, Sookyo In, Jinseong Park , et al. (12 additional authors not shown)

    Abstract: GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a K… ▽ More

    Submitted 28 November, 2021; v1 submitted 9 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP2021 as a long paper. Fixed some typos

  46. arXiv:2108.06759  [pdf, other

    cs.RO

    A Morphing Quadrotor that Can Optimize Morphology for Transportation

    Authors: Chanyoung Kim, Hyungyu Lee, Myeongwoo Jeong, Hyun Myung

    Abstract: Multirotors can be effectively applied to various tasks, such as transportation, investigation, exploration, and lifesaving, depending on the type of payload. However, due to the nature of multirotors, the payload loaded on the multirotor is limited in its position and weight, which presents a major disadvantage when the multirotor is used in various fields. In this paper, we propose a novel metho… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

    Comments: 7 pages, Accepted at 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021)

  47. arXiv:2108.05457  [pdf, other

    cs.RO cs.AI

    Low-level Pose Control of Tilting Multirotor for Wall Perching Tasks Using Reinforcement Learning

    Authors: Hyungyu Lee, Myeongwoo Jeong, Chanyoung Kim, Hyungtae Lim, Changgue Park, Sungwon Hwang, Hyun Myung

    Abstract: Recently, needs for unmanned aerial vehicles (UAVs) that are attachable to the wall have been highlighted. As one of the ways to address the need, researches on various tilting multirotors that can increase maneuverability has been employed. Unfortunately, existing studies on the tilting multirotors require considerable amounts of prior information on the complex dynamic model. Meanwhile, reinforc… ▽ More

    Submitted 11 August, 2021; originally announced August 2021.

    Comments: 8page, IROS2021 contributed paper

  48. arXiv:2106.06218  [pdf, other

    cs.LG cs.AI cs.SI

    Graph Transformer Networks: Learning Meta-path Graphs to Improve GNNs

    Authors: Seongjun Yun, Minbyul Jeong, Sungdong Yoo, Seunghun Lee, Sean S. Yi, Raehyun Kim, Jaewoo Kang, Hyunwoo J. Kim

    Abstract: Graph Neural Networks (GNNs) have been widely applied to various fields due to their powerful representations of graph-structured data. Despite the success of GNNs, most existing GNNs are designed to learn node representations on the fixed and homogeneous graphs. The limitations especially become problematic when learning representations on a misspecified graph or a heterogeneous graph that consis… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Comments: arXiv admin note: text overlap with arXiv:1911.06455

  49. arXiv:2106.03097  [pdf, other

    cs.LG cs.AI cs.CV

    Preservation of the Global Knowledge by Not-True Distillation in Federated Learning

    Authors: Gihun Lee, Minchan Jeong, Yongjin Shin, Sangmin Bae, Se-Young Yun

    Abstract: In federated learning, a strong global model is collaboratively learned by aggregating clients' locally trained models. Although this precludes the need to access clients' data directly, the global model's convergence often suffers from data heterogeneity. This study starts from an analogy to continual learning and suggests that forgetting could be the bottleneck of federated learning. We observe… ▽ More

    Submitted 29 November, 2022; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  50. arXiv:2105.11715  [pdf, other

    cs.CV

    Improving Few-shot Learning with Weakly-supervised Object Localization

    Authors: Inyong Koo, Minki Jeong, Changick Kim

    Abstract: Few-shot learning often involves metric learning-based classifiers, which predict the image label by comparing the distance between the extracted feature vector and class representations. However, applying global pooling in the backend of the feature extractor may not produce an embedding that correctly focuses on the class object. In this work, we propose a novel framework that generates class re… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: 5 pages, 4 figures

    ACM Class: I.4.10