Skip to main content

Showing 1–13 of 13 results for author: Sohn, S S

.
  1. arXiv:2406.10478  [pdf, other

    cs.CL cs.AI cs.GR

    From Words to Worlds: Transforming One-line Prompt into Immersive Multi-modal Digital Stories with Communicative LLM Agent

    Authors: Samuel S. Sohn, Danrui Li, Sen Zhang, Che-Jui Chang, Mubbasir Kapadia

    Abstract: Digital storytelling, essential in entertainment, education, and marketing, faces challenges in production scalability and flexibility. The StoryAgent framework, introduced in this paper, utilizes Large Language Models and generative tools to automate and refine digital storytelling. Employing a top-down story drafting and bottom-up asset generation approach, StoryAgent tackles key issues such as… ▽ More

    Submitted 21 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: 16 pages, 13 figures

  2. arXiv:2406.05431  [pdf

    cs.CL

    MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature

    Authors: Gyeong Hoon Yi, Jiwoo Choi, Hyeongyun Song, Olivia Miano, Jaewoong Choi, Kihoon Bang, Byungju Lee, Seok Su Sohn, David Buttler, Anna Hiszpanski, Sang Soo Han, Donghun Kim

    Abstract: Efficiently extracting data from tables in the scientific literature is pivotal for building large-scale databases. However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature. MaTabl… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  3. Microscopic modeling of attention-based movement behaviors

    Authors: Danrui Li, Mathew Schwartz, Samuel S. Sohn, Sejong Yoon, Vladimir Pavlovic, Mubbasir Kapadia

    Abstract: For transportation hubs, leveraging pedestrian flows for commercial activities presents an effective strategy for funding maintenance and infrastructure improvements. However, this introduces new challenges, as consumer behaviors can disrupt pedestrian flow and efficiency. To optimize both retail potential and pedestrian efficiency, careful strategic planning in store layout and facility dimension… ▽ More

    Submitted 2 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: 27 pages, 13 figures, 6 tables. Published in Transportation Research Part C. For the project webpage, see https://danruili.github.io/AttentionMove/

    Journal ref: Transportation Research Part C: Emerging Technologies 162 (2024): 104583

  4. arXiv:2309.15311  [pdf, other

    cs.HC cs.AI cs.GR

    The Importance of Multimodal Emotion Conditioning and Affect Consistency for Embodied Conversational Agents

    Authors: Che-Jui Chang, Samuel S. Sohn, Sen Zhang, Rajath Jayashankar, Muhammad Usman, Mubbasir Kapadia

    Abstract: Previous studies regarding the perception of emotions for embodied virtual agents have shown the effectiveness of using virtual characters in conveying emotions through interactions with humans. However, creating an autonomous embodied conversational agent with expressive behaviors presents two major challenges. The first challenge is the difficulty of synthesizing the conversational behaviors for… ▽ More

    Submitted 6 December, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

  5. arXiv:2306.16772  [pdf, other

    cs.CV cs.AI cs.LG

    M3Act: Learning from Synthetic Human Group Activities

    Authors: Che-Jui Chang, Danrui Li, Deep Patel, Parth Goel, Honglu Zhou, Seonghyeon Moon, Samuel S. Sohn, Sejong Yoon, Vladimir Pavlovic, Mubbasir Kapadia

    Abstract: The study of complex human interactions and group activities has become a focal point in human-centric computer vision. However, progress in related tasks is often hindered by the challenges of obtaining large-scale labeled datasets from real-world scenarios. To address the limitation, we introduce M3Act, a synthetic data generator for multi-view multi-group multi-person human atomic actions and g… ▽ More

    Submitted 2 May, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

  6. arXiv:2212.04673  [pdf, other

    cs.CV

    MSI: Maximize Support-Set Information for Few-Shot Segmentation

    Authors: Seonghyeon Moon, Samuel S. Sohn, Honglu Zhou, Sejong Yoon, Vladimir Pavlovic, Muhammad Haris Khan, Mubbasir Kapadia

    Abstract: FSS(Few-shot segmentation) aims to segment a target class using a small number of labeled images(support set). To extract information relevant to the target class, a dominant approach in best-performing FSS methods removes background features using a support mask. We observe that this feature excision through a limiting support mask introduces an information bottleneck in several challenging FSS c… ▽ More

    Submitted 10 November, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

    Comments: ICCV 2023

  7. arXiv:2211.00817  [pdf, other

    cs.LG cs.MA

    An Information-Theoretic Approach for Estimating Scenario Generalization in Crowd Motion Prediction

    Authors: Gang Qiao, Kaidong Hu, Seonghyeon Moon, Samuel S. Sohn, Sejong Yoon, Mubbasir Kapadia, Vladimir Pavlovic

    Abstract: Learning-based approaches to modeling crowd motion have become increasingly successful but require training and evaluation on large datasets, coupled with complex model selection and parameter tuning. To circumvent this tremendously time-consuming process, we propose a novel scoring method, which characterizes generalization of models trained on source crowd scenarios and applied to target crowd s… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  8. arXiv:2205.09075  [pdf

    cond-mat.mtrl-sci cs.LG

    Predicting failure characteristics of structural materials via deep learning based on nondestructive void topology

    Authors: Leslie Ching Ow Tiong, Gunjick Lee, Seok Su Sohn, Donghun Kim

    Abstract: Accurate predictions of the failure progression of structural materials is critical for preventing failure-induced accidents. Despite considerable mechanics modeling-based efforts, accurate prediction remains a challenging task in real-world environments due to unexpected damage factors and defect evolutions. Here, we report a novel method for predicting material failure characteristics that uniqu… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

  9. arXiv:2203.12826  [pdf, other

    cs.CV

    HM: Hybrid Masking for Few-Shot Segmentation

    Authors: Seonghyeon Moon, Samuel S. Sohn, Honglu Zhou, Sejong Yoon, Vladimir Pavlovic, Muhammad Haris Khan, Mubbasir Kapadia

    Abstract: We study few-shot semantic segmentation that aims to segment a target object from a query image when provided with a few annotated support images of the target class. Several recent methods resort to a feature masking (FM) technique to discard irrelevant feature activations which eventually facilitates the reliable prediction of segmentation mask. A fundamental limitation of FM is the inability to… ▽ More

    Submitted 24 July, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: 14 pages

    MSC Class: 68T45

  10. arXiv:2201.07189  [pdf, other

    cs.CV

    MUSE-VAE: Multi-Scale VAE for Environment-Aware Long Term Trajectory Prediction

    Authors: Mihee Lee, Samuel S. Sohn, Seonghyeon Moon, Sejong Yoon, Mubbasir Kapadia, Vladimir Pavlovic

    Abstract: Accurate long-term trajectory prediction in complex scenes, where multiple agents (e.g., pedestrians or vehicles) interact with each other and the environment while attempting to accomplish diverse and often unknown goals, is a challenging stochastic forecasting problem. In this work, we propose MUSE, a new probabilistic modeling framework based on a cascade of Conditional VAEs, which tackles the… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

  11. arXiv:2112.11734  [pdf, other

    cs.LG cs.AI

    D-HYPR: Harnessing Neighborhood Modeling and Asymmetry Preservation for Digraph Representation Learning

    Authors: Honglu Zhou, Advith Chegu, Samuel S. Sohn, Zuohui Fu, Gerard de Melo, Mubbasir Kapadia

    Abstract: Digraph Representation Learning (DRL) aims to learn representations for directed homogeneous graphs (digraphs). Prior work in DRL is largely constrained (e.g., limited to directed acyclic graphs), or has poor generalizability across tasks (e.g., evaluated solely on one task). Most Graph Neural Networks (GNNs) exhibit poor performance on digraphs due to the neglect of modeling neighborhoods and pre… ▽ More

    Submitted 28 September, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

    Comments: CIKM 2022

  12. arXiv:1910.05810  [pdf, other

    cs.AI cs.CV

    Deep Crowd-Flow Prediction in Built Environments

    Authors: Samuel S. Sohn, Seonghyeon Moon, Honglu Zhou, Sejong Yoon, Vladimir Pavlovic, Mubbasir Kapadia

    Abstract: Predicting the behavior of crowds in complex environments is a key requirement in a multitude of application areas, including crowd and disaster management, architectural design, and urban planning. Given a crowd's immediate state, current approaches simulate crowd movement to arrive at a future state. However, most applications require the ability to predict hundreds of possible simulation outcom… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

  13. arXiv:1910.00767  [pdf

    cs.MA cs.AI

    Cognitive Agent Based Simulation Model For Improving Disaster Response Procedures

    Authors: Rohit K. Dubey, Samuel S. Sohn, Christoph Hoelscher, Mubbasir Kapadia

    Abstract: In the event of a disaster, saving human lives is of utmost importance. For developing proper evacuation procedures and guidance systems, behavioural data on how people respond during panic and stress is crucial. In the absence of real human data on building evacuation, there is a need for a crowd simulator to model egress and decision-making under uncertainty. In this paper, we propose an agent-b… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.