Skip to main content

Showing 1–12 of 12 results for author: Roman, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12260  [pdf, other

    cs.HC

    HuBar: A Visual Analytics Tool to Explore Human Behaviour based on fNIRS in AR guidance systems

    Authors: Sonia Castelo, Joao Rulff, Parikshit Solunke, Erin McGowan, Guande Wu, Iran Roman, Roque Lopez, Bea Steers, Qi Sun, Juan Bello, Bradley Feest, Michael Middleton, Ryan Mckendrick, Claudio Silva

    Abstract: The concept of an intelligent augmented reality (AR) assistant has significant, wide-ranging applications, with potential uses in medicine, military, and mechanics domains. Such an assistant must be able to perceive the environment and actions, reason about the environment state in relation to a given task, and seamlessly interact with the task performer. These interactions typically involve an AR… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 11 pages, 6 figures. This is the author's version of the article that has been accepted for publication in IEEE Transactions on Visualization and Computer Graphics (TVCG)

  2. arXiv:2405.14679  [pdf, other

    cs.SD eess.AS

    Leveraging Real Electric Guitar Tones and Effects to Improve Robustness in Guitar Tablature Transcription Modeling

    Authors: Hegel Pedroza, Wallace Abreu, Ryan Corey, Iran Roman

    Abstract: Guitar tablature transcription (GTT) aims at automatically generating symbolic representations from real solo guitar performances. Due to its applications in education and musicology, GTT has gained traction in recent years. However, GTT robustness has been limited due to the small size of available datasets. Researchers have recently used synthetic data that simulates guitar performances using pr… ▽ More

    Submitted 13 July, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  3. arXiv:2401.12238  [pdf, other

    eess.AS cs.LG cs.SD

    Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic Rooms

    Authors: Iran R. Roman, Christopher Ick, Sivan Ding, Adrian S. Roman, Brian McFee, Juan P. Bello

    Abstract: Sound event localization and detection (SELD) is an important task in machine listening. Major advancements rely on simulated data with sound events in specific rooms and strong spatio-temporal labels. SELD data is simulated by convolving spatialy-localized room impulse responses (RIRs) with sound waveforms to place sound events in a soundscape. However, RIRs require manual collection in specific… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: 5 pages, 4 figures, 1 table, to be presented at ICASSP 2024 in Seoul, South Korea

  4. arXiv:2401.08717  [pdf, other

    cs.SD eess.AS

    Robust DOA estimation using deep acoustic imaging

    Authors: Adrian S. Roman, Iran R. Roman, Juan P. Bello

    Abstract: Direction of arrival estimation (DoAE) aims at tracking a sound in azimuth and elevation. Recent advancements include data-driven models with inputs derived from ambisonics intensity vectors or correlations between channels in a microphone array. A spherical intensity map (SIM), or acoustic image, is an alternative input representation that remains underexplored. SIMs benefit from high-resolution… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  5. arXiv:2312.14036  [pdf, other

    cs.SD eess.AS

    Total variation in popular rap vocals from 2009-2023: extension of the analysis by Georgieva, Ripolles & McFee

    Authors: Kelvin L Walls, Iran R Roman, Bea Steers, Elena Georgieva

    Abstract: Pitch variability in rap vocals is overlooked in favor of the genre's uniquely dynamic rhythmic properties. We present an analysis of fundamental frequency (F0) variation in rap vocals over the past 14 years, focusing on song examples that represent the state of modern rap music. Our analysis aims at identifying meaningful trends over time, and is in turn a continuation of the 2023 analysis by Geo… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Journal ref: Ismir 2023 Hybrid Conference 2023 Nov 5

  6. MarMot: Metamorphic Runtime Monitoring of Autonomous Driving Systems

    Authors: Jon Ayerdi, Asier Iriarte, Pablo Valle, Ibai Roman, Miren Illarramendi, Aitor Arrieta

    Abstract: Autonomous Driving Systems (ADSs) are complex Cyber-Physical Systems (CPSs) that must ensure safety even in uncertain conditions. Modern ADSs often employ Deep Neural Networks (DNNs), which may not produce correct results in every possible driving scenario. Thus, an approach to estimate the confidence of an ADS at runtime is necessary to prevent potentially dangerous situations. In this paper we p… ▽ More

    Submitted 6 September, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted at ACM Transactions on Software Engineering and Methodology

  7. arXiv:2310.00870  [pdf, other

    cs.SD cs.IR eess.AS

    F0 analysis of Ghanaian pop singing reveals progressive alignment with equal temperament over the past three decades: a case study

    Authors: Iran R. Roman, Daniel Faronbi, Isabelle Burger-Weiser, Leila Adu-Gilmore

    Abstract: Contemporary Ghanaian popular singing combines European and traditional Ghanaian influences. We hypothesize that access to technology embedded with equal temperament catalyzed a progressive alignment of Ghanaian singing with equal-tempered scales over time. To test this, we study the Ghanaian singer Daddy Lumba, whose work spans from the earliest Ghanaian electronic style in the late 1980s to the… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

    Comments: Pages 27-33

  8. arXiv:2309.09288  [pdf, other

    cs.SD eess.AS

    Sound Source Distance Estimation in Diverse and Dynamic Acoustic Conditions

    Authors: Saksham Singh Kushwaha, Iran R. Roman, Magdalena Fuentes, Juan Pablo Bello

    Abstract: Localizing a moving sound source in the real world involves determining its direction-of-arrival (DOA) and distance relative to a microphone. Advancements in DOA estimation have been facilitated by data-driven methods optimized with large open-source datasets with microphone array recordings in diverse environments. In contrast, estimating a sound source's distance remains understudied. Existing a… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: Accepted in WASPAA 2023

  9. arXiv:2308.06246  [pdf, other

    cs.HC

    ARGUS: Visualization of AI-Assisted Task Guidance in AR

    Authors: Sonia Castelo, Joao Rulff, Erin McGowan, Bea Steers, Guande Wu, Shaoyu Chen, Iran Roman, Roque Lopez, Ethan Brewer, Chen Zhao, Jing Qian, Kyunghyun Cho, He He, Qi Sun, Huy Vo, Juan Bello, Michael Krone, Claudio Silva

    Abstract: The concept of augmented reality (AR) assistants has captured the human imagination for decades, becoming a staple of modern science fiction. To pursue this goal, it is necessary to develop artificial intelligence (AI)-based methods that simultaneously perceive the 3D environment, reason about physical tasks, and model the performer, all in real-time. Within this framework, a wide variety of senso… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: 11 pages, 8 figures. This is the author's version of the article of the article that has been accepted for publication in IEEE Transactions on Visualization and Computer Graphics

  10. arXiv:2109.12690  [pdf, ps, other

    cs.SD cs.DB cs.LG eess.AS

    Soundata: A Python library for reproducible use of audio datasets

    Authors: Magdalena Fuentes, Justin Salamon, Pablo Zinemanas, Martín Rocamora, Genís Paja, Irán R. Román, Marius Miron, Xavier Serra, Juan Pablo Bello

    Abstract: Soundata is a Python library for loading and working with audio datasets in a standardized way, removing the need for writing custom loaders in every project, and improving reproducibility by providing tools to validate data against a canonical version. It speeds up research pipelines by allowing users to quickly download a dataset, load it into memory in a standardized and reproducible way, valid… ▽ More

    Submitted 4 October, 2021; v1 submitted 26 September, 2021; originally announced September 2021.

  11. arXiv:1910.05173  [pdf, other

    cs.LG stat.ML

    Evolving Gaussian Process kernels from elementary mathematical expressions

    Authors: Ibai Roman, Roberto Santana, Alexander Mendiburu, Jose A. Lozano

    Abstract: Choosing the most adequate kernel is crucial in many Machine Learning applications. Gaussian Process is a state-of-the-art technique for regression and classification that heavily relies on a kernel function. However, in the Gaussian Process literature, kernels have usually been either ad hoc designed, selected from a predefined set, or searched for in a space of compositions of kernels which have… ▽ More

    Submitted 14 October, 2019; v1 submitted 11 October, 2019; originally announced October 2019.

  12. arXiv:1904.00977  [pdf, ps, other

    cs.CL cs.LG stat.ML

    Sentiment analysis with genetically evolved Gaussian kernels

    Authors: Ibai Roman, Alexander Mendiburu, Roberto Santana, Jose A. Lozano

    Abstract: Sentiment analysis consists of evaluating opinions or statements from the analysis of text. Among the methods used to estimate the degree in which a text expresses a given sentiment, are those based on Gaussian Processes. However, traditional Gaussian Processes methods use a predefined kernel with hyperparameters that can be tuned but whose structure can not be adapted. In this paper, we propose t… ▽ More

    Submitted 14 October, 2019; v1 submitted 1 April, 2019; originally announced April 2019.