Skip to main content

Showing 1–50 of 115 results for author: Tripathi, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.06010  [pdf, other

    cs.NI eess.SY

    When Learning Meets Dynamics: Distributed User Connectivity Maximization in UAV-Based Communication Networks

    Authors: Bowei Li, Saugat Tripathi, Salman Hosain, Ran Zhang, Jiang, Xie, Miao Wang

    Abstract: Distributed management over Unmanned Aerial Vehicle (UAV) based communication networks (UCNs) has attracted increasing research attention. In this work, we study a distributed user connectivity maximization problem in a UCN. The work features a horizontal study over different levels of information exchange during the distributed iteration and a consideration of dynamics in UAV set and user distrib… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

    Comments: 12 pages, 12 figures, journal draft

  2. arXiv:2409.03944  [pdf, other

    cs.CV cs.AI

    HUMOS: Human Motion Model Conditioned on Body Shape

    Authors: Shashank Tripathi, Omid Taheri, Christoph Lassner, Michael J. Black, Daniel Holden, Carsten Stoll

    Abstract: Generating realistic human motion is essential for many computer vision and graphics applications. The wide variety of human body shapes and sizes greatly impacts how people move. However, most existing motion models ignore these differences, relying on a standardized, average body. This leads to uniform motion across different body types, where movements don't match their physical characteristics… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

    Comments: Accepted in ECCV'24. Project page: https://CarstenEpic.github.io/humos/

  3. arXiv:2407.19520  [pdf, other

    cs.CV cs.LG

    Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation

    Authors: Tz-Ying Wu, Kyle Min, Subarna Tripathi, Nuno Vasconcelos

    Abstract: Video understanding typically requires fine-tuning the large backbone when adapting to new domains. In this paper, we leverage the egocentric video foundation models (Ego-VFMs) based on video-language pre-training and propose a parameter-efficient adaptation for egocentric video tasks, namely Ego-VPA. It employs a local sparse approximation for each video frame/text feature using the basis prompts… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

  4. arXiv:2406.09462  [pdf, other

    cs.CV cs.AI

    SViTT-Ego: A Sparse Video-Text Transformer for Egocentric Video

    Authors: Hector A. Valdez, Kyle Min, Subarna Tripathi

    Abstract: Pretraining egocentric vision-language models has become essential to improving downstream egocentric video-text tasks. These egocentric foundation models commonly use the transformer architecture. The memory footprint of these models during pretraining can be substantial. Therefore, we pretrain SViTT-Ego, the first sparse egocentric video-text transformer model integrating edge and node sparsific… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  5. A PCA based Keypoint Tracking Approach to Automated Facial Expressions Encoding

    Authors: Shivansh Chandra Tripathi, Rahul Garg

    Abstract: The Facial Action Coding System (FACS) for studying facial expressions is manual and requires significant effort and expertise. This paper explores the use of automated techniques to generate Action Units (AUs) for studying facial expressions. We propose an unsupervised approach based on Principal Component Analysis (PCA) and facial keypoint tracking to generate data-driven AUs called PCA AUs usin… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this contribution is published in [LNCS,volume 14301], and is available online at https://doi.org/10.1007/978-3-031-45170-6_85

  6. arXiv:2406.05434  [pdf, other

    cs.CV cs.HC

    Unsupervised learning of Data-driven Facial Expression Coding System (DFECS) using keypoint tracking

    Authors: Shivansh Chandra Tripathi, Rahul Garg

    Abstract: The development of existing facial coding systems, such as the Facial Action Coding System (FACS), relied on manual examination of facial expression videos for defining Action Units (AUs). To overcome the labor-intensive nature of this process, we propose the unsupervised learning of an automated facial coding system by leveraging computer-vision-based facial keypoint tracking. In this novel facia… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  7. arXiv:2406.02631  [pdf, other

    cs.CV

    Contrastive Language Video Time Pre-training

    Authors: Hengyue Liu, Kyle Min, Hector A. Valdez, Subarna Tripathi

    Abstract: We introduce LAVITI, a novel approach to learning language, video, and temporal representations in long-form videos via contrastive learning. Different from pre-training on video-text pairs like EgoVLP, LAVITI aims to align language, video, and temporal features by extracting meaningful moments in untrimmed videos. Our model employs a set of learnable moment queries to decode clip-level visual, la… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: CVPR EgoVis Workshop 2024 extended abstract

  8. arXiv:2405.15392  [pdf, other

    cs.DC

    D-VRE: From a Jupyter-enabled Private Research Environment to Decentralized Collaborative Research Ecosystem

    Authors: Yuandou Wang, Sheejan Tripathi, Siamak Farshidi, Zhiming Zhao

    Abstract: Today, scientific research is increasingly data-centric and compute-intensive, relying on data and models across distributed sources. However, it still faces challenges in the traditional cooperation mode, due to the high storage and computing cost, geo-location barriers, and local confidentiality regulations. The Jupyter environment has recently emerged and evolved as a vital virtual research env… ▽ More

    Submitted 26 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: We revised the manuscript draft and submitted the revised manuscript to the journal Blockchain: Research and Applications

  9. arXiv:2404.10539  [pdf, other

    cs.CV cs.AI

    VideoSAGE: Video Summarization with Graph Representation Learning

    Authors: Jose M. Rojas Chaves, Subarna Tripathi

    Abstract: We propose a graph-based representation learning framework for video summarization. First, we convert an input video to a graph where nodes correspond to each of the video frames. Then, we impose sparsity on the graph by connecting only those pairs of nodes that are within a specified temporal distance. We then formulate the video summarization task as a binary node classification problem, precise… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2207.07783

  10. Loss Regularizing Robotic Terrain Classification

    Authors: Shakti Deo Kumar, Sudhanshu Tripathi, Krishna Ujjwal, Sarvada Sakshi Jha, Suddhasil De

    Abstract: Locomotion mechanics of legged robots are suitable when pacing through difficult terrains. Recognising terrains for such robots are important to fully yoke the versatility of their movements. Consequently, robotic terrain classification becomes significant to classify terrains in real time with high accuracy. The conventional classifiers suffer from overfitting problem, low accuracy problem, high… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Preliminary draft of the work published in IEEE conference 2023

  11. arXiv:2403.00788  [pdf

    cs.CL cs.AI cs.HC cs.LG

    PRECISE Framework: GPT-based Text For Improved Readability, Reliability, and Understandability of Radiology Reports For Patient-Centered Care

    Authors: Satvik Tripathi, Liam Mutter, Meghana Muppuri, Suhani Dheer, Emiliano Garza-Frias, Komal Awan, Aakash Jha, Michael Dezube, Azadeh Tabari, Christopher P. Bridge, Dania Daye

    Abstract: This study introduces and evaluates the PRECISE framework, utilizing OpenAI's GPT-4 to enhance patient engagement by providing clearer and more accessible chest X-ray reports at a sixth-grade reading level. The framework was tested on 500 reports, demonstrating significant improvements in readability, reliability, and understandability. Statistical analyses confirmed the effectiveness of the PRECI… ▽ More

    Submitted 19 February, 2024; originally announced March 2024.

  12. arXiv:2312.05432  [pdf, other

    cs.LG math.OC

    Fusing Multiple Algorithms for Heterogeneous Online Learning

    Authors: Darshan Gadginmath, Shivanshu Tripathi, Fabio Pasqualetti

    Abstract: This study addresses the challenge of online learning in contexts where agents accumulate disparate data, face resource constraints, and use different local algorithms. This paper introduces the Switched Online Learning Algorithm (SOLA), designed to solve the heterogeneous online learning problem by amalgamating updates from diverse agents through a dynamic switching mechanism contingent upon thei… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 13 pages, 3 figures

  13. arXiv:2312.03391  [pdf, other

    cs.CV

    Action Scene Graphs for Long-Form Understanding of Egocentric Videos

    Authors: Ivan Rodin, Antonino Furnari, Kyle Min, Subarna Tripathi, Giovanni Maria Farinella

    Abstract: We present Egocentric Action Scene Graphs (EASGs), a new representation for long-form understanding of egocentric videos. EASGs extend standard manually-annotated representations of egocentric videos, such as verb-noun action labels, by providing a temporally evolving graph-based description of the actions performed by the camera wearer, including interacted objects, their relationships, and how a… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  14. arXiv:2311.10476  [pdf, other

    cs.CV

    FRCSyn Challenge at WACV 2024:Face Recognition Challenge in the Era of Synthetic Data

    Authors: Pietro Melzi, Ruben Tolosana, Ruben Vera-Rodriguez, Minchul Kim, Christian Rathgeb, Xiaoming Liu, Ivan DeAndres-Tame, Aythami Morales, Julian Fierrez, Javier Ortega-Garcia, Weisong Zhao, Xiangyu Zhu, Zheyu Yan, Xiao-Yu Zhang, Jinlin Wu, Zhen Lei, Suvidha Tripathi, Mahak Kothari, Md Haider Zama, Debayan Deb, Bernardo Biesseck, Pedro Vidal, Roger Granada, Guilherme Fickel, Gustavo Führ , et al. (22 additional authors not shown)

    Abstract: Despite the widespread adoption of face recognition technology around the world, and its remarkable performance on current benchmarks, there are still several challenges that must be covered in more detail. This paper offers an overview of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at WACV 2024. This is the first international challenge aiming to explore the use… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: 10 pages, 1 figure, WACV 2024 Workshops

  15. arXiv:2310.02753  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    MUNCH: Modelling Unique 'N Controllable Heads

    Authors: Debayan Deb, Suvidha Tripathi, Pranit Puri

    Abstract: The automated generation of 3D human heads has been an intriguing and challenging task for computer vision researchers. Prevailing methods synthesize realistic avatars but with limited control over the diversity and quality of rendered outputs and suffer from limited correlation between shape and texture of the character. We propose a method that offers quality, diversity, control, and realism alo… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  16. arXiv:2309.15273  [pdf, other

    cs.CV

    DECO: Dense Estimation of 3D Human-Scene Contact In The Wild

    Authors: Shashank Tripathi, Agniv Chatterjee, Jean-Claude Passy, Hongwei Yi, Dimitrios Tzionas, Michael J. Black

    Abstract: Understanding how humans use physical contact to interact with the world is key to enabling human-centric artificial intelligence. While inferring 3D contact is crucial for modeling realistic and physically-plausible human-object interactions, existing methods either focus on 2D, consider body joints rather than the surface, use coarse 3D body regions, or do not generalize to in-the-wild images. I… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Accepted as Oral in ICCV'23. Project page: https://deco.is.tue.mpg.de

  17. arXiv:2307.16195  [pdf, ps, other

    cs.AR

    Implementation of Fast and Power Efficient SEC-DAEC and SEC-DAEC-TAEC Codecs on FPGA

    Authors: Sayan Tripathi, Jhilam Jana, Jaydeb Bhaumik

    Abstract: The reliability of memory devices is affected by radiation induced soft errors. Multiple cell upsets (MCUs) caused by radiation corrupt data stored in multiple cells within memories. Error correction codes (ECCs) are typically used to mitigate the effects of MCUs. Single error correction-double error detection (SEC-DED) codes are not the right choice against MCUs, but are more suitable for protect… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

    Comments: 9 pages, 2 figures, 2 tables

  18. Emotional Speech-Driven Animation with Content-Emotion Disentanglement

    Authors: Radek Daněček, Kiran Chhatre, Shashank Tripathi, Yandong Wen, Michael J. Black, Timo Bolkart

    Abstract: To be widely adopted, 3D facial avatars must be animated easily, realistically, and directly from speech signals. While the best recent methods generate 3D animations that are synchronized with the input audio, they largely ignore the impact of emotions on facial expressions. Realistic facial animation requires lip-sync together with the natural expression of emotion. To that end, we propose EMOTE… ▽ More

    Submitted 26 September, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: SIGGRAPH Asia 2023 Conference Paper

  19. arXiv:2306.05689  [pdf, other

    cs.CV

    Single-Stage Visual Relationship Learning using Conditional Queries

    Authors: Alakh Desai, Tz-Ying Wu, Subarna Tripathi, Nuno Vasconcelos

    Abstract: Research in scene graph generation (SGG) usually considers two-stage models, that is, detecting a set of entities, followed by combining them and labeling all possible relationships. While showing promising results, the pipeline structure induces large parameter and computation overhead, and typically hinders end-to-end optimizations. To address this, recent research attempts to train single-stage… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: Accepted to NeurIPS 2022

  20. arXiv:2306.01652  [pdf, other

    cs.IT eess.SP

    On the Coverage of Cognitive mmWave Networks with Directional Sensing and Communication

    Authors: Shuchi Tripathi, Abhishek K. Gupta, SaiDhiraj Amuru

    Abstract: Millimeter-waves' propagation characteristics create prospects for spatial and temporal spectrum sharing in a variety of contexts, including cognitive spectrum sharing (CSS). However, CSS along with omnidirectional sensing, is not efficient at mmWave frequencies due to their directional nature of transmission, as this limits secondary networks' ability to access the spectrum. This inspired us to c… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 30 pages, 12 figures

  21. arXiv:2304.11827  [pdf

    cs.CR cs.NI

    Safe and Secure Smart Home using Cisco Packet Tracer

    Authors: Shivansh Walia, Tejas Iyer, Shubham Tripathi, Akshith Vanaparthy

    Abstract: This project presents an implementation and designing of safe, secure and smart home with enhanced levels of security features which uses IoT-based technology. We got our motivation for this project after learning about movement of west towards smart homes and designs. This galvanized us to engage in this work as we wanted for homeowners to have a greater control over their in-house environment wh… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: 11 pages

  22. arXiv:2304.08809  [pdf, other

    cs.CV

    SViTT: Temporal Learning of Sparse Video-Text Transformers

    Authors: Yi Li, Kyle Min, Subarna Tripathi, Nuno Vasconcelos

    Abstract: Do video-text transformers learn to model temporal relationships across frames? Despite their immense capacity and the abundance of multimodal training data, recent work has revealed the strong tendency of video-text models towards frame-based spatial representations, while temporal reasoning remains largely unsolved. In this work, we identify several key challenges in temporal learning of video-t… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  23. arXiv:2304.00733  [pdf, other

    cs.CV

    Unbiased Scene Graph Generation in Videos

    Authors: Sayak Nag, Kyle Min, Subarna Tripathi, Amit K. Roy Chowdhury

    Abstract: The task of dynamic scene graph generation (SGG) from videos is complicated and challenging due to the inherent dynamics of a scene, temporal fluctuation of model predictions, and the long-tailed distribution of the visual relationships in addition to the already existing challenges in image-based SGG. Existing methods for dynamic SGG have primarily focused on capturing spatio-temporal context usi… ▽ More

    Submitted 29 June, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: Published in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023

  24. arXiv:2303.18246  [pdf, other

    cs.CV cs.AI cs.GR

    3D Human Pose Estimation via Intuitive Physics

    Authors: Shashank Tripathi, Lea Müller, Chun-Hao P. Huang, Omid Taheri, Michael J. Black, Dimitrios Tzionas

    Abstract: Estimating 3D humans from images often produces implausible bodies that lean, float, or penetrate the floor. Such methods ignore the fact that bodies are typically supported by the scene. A physics engine can be used to enforce physical plausibility, but these are not differentiable, rely on unrealistic proxy bodies, and are difficult to integrate into existing optimization and learning frameworks… ▽ More

    Submitted 24 July, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR'23. Project page: https://ipman.is.tue.mpg.de

  25. arXiv:2303.17499  [pdf, other

    cs.CR

    Fuzzified advanced robust hashes for identification of digital and physical objects

    Authors: Shashank Tripathi, Volker Skwarek

    Abstract: With the rising numbers for IoT objects, it is becoming easier to penetrate counterfeit objects into the mainstream market by adversaries. Such infiltration of bogus products can be addressed with third-party-verifiable identification. Generally, state-of-the-art identification schemes do not guarantee that an identifier e.g. barcodes or RFID itself cannot be forged. This paper introduces identifi… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: 9 pages, 6 figures, 3 tables

    ACM Class: E.3; E.4; H.1

  26. arXiv:2212.04360  [pdf, other

    cs.CV cs.GR

    MIME: Human-Aware 3D Scene Generation

    Authors: Hongwei Yi, Chun-Hao P. Huang, Shashank Tripathi, Lea Hering, Justus Thies, Michael J. Black

    Abstract: Generating realistic 3D worlds occupied by moving humans has many applications in games, architecture, and synthetic data creation. But generating such scenes is expensive and labor intensive. Recent work generates human poses and motions given a 3D scene. Here, we take the opposite approach and generate 3D indoor scenes given 3D human motion. Such motions can come from archival motion capture or… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: Project Page: https://mime.is.tue.mpg.de

  27. arXiv:2211.04442  [pdf, other

    cs.LG

    Algorithmic Bias in Machine Learning Based Delirium Prediction

    Authors: Sandhya Tripathi, Bradley A Fritz, Michael S Avidan, Yixin Chen, Christopher R King

    Abstract: Although prediction models for delirium, a commonly occurring condition during general hospitalization or post-surgery, have not gained huge popularity, their algorithmic bias evaluation is crucial due to the existing association between social determinants of health and delirium risk. In this context, using MIMIC-III and another academic hospital dataset, we present some initial experimental evid… ▽ More

    Submitted 26 November, 2022; v1 submitted 8 November, 2022; originally announced November 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 14 pages

  28. arXiv:2210.15923  [pdf, other

    cs.LG

    DELFI: Deep Mixture Models for Long-term Air Quality Forecasting in the Delhi National Capital Region

    Authors: Naishadh Parmar, Raunak Shah, Tushar Goswamy, Vatsalya Tandon, Ravi Sahu, Ronak Sutaria, Purushottam Kar, Sachchida Nand Tripathi

    Abstract: The identification and control of human factors in climate change is a rapidly growing concern and robust, real-time air-quality monitoring and forecasting plays a critical role in allowing effective policy formulation and implementation. This paper presents DELFI, a novel deep learning-based mixture model to make effective long-term predictions of Particulate Matter (PM) 2.5 concentrations. A key… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: 6 pages

  29. arXiv:2210.10130  [pdf, other

    cs.CV

    PERI: Part Aware Emotion Recognition In The Wild

    Authors: Akshita Mittel, Shashank Tripathi

    Abstract: Emotion recognition aims to interpret the emotional states of a person based on various inputs including audio, visual, and textual cues. This paper focuses on emotion recognition using visual features. To leverage the correlation between facial expression and the emotional state of a person, pioneering methods rely primarily on facial features. However, facial features are often unreliable in nat… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: Accepted at ECCVW 2022

  30. arXiv:2210.00521  [pdf, other

    cs.LG eess.SP

    Leveraging unsupervised data and domain adaptation for deep regression in low-cost sensor calibration

    Authors: Swapnil Dey, Vipul Arora, Sachchida Nand Tripathi

    Abstract: Air quality monitoring is becoming an essential task with rising awareness about air quality. Low cost air quality sensors are easy to deploy but are not as reliable as the costly and bulky reference monitors. The low quality sensors can be calibrated against the reference monitors with the help of deep learning. In this paper, we translate the task of sensor calibration into a semi-supervised dom… ▽ More

    Submitted 2 October, 2022; originally announced October 2022.

    Comments: submitted to IEEE Trans. on Neural Networks and Learning Systems as a regular article

  31. arXiv:2208.01953  [pdf, ps, other

    cs.DS

    Maximum Minimal Feedback Vertex Set: A Parameterized Perspective

    Authors: Ajinkya Gaikwad, Hitendra Kumar, Soumen Maity, Saket Saurabh, Shuvam Kant Tripathi

    Abstract: In this paper we study a maximization version of the classical Feedback Vertex Set (FVS) problem, namely, the Max Min FVS problem, in the realm of parameterized complexity. In this problem, given an undirected graph $G$, a positive integer $k$, the question is to check whether $G$ has a minimal feedback vertex set of size at least $k$. We obtain following results for Max Min FVS. 1) We first des… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

  32. arXiv:2207.07783  [pdf, other

    cs.CV

    Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection

    Authors: Kyle Min, Sourya Roy, Subarna Tripathi, Tanaya Guha, Somdeb Majumdar

    Abstract: Active speaker detection (ASD) in videos with multiple speakers is a challenging task as it requires learning effective audiovisual features and spatial-temporal correlations over long temporal windows. In this paper, we present SPELL, a novel spatial-temporal graph learning framework that can solve complex tasks such as ASD. To this end, each person in a video frame is first encoded in a unique n… ▽ More

    Submitted 12 October, 2022; v1 submitted 15 July, 2022; originally announced July 2022.

    Comments: ECCV 2022 camera ready (Supplementary videos: on ECVA soon). This paper supersedes arXiv:2112.01479

  33. arXiv:2207.03536  [pdf, other

    cs.DB cs.LG

    Deep Learning to Jointly Schema Match, Impute, and Transform Databases

    Authors: Sandhya Tripathi, Bradley A. Fritz, Mohamed Abdelhack, Michael S. Avidan, Yixin Chen, Christopher R. King

    Abstract: An applied problem facing all areas of data science is harmonizing data sources. Joining data from multiple origins with unmapped and only partially overlapping features is a prerequisite to developing and testing robust, generalizable algorithms, especially in health care. We approach this issue in the common but difficult case of numeric features such as nearly Gaussian and binary features, wher… ▽ More

    Submitted 22 June, 2022; originally announced July 2022.

  34. arXiv:2205.08440  [pdf, other

    cs.CR cs.DC cs.MA cs.SE

    Moving Smart Contracts -- A Privacy Preserving Method for Off-Chain Data Trust

    Authors: Simon Tschirner, Shashank Shekher Tripathi, Mathias Roeper, Markus M. Becker, Volker Skwarek

    Abstract: Blockchains provide environments where parties can interact transparently and securely peer-to-peer without needing a trusted third party. Parties can trust the integrity and correctness of transactions and the verifiable execution of binary code on the blockchain (smart contracts) inside the system. Including information from outside of the blockchain remains challenging. A challenge is data priv… ▽ More

    Submitted 18 May, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

    Comments: 10 pages, 6 figures

    ACM Class: C.2.4; E.2; E.3

  35. arXiv:2204.08695  [pdf, other

    cs.SE

    Automated Application Processing

    Authors: Eshita Sharma, Keshav Gupta, Lubaina Machinewala, Samaksh Dhingra, Shrey Tripathi, Shreyas V S, Sujit Kumar Chakrabarti

    Abstract: Recruitment in large organisations often involves interviewing a large number of candidates. The process is resource intensive and complex. Therefore, it is important to carry it out efficiently and effectively. Planning the selection process consists of several problems, each of which maps to one or the other well-known computing problem. Research that looks at each of these problems in isolation… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

  36. arXiv:2204.07066  [pdf, other

    cs.NE cs.LG eess.SP

    EvoSTS Forecasting: Evolutionary Sparse Time-Series Forecasting

    Authors: Ethan Jacob Moyer, Alisha Isabelle Augustin, Satvik Tripathi, Ansh Aashish Dholakia, Andy Nguyen, Isamu Mclean Isozaki, Daniel Schwartz, Edward Kim

    Abstract: In this work, we highlight our novel evolutionary sparse time-series forecasting algorithm also known as EvoSTS. The algorithm attempts to evolutionary prioritize weights of Long Short-Term Memory (LSTM) Network that best minimize the reconstruction loss of a predicted signal using a learned sparse coded dictionary. In each generation of our evolutionary algorithm, a set number of children with th… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: 5 pages, 2 figures, 2 tables

  37. arXiv:2204.01918  [pdf, other

    cs.CV

    Text Spotting Transformers

    Authors: Xiang Zhang, Yongwen Su, Subarna Tripathi, Zhuowen Tu

    Abstract: In this paper, we present TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild. TESTR builds upon a single encoder and dual decoders for the joint text-box control point regression and character recognition. Other than most existing literature, our method is free from Region-of-Interest operations and heu… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: Accepted to CVPR 2022

  38. arXiv:2204.01696  [pdf, other

    cs.CV cs.LG

    Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos

    Authors: Shaowei Liu, Subarna Tripathi, Somdeb Majumdar, Xiaolong Wang

    Abstract: We propose to forecast future hand-object interactions given an egocentric video. Instead of predicting action labels or pixels, we directly predict the hand motion trajectory and the future contact points on the next active object (i.e., interaction hotspots). This relatively low-dimensional representation provides a concrete description of future interactions. To tackle this task, we first provi… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: CVPR 2022, Project page: https://stevenlsw.github.io/hoi-forecast

  39. arXiv:2203.13349  [pdf, other

    cs.CV cs.AI

    Occluded Human Mesh Recovery

    Authors: Rawal Khirodkar, Shashank Tripathi, Kris Kitani

    Abstract: Top-down methods for monocular human mesh recovery have two stages: (1) detect human bounding boxes; (2) treat each bounding box as an independent single-human mesh recovery task. Unfortunately, the single-human assumption does not hold in images with multi-human occlusion and crowding. Consequently, top-down methods have difficulties in recovering accurate 3D human meshes under severe person-pers… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

  40. arXiv:2203.10636  [pdf, other

    cs.CV

    Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild

    Authors: Ardhendu Shekhar Tripathi, Martin Danelljan, Samarth Shukla, Radu Timofte, Luc Van Gool

    Abstract: We propose a trainable Image Signal Processing (ISP) framework that produces DSLR quality images given RAW images captured by a smartphone. To address the color misalignments between training image pairs, we employ a color-conditional ISP network and optimize a novel parametric color mapping between each input RAW and reference DSLR image. During inference, we predict the target color image by des… ▽ More

    Submitted 12 July, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: Accepted at ECCV 2022

  41. arXiv:2202.10701  [pdf, other

    cs.CV

    Bag of Visual Words (BoVW) with Deep Features -- Patch Classification Model for Limited Dataset of Breast Tumours

    Authors: Suvidha Tripathi, Satish Kumar Singh, Lee Hwee Kuan

    Abstract: Currently, the computational complexity limits the training of high resolution gigapixel images using Convolutional Neural Networks. Therefore, such images are divided into patches or tiles. Since, these high resolution patches are encoded with discriminative information therefore; CNNs are trained on these patches to perform patch-level predictions. However, the problem with patch-level predictio… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  42. Ensembling Handcrafted Features with Deep Features: An Analytical Study for Classification of Routine Colon Cancer Histopathological Nuclei Images

    Authors: Suvidha Tripathi, Satish Kumar Singh

    Abstract: The use of Deep Learning (DL) based methods in medical histopathology images have been one of the most sought after solutions to classify, segment, and detect diseased biopsy samples. However, given the complex nature of medical datasets due to the presence of intra-class variability and heterogeneity, the use of complex DL models might not give the optimal performance up to the level which is sui… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

    Journal ref: Multimedia Tools Application 79 34931-34954 2020

  43. arXiv:2202.10691  [pdf, other

    eess.IV cs.CV

    An Object Aware Hybrid U-Net for Breast Tumour Annotation

    Authors: Suvidha Tripathi, Satish Kumar Singh

    Abstract: In the clinical settings, during digital examination of histopathological slides, the pathologist annotate the slides by marking the rough boundary around the suspected tumour region. The marking or annotation is generally represented as a polygonal boundary that covers the extent of the tumour in the slide. These polygonal markings are difficult to imitate through CAD techniques since the tumour… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  44. Cell nuclei classification in histopathological images using hybrid OLConvNet

    Authors: Suvidha Tripathi, Satish Kumar Singh

    Abstract: Computer-aided histopathological image analysis for cancer detection is a major research challenge in the medical domain. Automatic detection and classification of nuclei for cancer diagnosis impose a lot of challenges in developing state of the art algorithms due to the heterogeneity of cell nuclei and data set variability. Recently, a multitude of classification algorithms has used complex deep… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

    Journal ref: @article{10.1145/3345318, year = {2020},journal = {ACM Trans. Multimedia Comput. Commun. Appl.}, volume = {16}, number = {1s}, issn = {1551-6857}, articleno = {32}, numpages = {22}}

  45. arXiv:2112.09828  [pdf, other

    cs.CV

    Exploiting Long-Term Dependencies for Generating Dynamic Scene Graphs

    Authors: Shengyu Feng, Subarna Tripathi, Hesham Mostafa, Marcel Nassar, Somdeb Majumdar

    Abstract: Dynamic scene graph generation from a video is challenging due to the temporal dynamics of the scene and the inherent temporal fluctuations of predictions. We hypothesize that capturing long-term temporal dependencies is the key to effective generation of dynamic scene graphs. We propose to learn the long-term dependencies in a video by capturing the object-level consistency and inter-object relat… ▽ More

    Submitted 19 October, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

    Comments: WACV 2023

  46. arXiv:2112.01479  [pdf, other

    cs.CV

    Learning Spatial-Temporal Graphs for Active Speaker Detection

    Authors: Sourya Roy, Kyle Min, Subarna Tripathi, Tanaya Guha, Somdeb Majumdar

    Abstract: We address the problem of active speaker detection through a new framework, called SPELL, that learns long-range multimodal graphs to encode the inter-modal relationship between audio and visual data. We cast active speaker detection as a node classification task that is aware of longer-term dependencies. We first construct a graph from a video so that each node corresponds to one person. Nodes re… ▽ More

    Submitted 3 December, 2021; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: 10 pages

  47. arXiv:2111.03039  [pdf, other

    cs.CV

    Towards Panoptic 3D Parsing for Single Image in the Wild

    Authors: Sainan Liu, Vincent Nguyen, Yuan Gao, Subarna Tripathi, Zhuowen Tu

    Abstract: Performing single image holistic understanding and 3D reconstruction is a central task in computer vision. This paper presents an integrated system that performs dense scene labeling, object detection, instance segmentation, depth estimation, 3D shape reconstruction, and 3D layout estimation for indoor and outdoor scenes from a single RGB image. We name our system panoptic 3D parsing (Panoptic3D)… ▽ More

    Submitted 29 November, 2021; v1 submitted 4 November, 2021; originally announced November 2021.

  48. arXiv:2111.01414  [pdf, other

    cs.CL cs.AI

    A Review of Dialogue Systems: From Trained Monkeys to Stochastic Parrots

    Authors: Atharv Singh Patlan, Shiven Tripathi, Shubham Korde

    Abstract: In spoken dialogue systems, we aim to deploy artificial intelligence to build automated dialogue agents that can converse with humans. Dialogue systems are increasingly being designed to move beyond just imitating conversation and also improve from such interactions over time. In this survey, we present a broad overview of methods developed to build dialogue systems over the years. Different use c… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

  49. arXiv:2110.09436  [pdf

    cs.LG

    Early Diagnostic Prediction of Covid-19 using Gradient-Boosting Machine Model

    Authors: Satvik Tripathi

    Abstract: With the huge spike in the COVID-19 cases across the globe and reverse transcriptase-polymerase chain reaction (RT-PCR) test remains a key component for rapid and accurate detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In recent months there has been an acute shortage of medical supplies in developing countries, especially a lack of RT-PCR testing resulting in delayed p… ▽ More

    Submitted 18 October, 2021; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: Presented at the Drexel Society of Artificial Intelligence Research Conference, 2021 (arXiv:2110.05263)

    Report number: drexelai/2021/06

  50. A Context-aware Radio Resource Management in Heterogeneous Virtual RANs

    Authors: Sharda Tripathi, Corrado Puligheddu, Carla Fabiana Chiasserini, Federico Mungari

    Abstract: New-generation wireless networks are designed to support a wide range of services with diverse key performance indicators (KPIs) requirements. A fundamental component of such networks, and a pivotal factor to the fulfillment of the target KPIs, is the virtual radio access network (vRAN), which allows high flexibility on the control of the radio link. However, to fully exploit the potentiality of v… ▽ More

    Submitted 23 September, 2021; v1 submitted 22 September, 2021; originally announced September 2021.

    Comments: Accepted for publication in IEEE Transactions on Cognitive Communications and Networking