Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Download and Read online, DOWNLOAD EBOOK, [PDF EBOOK EPUB ], Ebooks

download, Read Ebook EPUB/KINDE, Download Book Format PDF

Frontiers Of Digital Transformation: Applications


Of The Real-World Data Circulation Paradigm Kazuya
Takeda

OR CLICK LINK
https://textbookfull.com/product/frontiers-of-
digital-transformation-applications-of-the-real-
world-data-circulation-paradigm-kazuya-takeda/

Read with Our Free App Audiobook Free Format PFD EBook, Ebooks dowload PDF
with Andible trial, Real book, online, KINDLE , Download[PDF] and Read and Read
Read book Format PDF Ebook, Dowload online, Read book Format PDF Ebook,
[PDF] and Real ONLINE Dowload [PDF] and Real ONLINE
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Image Brokers: Visualizing World News in the Age of


Digital Circulation Zeynep Devrim Gürsel

https://textbookfull.com/product/image-brokers-visualizing-world-
news-in-the-age-of-digital-circulation-zeynep-devrim-gursel/

Business Models: Innovation, Digital Transformation,


and Analytics (Data Analytics Applications) 1st Edition
Iwona Otola (Editor)

https://textbookfull.com/product/business-models-innovation-
digital-transformation-and-analytics-data-analytics-
applications-1st-edition-iwona-otola-editor/

Mobile Computing Applications and Services Kazuya Murao

https://textbookfull.com/product/mobile-computing-applications-
and-services-kazuya-murao/

Storing Digital Binary Data in Cellular DNA: The New


Paradigm 1st Edition Rocky Termanini

https://textbookfull.com/product/storing-digital-binary-data-in-
cellular-dna-the-new-paradigm-1st-edition-rocky-termanini/
Digital Fluency Understanding the Basics of Artificial
Intelligence Blockchain Technology Quantum Computing
and Their Applications for Digital Transformation
Volker Lang
https://textbookfull.com/product/digital-fluency-understanding-
the-basics-of-artificial-intelligence-blockchain-technology-
quantum-computing-and-their-applications-for-digital-
transformation-volker-lang/

Digital Peripheries The Online Circulation of


Audiovisual Content from the Small Market Perspective
Petr Szczepanik

https://textbookfull.com/product/digital-peripheries-the-online-
circulation-of-audiovisual-content-from-the-small-market-
perspective-petr-szczepanik/

The Technology Fallacy How People Are the Real Key to


Digital Transformation 1st Edition Gerald C. Kane

https://textbookfull.com/product/the-technology-fallacy-how-
people-are-the-real-key-to-digital-transformation-1st-edition-
gerald-c-kane/

Information Systems Outsourcing The Era of Digital


Transformation Rudy Hirschheim

https://textbookfull.com/product/information-systems-outsourcing-
the-era-of-digital-transformation-rudy-hirschheim/

Smart Digital Manufacturing A Guide for Digital


Transformation with Real Case Studies Across Industries
1st Edition Rene Wolf

https://textbookfull.com/product/smart-digital-manufacturing-a-
guide-for-digital-transformation-with-real-case-studies-across-
industries-1st-edition-rene-wolf/
Kazuya Takeda
Ichiro Ide
Victor Muhandiki Editors

Frontiers
of Digital
Transformation
Applications of the Real-World Data
Circulation Paradigm
Frontiers of Digital Transformation
Kazuya Takeda · Ichiro Ide · Victor Muhandiki
Editors

Frontiers of Digital
Transformation
Applications of the Real-World Data
Circulation Paradigm
Editors
Kazuya Takeda Ichiro Ide
Institutes of Innovation for Future Society Mathematical and Data Science Center
Nagoya University Nagoya University
Nagoya, Aichi, Japan Nagoya, Japan

Victor Muhandiki
Institutes of Innovation for Future Society
Nagoya University
Nagoya, Japan

ISBN 978-981-15-1357-2 ISBN 978-981-15-1358-9 (eBook)


https://doi.org/10.1007/978-981-15-1358-9

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface

The speed of data transmission between computers surpassed that of human commu-
nications long ago, and has since expanded exponentially. As a result, the origin of
the majority of data has become non-human, mechanical, or natural sources; in fact,
humans are merely the source of a small part of the current data explosion. Such
expanding data transmission does not simply consist of single source and desti-
nation pairs, but actually circulates over a complex network connecting numerous
sources and destinations. We should note that such circulation is an important aspect
of the underlying systems. For example, in engineering, it is well known that a feed-
back loop can stabilize a dynamic system. This fact implicates the possibility of
controlling the torrential flow of data circulation by human intervention even on a
small amount of data. Based on this concept, in order to tame and control the massive
amount of data originating from non-human sources, we have been considering the
insertion of “acquisition,” “analysis,” and “implementation” processes in the flow of
data circulation.
Although this approach has the potential to provide many societal benefits, data
circulation has not typically been the target of academic research. Thus, in 2013, we
started a new degree program in this domain, namely, Real-World Data Circulation
(RWDC), gathering faculty and students from the Graduate Schools of Information
Science, Engineering, Medicine, and Economics in Nagoya University, Japan.
This book is the first volume of a series of publications summarizing the outcome
of the RWDC degree program, collecting the relevant chapters of graduate students’
dissertations from various research fields targeting various applications, as well as
lecture notes in relevant fields. Throughout the book, we present examples of real-
world data circulation and then illustrate the resulting creation of social value.

Nagoya, Japan Kazuya Takeda


January 2021 Ichiro Ide
Victor Muhandiki

v
Contents

Introduction
Introduction to the Real-World Data Circulation Paradigm . . . . . . . . . . . 3
Kazuya Takeda

Frontiers in Human Data Domain


A Study on Environmental Sound Modeling Based on Deep
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Tomoki Hayashi
A Study on Utilization of Prior Knowledge in Underdetermined
Source Separation and Its Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Shogo Seki
A Study on Recognition of Students’ Multiple Mental States
During Discussion Using Multimodal Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Shimeng Peng
Towards Practically Applicable Quantitative Information Flow
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Bao Trung Chu

Frontiers in Mechanical Data Domain


Research on High-Performance High-Precision Elliptical
Vibration Cutting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Hongjin Jung
A Study on Efficient Light Field Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Kohei Isechi
Point Cloud Compression for 3D LiDAR Sensor . . . . . . . . . . . . . . . . . . . . . . 119
Chenxi Tu

vii
viii Contents

Integrated Planner for Autonomous Driving in Urban


Environments Including Driving Intention Estimation . . . . . . . . . . . . . . . . 135
Hatem Darweesh
Direct Numerical Simulation on Turbulent/Non-turbulent
Interface in Compressible Turbulent Boundary Layers . . . . . . . . . . . . . . . . 155
Xinxian Zhang

Frontiers in Social Data Domain


Efficient Text Autocompletion for Online Services . . . . . . . . . . . . . . . . . . . . 171
Sheng Hu
Coordination Analysis and Term Correction for Statutory
Sentences Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Takahiro Yamakoshi
Research of ICT Utilization for the Consideration of Townscapes . . . . . . 205
Mari Endo
Measuring Efficiency and Productivity of Japanese Manufacturing
Industry Considering Spatial Interdependence of Production
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Takahiro Tsukamoto
Contributors

Bao Trung Chu Super Zero Lab, Tokyo, Japan


Hatem Darweesh Nagoya University, Nagoya, Japan
Mari Endo Kinjo Gakuin University, Nagoya, Japan
Tomoki Hayashi Human Dataware Lab, Nagoya, Japan
Sheng Hu Hokkaido University, Sapporo, Japan
Kohei Isechi Nagoya University, Nagoya, Japan
Hongjin Jung Korean Institute of Machinery & Materials, Daejeon, South Korea
Shimeng Peng Nagoya University, Nagoya, Japan
Shogo Seki Nagoya University, Nagoya, Japan
Kazuya Takeda Nagoya University, Nagoya, Japan
Takahiro Tsukamoto Nagoya University, Nagoya, Japan
Chenxi Tu Huawei Technologies, Shenzhen, China
Takahiro Yamakoshi Nagoya University, Nagoya, Japan
Xinxian Zhang Beihang University, Beijing, China

ix
Introduction
Introduction to the Real-World Data
Circulation Paradigm

Kazuya Takeda

1 Real-World Data Circulation (RWDC)

The essential social value is formed by widely sharing the fundamental values such as
convenience, enjoyability, well-being, and affluence with other people. Such values
are not simply delivered from the product/service creators to the consumers but are
created through the interactive processes of both creators’ ideas and consumers’
demands. The consumers’ demands are usually not visible and grow/change during
the use of new products and services. Circulations that connect users and creators that
enable creating new products/services that reflect the ever-changing or unconstructed
users’ demands well are indeed the essential social value creation processes. The
lack of attention to this circulation may be one of the reasons that has caused Japan’s
degradation in the global competitiveness ranking1 from the 1st position (1990) to
the 24th (2013), and 34th (2020).
We believe, by the following two reasons, that in order to create such a circula-
tion, a new research paradigm is needed: (1) creating new social values essentially
implicates a multi-disciplinary study involving at least engineering (convenience),
computer science (enjoyment), medicine (health), and economics (abundance), and
(2) connecting creators and consumers inevitably requires three steps such as sensing
the demands from the measurement of the real world (data acquisition), analyzing
the data to understand demands (data analysis), and based on the hypothesis derived
from the understanding, modifying or even newly creating products/services (imple-
mentation). Needless to say that the implantation of the new products/services may
affect the customers’ behaviors which would be measured again by the first step (data

1 International
Institute for Management Development, “World competitiveness ranking,” https://
worldcompetitiveness.imd.org/ [Accessed: Jan. 11, 2021].

K. Takeda (B)
Nagoya University, 1 Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 3
K. Takeda et al. (eds.), Frontiers of Digital Transformation,
https://doi.org/10.1007/978-981-15-1358-9_1
4 K. Takeda

Fig. 1 The RWDC process is indispensable in order to transform the production process and/or
consumer services using digital technologies

acquisition). The new academic discipline, Real-World Data Circulation (RWDC),


is a discipline that studies this circulation imposed in either academic or industrial
applications. (Fig. 1)
Fostering Ph.D. level experts of RWDC is crucially important for the Japanese
industry where less attention has been paid to the system technologies rather than
the production technologies. For example, it is well known that although many parts
inside Apple’s iPhone are “Made in Japan,” very few apps are provided by Japanese
companies. The industrial standing point is clearly contrasting to that of the US and
China.
Here let me show one example of the RWDC skillset. This is a job description of a
growth engineer (In general, a growth hacker) given by Dropbox almost a decade ago.
Dropbox at that time was an emerging startup company and therefore their human
demand was somehow projecting the current social needs of human resource.
A growth engineer would substantially contribute to Dropbox’s continued success. The
process is simple: measure everything to understand it, come up with new ideas, test the best
ones, launch the best performing, and repeat this all as quickly as possible.

Inspired by this job description as well as the common understanding of the


importance of RWDC, four graduate schools of Nagoya University, Japan jointly put
a unique proposal to the call for “Program for Leading Graduate Schools” by the
Ministry of Education Culture, Sports, Science and Technology (MEXT), in 2012.
This book introduces essences of the RWDC considered in Ph.D. theses of 13
students who joined the program from various research fields and disciplines during
2014 and 2021 in order to showcase the actual cases of RWDC.
Introduction to the Real-World Data Circulation Paradigm 5

2 Real-World Data Circulation in the Human, Machine,


and Society Domains

This section will overview the dissemination of the RWDC concept in each of the
human, machine, and social domains that are showcased in this book.

2.1 RWDC in the Human Data Domain

As examples of RWDC in the human data domain, four research topics are showcased
in Part II of this book. First, two topics on auditory scene analysis titled “A study on
environmental sound modeling based on deep learning” and “A study on utilization
of prior knowledge in underdetermined source separation and its application” are
introduced. Next, a topic on computer-aided education titled “A study on recogni-
tion of students’ multiple mental states during discussion using multimodal data” is
introduced. In the end, a topic on information security titled “Towards practically
applicable quantitative information flow analysis” is introduced.

2.2 RWDC in the Machine Data Domain

As examples of RWDC in the machine data domain, five research topics are show-
cased in Part III of this book. First, a topic on material processing titled “Research
on high-performance high-precision elliptical vibration cutting” is introduced. Next,
two topics on coding of optical sensor data titled “A study on efficient light field
coding” and “Point cloud compression for 3D LiDAR sensor” are introduced. Thirdly,
a topic on route planning for autonomous driving titled “Integrated planner for
autonomous driving in urban environments including driving intention estimation”
is introduced. Finally, a topic on fluid dynamics analysis titled “Direct numerical
simulation on turbulent/non-turbulent interface in compressible turbulent boundary
layers” is introduced.

2.3 RWDC in the Social Data Domain

As examples of RWDC in the social data domain, four research topics are showcased
in Part IV of this book. First, two topics on correction of real-world text data titled
“Efficient text autocompletion for online services” and “Coordination analysis and
term correction for statutory sentences using machine learning” are introduced. Next,
a topic on the analysis of cityscape analysis titled “Research of ICT utilization for
consideration of townscapes” is introduced. Finally, a topic on industrial science
6 K. Takeda

titled “Measuring efficiency and productivity of Japanese manufacturing industry


considering spatial interdependence of production activities” is introduced.

3 Human-Resource Development in the RWDC Domain

This section introduces the four pillars of human-resource development in the RWDC
domain as a degree program.

3.1 Ph.D. Research

The human resource development in the RWDC Ph.D. program consists of four
pillars (Fig. 2). The first pillar, of course, is the Ph.D. research experiences in one of
the four graduate schools. The second pillar is the knowledge and skills in RWDC,
which will be discussed in detail in Sec. 3.2. The third pillar is the global experiences
such as student collaboration, visiting research, and networking. Finally, the fourth
pillar is the industrial experiences including startup experiences. For general students,
pursuing all four requirements in a 5-year period is not easy, since the latter three
pillars should be fulfilled on top of the first pillar, i.e., Ph.D. research. As most Ph.D.
programs in Japanese graduate schools, it is a mandatory requirement for our students

Final evaluation
D3

D2

Global
experiences Industrial
D1
experiences
Lectures and PhD
hands-on research
M2
Knowledge
And skills
M1

Fig. 2 The four pillars of the education in the RWDC program


Introduction to the Real-World Data Circulation Paradigm 7

to publish multiple journal papers. In other words, the RWDC Ph.D. program requires
extra achievements to the Ph.D. degree.
In the RWDC Ph.D. program, acquisition, analysis, and implementation are not
independent procedures but is one general process as a whole. Whatever the topic
of his/her Ph.D. thesis may be, the new findings should be connected to an RWDC
and therefore have to have impacts to our society. In order to guarantee the value of
the degree of the program, a degree committee is formed and requests the applicant
student to add an additional chapter to the Ph.D. thesis where he/she discusses the
research achievement in terms of RWDC. Some experts from the industry are also
invited for reviewing that chapter.

3.2 Knowledge and skills in RWDC

Dealing with real-world data has increased its importance in the past 20 years, and
will definitely continue to increase toward the future. We believe that our RWDC
Ph.D. program was one of the pioneering attempts emphasizing the importance of
real-world data circulation. In general, the RWDC consists of three steps: the first
step is data acquisition, i.e., acquiring data from the real world. This process can be
generalized as the process of “real to virtual information transformation” in a wider
view. There are various means of data acquisition such as physical sensing with
IoT devices, SNS statistics, public open data, private management indicators, etc.
Students are requested to study the fundamental theories and skills of data acquisition
in two of their majoring target domains through recommended lectures taught in the
four member graduate schools.
In the RWDC program, we categorize the target of RWDC into three domains:
human, machine, and society. Students are requested to select two out of the three
categories as primary and secondary domains. According to their target domains,
students can put emphases in some particular acquisition methods.
The second process is data analysis. Students start with learning fundamental
theories of statistics and signal and pattern information processing as well as machine
learning. Then, they study various cases of data analysis applications through lectures
taught by professors in the four member graduate schools. This lecture series, namely,
Real-World Data Circulation Systems I, connects theories/methods of data analysis
applied to various real problems in research/technology.
The third process of the RWDC is implementation. In order to interact with,
we finally implement the analysis results to the real world, which initiates the next
circulation. That part is taught mainly by invited lecturers from the industry.
8 K. Takeda

3.3 Global/Industrial Leadership

Unlike other Ph.D. programs, the RWDC Ph.D. program puts emphasis on the skill
development, particularly for becoming a global leader. It is not easy to enhance such
skills based solely on lessons such as practicums, group works, and discussions held
in classrooms. The designed education style is, so-to-say “Giving the students oppor-
tunities and evaluate their experiences.” Thus, the program mandates all students to
experience research activities in global environments during a minimum of 2-month
long visiting research (some students would actually stay longer, such as 1 year)
outside their home country. In addition to that, students must participate in a 2-week
long summer school held in Asian countries during the second year in the program.
The summer schools have been located in Istanbul, Turkey (2015); Hanoi, Vietnam
(2016 and 2018); and Bangkok, Thailand (2017 and 2019). Collaborating with the
students of Istanbul Technological University (ITU), Hanoi Institute of Science and
Technology (HIST), and Chulalongkorn University, our students designed cultural
lectures with hands-on activities, together with short project involving group works.
We found that these global experiences brought a big change to our students in the
sense that it eliminates the fears for communicating and collaborating with people
from different cultures and backgrounds in English. In fact, it is not necessarily diffi-
cult, but they have not been aware of how easy it is until they actually tried out. Of
course, staying in the World’s top-level laboratories even for a short period of time
connected our students with the premier research community which would become
an eternal asset for the young researchers.

3.4 Industrial Experience

The program also requests all students to take part in industrial internships for at
least 2 months. The experiences in time and goal managements in a corporate project
sometimes changed the students’ attitude drastically.

4 Achievements of the RWDC program

The research achievements obtained through the RWDC program will be evident
in the following chapters. We will see various data circulations formed upon the
Ph.D. researches in different disciplines, spanning from material science to social
economics. Each of them has an associated circulation(s), and therefore tightly
connected to the real world. These chapters will showcase the effectiveness of
our systematic scheme in education that combines practical experiences and deep
research. This is obviously the most important achievement of the program. In the
Introduction to the Real-World Data Circulation Paradigm 9

future, the pile of research discussions in the following chapters will serve as the foun-
dation of a new research discipline, Real-World Data Circulation, fostering young
leaders in that area.
Another symbolic achievement throughout the first 7 years was the young talents
themselves. For instance, we are proud of the fact that ten startups were launched
from the program. As of summer 2020, they have raised more than ten million USD
of funds and have created more than 100 job opportunities. Many students in the
program are closely connected with those startups, and they experienced that research
achievements can work as parts of the industrial eco-system once they are used in
the context of RWDC. The importance and the social demand of our targeted talent,
RWDC leaders, are still growing. Actually, it is attracting more attention now than
the time we designed the program as we can see from the fact that data scientists still
seem to be one of the most popular job titles among employers; talents who can collect
data, analyze them, and recommend tactics; who understand statistics; who knows
which analysis model works on which problem; and who can find the appropriate
computer tools or write codes for that. On top of that, with the understanding of the
crucial importance of RWDC, as well as the deep knowledge on a specific discipline,
our students can surely pioneer the next step needed for the innovation toward our
future society.

5 Future Prospective of the Program—Beyond Digital


Transformation

The concept of RWDC may not sound very new to some people. They may wonder
how it differs from the concept of Plan-Do-Check-Act (PDCA). To me, RWDC is
different from PDCA in its final goal. PDCA is a self-governing process toward a
given goal, while RWDC is an ever-continuing innovation process. PDCA or feed-
back control is a nice idea if we wished to stabilize complex systems, while RWDC
is a fundamental policy of trying to do new things where we wish to change as much
as possible.
Recently, the concept of Digital Transformation (DX) is becoming popular in
Japan. Sometimes it simply indicates the application of digital technologies such as
IoT, Big data, and AI for improving the efficiency of an organization, particularly a
company’s production or service systems. But to me, it is obvious that the biggest
advantage of using digital technologies is enlarging and accelerating the deformation
process, which is not in the form of a single pipeline, but rather is a circulation or an
interactive process between the technologies and the human society. We are proud
that RWDC has deeply recognized the importance of the fact that transformation is
the result of interaction, and that we have continuously worked in the education of
the leaders who share this concept.
10 K. Takeda

Although in this volume, the number of chapters, i.e., Ph.D. theses produced
through the RWDC program, is limited to 13, our continuous efforts will extend
the chapters in succeeding volumes published in the future, which will keep on
contributing to build the new discipline of RWDC, as well as producing global
leaders in industrial science.
Frontiers in Human Data Domain
A Study on Environmental Sound
Modeling Based on Deep Learning

Tomoki Hayashi

Abstract Recent improvements in machine learning techniques have opened new


opportunities to analyze every possible sound in the real-world situation, namely,
understanding environmental sound. This is a challenging problem because the goal
is to understand every possible sound in a given environment, from the sound of glass
breaking to the crying of children. This chapter focuses on Sound Event Detection
(SED), one of the most important tasks in the field of understanding environmental
sound, and addresses three problems that affect the performance of monophonic,
polyphonic, and anomalous SED. The first problem is how to combine multi-modal
signals to extend the range of detectable sound events into human activities. The
second one is how to model the duration of sound events that is one of the essential
characteristics to improve polyphonic SED performance. The third one is how to
model normal environments in the time domain to improve anomalous SED systems.
This chapter introduces how the proposed method solves each problem and reveals
the effectiveness of the proposed method to improve the performance of each SED
task. Furthermore, discussions about the relationship between each work and the
Real-World Data Circulation (RWDC) reveal how each work accomplishes what
kind of data circulation.

1 Introduction

Humans encounter many kinds of sounds in daily life, such as speech, music, the
singing of birds, and keyboard typing. Over the past several decades, the main targets
of acoustic research have been speech and music, while other sounds have gener-
ally been treated as background noise. However, recent improvements in machine
learning techniques have opened new opportunities to analyze such sounds in detail,
namely, understanding environmental sound. Understanding environmental sound is
challenging because the goal is to understand every possible sound in a given envi-
ronment, from the sound of glass breaking to the crying of children. To accelerate

T. Hayashi (B)
Human Dataware Lab. Co., Ltd, Nagoya, Japan
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 13
K. Takeda et al. (eds.), Frontiers of Digital Transformation,
https://doi.org/10.1007/978-981-15-1358-9_2
14 T. Hayashi

this research field, several competitions have been held in recent years, including
CLEAR AED [41], TRECVID MED [34], and the DCASE Challenge [31]. More-
over, several datasets have been developed, such as urban sound datasets [39] and
AudioSet [13].
One of the essential tasks in this field is Sound Event Detection (SED), which
is the task to detect the beginning and the end of sound events and to identify their
labels. SED has many applications such as retrieval from multimedia databases [48],
life-logging [36, 42], automatic control of devices in smart homes [45], and audio-
based surveillance systems [6, 7, 44]. The SED task can be divided into three types:
monophonic, polyphonic, and anomalous SED. An overview of each SED task is
shown in Fig. 1. An acoustic feature vector (e.g., log Mel filterbanks or Mel-frequency
cepstral coefficients) is extracted from the input audio clip, the length of which is
around several minutes. Then, the SED model estimates the beginning and the end
of sound events and identifies their labels from a given feature vector. The target
sound events are predefined, but it depends on the SED type. In the monophonic
SED case, multiple sound events cannot appear at the same time. On the other hand,
in the polyphonic SED case, any number of sound events can be overlapped at the
same time. In the anomalous SED case, the prior information of target sound events
is not given. Therefore, the system tries to detect novel or anomalous sound events
which do not appear in the training data. Sound events include a wide range of
phenomena that vary widely in acoustic characteristics, duration, and volume, such
as the sound of glass breaking, typing on a keyboard, knocking on doors, and human
speech. This diversity of targets makes SED challenging. Though recent advances
in machine learning techniques have led to the improvement of the performance of
SED systems, various problems remain to be solved.
This study addresses three problems that affect the performance of monophonic,
polyphonic, and anomalous SED systems. The first problem is how to combine multi-
modal signals to extend the range of detectable sound events into human activities.
The second is how to model the duration of sound events that is one of the essential
characteristics to improve polyphonic SED performance. The third is how to model
normal environments in the time domain to improve anomalous SED systems.
First, toward the development of a life-logging system, the use of multi-modal sig-
nals recorded under realistic conditions is focused [16, 19]. In that way, sound events
related to typical human activities can be detected, including discrete sound events
like a door closing along with sounds related to more extended human activities like
cooking. The key to realizing the application is finding associations between different
types of signals that facilitate the detection of various human activities. To address
this issue, a large database of human activities recorded under realistic conditions
is created. The database consists of over 1,400 h of data, including the outdoor and
indoor activities of 19 subjects under practical conditions. Two Deep Neural Network
(DNN)-based fusion methods using multi-modal signals are proposed to detect var-
ious human activities. Furthermore, the speaker adaptation techniques in Automatic
Speech Recognition (ASR) [32] are introduced to address the subject individuality
problem, which degrades the detection performance. Experimental results using the
constructed database demonstrate that the use of multi-modal signals is effective,
A Study on Environmental Sound Modeling Based on Deep Learning 15

Fig. 1 Overview of the sound event detection task. First, the system extracts a feature vector from
the input audio clip. Then, the SED model estimates the beginning and the end of sound events
and their labels from the feature vector. Here, the target types of sound events are predefined. In
the monophonic case, a sound event cannot be overlapped. Meanwhile, in the polyphonic case, any
number of sound events can be overlapped. In the anomalous case, the prior information on the
target sound events is not given; therefore, the system detects novel or anomalous sound events
which do not appear in the training data
16 T. Hayashi

and that speaker adaptation techniques can improve performance, especially when
using only a limited amount of training data.
Second, modeling the duration of sound events is focused to improve the perfor-
mance of polyphonic SED systems [17]. The duration is one of the most important
characteristics of sound events, but conventional methods have not yet modeled them
explicitly [21, 24]. To address this issue, a novel hybrid approach using duration-
controlled Long Short-Term Memory (LSTM) [14, 22] is proposed. The proposed
model consists of two components: a Bidirectional LSTM recurrent neural network
(BLSTM), which performs frame-by-frame detection, and a Hidden Markov Model
(HMM) or a Hidden Semi-Markov Model (HSMM) [49] that models the duration of
each sound events. The proposed approach makes it possible to model the duration of
each sound event precisely and to perform sequence-by-sequence detection without
needing thresholding. Furthermore, to effectively reduce insertion errors, the post-
processing method using binary masks is also introduced. This post-processing step
uses a Sound Activity Detection (SAD) network to identify segments for activity
indicating any sound event. Experimental evaluation with the DCASE2016 task2
dataset [31] demonstrates that the proposed method outperforms conventional poly-
phonic SED methods and can effectively model sound event duration for polyphonic
SED.
Third, modeling the normal acoustic environment is focused to improve the
anomalous SED system [18, 25]. In conventional approaches [29, 38], the mod-
eling is performed in the acoustic feature domain. However, this results in a lack of
information about the temporal structure, like the phase of the sounds. To address
this issue, a new anomalous detection method based on WaveNet [47] is proposed.
WaveNet is an autoregressive convolutional neural network that directly models
acoustic signals in the time domain, which enables us to model detailed temporal
structures like the phase of waveform signals. The proposed method uses WaveNet as
a predictor rather than a generator to detect waveform segments responsible for sig-
nificant prediction errors as unknown acoustic patterns. Furthermore, i-vector [8] is
utilized as an additional auxiliary feature of WaveNet to consider differences in envi-
ronmental situations. The i-vector extractor should allow the system to discriminate
the sound patterns, depending on the time, location, and surrounding environment.
Experimental evaluation with a database of sounds recorded in public spaces shows
that the proposed method outperforms conventional feature-based approaches and
that time-domain modeling in conjunction with the i-vector extractor is effective for
anomalous SED.
In the following sections, the relationship between each work and Real-World
Data Circulation (RWDC) has been discussed. Section 2 explains how the developed
tools for human activity recognition can foster RWDC. Section 3 describes how
data analysis during research on polyphonic SED inspires a new method for SED
based on duration modeling. Section 4 explains how the output from the research
on anomalous SED can be used to improve the performance of polyphonic SED
systems, demonstrating the application of discovered knowledge to another process.
Finally, this chapter is summarized in Sect. 5.
A Study on Environmental Sound Modeling Based on Deep Learning 17

2 Human Activity Recognition with Multi-modal Signals

The goal of this work is the development of a method to promote the cycle shown
in Fig. 2. This cycle assumes that intellectual and physical activities result in expe-
riences, such as the discovery of new knowledge or a sense of accomplishment.
These experiences enhance people’s abilities, expanding the range of activities open
to them. Continuously repeating this cycle makes it possible for us to develop our-
selves and improve their quality of our life. In order to help people to keep repeating
this cycle, it is necessary to monitor them and to understand their activities and expe-
riences. To achieve this, a life-logging system was developed, which automatically
records the signals and recognizes human activity, from simple movements such as
walking to complex tasks such as cooking. An overview of the target life-logging
system is shown in Fig. 3. Users attach a smartphone that records environmental
sound and acceleration signals continuously, and then the signals are sent to the
server. The server receives the signals and recognizes the subject’s current activity
by the proposed human activity recognition model. Finally, the results are then sent
to the subject’s smartphone. The subjects can not only view their activity history but
also send feedback to improve recognition performance. Furthermore, the system
can provide a recommendation of the activity based on their history.
There are two important points to develop such applications: the usability of
the system and the range of recognizable activities under realistic conditions. To
address the first point, a smartphone-based recording system was developed. The
smartphone-based system does not require attaching a large number of sensors, easy
and less burden to use. To address the second point, a large database of human activity
consisting of multi-modal signals recorded under realistic conditions is created. Then,
two DNN-based fusion methods that use multi-modal signals were developed to
recognize complex human activities. Furthermore, speaker adaptation techniques
were introduced to address the problem of the subject individuality, which degrades
the system performance when the model constructed for a particular subject is used
to classify the activities of another subject. Experimental results with the constructed
database demonstrated that using multi-modal signals is effective, and the speaker
adaptation techniques can improve the performance, especially when using only a
limited amount of training data.
Finally, a human activity visualization tool was developed by integrating the above
systems, shown in Fig. 4. In Fig. 4, the right side represents the results of activity
recognition, and the left side displays the recorded signals, while the center shows
the monitored video and the geographic location of the smartphone user. The system
can collect and analyze the individual data, and then feedback them to improve the
activity recognition performance. Furthermore, it can provide recommendations to
encourage users to make them more active based on the analyzed data. Thus, the
developed systems enable us to promote not only the RWDC, i.e., the cycle of data
acquisition, data analysis, and implementation but also the loop in Fig. 2 which
improves the quality of our life.
18 T. Hayashi

Fig. 2 Overview of the target cycle. Activities yield experiences, such as the discovery of new
knowledge or a sense of accomplishment. These experiences enhance people’s abilities, expanding
the range of activities that can be attempted

Fig. 3 Overview of the life-logging system. The system uses a smartphone to record environmental
sound and acceleration signals continuously. The server receives the signals and recognizes the
subject’s current activity by the proposed human activity recognition model. Finally, the results are
sent to the subject’s smartphone. The subjects can not only view their activity history but also send
feedback to improve recognition performance
A Study on Environmental Sound Modeling Based on Deep Learning 19

Fig. 4 Developed human activity visualization tool. On the right, the results of activity recognition
are shown. On the left, the recorded signals are displayed. A monitored video is shown at the top
center, and the location of the subject with the smartphone is indicated on the map in the center

3 Polyphonic SED Based on Duration Modeling

To realize practical applications, we need to analyze the characteristics of acquired


data in detail and to develop a method based on these characteristics. In SED, there are
various characteristics of sound events, and one of the most important characteristics
is duration, which represents how the sound event continues in the time direction.
The histogram of sound events is shown in Fig. 5. The horizontal axis represents
the number of frames, i.e., duration, and the vertical axis represents the number
of appearances. The bigger number of frames represents the duration of the sound
events is longer. The figure shows that each sound event has a different duration, and
it should help to improve detection performance. However, in conventional methods,
this important information was not utilized explicitly.

15000 15000 15000


12000 12000 12000
9000 9000 9000
6000 6000 6000
3000 3000 3000
0 0 0
0 30 60 90 120 150 0 30 60 90 120 150 0 30 60 90 120 150

Fig. 5 Histogram of different three sound events. The horizontal axis represents the number of
frames, i.e., duration. The bigger the value is, the duration of the sound event is longer
Another random document with
no related content on Scribd:
The Project Gutenberg eBook of Essays on
things
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

Title: Essays on things

Author: William Lyon Phelps

Release date: December 13, 2023 [eBook #72395]

Language: English

Original publication: New York: The Macmillan Company, 1930

Credits: Aaron Adrignola, Tim Lindell and the Online Distributed


Proofreading Team at https://www.pgdp.net (This book
was produced from images made available by the
HathiTrust Digital Library.)

*** START OF THE PROJECT GUTENBERG EBOOK ESSAYS ON


THINGS ***
Transcriber’s Note
New original cover art included with this eBook is granted
to the public domain. It uses an image of the Title Page of the
original book.
Additional notes will be found near the end of this ebook.
ESSAYS ON THINGS
By WILLIAM LYON PHELPS
Essays on Modern Novelists
Essays on Russian Novelists
Essays on Books
Essays on Modern Dramatists
Essays on Things
Howells, James, Bryant and Other Essays
Reading the Bible
Teaching in School and College
Some Makers of American Literature
The Advance of the English Novel
The Advance of English Poetry
The Beginnings of the English Romantic Movement
Human Nature in the Bible
Human Nature and the Gospel
Adventures and Confessions
As I Like It, First, Second, Third Series
Archibald Marshall
Happiness
Love
Memory
Music
A Dash at the Pole
Browning—How to Know Him
ESSAYS ON THINGS
By WILLIAM LYON PHELPS
NEW YORK
THE MACMILLAN COMPANY
1930
Copyright, 1930,
By THE MACMILLAN COMPANY.
All rights reserved—no part of this
book may be reproduced in any form
without permission in writing from
the publisher.
Set up and printed. Published September, 1930.
· PRINTED IN THE UNITED STATES OF AMERICA ·
CONTENTS
PAGE
Sunrise 3
Molasses 8
Resolutions When I Come to Be Old 14
English and American Humour 20
A Pair of Socks 26
An Inspiring Cemetery 31
Ancient Football 35
Rivers 39
One Day at a Time 45
City and Country 51
Age Before Beauty 57
Church Unity 63
Political History 68
A Room Without a View 74
Tea 80
The Weather 86
War 91
Man and Boy 96
Ambition 101
Birds and Statesmen 107
Russia Before the Revolution 113
The Devil 119
The Forsyte Saga 124
Profession and Practice 130
London as a Summer Resort 135
What the Man Will Wear 140
Dreams 146
Eating Breakfast 151
The Mother Tongue 157
Our South as Cure for Flu 163
Going to Church in Paris 169
Optimism and Pessimism 175
Translations 180
Music of the Spheres 185
Dog Books 190
Going to Honolulu 196
Hymns 201
Old-Fashioned Snobs 207
A Fair City 212
Traditions 218
Spooks 224
Trial by Jury 230
Athletics 235
A Private Library All Your Own 240
The Greatest Common Divisor 246
The Great American Game 252
Ten Sixty-Six 258
Going Abroad the First Time 264
Spiritual Healing 269
Superstition 274
The Importance of the Earth 279
What Shall I Think About? 285
ESSAYS ON THINGS
I
SUNRISE

At an uncertain hour before dawn in February 1912, as I lay


asleep in my room on the top floor of a hotel in the town of Mentone,
in Southern France, I was suddenly awakened by the morning star. It
was shining with inquisitive splendour directly into my left eye. At that
quiet moment, in the last stages of the dying night, this star seemed
enormous. It hung out of the velvet sky so far that I thought it was
going to fall, and I went out on the balcony of my room to see it drop.
The air was windless and mild, and, instead of going back to bed, I
decided to stay on the balcony and watch the unfolding drama of the
dawn. For every clear dawn in this spectacular universe is a
magnificent drama, rising to a superb climax.
The morning stars sang together and I heard the sons of God
shouting for joy. The chief morning star, the one that had roused me
from slumber, recited a splendid prologue. Then, as the night paled
and the lesser stars withdrew, some of the minor characters in the
play began to appear and take their respective parts. The grey
background turned red, then gold. Long shafts of preliminary light
shot up from the eastern horizon, and then, when the stage was all
set, and the minor characters had completed their assigned rôles,
the curtains suddenly parted and the sun—the Daystar—the star of
the play, entered with all the panoply of majesty. And as I stood there
and beheld this incomparable spectacle, and gazed over the
mountains, the meadows and the sea, the words of Shakespeare
came into my mind:

Full many a glorious morning have I seen,


Flatter the mountain tops with sovereign eye.
Kissing with golden face the meadows green.
Gilding pale streams with heavenly alchemy.

It is a pity that more people do not see the sunrise. Many do not
get up early enough, many do not stay up late enough. Out of the
millions and millions of men, women and children on this globe only
a comparatively few see the sunrise, and I dare say there are many
respectable persons who have never seen it at all. One really should
not go through life without seeing the sun rise at least once,
because, even if one is fortunate enough to be received at last into
heaven, there is one sight wherein this vale of tears surpasses the
eternal home of the saints. “There is no night there,” hence there can
be no dawn, no sunrise; it is therefore better to make the most of it
while we can.
As a man feels refreshed after a night’s sleep and his morning
bath, so the sun seems to rise out of the water like a giant renewed.
Milton gave us an excellent description:

So sinks the daystar in the ocean bed,


And yet anon repairs his drooping head,
And tricks his beams, and with new-spangled ore
Flames in the forehead of the morning sky.

Browning, in his poem, Pippa Passes, compares the sunrise to a


glass of champagne, a sparkling wine overflowing the world:

DAY!
Faster and more fast,
O’er night’s brim, day boils at last:
Boils, pure gold, o’er the cloud-cup’s brim,
Where spurting and suppressed it lay,
For not a froth-flake touched the rim
Of yonder gap in the solid gray
Of the eastern cloud, an hour away;
But forth one wavelet, then another, curled.
Till the whole sunrise, not to be suppressed,
Rose, reddened, and its seething breast
Flickered in bounds, grew gold, then overflowed the world.

The sunset has a tranquil beauty but to me there is in it always a


tinge of sadness, of the sadness of farewell, of the approach of
darkness. This mood is expressed in the old hymn which in my
childhood I used to hear so often in church:

Fading, still fading, the last beam is shining,


Father in heaven! the day is declining.
Safety and innocence fly with the light,
Temptation and danger walk forth with the night.

Sorrow may endure for a night, but joy cometh in the morning,
saith the Holy Book. The sunrise has not only inexpressible majesty
and splendour, but it has the rapture of promise, the excitement of
beginning again. Yesterday has gone forever, the night is over and
we may start anew. To how many eyes, weary with wakefulness in
the long watches of the night, or flushed with fever, is the first
glimmer of the dawn welcome. The night makes every fear and
worry worse than the reality, it magnifies every trivial distress. Mark
Twain said the night brought madness—none of us is quite sane in
the darkness. That particular regret for yesterday or apprehension
for tomorrow that strikes you like a whiplash in the face at 2:45 a.m.
dwindles into an absurdity in the healthy dawn.
Mark Twain, who had expressed the difference between the
night and the morning tragically, also expressed it humorously. He
said that when he was lying awake in the middle of the night he felt
like an awful sinner, he hated himself with a horrible depression and
made innumerable good resolutions; but when at 7:30 he was
shaving himself he felt just as cheerful, healthy and unregenerate as
ever.
I am a child of the morning. I love the dawn and the sunrise.
When I was a child I saw the sunrise from the top of Whiteface and it
seemed to me that I not only saw beauty but heard celestial music.
Ever since reading in George Moore’s Evelyn Innes the nun’s
description of her feelings while listening to Wagner’s Prologue to
Lohengrin I myself never hear that lovely music rising to a
tremendous climax without seeing in imagination what was revealed
to the Sister of Mercy. I am on a mountain top before dawn; the
darkness gives way; the greyness strengthens, and finally my whole
mind and soul are filled with the increasing light.
II
MOLASSES

Before both the word molasses and the thing it signifies disappear
forever from the earth, I wish to recall its flavour and its importance to
the men and women of my generation. By any other name it would
taste as sweet; it is by no means yet extinct; but for many years maple
syrup and other commodities have taken its place on the breakfast
table. Yet I was brought up on molasses. Do you remember, in that
marvellous book, Helen’s Babies, when Toddie was asked what he had
in his pantspocket, his devastating reply to that tragic question? He
calmly answered, “Bread and molasses.”
Well, I was brought up on bread and molasses. Very often that was
all we had for supper. I well remember, in the sticky days of childhood,
being invited out to supper by my neighbour Arthur Greene. My table
manners were primitive and my shyness in formal company
overwhelming. When I was ushered into the Greene dining room not
only as the guest of honour but as the only guest, I felt like Fra Lippo
Lippi in the most august presence in the universe, only I lacked his
impudence to help me out.
The conditions of life in those days may be estimated from the fact
that the entire formal supper, even with “company,” consisted wholly
and only of bread, butter and molasses. Around the festive board sat
Mr. Greene, a terrifying adult who looked as if he had never been
young; Mrs. Greene, tight-lipped and serious; Arthur Greene, his sister
Alice, and his younger brother, Freddy. As I was company I was helped
first and given a fairly liberal supply of bread, which I unthinkingly (as
though I were used to such luxuries) spread with butter and then
covered with a thick layer of molasses. Ah, I was about to learn
something.
Mr. Greene turned to his eldest son, and enquired grimly, “Arthur,
which will you have, bread and butter or bread and molasses?”
The wretched Arthur, looking at my plate, and believing that his
father, in deference to the “company,” would not quite dare to enforce
what was evidently the regular evening choice, said, with what I
recognised as a pitiful attempt at careless assurance, “I’ll take both.”
“No, you don’t!” countered his father, with a tone as final as that of
a judge in court. His father was not to be bluffed by the presence of
company; he evidently regarded discipline as more important than
manners. The result was I felt like a voluptuary, being the only person
at the table who had the luxury of both butter and molasses. They
stuck in my throat; I feel them choking me still, after an interval of more
than fifty years.

* * * * *
The jug of molasses was on our table at home at every breakfast
and at every supper. The only variety lay in the fact (do you
remember?) that there were two distinct kinds of molasses—
sometimes we had one, sometimes the other. There was Porto Rico
molasses and there was New Orleans molasses—brunette and
blonde. The Porto Rico molasses was so dark it was almost black, and
New Orleans molasses was golden brown.
The worst meal of the three was invariably supper, and I imagine
this was fairly common among our neighbours. Breakfast was a hearty
repast, starting usually with oatmeal, immediately followed by
beefsteak and potatoes or mutton chops, sometimes ham and eggs;
but usually beef or chops. It had a glorious coda with griddle cakes or
waffles; and thus stuffed, we rose from the table like condors from their
prey, and began the day’s work. Dinner at one was a hearty meal, with
soup, roast, vegetables and pie.
Supper consisted of “remainders.” There was no relish in it, and I
remember that very often my mother, who never complained vocally,
looking at the unattractive spread with lack-lustre eye, would either
speak to our one servant or would disappear for a moment and return
with a cold potato, which it was clear she distinctly preferred to the
sickening sweetish “preserves” and cookies or to the bread and
molasses which I myself ate copiously.
However remiss and indifferent and selfish I may have been in my
conduct toward my mother—and what man does not suffer as he
thinks of this particular feature of the irrecoverable past?—it does me
good to remember that, after I came to man’s estate, I gave my mother
what it is clear she always and in vain longed for in earlier years, a
good substantial dinner at night.
At breakfast we never put cream and sugar on our porridge; we
always put molasses. Then, if griddle cakes followed the meat, we
once more had recourse to molasses. And as bread and molasses
was the backbone of the evening meal, you will see what I mean when
I say I swam to manhood through this viscous sea. In those days youth
was sweet.
The transfer of emphasis from breakfast to supper is the chief
distinguishing change in the procession of meals as it was and as it
became. It now seems incredible that I once ate large slabs of steak or
big chops at breakfast, but I certainly did. And supper, which
approached the vanishing point, turned into dinner in later years.
Many, many years ago we banished the molasses jug and even
the lighter and more patrician maple syrup ceased to flow at the
breakfast table. I am quite aware that innumerable persons still eat
griddle cakes or waffles and syrup at the first meal of the day. It is
supposed that the poet-artist Dante Gabriel Rossetti ruined his health
by eating huge portions of ham and eggs, followed by griddle cakes
and molasses, for breakfast. To me there has always been something
incongruous between syrup and coffee; they are mutually destructive;
one spoils the taste of the other.
Yet waffles and syrup are a delectable dish; and I am quite certain
that nectar and ambrosia made no better meal. What to do, then? The
answer is simple. Eat no griddle cakes, no waffles and no syrup at
breakfast; but use these commodities for dessert at lunch. Then comes
the full flavour.
Many taverns now have hit upon the excellent idea of serving only
two dishes for lunch or dinner—chicken and waffles. This obviates the
expense of waste, the worry of choice, the time lost in plans. And what
combination could possibly be better?
One of the happiest recollections of my childhood is the marvelous
hot, crisp waffle lying on my plate, and my increasing delight as I
watched the molasses filling each square cavity in turn. As the English
poet remarked, “I hate people who are not serious about their meals.”

You might also like