Professional Documents
Culture Documents
Remotesensing 14 01263
Remotesensing 14 01263
Review
Data Fusion in Earth Observation and the Role of Citizen as a
Sensor: A Scoping Review of Applications, Methods and
Future Trends
Aikaterini Karagiannopoulou * , Athanasia Tsertou , Georgios Tsimiklis and Angelos Amditis
Abstract: Recent advances in Earth Observation (EO) placed Citizen Science (CS) in the highest
position, declaring their essential provision of information in every discipline that serves the SDGs,
and the 2050 climate neutrality targets. However, so far, none of the published literature reviews
has investigated the models and tools that assimilate these data sources. Following this gap of
knowledge, we synthesised this scoping systematic literature review (SSLR) with a will to cover this
limitation and highlight the benefits and the future directions that remain uncovered. Adopting
the SSLR guidelines, a double and two-level screening hybrid process found 66 articles to meet
the eligibility criteria, presenting methods, where data were fused and evaluated regarding their
performance, scalability level and computational efficiency. Subsequent reference is given on EO-data,
their corresponding conversions, the citizens’ participation digital tools, and Data Fusion (DF) models
that are predominately exploited. Preliminary results showcased a preference in the multispectral
satellite sensors, with the microwave sensors to be used as a supplementary data source. Approaches
such as the “brute-force approach” and the super-resolution models indicate an effective way to
Citation: Karagiannopoulou, A.;
overcome the spatio-temporal gaps and the so far reliance on commercial satellite sensors. Passive
Tsertou, A.; Tsimiklis, G.; Amditis, A. crowdsensing observations are foreseen to gain a greater audience as, described in, most cases as a
Data Fusion in Earth Observation low-cost and easily applicable solution even in the unprecedented COVID-19 pandemic. Immersive
and the Role of Citizen as a Sensor: A platforms and decentralised systems should have a vital role in citizens’ engagement and training
Scoping Review of Applications, process. Reviewing the DF models, the majority of the selected articles followed a data-driven method
Methods and Future Trends. Remote with the traditional algorithms to still hold significant attention. An exception is revealed in the
Sens. 2022, 14, 1263. smaller-scale studies, which showed a preference for deep learning models. Several studies enhanced
https://doi.org/10.3390/rs14051263 their methods with the active-, and transfer-learning approaches, constructing a scalable model. In
Academic Editor: Gregory Giuliani the end, we strongly support that the interaction with citizens is of paramount importance to achieve
a climate-neutral Earth.
Received: 25 January 2022
Accepted: 1 March 2022
Keywords: earth observation; citizen science; crowdsourcing; data fusion; data assimilation; data
Published: 4 March 2022
models; data curation; citizen engagement; scoping review
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations. 1. Introduction
During the last few decades, rapid and aggressive changes to the global climate have
placed citizens in the spotlight, as the main drivers of Climate Change (CC) [1]. Increased
Copyright: © 2022 by the authors.
greenhouse gas emission (GHG), global temperature, and mean sea level rise are producing
Licensee MDPI, Basel, Switzerland. a domino effect at various levels. Indeed, the temperature rise, natural hazards of increasing
This article is an open access article intensity, and extreme food demand due to the population increase and the perturbation
distributed under the terms and of natural resources [2] are already visible challenges. The earth population is expanding,
conditions of the Creative Commons expected to exceed 1.2 billion by 2100 [2,3], with 75% of citizens living in urban regions [4].
Attribution (CC BY) license (https:// Under these conditions, cities are exposed to high concentrations of GHGs, and local events,
creativecommons.org/licenses/by/ such as urban flash floods, intense droughts on land, long-standing forest fires [5], and
4.0/). extreme heatwaves, generated by Urban Heat Islands (UHI) [6]. However, citizens are still
predominantly living in urban areas [7], further deteriorating the visible impacts of CC [8].
Stakeholders need to be able to make use of open access data at high spatio-temporal
resolution to mobilise relevant authorities and strengthen the impact of resilient-building
efforts [9]. Duro et al. [10] stated that economic costs and human losses could be enormous
if we continue to face fragmented data sources and the lack of real-time observations.
Recent technology advances in the two inter-related areas of Earth Observation (EO) and
Citizen Science (CS) have shown a great potential to address these challenges. An increased
spatio-temporal resolution of satellite sensors could offer detailed observations even in
remotely located regions, but are still of limited use due to their high cost. On the contrary,
the sudden rise of citizen sensing can assist as a complementary observational stream that
can eliminate the temporal gap of satellite observations or as reference data, supporting
the analysis of remote sensed images [11]. Providing such a geospatial perspective of
aggregated observations received from Satellite systems and the near-surface volunteer
geographic information (VGI) [12] is considered as a promising solution to describe the
complexity of earth’s system. Rigorous data transformation and assimilation processes
are being identified, in order to generate accurate and “actionable” information that could
assist decision-makers to strengthen their policies, and support a climate resilient strategy.
Leveraging this gap of knowledge, this scoping review aims at presenting the data
fusion algorithms and methodological procedures explored thus far, focusing on demon-
stratable applications of this synthesis. Additional research questions related to EO datasets
and the crowdsourcing tools and platforms that are used, are included in this review, as
well as methods that aim to tackle the heterogeneous and noisy nature of measurements
received by citizens. In this context, achievements, barriers and remaining challenges are
also outlined. The remainder of this paper is organised as follows: Sections 2 and 3 present
the background and methodological framework based on the principles of the systematic
scoping review. Section 4 illustrates the research findings and addresses our research
questions, with the final Sections of Sections 5 and 6 being focused on the conclusion of
this analysis emphasising on research gaps and future directions.
2. Background
2.1. Earth Observation and Citizen Science Data Proliferation: The “Footprint” of the Digital Age
Two significant landmarks appeared in the 21st century as the ones that can facilitate
to better understanding the world’s needs; the provision of global-scale, open-accessed
Satellite data and Web 2.0, which has culminated in the rise of crowdsourcing and the
citizens as a sensor information [13–15]. Recent solutions have placed EO in the highest
position of the data landscape as a cost-efficient solution that could provide more accurate
estimations on the future dynamics of the human-Earth system [16]. Under this frame,
various international organisations such as the Committee on Earth Observation Satellites
(CEOS), the Global Climate Observing System (GCOS), and the Group on Earth Observa-
tions (GEO) were established to design and further certify the scalable and interoperable
nature of EO systems [17]. In 2016, the European Space Agency (ESA) initiated the Earth
Observation for Sustainable Development (EO4SD) program to explore the existed EO-
missions in numerous applications, such as agriculture and rural-urban development,
water resource management, climate resilience and natural hazards reduction. With the
combinations of the EO and ancillary data from static sensors, model simulations’ outputs,
and others, now we shall claim that we are in the most convenient position to monitor
our planet [18] accurately. Opposing this statement, even if such a tremendous amount of
information surrounds us, it usually lacks semantic meaning. Traditional methods of visual
inspection and photo interpretation are still performed for the acquisition of reference data
and therefore described among researchers as a bottleneck and an unsustainable way to
extract meaningful outcomes for the heterogeneous, complex, imperfect big-EO data [19].
Overcoming this obstacle, citizens have proven a meaningful addition to the environ-
mental sciences as they are capable of creating content in various ways, i.e., through image
interpretation, collection of in situ data, and social media [20]. Mialhe et al. [21] declared
Remote Sens. 2022, 14, 1263 3 of 66
that the VGI derived from citizens and stakeholders could reveal rich and complex infor-
mation of the local environment and its change through time; a hard to gain knowledge
by other data forms or experts. Capitalising on the pioneering work of Goodchild [12,22],
in the last two decades, a significant amount of initiatives and projects incorporated the
VGI as a valuable source of information. Indicative examples are the EU-funded projects
of hackAIR [23], SCENT [24], the ARGOMARINE [25], E2mC [26], and GROW Obser-
vatory [27]. The hackAir project (www.hackair.eu; accessed on 28 February 2022) has
developed an air quality data warehouse, where large communities of citizens can pro-
vide air quality measurements with easy deployment of low-cost sensors. Subsequently,
the air quality conditions are further expanded using a combination of official data and
sky-depicting images. SCENT (https://scent-project.eu/; accessed on 28 February 2022)
demonstrated a collaborative citizen engagement tool, enabling the land-use/land-cover
change (LULCC) data collection. Semi-automated and machine learning classification
methods were utilised to evaluate the collected observations and extract semantic de-
scriptions of the ground-level images. An innovative contribution of CS to the improved
monitoring of marine water quality is offered by the ARGOMARINE mobile application
(http://www.argomarine.eu/; accessed on 28 February 2022), allowing citizens to pro-
vide notifications of detected oil spills, assisting on the efficient and responsive mitigation
actions. E2mC (http://www.e2mc-project.eu/; accessed on 28 February 2022) aims to
expand the Emergency Management Service (EMS) of Copernicus, exploiting the beneficial
contribution of social media data to the rapid evaluation of the satellite-based map prod-
ucts and subsequent reduction of timely effort of producing reliable information. One of
the fundamental projects of CS is the GROW observatory (https://growobservatory.org/;
accessed on 28 February 2022), which demonstrated a complete “Citizen Observatory”,
resulting in a targeted audience of different stakeholders (i.e., smallholders, community
groups, etc.), creating soil and LULC monitoring system.
In addition to the broad applicability of volunteer crowdsourcing in several domains,
still, it is essential to encapsulate the citizen role, not only as a data collector but as a
contributor and collaborator to a citizen science project [28]. Many terms have appeared
across the literature [15,20,29], with a will to describe the crowdsource data, based on the
role of citizen within the project, with the “passive and active crowdsourcing” [30] to be met
in the majority of studies. Former studies defined the role of citizen in EO depending on
the nature of the crowdsourcing task [31] and the technological equipment that was used,
identifying two additional types of crowdsourcing, the "Mobile Crowdsourcing" [32], where
citizens act as moving sensors [33,34], and the social media. The recent growing number
of publications in social media testified the tendency of the research community to invest
in passive (or opportunistic) crowdsourcing and the semi-autonomous or autonomous
information extraction [35]. An example of opportunistic sensing is the pools of geotagged
photos that rapidly increase, declaring almost 2 million public photographs uploaded to
Flickr, and around 58 million images per day to Instagram [36]. Nevertheless, the inclusion
of such vast volumes of EO and CS data generates significant difficulties in the production
of meaningful outcomes to decision-makers, related to the Big Data characteristics (i.e.,
denoted as the 5Vs: Volume, Variety, Velocity, Veracity and Value) [19].
schemes have been used to effectively describe data associations. Meng et al. [19] and
Salcedo-Sanz et al. [18] examined the main building blocks of DF. In particular, these blocks
exploit (i) the diverse data sources from different positions and moments, (ii) the operational
processing that refines (or transforms) the pre-described data, to be ingested on the DF
model, and (iii) the purpose that defines the model that is implemented to gain improved
information with fewer errors. In the second article, post-processing was mentioned as an
additional methodological step that is applied to update model’s outcomes and enhance
their accuracy. From a broader view, several methods were designed, attempting to unify
the terminologies that have been used and better understand such a complex system as
data fusion is. However, a dedicated analysis of the DF models exceeds the scope of this
study, and thus Figure 1 aims to report the most wide-spreading architectures and their
corresponding divisions.
Figure 1. An brief description of the main Data Fusion architecture schemas. Data from: Cas-
tanedo [40].
In Remote Sensing (RS), DF is usually defined by the level at which the EO-data
is at hand, and is categorised into pixel, feature, and decision. Starting with the first
raw/pixel, it refers to procedures that utilise different modalities with a will to generate a
new enhanced modality. This includes applications of pan-sharpening, the super-resolution
and reconstruction of a 3D model. The feature DF level aims to augment the initial set of
observations, as linear or spatial transformations of the initial data, and finally, the last
level represents the decision DF, where data represent a specific decision (e.g., LC class, or
the presence/absence of an event), and is combined with additional layers to increase the
robustness of the prior decision. Schmitt and Zhu [41] presented a state-of-the-art (StoA)
review in data fusion techniques and algorithms for Remote Sensing and claimed that
among the most sophisticated fusion tasks, the real challenge is to combine significantly
heterogeneous data. The EO-DF with the crowdsourcing data appeared to be of great
interest. An obvious benefit is the large magnitude of available data, the fact that CS
offers timely and near (real-time) monitoring and analysis, as well as its availability in
online publicly accessed repositories [35]. There is a growing interest in applications that
use CS data from social media, crowdsourced open access data repositories (e.g., OSM,
GeoWiki, Zooniverse, etc.), geotagged shared photographs (e.g., Picasa, Flickr, etc.), web
scrapers and many more [42,43]. Fritz et al. [20] listed a remarkable number of CS projects
in different disciplines of RS, including air quality, collection of environmental data, natural
hazards, land-use/land-cover, and humanitarian and crisis response. On the contrary, CS
Remote Sens. 2022, 14, 1263 5 of 66
data has been criticised due to its intense noisy nature, thematic inconsistency, and lack
of usability. Various studies tried to solve these challenges and enhance the performance
and credibility of CS data, using noise-tolerance classification algorithms [44], rule-based
thresholds and methods that eliminate errors and biases in the training data [45], with
significant results [46]. Inevitably, citizens have proven a viable data gathering tool, able to
cover the limited quantity of training data, an issue that is often the cause for poor model’s
performance [45].
3. Methodological Framework
In this analysis, a systematic scoping review is presented, following the steps of
Daudt et al.’s [47] and Arksey and O’Malley [48], to explore the current state of knowledge
regarding the pre-determined research questions [49,50]. Data fusion algorithms, data
streams and tools are examined with a focus on research works that assimilated data
from Crowd and EO (satellite, aerial and in situ) sensors. A priori research protocol was
designed, based on the proposed steps of Arksey and O’ Malley [48], and Peters et al. [51].
Table 1. Group of the selected keywords used in the four selected databases. The operator “AND”
was used to combine the two sections.
A double and two-level screening (abstract screening and full-text screening) methods
were adopted using an iterative review team and the systematic review software, Swift-
Review [53]. This software is built on the latent Dirichlet allocation (LDA) model, where
few pre-trained data are manually labelled (i.e., as “relevant” or “non-relevant”) and are
used to predict the conditional probability of each document as “relevant” [54]. As a result,
the most “representative” documents get the highest scores and are presented at the top
of the list. The double screening procedure consists of two stages with the first including
triaging the publications in “Relevant/Non-Relevant” and thus performing the priority
ranking, and the second to evaluate the predicted ranking scores, denoting the threshold of
relevance. This method included screening the title and abstract of each publication, where
a 50/ 50 ratio of training and test sample was randomly selected, ensuring the objectivity
of the data sample. An evaluation of the model’s results was conducted, investigating the
quality of the ranking scores against authors’ criterion of “Inclusion/Exclusion” (Inc/Exc),
leveraging the visual-inspection process, the confusion matrix post-processing analysis
and the sensitivity evaluation metric, formulated through the following equation. The
validation process was initiated, denoting as “Relevant” the corpus of a prioritisation
ranking score of 0.6 or higher. Subsequently, we obtained the proportions of true positive
Remote Sens. 2022, 14, 1263 6 of 66
(TP), and false-positive (FP) and synthesised the sensitivity score. The dataset was updated
including the validated “Relevant” articles and the remaining articles, which were denoted
as “included” during the first abstract screening process. Thereafter, the final dataset
was confirmed, ascertaining the corpus selection efficiency through the full-text screening
procedure. The selection procedure and the adoption of the selection criteria are presented
in the section below.
TP
Sensitivity = , (1)
TP + FP
(Table 2), following Tobler’s rule [56], which associates the mapping scale with a spatial
resolution of RS images. In cases of multiple EO data, at diverse spatial resolutions, the
category was determined according to data with the finest spatial resolution, denoting as
the minimum size of an area that can be detected, or to authors’ pre-defined minimum
mapping unit (MMU).
Table 2. Associating the mapping scale capabilities of the proposed models in the selected literature,
with the spatial resolution of the Earth Observation images.
4. Results
4.1. General Overview of Process and Findings
The association of RS and CS data was initially identified in 2205 published articles,
retrieved by the four electronic databases, referring to the period of 1 January 2015 and
31 December 2020. The articles were further reduced to 732 after the exclusion of the
grey literature and duplicate records. During the double abstract-screening process, 15
additional publications were found relevant, following the prioritisation modelling and the
statistical evaluation. The ranking scores of the examined literature varied from 0.094 to
0.921, from which approximately 90% of the “Relevant” documents were listed at 50% of
the top-ranked documents. Relevant documents were identified in 70% of the top-ranked
data, (Figure 2). The statistical findings resulting from the prioritization modelling were
evaluated with a threshold of relevance (i.e., ToR = 0.6), using the confusion matrix method
and the sensitivity evaluation metric [64]. Note that ToR was empirically defined based
on the authors’ observations on the selected literature. The sensitivity score of 80% was
estimated exploiting titles-abstracts only and thus seemed acceptable for further processing,
(Table 3).
Remote Sens. 2022, 14, 1263 9 of 66
Figure 2. Ranking performance curve using 50/50, randomly selecting training-test dataset in the
prioritization modelling. The yellow line shows the baseline of the predicted performance. The green
line denotes the performance based on the test dataset, while the blue line represents the training set.
Table 3. Confusion matrix based on visual inspection analysis, indicating that 80% of the predicted
publications, with prioritisation score of 0.6 or higher, are included in the final dataset.
During the full-text screening, different data fusion algorithms and processes were
identified. Such processes have been successfully organised by Bleiholder and Nau-
mann [65] into two categories, i.e., data integration and data assimilation. The first refers
to the notion of a unified information system, where data are stored and presented to the
user in a unified view, while the second focuses not only on a common data interpretation
but on the generation of new real-world objects. In the data assimilation system, informa-
tion by various sources is harmonised, based on a pre-designed and three-level mapping
schema (i.e., transform, remove duplicates, and concise), and integrated into numerical
models and algorithms in order to produce a new decision [66]. Considering the above, this
scoping review focuses predominately on articles describing the data assimilation models,
involving EO and CS data as well as any auxiliary geospatial information. Adhering to the
latest criterion, a total of 66 scientific articles were finally selected as they met the eligibility
criteria (Figure 3), and are briefly presented in Tables A1–A8 of Appendix A.
Analysing the 66 selected articles, a thorough view is formulated based on the general
characteristics of the selected documents. Particular attention is given to the peer-reviewed
journals, the number of publications addressing the most common thematic categories as
described in Section 3.2, the number of published articles on annual basis, and the mapping
scale levels, where models were tested.
Remote Sens. 2022, 14, 1263 10 of 66
Figure 3. Modified Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)
flowchart illustrating the different phases of the scoping review and the number of selected articles at
each stage of the process [67].
The majority of the peer-reviewed journals (56%) that are published in the 9 scientific
journals are presented in Table 4. A total of 29 articles appeared only once and thus are
aggregated under the “Other“ category (44%). The majority of the aforementioned journals
focuses on environmental, or climate change ecosystems, whereas the journals ranked in
the top four positions (i.e., Remote Sensing|n = 14, ISPRS Journal of Photogrammetry
and Remote Sensing|n = 5, Remote Sensing of Environment|n = 4, IEEE Transactions
on Geoscience and Remote Sensing|n = 3, and International Journal of Applied Earth
Observation and Geoinformation|n = 3) predominately oriented their studies in the RS
applications.
Following Fritz et al.’s [20] findings, we examined the articles published between 2015
and 2020 (Figure 4a), and those related to the thematic categories presented in Section 3.2,
(Figure 4b). A categorization with respect to the economic status of the selected regions was
performed, revealing two maxima in 2017 (n = 14) and 2019 (n = 14), decreasing in the last
year of the searching riod. However, continuous high numbers of published articles indicate
that the contribution of EO and CS data in the decision-making and management processes
is continuing to grow[68]. With respect to the results in Figure 4b, the greatest number
Remote Sens. 2022, 14, 1263 11 of 66
of articles refers to the categories of Vegetation monitoring (n = 19) and Land Use/Land
Cover (n = 17) with a negligible difference among them, whereas the Humanitarian and
Crisis Response (n = 2), gained the least attention. Finally, only one article is related to
the Soil moisture domain, indicating that the fusion between the EO and CS data for this
specific application is still in its infancy [69].
Table 4. Peer-reviewed journals are mostly preferred, according to the number of publications. In
terms of simplicity the journals that appeared only once, were homogenized other the Other class.
Figure 4. (a) Number of published articles per examined year, (b) Number of publications for each
thematic domain, and (c) Percentage of selected papers per mapping scale, indicating the approximate
spatial resolution of the EO data that were exploited, and the replicability status of the proposed
methods as a corollary of the open-source or commercial data providers.
Considering the ratio of articles per mapping scale (Figure 4c), a greater preference is
shown in studies with high and medium mapping scales (local|regional|global = 31|42|27%).
More intuitive findings regarding this research question are presented in Section 4.3.3. The
primary conclusions follow next: First, the tendency to exploit the regional-scale data was
an expected result, as data at finer resolution are predominantly not freely distributed. Ad-
ditionally, we could claim that citizens might feel more engaged with studies at familiar
scales or to a phenomenon that directly affects them, revealing a lower interest in global-scale
applications. This basic assumption is also evaluated by Tobler’s first law of geography (TFL),
Remote Sens. 2022, 14, 1263 12 of 66
which implies a direct and strong relationship between things that are closely located [46].
Recent studies in the quality of the Volunteered Geographic Information (VGI) indicate that
volunteers tend to perform the VGI tasks with greater success when they are located in areas
close to volunteers’ homes [70]. Yet, continental or even global scale effects of humanitarian
or natural disasters are still unknown to the public, or presumably not evident in citizens’
everyday life.
A clustering visualisation network was used, as an alternative to the words’ of cloud
method, as it was able to elicit more meaningful results from unknown patterns, containing
keywords and phrases that were referred in combinations in most documents. The cluster-
ing network graph was generated after several trials, evaluated by authors and displayed
in Figure 5. A total of 49 keywords are visualised in eight core clusters, coloured in red,
green, blue, light blue, yellow, purple, orange and brown, extracting the main research
domains of this paper. Each cluster was structured with nodes closely located to each
other and linkages among them, where each node is denoted from the assigned colour that
indicates the cluster category, and the relative size, presenting “keyword’s importance” and
the frequency of occurrence. Links between the terms indicate the strength of the term and
its association with various terms. Therefore, nodes with higher “keyword importance”
tend to have higher linkage strength; whereas, the colour of the link is related only to the
node’s class.
Figure 5. Clustering network map of the most frequently occurred keywords in the selected case
studies. The shape of the circles denotes the frequency of occurrence of the depicted terms. Subse-
quently, the distance between terms indicates the “level of strength” of relevance among the selected
journals.
Viewing the results, we could conclude that keywords with the highest frequencies
were “classification/class” (61), “model” (59), “image” (42), “observation” (37), “phenol-
ogy” (18), “open street map-OSM” (31), “volunteer” (16), “land cover product” (14), and
“citizen” (14), revealing the two basic terms of our review, i.e., “Earth Observation” and
“Crowdsource/Citizen Science”, and an increased interest to applications related to vegeta-
tion species, and their phenological stages, and also to land cover products in rural and
urbanized environments. However, in some cases, the Lin-Log/Modularity method was
Remote Sens. 2022, 14, 1263 13 of 66
not able to identify similarities in common terms, and thus terms such as “classification”
and “class”, or “open street map” and “osm” are revealed in different nodes. To deal
with this issue, both keywords’ importance and strengths of links were calculated and
are presented in an aggregated form. Moreover, it seems that the highest number of links
occurred in the same terms that presented the highest frequencies, a result that was actually
expected. In particular, the keywords “model”, “classification”, and “image” revealed 42,
32, and 24 links, associating themselves with almost all terms in the graph. Therefore, we
could assume that the majority of approaches addressed a supervised data-driven classifi-
cation problem, where patterns among features are known, and defined by the reference
dataset, collected either by citizen-science data or by experts (scientists or national data).
Investigating the clusters depicted in Figure 5, three major groups could be defined, in the
left (red), right (green and blue) and upper side of the graph (purple), separating them-
selves with the largest distances. In these groups, keywords revealed an association with
the most referenced thematic categories, i.e., Land Cover/Urban monitoring, Vegetation
monitoring, and Climate change. The aforementioned result is also in agreement with the
results depicted in Figure 4b.
Figure 6. Treemap chart of studies (n = 66) according to the fields of applications. Different colours
reflect the seven thematic categories of the application. Rectangle areas are proportional to the
number of studies in the specific sub-category.
algorithms and 2 machine learning models (i.e., the random forest regression and the neural
network) regarding their ability to identify the two phenological phases of SOS, and EOS,
as it can be expressed through the MODerate Resolution Imaging Spectrometer (MODIS)
Enhanced Vegetation Index (EVI) observations. In four articles (n = 4|[46,76–78]), similar
approaches were applied focusing on specific vegetation species and the identification of
their biophysical characteristics (i.e., tree height, diameter breast height, basal area, stem
volume, etc.). In terms of physical models, the Spring Plant Phenology (SPP) models [71]
encompass meteorological observations (i.e., temperature, and precipitation), and EO-
vegetation data [72] to predict the timing of leaf emergence.
Considering the two categories of crop classification (n = 5|[3,79–82]) and forest
cover map (n = 3|[83–85]), most articles attempted to combine the already available LC
data and provide updated maps identifying any uncovered areas [83]. Under this frame,
multiple datasets were used, i.e., RS land cover datasets [3] and regional or national maps
in several countries [85]), EO data at different spatial resolutions (e.g., commercial and
open-source satellite and VGI data), auxiliary data (e.g., FAO cropland statistics [84], and
sub-national crop area statistics, [82]). A noteworthy study is by Baker et al. [79], which
exploited the advantages of citizen science for the identification of urban greening areas at
a finer scale, denoted as domestic gardens. In this study, a participatory citizen campaign
was organised along with a social media campaign to encourage citizens to participate
and provide information related to the garden spaces. One publication [86] exploited the
computer vision models (i.e., object detection) and Unmanned Aerial Vehicle (UAV) RGB
images, proposing a disease diagnosis tool that could recognise changes in plants’ foliage
(i.e., symptoms), at a sub-leaf level.
statistical methods, as well as probabilistic models (e.g., Weights of Evidence [102]) were
applied for the estimation of the flood susceptibility rate over an area. In these articles,
various factors (e.g., environmental and meteorological factors) and enablers, including
web search engines [100], photo-sharing platforms (i.e., Flickr site) [102] and participatory
campaigns [101] revealed their potential contribution to monitor flood occurrences, and
to generate up-to-date and accurate datasets of flooded areas. Furthermore, the study by
Ahmad et al. presented an automated flood detection system, called JORD [103], which is
the first system that collects, analyses, and combines data from multi-modal sources (i.e.,
text, images, videos from social media platforms) and associates them with RS-based data
in real-time, in order to give estimations of areas that were affected by a disaster event.
Panteras and Cervone [104] explored the level of significance of crowdsourced geotagged
photos in flood detection methods and thus gave an alternative in cases where the EO
data are not available. Finally, Olthof and Svacina [105] evaluated the efficiency of various
data sources for receiving information on a flood event, such as passive and active satellite
sensors, high-resolution DEM, traffic RGB cameras, and crowdsourced geotagged photos.
It also investigated associated methods, i.e., rule-based image thresholds, the coherent
change detection proposed by Chini et al. [106] and the 1-dimensional flood simulation
algorithm, with the intention to provide accurate urban flood maps within the first 4 h of a
flood event. Subpixel mixing in optical sensors, the coarse spatial resolution of Sentinel-1
data and the occasional contributions of citizens seemed the main limitations of real-time
flood monitoring.
In addition to flooding events, the following three articles, explored the potential
contributions of crowdsourced data in cases related to fire, earthquake, and nuclear ac-
cident events. The first case presents the benefits of using smartphone applications and
a large audience of volunteers to estimate the forest fuel loading in areas close to urban
environments (i.e., wildland-urban interface (WUI) areas) [107]. In general, forest fuels
have proven structural components in wildfire risk monitoring, and therefore the accurate
data collection is of paramount importance [108]. Furthermore, Frank et. al [109] attempted
to give conclusions regarding the role of citizen scientists in terms of their contribution in
the rapid damage assessment after an earthquake event; in particular, they investigated the
noise resistance of two classification methods (i.e., object-based and pixel-based), and the
effect of using different labelling methodologies and crowdsourcing tools. Last, Hultquist
and Cervone [110] demonstrated -in the Safecast VGI project- the production of a complete
footprint for the radiological release over the Fukushima area after the nuclear accident.
In this project, 200,000 measurements were collected by citizens and thus associated with
reference data collected by in situ sensors.
map that has been produced so far [85]. Table 5 illustrates the LULC and forest/cropland
maps found in the literature. It seems that most of the products are referenced mainly
in four articles [83–85,96]. Two articles [74,75] exploited the annual vegetation dynamics
of MODIS data (i.e., MCD12Q2) to identify the different phenophase stages (e.g., Start of
Season or End of Season), expressed as date of the year-DOY.
Table 5. EO-generated LULC products that were found across the literature, alongside their data
sources (All websites have been accessed on 28 February 2022), where are accessible and creation
date.
Furthermore, the LULC products were identified also in the flood susceptibility cases
(n = 2), where the LC vegetation classes were proven to have a negative association with
flood events [100] or to correspond with the surface roughness properties of the floodplain,
with which the Quasi-2D models propagate the flow. This different classification schema
over the initial land use categories is denoted as Manning’s roughness coefficient [98].
The second LULC classification schema was developed by the World Urban Database and
Access Portal Tools (WUDAPT) and adopted by [2,97,136], in which the urban areas are
denoted with 17-LCZ typologies, presenting different micro-climatic conditions. More
information on the LCZ classification schema are available in Steward and Oke [137].
Finally, LULC was used to reveal information regarding the land cover/urban objects
(e.g., [95,96,112]), and specifically considering the effect of the agriculture activities on
water quality [122,123], or the signal loss due to dense vegetation presence [116].
EO spectral indices are important indicators, present in most examined categories,
except for the Humanitarian and Crisis Response, and Ocean and Marine monitoring
applications. Table 6 shows a preference in Vegetation indices (11/18 total number of
identified indices), and especially (n = 12) in the Normalised Difference Vegetation Index
(NDVI). Maximum NDVI images per annum were obtained in three articles [45,90,91]
to eliminate cloud contamination over the scenes and ameliorate any seasonal and inter-
annual fluctuations [138]. NDVI was not restricted only to LULC studies but was also
used to detect vegetation species (e.g., invasive buffelgrass [78] and urban orchards [46]),
and fuel loads [107], as it can capture variations in chlorophyll absorption patterns of
plants, even in cases of understory vegetation (e.g., shrubs, grass), where the reflectance
response in Near InfraRed (NIR) is lower. Continuing, the Enhanced Vegetation Index (EVI)
was present mostly in time-series models (n = 4), identifying the seasonal disparities in
deciduous forests. An interesting method was Delbart’s et al. [73], where the Normalised
Difference Water Index (NDWI) was exploited for the green-up estimation date, as an
effective solution to avoid false detection due to snowmelt.
Two studies produced multiple vegetation indices calculated by Very High Resolution
(VHR) data of the RapidEye satellite [107] and the High-Resolution True-colour Aerial
Imagery (TAI) of Getmapping [139], with the latest to leverage on pixel’s colour intensity
in visible red and green. Apparently, in land cover classification studies, additional indices
are incorporated, distinguishing the impervious surfaces and bare soil lands. Paradigms
are the most known (a) Normalised Difference Build-Up Index (NDBI), and the newer
indices of (b) built-up index (BuEI), (c) soil index (SoEI), (d) bare soil index (BSI), and (e)
index-based built-up index (IBI). Eventually, the Modified Normalised Difference Water
Index (MNDWI) seemed to be preferrable compared to the NDWI, when the extraction of
water areas is requested, as it is able to suppress the reflectance response of vegetation and
built-up areas, and thus to perform better in water detection [102,104]. Therefore, when
data from multiple satellite sensors are used, the pre-processing technique of atmospheric
correction (e.g., dark subtraction) was applied [88,102,107] in order to avoid inconsistencies
related to the sensor’s viewing angle and the different illumination conditions.
Remote Sens. 2022, 14, 1263 20 of 66
Table 6. Characteristics of the EO-generated indices that were found across the literature.
0.03 × ( GREEN − 0.11 × BLUE) + (1.56 × RED + 1.1 × N IR) + 1.37 × MIR +
SoEI Feyisa et. al. [148] [90]
(1.37 × MIR − 0.61 × SW IR)
2SW IR
IBI − [ N IR/( N IR + RED ) + GREEN/( GREEN + SW IR)] Xu [149] [120]
SW IR + N IR
2SW IR
+ [ N IR/( N IR + RED ) + GREEN/( GREEN + SW IR)]
SW IR + N IR
BSI SW IR − RED Rikimaru et al. [150] [89]
N IR + BLUE
NDWI N IR − SW IR Gao [151] [73,89,94]
Water
N IR + SW IR
MNDWI GREEN − SW IR Xu [152] [90,102,104]
GREEN + SW IR
Remote Sens. 2022, 14, 1263 21 of 66
LiDAR 3D point measurements [118], and combined signals of Navstar GPS and Russian
GLONASS Global Navigation Positioning Systems (GNSS) [77,116,117,154] were used.
Landsat multispectral sensors (e.g., 5 TM, 7 ETM, or 8 OLI) were exploited in 16 articles
(LULC = 10; Air monitoring = 2; Natural Hazards = 3; Vegetation monitoring = 1), with
the Landsat-8 OLI dominating among the others. The searching period itself is related to
this as well as the “scan line corrector off” (SLC-off) of Landsat-7 ETM+ which drastically
decreased its usability [45]. Among different application fields, Landsat data was used to
derive spectral indices (e.g., NDVI = 8| [45,69,81,90–92,100,120]; IBI = [120]; LST = [120];
NDBI = 2|[90,92]; MNDWI = 3|[90,102,104]; BuEI = [90]; SoEI = [90]), as well as to generate
LULC maps (n = 7| [2,21,89,97,122,123,136]). Multispectral satellites, described by similar
spatiospectral characteristics, such as Sentinel-2 (10m), and Earth Observation-1 Advanced
Land Imager (EO-1 ALI, 25 m) were used as alternatives by five articles [81,89,94,104,120].
Figure 7. Cross-matrix analysis: Summarized view of the sensors identified in the examined thematic
domains. Zero values indicate an absence of use of the specific sensor in the domain. Tha EO satellite
data are categorised following the thematic domains that were chosen to be analysed (e.g., Air: Air
quality, HuCR: Humanitarian crisis and response, LULC: Land Use/ Land Cover, NatHaz: Natural
Hazards, Urban and Vegetation: Urban and Vegetation monitoring, and Soil: Soil monitoring).
Satellite sensors at lower spatial resolutions (≤250 m), such as MODIS appeared in
regional or global scale studies (n = 5), providing “analysis-ready data”(ARD) of surface
reflectance, phenology related indices such as EVI [60,93], NDVI [78], LC (NLCD2011) [60],
and the Aerosol Optical Thickness/Depth (AOT/AOD) measurements at 550nm [119]
(MYD04/MOD04). Examples of the MODIS datasets that were used across the litera-
ture are the Nadir Bidirectional Reflectance Distribution Function-Adjusted Reflectance
(NBAR) (MCD43A4, Version 005) [72], 8-day surface reflectance data at 500 m spatial
resolution (MOD09A1 Version 006), the twice-daily surface reflectance data (MOD09GA
and MYD09GA, Version 006) [60], the Vegetation Dynamics dataset (MCD12Q2 Version
006) and the Annual Land Cover Type at 500m (MCD12Q1 Version 006). Furthermore,
multispectral imageries from the MERIS instrument (300m spatial resolution) were used
by Garaba et al. [124] to derive FUI colour maps over the North Sea. Continuing, VHR
EO-data were exploited by eight studies [44,46,87,103,104,107,111], with all of them ori-
entating their applications in urbanized environments, barely incorporating the whole
Remote Sens. 2022, 14, 1263 23 of 66
cover of the examined city. In smaller-scale applications, the VHR data were used as
they were able to distinguish urban objects (i.e., buildings, roads and their state) and
extract complex semantic contents, including building state, population density, road net-
work, flooded areas in dense urban environments and others [114]. Following the above,
GaoFen-2 (GF-2) [44,87] and WorldView-2&3 [46,104,111,114] satellite sensors were used
to derive high-level semantic objects and high-resolution population density information.
Subsequently, Ahmad et al. [103] and Olthof and Svacina [105] exploited the Planetscope’s
spectral bands in visible and near-infrared to identify image patches that could be char-
acterised as flooded. Eventually, RapidEye imagery at the spatial resolution of 5 m was
used in two articles [105,107] to estimate the fuel properties in small forest canopies and the
maximum flood extent over a region with a diverse land cover. Finally, the DigitalGlobe
satellite imagery (0.3 spatial resolution) was employed by Wang et al. [82], to ensure that
the majority of the crowdsourced geotagged photos were located inside the crop field and
therefore the selected features were denoted with the correct labels.
A significant number of studies (n = 30) chose to work on the raw satellite data, in
contrast with the aforementioned indices and products. In such cases, in addition to the
initial spectral information (n = 8), textural and colour features (n = 5), spectral features and
their derivatives (n = 17) were identified. In the first case, features of the Grey-Level Co-
occurrence Matrix (GLCM) [109] such as dissimilarity, entropy, angular second moment [90],
intensity and brightness [46] were used in object-based classifications problems, with
the most prominent being Geographic Object-Based Image Analysis (GEOBIA)) [109]. In
particular, Puissant et al. [155] and Baker et al. [79] referenced them as effective indicators for
describing different LC types, leading to significant improvements in image-segmentation
problems. Subsequent transformations of EO data expanded the already enormous feature
presented in the previous section, for instance focusing on the use of dimensionality
reduction techniques, such as the Principal Component Analysis (PCA) and Minimum
Noise Fraction (MNF) [79], simple band ratios, multispectral surface reflectance values [89],
and images’ vertical and horizontal rotations (e.g., 0◦ and 90◦ ) [86].
Furthermore, backscatter coefficient features (i.e., in decibel-dB) seem a common trans-
formation when SAR data are exploited. In two articles [81,91], SAR amplitude images
of dual co- and cross-polarization nodes (e.g., ALOS PALSAR: HH, HV; Sentinel-1A&B:
VV, VH; Radarsat-2: HH, HV) were utilised to discover different LC types (e.g., cropland
areas) as they compensate potential data losses due to weather conditions (e.g., cloudy
weather or with haze). Correspondingly, band ratios (e.g., HV/HH) and the backscatter
transformation in gamma naught (γ0E ) [156] and sigma naught (σE0 ) [69,105] were applied
in both cases. Olthof and Svacina [69] capitalised on the ability of SAR data to capture
the slight movements or changes in the landscape, between two different moments. In
particular, using the interferometric coherence between four complex Sentinel-1 images ac-
quired before the event (pre-event coherence, γ pre) and during or after the event (co-event
coherence, γ co), flooded regions were detected. Examining the vegetation monitoring
articles, GNSS bistatic signals were used in two articles [77,154], enabling the calculation
of the signal strength loss (SSL) for estimating the forest canopy. The distributions of SSL
(denoted by the carrier-to-noise ratio (C/N0 )) were estimated, subtracting the two acquired
signals, which are retrieved by two independent receivers, over the same period and under
the same sky conditions; the first placed in an open-space region, and the second inside the
forested area. Finally, a single article [118] constructed an EO-image frame by interpolating
values received by LiDAR point cloud density data.
4.3.4. Web Services and Benchmark Datasets Assisting with the Data Fusion Models’ Data
Needs
Publicly available datasets, and online web services with EO data at VHR were also
exploited and are presented below. In particular, up-to-date and free of charge RGB
images extracted by Google Earth maps and Microsoft Bing map servers in zoom level
Remote Sens. 2022, 14, 1263 24 of 66
18 (i.e., approximate resolution of 0.6 m) tiles (256 × 256 pixels) were used by six arti-
cles [7,64,112,113,115,117] and have been fed into deep learning models.
Consequently, four articles used already available datasets, denoted as “Gold-Standard
benchmark datasets”. Kaiser et al. [115] incorporated large datasets downloaded from
Google maps and OSM for the cities of Chicago, Paris, Zurich, Berlin, and the 2D Pots-
dam International Society for Photogrammetry and Remote Sensing (ISPRS) benchmark
dataset that consists of 38 true-colour patches at 5 cm spatial resolution, under a fully
connected neural network (FCN) architecture. Chen et al. [113] implemented a customised
CNN model based on the MapSwipe dataset, the OSM and the Offline Mobile Maps and
Navigation (OsmAnd) GPS tracking data, attempting to overcome the challenges of in-
completeness and heterogeneity. Herfort et al. [7] built a building footprint model based
on a pre-trained model, and the Microsoft COCO dataset. The viability of the residential
classification model of Chew et al. [112] was evaluated based on the ImageNet dataset. In
principle, the ImageNet benchmark comprises over 1.2 million labelled HR images and
1000 categories, collected by the Amazon’s Mechanical Turk [157]. Finally, Li et al. [64]
used the four benchmarking datasets of UC-Merced, SAT-4 and SAT-6, and Aerial Image
Dataset (AID) to evaluate the effectiveness of their benchmark dataset (i.e., RSI-CB). Table 8
presents all the available datasets that were used by the selected studies, along with the
generated ones.
Table 8. List of the available benchmark datasets that were found across the literature and can be
used for future researches. (All websites have been accessed on 28 February 2022).
Study
Datasets Source
IDs
ImageNet (ILSVRC) https://image-net.org/download.php [112]
Gold-Standard benchmark
http://weegee.vision.ucmerced.edu/datasets/
UC-Merced [64]
landuse.html
SAT-4 and SAT-6 https://csc.lsu.edu/~saikat/deepsat/ [64]
PASCAL VOC http://host.robots.ox.ac.uk/pascal/VOC/ [115]
ISPRS semantic labeling https://www2.isprs.org/commissions/comm2/wg4
[115]
benchmark /benchmark/2d-sem-label-potsdam/
Microsoft COCO https://paperswithcode.com/dataset/coco [7]
AID https://captain-whu.github.io/AID/ [64]
https:
RESISC45 [64]
//www.tensorflow.org/datasets/catalog/resisc45
MapSwipe https://mapswipe.org/en/data.html [113]
RSI-CB https://github.com/lehaifeng/RSI-CB [64]
https://github.com/ChenJiaoyan/DeepVGI-0.2 &
Produced datasets in the literature
MC-CNN [113]
DeepVGI-0.3
https://pub.iges.or.jp/pub/openstreetmap-land-
Land cover change dataset [91]
cover-change-dataset-laguna
Brick kilns locations https://doi.org/10.1016/j.isprsjprs.2018.02.012 [125]
Dataset:
https://doi.pangaea.de/10.1594/PANGAEA.894892,
Plantation map [81]
Script:
https://github.com/utu-tanzania/sh-plantations
Global croplands http://cropland.geo-wiki.orgwebsites [3]
Maize annotated images with
https://osf.io/p67rz/ [86]
disease symptoms
http:/ht/pub.iges.or.jp/modules/envirolib/view.php?
LULC dataset [45]
docid=6201
Tair dataset https://github.com/NINAnor/cityTairMapping [120]
Hybrid forest map http://Russia.geo?wiki.org [83]
tool was used by Juan and Bank [126] to collect spatial records of violations in Syria. Syria
Tracker (https://www.humanitariantracker.org/syria-tracker; accessed on 28 February
2022) is part of the Humanitarian Tracker project that incorporates data from citizens as
well as news reports, social media, etc. in order to provide updated crisis maps over Syria.
Figure 8. The share of the selected papers organised with respect to the participation type (active
or passive), and the crowdsourcing platforms and tools that were used in both cases. The selected
papers may contain at least one or more tools and therefore are included in each of the tools.
Traditional mainstreaming media such as Twitter (n = 3), Sina and Weido (n = 2),
Baidu (n = 3), and Youtube (n = 3) were used to cover the absence of reference data in
urban planning applications [92,114] and in studies for hazard preparedness and man-
agement [98,100,103,104]. In those cases, unconventional data (text or images), accom-
panied with their relative position were extracted by standardised APIs. Panteras and
Remote Sens. 2022, 14, 1263 27 of 66
Cervone [104] retrieved 2393 geotagged tweets from the USA Mongo DB server, using the
R twitter library. Hashtags and the area of interest were used to assist with the filtering
process and the identification of flooded areas. Subsequently, Ahmad et al. [103] integrated
social media sources (e.g., YouTube, Twitter, etc.) and the photosharing repositories of
Google, and Flickr into an automated flood detection system. In this system, images, videos,
web crawlers and text translators (i.e., Google Translator API) were used along with the in-
ternational disaster database EM-DAT (supported by the World Health Organisation-WHO)
in order to collect, link and analyse an unlimited number of reports on natural hazard
events. Additionally, the TextBlob NLP Python library was used to discard any irrelevant
tweets, while it identified cities’ locations and names, where incidents were reported. Annis
and Nardi [98] adopted a similar approach. Five articles [87,88,98,102,103] illustrated the
effectiveness of using photo-sharing services with the most known to be Flickr. Flickr geo-
tagged photos can be retrieved through the public API performing queries related to their
title, description, tags, date, and image location [88]. Sitthi et al. [88] explored, apart from
the location and content, their colour characteristics (e.g., RGB histogram and identified
edges), generating LC features. In this case, the Otsu binary segmentation algorithm, the
high pass Sobel kernel filter and the colour vegetation indices were implemented to first
isolate the land cover classes from the background image content, and to generate the
features using the probabilistic Naïve Bayes (NB) algorithm to extract the CS land cover
map. Misdetections in the foreground/background segmentation were identified due to
background illumination, different acquisition dates, and image angles, leading to 12.2%
of incorrect classifications; even so, the NB classifier performed with an accuracy greater
than 82% (testing kappa-coefficient, precision, recall, and F-measure). Two additional
websites that are worthy to be mentioned are Panoramio [87] and the PhenoCam Network
(https://phenocam.sr.unh.edu/webcam/; accessed on 28 February 2022). Having access to
a great number of raw vegetation photographs, Melaas et al. [72] monitored the phenologi-
cal dynamics over various vegetation species using time-series of the retrieved images and
the green chromatic coordinate (GCC) index.
Exploring the studies that correspond to the bottom of Poblet’s pyramid [32] , raw
crowdsourced data were utilised by three articles [97,116,117] in which GPS traces from
smartphone devices and data from low-cost weather stations were used. In the first case,
the GPS traces of hikers, bikers, or taxi drivers were used as combined trajectory points to
identify missing footpaths and road networks. Li et al. [117] presented four methods (i.e., (i)
trace incrementing, (ii) clustering, (iii) intersection linking and (iv) rasterization), regarding
the construction of a road map of GPS trajectory data. The trace incrementing requires
an initial GPS trajectory of high quality to concatenate the remaining data using certain
models (e.g., weighted Delaunay triangular model). This method is sensitive to data of low-
frequency and high noise. In clustering methods, unsupervised algorithms are leveraged,
such as Density-based spatial clustering of applications with noise (DBSCAN) and K-means,
whereas the third method finds centerlines and notes which have to be linked. In the last,
data are first converted to a greyscale raster image, with the different colour tones of grey
depicting the number of GPS traces. Then, morphological operations, such as erosion can
be applied to construct the final layer. Additionally, geotagged photographs were used
for crop type identification [82]. Through the Plantix Android geotag application, created
by Progressive Environmental and Agricultural Technologies (PEAT) in 2015, farmers are
able to collect photos of their crops and identify pests, diseases, and nutrient deficiencies
using their mobile phone camera and image recognition software.Hammerberg et al. [97]
incorporated weather data received by the Personal Weather Station Network (PWSN) of
the Weather Underground database (https://www.wunderground.com/; accessed on 28
February 2022), in which citizens voluntarily provide measurements via a simple sign in
process and a set of Netatmo stations with a temperature accuracy of ±0.3 ◦ C, a humidity
accuracy of ±3%, and a barometer accuracy of ±1 mbar.
Remote Sens. 2022, 14, 1263 28 of 66
app to assist citizens in the collection of optical colours of water. Shupe [122] and Thorn-
hill et al. [123] performed field-based workshops to train volunteers on the collection
of water samples, whereas in the Olthof and Svacina [105] article, an android mobile
application was developed attempting to collect geotagged photos of flooding, when a
satellite sensor is passing through the examined area, along with a short survey on the local
impacts of the event. In the WeSenseIT (WSI) project, the WSI mobile phone app was devel-
oped and exploited by citizens to collect both static and dynamic measurements of water
level and precipitation after a flood event in the Bacchiglione catchment (Italy). Another
smartphone application, called ForestFuels was utilised to collect observations of forest
fuels loading of different vegetation species. The application was constructed to include,
GPS, compass, accelerometer, camera and a training guide prior to the image collection.
Eventually, Wallace et al. [78] recruited and trained 10 citizens through the USA-NPN
education coordinator to provide locations, where buffelgrass were identified. Finally, two
articles [21,46] built their analysis on focus-groups, without the use of any technological
equipment. Mialhe et al. [21] mentioned the gender balance and the fair representation of
age classes and ethnic groups as the most critical elements in those applications.
Massive Open Online Courses (MOOC). Promotional activities on social media ascribed
the participation of more than a hundred university students, expanding the collected data
sample and the subsequent evaluation based on the majority of votes. Government agen-
cies can give functional support in the CS campaigns, providing an appropriate funding
mechanism to extend their duration, representing a benefit to the society and an increase in
environmental awareness [123].
tioned a general limitation regarding the CS campaigns. Usually, the satellite images that
are used, varying over different seasons, are producing errors as a result of these variations.
Chew et al. [112] pointed out that different RS images might complicate the analysis, either
for the crowdsourced or the model’s performance. Residences have a common colour
representation on an image with bare soil features, and thus accurate representation of their
locations are of paramount importance. RS images at different seasons or times of the day
might confuse coders’ decisions [81].
(B) Digitisation-Conflation tasks (DCT)
Articles suggesting DCT revealed several drawbacks related to the class imbalance and
the digitization errors due to citizens’ amateurism or lack of motivation (e.g., incompleteness-
omission errors and heterogeneity) [60,113]. Concerning the first limitation, the Synthetic
Minority Oversampling Technique (SMOTE) [89,90] and the Kernel Density Function (KDF)
were proposed as prominent solutions to solve imbalance distributions [92]. A majority
voting and a neighbouring spatial aggregation proved efficient in dealing with different
and inconsistent labels [114]. Wan et al. [44] explored a combination of image process-
ing and statistical methods to obtain a more reliable training dataset. In this approach,
morphological erosion was used as the generalisation technique in order to maximize
differences between classes and eliminate OSM’s offsets in boundaries. A cluster analysis
was performed using the fuzzy c-means (FCM) un-supervised algorithm to associate fea-
tures with similar characteristics and maximize the intra-variability between LC classes.
Chen et al. [113] integrated an iterative loss calculation approach to overcome artefacts
related to VGI incompleteness and label heterogeneity. The method seems to support VGI
data collection, as it executed this task in a much shorter time (i.e., 8.3 times faster on
average).
two-tailed median absolute deviation (MAD) [97] attempted to detect errors and abnormal
values in temperature records [121]. An interesting and open-source tool was proposed
in Venter et al.’s [120] study, called CrowdQC (R-package), which can identify, without
any reference data, the statistical implausible temperature values due to misplacements
of sensors, solar exposition, data inconsistencies and malfunctions on the device. Subse-
quently, the e-SOTER methodology was adopted by Kibirige and Dobos [69] to determine
the spatial distribution of the soil sensors in places with specific geomorphologic units,
certifying a lower impact in the collected soil moisture measurements.
According to Figure 9, 32 studies chose to use traditional models, including spatial associ-
ations and significant statistical tests [21,46,69,91,92,104,121,125], regression and spatial inter-
polation models with a single or multiple explanatory variables [60,69,73–75,110,119,122,124]
probabilistic and decision making models [3,98,99,102]. Classification problems were ad-
Remote Sens. 2022, 14, 1263 33 of 66
dressed in object-based [79] and pixel-based schema with the first providing additional
image information about the scene, such as shape, length, and spectral details [164]. Em-
pirical rule-based [75,105] and statistical models, such as logistic regression predicted mod-
els [57,100,101,126] were explored in four cases [83–85,96] using the spatial proximity of
model’s predictors. Advance machine- and deep-learning techniques were adopted in 22
articles, incorporating high-level features or features from various data fusion levels (i.e.,
pixel-decision DF, object-decision DF), while 15 studies introduced ensemble models, allowing
multiple classifiers to complement each other and overcome limitations in performance gap
and small training instances [89]. Three articles [71,72,97] integrated satellite and crowd-
sourced data into physical models. Two additional cases [98,99] demonstrated the usefulness
of the assimilated EO/CS observations in the flood model predictions, as they give more
updated, accurate and less sparse observations. Those articles are listed in both statistical and
mechanistic models categories, as the assimilation of EO and CS data produced outputs at
two stages, with the first to be the input of the second approach.
In the following sections, an overview of the most noteworthy findings in the data
fusion models is given, commenting on their effectiveness and evaluation performance.
During this process, comparisons among different evaluation metrics’ rates have been
made and documented. Appendix A (Tables A1–A8) presents the critical characteristics
of each such article, and the evaluation metrics and validation methods used to verify the
efficacy of the DF methods.
Table 10. Summary of methods used in the high-level data fusion category.
network classification maps. A typical LeNet architecture [167] is defined by two layers
using ReLU as the activation function, two fully-connected (FC)-Dropout-ReLU layers, and
one softmax or logistic regression classifier. In each convolutional layer, the ReLU function
truncates all the negative values to zero and leaves the positive values unchanged, whereas
the max-pooling kernel operator is responsible for the dimensional reduction of each image
layer, preventing the model from overfitting [115]. Following a similar architecture, the
AlexNet [168] and the VggNet [169] incorporate more hidden Conv-ReLU layers [113].
In this experiment, the LeNet-CNN had the highest performance, whereas all of them
had issues in the road map extraction, whatever the evaluation metric selected (i.e., F1,
Accuracy, and AUC).
Subsequently, a refined version of the U-Net CNN architecture was designed by
Li et al. [117], formulating two experiments. The first, referred to as input-based data
fusion model, concatenated the extracted CS-road and RS-road features and then trained
the images to extract the road network, whereas in the second case, both CS- and RS-road
feature datasets are separately trained using the refined U-Net model (RUNET), and then
the models’ output was fused according to a certain weight. In the RUNET architecture,
two modifications were made; first, excluding the first and the last convolutional stages
leading to a reduced complexity, and the second using a rectangle convolution kernel
at each stage instead of the square kernel, considered more suitable for the geometrical
shapes of the roads. The input-based RUNET model performed better (Sensitivity =
0.840 and OA = 0.672), compared to the state-of-the-art CNN models for road extraction,
e.g., LinkNet34, D-LinkNet34 and U-Net, but had challenges in shorter road segments.
The last two articles incorporated the transfer learning technology using datasets of the
same type [170], to construct larger datasets [171]. Finally, the Single Shot Detection
(SDD) network was applied in the Herfort et al.’s [7] study, in order to extract the human
settlements with lower effort. The SDD networks follow the tilling concept, where the
model is trained in determining boxes. This way, the feature maps are learned to be
responsive in particular scales. Additional experiments were performed to identify both
the spatial and the non-spatial characteristics of the misclassifications, applying Scott’s
rule [172], to calculate the probability of a task to be misclassified. This method was
evaluated introducing various metrics, such as false negatives (FN), false positives (FP),
true negatives (TN), and true positives (TP), specificity (TNR), sensitivity (TPR), accuracy
(ACC), and the Matthews correlation coefficient (MCC). In particular, the MCC was used
as an alternative metric of F1 and precision, as the latter have been noted to provide highly
biased outcomes for imbalanced datasets [173].
were identified in two articles; the first deploying a logistic regression model [126], and
the second the Bayesian probabilistic method, “Weights-of-Evidence”(WoE) [102]. Both
cases introduced multiple geospatial variables, obtained from both the CS and RS data and
established relations between them and the presence/absence of an incidence. The most
widely used probabilistic classification method, that of maximum likelihood, was utilised
by Sitthi et al. [88] in order to produce a LC map over Sapporo city, achieving an overall
accuracy of 70% and kappa-coefficient of 0.65.
Table 11. Summary of methods used in the multiple-level data fusion category.
proving that additional features can assist in noise resistance, as the algorithm performed
with a lower drop. Wang et al. [94] tried to discriminate the crop types in small-holding
parcels in India, leveraging the RF classifier and the seasonal patterns of the Sentinel-1
and 2 data values. Specifically, a discrete Fourier transform, also known as a “harmonic
regression”, was applied to S2 observations, even at cloudy images.
Additional DT and ensemble DT learners were found in the examined literature,
including the CART [96] and the multivariate C.45 DT classifier (denoted in Weka as
J48) [45], the AdaBoost [112], which according to Miao et al. [177] is based on the C5.0 DT
algorithm, Rotation Forest (RoF), and Canonical Correlation Forests (CCFs) [89]. RoFs is an
ensemble classification method, introduced by Rodriguez et al., showing to outperform the
most known ensemble methods (e.g., Bagging, AdaBoost, and RF) [175], with a significant
margin . In the CCFs [178], a Canonical Correlation Analysis (CCA) is applied to find
the maximum correlations between the features and the class labels. An ensemble-based
classifier was developed by Yokoya et al. [89], who combined three state-of-the-art models;
CNN (3 layers with ReLU to be the activation function), RF, and gradient boosting machines
(GBM) under two frameworks; the first fusing relevant features and the second fusing
models’ predictions into a LCZ map. The Markov random field (MRF) was applied as
a post-processing spatial smoothing filter, increasing the OA by 7.1% and the kappa-
coefficient by 0.08. Additional ML models were identified such as the k-NN [107] and the
SVM-RBF [44,81] classification, and the space-partitioning data structure method of the
kd-tree-based hierarchical clustering [117]. In the K-NN the Mahalanobis distance was
used instead of the Euclidean distance as it provides a better estimation, due to its ability
to identify possible correlations and trends among the features. In the presented study,
the CCA was applied to investigate the relation among the explanatory variables and the
predicted ones.
Several DCNN techniques appeared in eight articles [64,86,94,103,112,114,115] intend-
ing to extract image contents related to crop types and urbanised elements, such as urban
land use interpretations, archaeological monuments and others. Starting with the first
article, a variant of the fully-connected CNN (FCN) was adopted by Kaiser et. al [115],
generating a spatially structured labelled image of noisy crowdsourced data. The FCN
has the ability to convolve transforming the pixels into 1D label probabilities, and subse-
quently, deconvolve to gain the initial image size. The model was trained in equally sized
mini-batches using the stochastic gradient descent, reaching an average F1-score of 0.84.
Lambers et al. [118] identified the Dutch archaeological sites, using an adapted version
of the object-detection and image classification CNN denoted as Region-based CNN or
Regions with CNN features (R-CNN). In particular, within the R-CNN model, the object
identification is conducted at the beginning, generating the required features, which are
then fed into the SVM classifier to determine the credibility of the objects. Furthermore,
the Dropout strategy was adopted by two articles [112,114] omitting feature detectors that
could lead to complex co-adaptations over the training data and model overfitting [166].
In the first study, Chew et al. [112] predicted the probabilities of presence/absence of a
residency with the CNN performance exceeding 85%. The transfer learning approach was
subsequently used and tested over the pre-trained Inception V3 and VGG16 networks
and several shallow-learning models. Zhao et al. [114] designed a 5-layer CNN using
a feedforward activation function and the softmax algorithm, predicting labels over un-
known semantic EO-elements. Wang et al. [94] constructed two CNN models of 1 and
3 dimensions, exploiting both the spatial and the temporal dimensions over GCVI time-
series. Structuring the 1D-CNN architecture, a max-pooling convolutional model at the
size of 18 rows (14 S2 features and 4 S1 features) and 365 columns is comprised of multiple
convolutional layers and with each layer to use the ReLU. The cross-entropy loss function
was used during the training to provide the best-fitted model, and therefore diverse crop
types. Attempting to integrate both the spatial and temporal dimensions in the CNN
model, a 3D-UNet segmentation network was also tested, exploring the spectral infor-
mation through time and the corresponding pixels’ behaviour at the maximum distance
Remote Sens. 2022, 14, 1263 39 of 66
of 200 m. However, evaluation metrics show that the 1D-CNN slightly produced better
results, while a potential reason for this seems to be the higher computational time and
the greater number of hyperparameters. Additional variations in CNN algorithms such as
the AlexNet, VGGNet, GoogLeNet, ResNet, and the ResNet34 were found in Li et al. [58]
and Wiesner-Hanks et al. [86], respectively. Finally, the Generative Adversarial Networks
(GAN) model was exploited by Ahmad et al. [103], aiming to detect the flooded regions
over crowd selected satellite scenes. According to the aforementioned article, GANs can
be considered as unsupervised classification learners, consisting of two competitive NNs.
More details on GANs models can be found in Abdollahi et al.’s StoA review [179].
5. Discussion
Within this scoping review, we attempted to provide a holistic overview on research
studies assimilating information acquired by Earth Observation and Crowdsourced/Citizen
Science data. We have comprehensively reviewed the selected articles according to the
collected measurements, the sensors and technological tools or equipment which were
used, and the methods that attempted to transform these data into meaningful information
regarding the research problems at hand. We adopted a scoping review methodological
framework, as a systematic literature review was neither applicable nor realistic, due to the
diverse nature of the scope of referenced articles. These variations make the quantitative
analysis and comparisons a challenge, illustrating sometimes the loose connections among
articles. Concluding, this scoping review aimed to extend previous outcomes [20,52],
highlighting gaps and future directions, in research domains that have not been explored
so far. Being in line with the previous works, we shall claim that to the best of our
knowledge this is the first literature review that examined the data/tools/algorithms and
their association withdata fusion types, demonstrating their use and performance in given
scenarios. Based on the examined categories, we summarise our review in the following
sections.
outdated scenes, or the complete absence of observations in remote regions have been
noted as the most common limitations in Remote Sensing. The Savitzky-Golay filter is
also used in time-series models to fill up the cloudy or missing data, applying a 7 point
median moving method [75]. Moreover, the so-called “brute force approach”[45] has been
proposed, noting that a greater volume of data could overcome noise issues related to
clouds, haze, and shadows resulting from the spectral variations in regions with intense
inclinations. This approach was applied only with data by the same satellite sensor [90] or
explicitly with sensors, which monitor the same spectral range [104].
On the other hand, articles rarely incorporated SAR data in their analysis. John-
son et al. [91] have stated that, in addition to the advantage of SAR sensors to acquire data
in every weather condition, they are sensitive to the vegetation structure, as microwaves
at specific wavelengths (e.g., L-band) can penetrate the vegetation canopy receiving in-
formation of stems and branches. Recent studies have examined the explicit use of the
polarimetric [181], as well as the interferometric synthetic aperture radar (InSAR) data
to discriminate different backscatter behaviours of vegetation types, such as forests, crop
types, etc. [182]. One of the greatest challenges of using SAR data so far was the geometric
distortions (e.g., foreshortening, layover, shadowing), caused by the local topography and
the right orientation of the satellite. A solution to these variations in the backscatter signal
is the terrain-flattened algorithm that can be used to reduce the terrain effects and thus to
retrieve “flattened” gamma observations [156]. Complementary studies were even able to
tackle the cloud occurrence problem in optical sensors, by fusing SAR and MSI data. Specif-
ically, the DSen2-CR model was based on an image reconstruction task, aiming to predict
the missing pixels, denoting as missing the ones covered by clouds. Moreover, we observed
that researchers have formulated their models with data from satellite missions that are
at the end of their operation. One reason for this is that in some cases it was important to
receive knowledge regarding the temporal change of the environment, and thus satellite
missions such as the Landsat (revealed in most of the studies) were used the most, as the
whole mission (from Landsat 4 to 8) was designed with the same characteristics, enabling
time-series observations. However, we claim that the StoA satellite missions, such as the
Sentinel missions and the Cube-based platforms (e.g., PlanetScope), give a new challenge in
the analysis of this countless data volumes and new perspectives regarding the possibility
of observations almost on daily basis.
addition to the general tendency to describe social media as a satisfactory data source,
limitations of citizen-generated news, the potential data loss in cases of internet connection
absence [104], and the uncertainty of the credibility of such content revealed the need to
address the lack of control that still exists [100].
From the model’s perspective, several studies (n = 8|[7,64,112–115,117,118]) applied
the transfer learning method (or domain adaptation [184]), as a potential solution to
overcome the sparse and small reference data. The major benefit of this method is that
with limited data samples, the model can be trained and fine-tuned, achieving optimal
performances in every new task. Yuan et al. [165] mentioned that the transfer learning
(TL) framework can be applied using two models; the regional-based TL model, which
generates a robust model that can be adapted in various regions, and the data-based TL
model that attempts to solve the problem of model generalisation and train with features of
multiple sensor images (indicating both satellites and the crowds). However, one limitation
was discussed by Chen et al. [113], in cases where the examined objects (e.g., buildings)
reveal similar spectral response with neighbouring objects (e.g., the road network) or a
completely different because of the diversity of the materials that are used. Both cases are
met in low-income regions as a substantial percentage of the urban population living in
slums, which are constructed with bricks and other spectrally similar materials to unpaved
roads. Additionally, the active learning (AL) algorithm was adopted by two studies [87,113]
with the first including both TL and AL in the deep-learning DF model. AL algorithms
propose an iterative process of data querying in order to identify the most representative
data, using only a small data sample. AL has been applied in studies that lacked labelled
data, giving users the samples that had the higher chance of being correctly annotated [87].
On the contrary, two challenges are observed when the AL and DL are incorporated: i.e.,
the labelling cost, as the AL initiates the training process with a small amount of data,
and the data uncertainty, on which the AL model is based. An additional constraint is the
data shortage of features. Obtaining training data from existing generalised datasets was
adopted in most articles, including data augmentation schemes of image rotations, features
with certain properties (e.g., spectral indices, textural features) and the use of generalised
datasets (e.g gold-standard benchmark datasets). In addition, deep-learning models, such
as the Fast Region-Based CNN or Faster R-CNN [118], the You-Only-Look-Once (YOLO),
and the Single-Shot-Detection (SSD) [7], have been introduced to reduce the burden of
manual annotation in object detection. Nevertheless, a growing research field targets the
development of more elaborate techniques to generate accurate training data that could
assist models to perform more efficient predictions.
perspective, the ability to assess the quality of the CS observations “on-the-spot” during
data collection campaigns seems to be an emerging trend. In the 5G era, the increasing
need for emergent crowdsourcing applications promotes the use of decentralised mobile
systems, as they can consume in real-time the “wisdom of the crowds” and provide a
direct validation mechanism [187]. Eventually, immersive platforms (e.g., Virtual Reality
(VR)/Augmented Reality (AR)) could have a critical role in citizens’ engagement and
training process. AR/VR mobile or web applications could be the “virtual globes” that
could assist citizens in familiarising themselves with the crowdsourcing task and thus
provide less noisy measurements [188].
As it was observed in the majority of articles, an unorthodox phenomenon is still
present; VGI data is on one hand embraced by scientists as the new and auspicious source of
information and, on the other hand, is frequently criticised for their discrepancies in quality.
Several approaches have been proposed to overcome this challenge, and are described
in detail in Section 4.4, including initial statistical methods (e.g., the majority of voting,
etc.), probabilities (Naïve Bayes binary classification), and the more complex frameworks
of deep-learning NNs. However, a small number of articles investigated the impact of
the active involvement of citizens and the behavioural and emotional factors that could
contribute to an influential citizens’ engagement. This way, it might be fruitful to shape
the direction of future research towards the identification of these societal needs. Ideally,
participation in CS projects should have a goal to strengthen citizens’ trust in science and
encourage multi-stakeholder partnerships. In this direction, it is mandatory to prioritise
citizens’ concerns and apply methods that could obtain a better understanding of the
engaged citizens. Such scenarios can result to person-centric approaches (e.g., Latent
Class Analysis-LCA and Experience Sampling Method-ESM) [189] and self-assessment
questions [136], formulated on the basis of different socioeconomic profiles (i.e., geographic,
demographic, non-cognitive and cognitive personal characteristics) [190] and incentives.
Therefore, Vohland et al. [189] stated that fruitful outcomes might be generated if such
methods could be leveraged in order to trace the heterogeneity in a group or between
homogeneous subgroups.
The future of citizen science should go beyond the limits of academic science, in order
to be in the position to shape innovation policies and sustainable governmental plans. Social
and governmental plans that aim to tangibly contribute to the achievement of Sustainable
Development Goals (SDGs) require a collaborative and holistic approach with all the
actors (i.e., citizens, policymakers, scientific communities, industrial and social actors) to
be engaged. Citizens can offer both bottom-up and top-down perspectives, generating
valuable knowledge to scientists and contributing to SDGs’ diversity (bottom-up). On the
other hand, citizens may be able to support the initiation of local and global communities
that could mobilise governments and businesses to take action (top-down). The role of
CS for SDGs has already been acknowledged. Nevertheless, proper transformations of
the SDGs to a “common language” is a mandatory activity in order to invest in citizens’
active and continuous involvement [191]. However, without funding schemes and tangible
benefits, long-term commitment of CS in SDGs pathways could be at risk. Finally, such
activities should be accompanied by relevant dissemination and democratisation of the
research findings, targeting the preservation of citizens’ independent views, their social
innovation and self-reflection on SDGs [192].
6. Conclusions
In this scoping review, we attempted to provide an overview of the data fusion models
that assimilated the remotely sensed and crowdsourced data streams, both of which have
emerged as promising, scalable and low-cost ways to provide insights in many domains.
The extraordinary increase in articles targeting this research field seized our interest to
explore the use cases, where the data is used, the technological equipment and tools,
under which a CS survey is leveraged, and eventually, the algorithms that integrate this
information, categorised under the data fusion abstraction levels. We carefully reviewed
Remote Sens. 2022, 14, 1263 44 of 66
the literature, following the guidelines of the systematic scoping review, and emphasised
the strengths and challenges of identified methods overcoming concerns related to data
quality, data sparsity, biases related to human cognitive level, and big data related obstacles.
Our analysis revealed the necessity of deploying a multi-sensory approach, combining the
traditional satellite observations and the next generation StoA missions to confront the
spatiotemporal gaps. On the contrary, big data solutions, such as cloud-based platforms,
high-performance computing, and datacubes, are a mandatory pathway to take in order
to address this countless data volume and the exploitation of EO data in less common
research domains.
The beneficial contribution of CS data is depicted across the literature, characterised
as a low-cost, low-intensity and scalable data source, which could overcome the limited
sample condition. Furthermore, automated object-based or regional-based classifiers and
active and transfer learning technologies seemed insightful methods, enabling the provi-
sion of an immense amount of annotated training datasets in only a fraction of the time.
However, Frank et al. [109] stated their concern regarding the future role of crowdsourcing
as the primary receiver of labelling data and the tendency to ultimately substitute users’ in
the mapping tasks with automated means [115]. Addressing this conclusion, we attempted
to sculpture the future role of citizens by initiating the concepts of collaborative partner-
ships with multiple actors, person-centered behavioural examination approaches, and
most importantly, the democratisation and vulgarisation of scientific-related information,
promoting their self-reflection in the existed criticalities and their actual contribution to
the mitigation and adaptation policies. Even though the combined exploitation of CS and
EO is still underrepresented in the literature, we have shown the huge potential that the
aforementioned could achieve. An exciting addition to AI-DF models would be to see the
valuable contribution of the geographical information to the quality improvement of the
data-driven models and their benefits to the physical environmental models.
Author Contributions: Conceptualization, A.T.; methodology, A.K.; software, A.K.; validation, A.K.
and G.T.; formal analysis, A.K.; investigation, A.K. and G.T.; resources, A.A.; writing—original draft
preparation, A.K.; writing—review and editing, A.K. and G.T.; visualization, A.K.; supervision, A.T.,
G.T. and A.A.; project administration, G.T.; funding acquisition, A.A. All authors have read and
agreed to the published version of the manuscript.
Funding: This research was funded by the DIONE project under grant number 8703788, supported
by European Union’s Horizon 2020 research and innovation program.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
ACC Accuracy
AFOLU Agriculture, Forestry And Other Land Use
AI Artificial Intelligence
AL Active Learning
AlexNet Convolutional Neural Network (CNN) Architecture, designed By Alex Krizhevsky
ALI Advanced Land Imager
ALS Airborne Laser Scanning
AOD Aerosol Optical Depth
AOT Aerosol Optical Thickness
API Application Programming Interface
APS Active Participatory Sensing
Remote Sens. 2022, 14, 1263 45 of 66
AR Augmented Reality
AR6 6th Assessment Report
ARC American Red Cross
ARD Analysis-Ready Data
AWS Amazon Web Service
BDF Bayesian Data Fusion
BME Bayesian Maximum Entropy
BSI Bare Soil Index
BuEI Built-Up Index
C/N0 Carrier-To-Noise Ratio
CART Classification And Regression Trees
CC Climate Change
CCA Canonical Correlation Analysis
CCF Canonical Correlation Forests
CCI-LC Climate Change Initiative Land Cover
CDOM Coloured Dissolved Organic Matter
CEOS Committee On Earth Observation Satellites
CGD Crowdsourced Geographic Data
CGI Crowdsourced Geographic Information
chl-a Chlorophyll-A
CHM Canopy Height Model
CIL Citizen Involved Level
CNN Convolutional Neural Network
CO Citizen’s Observatory
CS Citizen Science
CT Classification Task
CTI Compound Topographic Index
DA Data Assimilation
DB Database
DBSCAN Density-Based Spatial Clustering Of Applications With Noise
DCT Digitisation-Conflation Tasks
DEM Digital Elevation Model
DF Data Fusion
DMS-OLS Defense Meteorological Satellite Program’s Operational Linescan System
DP Developed Platforms
DSM Digital Surface Model
DTM Digital Terrain Model
EO Earth Observation
EO4SD Earth Observation For Sustainable Development
EOS End Of Season
EVI Enhanced Vegetation Index
EXIF Exchangeable Image File
FAO Food And Agriculture Organization
FCM Fuzzy C-Means
FCN Fully Connected Neural Network
FETA Fast And Effective Aggregator
FN False Negative
FP False Positive
FUI Forel-Ule Colour Index
FWW Freshwater Watch
GAN Generative Adversarial Networks
GBM Gradient Boosting Machines
GCC Green Chromatic Coordinate
GCOS Global Climate Observing System
GEE Google Earth Engine
GEO Group On Earth Observations
Remote Sens. 2022, 14, 1263 46 of 66
Appendix A
Table A1. Summary table of Vegetation Monitoring studies included in this scoping review. Abstraction levels of data fusion and Technological equipment of
CS data collection are described as assisted by the categories and abbreviations presented in Sections 3.5.1 and 3.5, respectively. Methods are categorised as the
following, AI: Artificial Intelligence, S: Statistical method, ES: Ensemble methods, FS: Fuzzy Association, and M: Mechanistic models. .
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
Overall Accuracy using Multinomial Law
DP/My Back Yard A manually digitised reference layer and a
[79] Aerial HF GeoBIA/S equation (73.5% classification result, and 76.63%
tool stratified separation of the dataset was used
CS valid decision)
RMSEs: mean tree height = 14.77–20.98% Mean
Sensors/GPS Out-of-Bag error estimation leveraging on DBH = 15.76–22.49%; plot-based basal areas =
[154] Satellite/GNSS LF RF Regression/ES
receivers reference data of forest attributes 15.76% and 33.95%, stem volume and AGB =
27.76–40.55% and 26.21–37.92%; R = 0.7 and 0.8
The combination of the GLONASS and GPS
Sensors/GPS Out-of-Bag error estimation comprising of
[77] Satellite/GNSS LF RF Regression/ES GNSS signals slightly better results than the
receivers reference data with forest attributes
single GPS values.
RMSE, bias, MAE and Pearson’s R evaluation
Linear change metrics were used, revealing a weak significance
[60] Satellite/MSI DP/NPN LF Ground-based phenological dates
point estimation/S when the CS data were used (Best result: r = 0.24,
MAE = 23.0, bias = −14.8, and RMSE = 28.0)
Green-up date RMSE unsystematic (RMSEu) and
Stratification based on the GLC2000 land cover
[73] Satellite/MSI DP/PlantWatch MUF L-Regression/S systematic (RMSEs) were 13.6 to 15.6 days, R and
map
p < 0.0001
Binary classification: Apparent error rate (GWR:
AER = 0.15), Sensitivity (CART: Sen = 0.844),
k-NN, Naïve Bayes,
Specificity (GWR: Spe = 0.925), pairwise
logistic regression,
[85] Satellite/LULC DP/Geo-Wiki HF 10-fold cross validation McNemar’s statistically significance test (GWR: p
GWR, CART/AI +
= 0.001); % Forest cover: GWR: AER = 0.099,
S
CART: Sen = 0.844, GWR: Spe = 0.927, GWR: p =
0.001
Remote Sens. 2022, 14, 1263 50 of 66
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
LULC OA forest recognition (89 %), specificity =
[84] Satellite/LULC DP/Geo-Wiki HF GWR/S Equal-area stratified random sampling 0.93, Sensitivity = 96%; Estimation of % forest
cover R2 = 0.81, RMSE = 19
Apparent error % Hybrid predicted dataset = 4
Validated with a random distribution of reference
[83] Satellite/LULC DP/Geo-Wiki HF GWR/S (Best score for Russia), and 5 for the Sakha
data
Republic
RMSE and MAE of the Day of the Year (DOY)
Hybrid/in situ and Sprint Plant Resubstitution error rate between the observed
[71] DP/NPN LF resulted in 11.48 and 6.61, respectively (Best
DEM Phenolog/M and the estimated days
model)
DP/ 53,000 CS data Convergence of (1) Stratified dataset from 2 CS campaigns. (1) Stratified dataset from 2 CS campaigns.
samples were evidence approach Finalised with experts’ evaluation. (2) 1033 pixels Finalised with experts’ evaluation. (2) 1033 pixels
[3] Satellite/LULC HF
gathered on the (IDW with high disagreement randomly classified by 2 with high disagreement randomly classified by 2
Geo-wiki platform interpolation)/S experts. experts.
(1) R2 >0.5 between field observations and
L-Regression
independent variables; (2) T-tests between CLaRe,
Spearman’s
Hybrid/Rainfall DP/ CS campaign (1) Reference data collected on the field, (2) native and invasive Buffelgrass depicted
[78] MUF correlation, Binary-
Satellite/MSI archived in NPN CLaRe metrics significant differences. (3) OA Buffelgrass
thresholding
classification was equal to 66–77%, with
classification/S
trade-offs in UA, PA accuracies
Completeness (C = 69.2 to 96.7), Thematic
accuracy metrics: positive predictive value (PPV
= 0.926 to 1.000), false positive (FDR = 0.037 to
0.550), false negative rate (FNR = 0.033 to 0.308),
Template match Validated with reference training samples
Other/CS F1 = 0.545 to 0.964, Positional accuracy metrics:
[46] Satellite/MSI MUF binary applying hold-out dividing the dataset in
campaign Diameter error for a correctly identified ITC (ed)
thresholding/S train-testing
RMSE for tree crown diameter measurement for
crowdsourced observation (RMSEcd = 0.67 to
0.93) and tree crown centroid position
measurement (RMSEcc = 0.28 to 1.01)
Remote Sens. 2022, 14, 1263 51 of 66
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
DP/Amazon’s ResNet34 CNN Hold-out dataset separation in train, validation OA = 0.9741, precision = 0.9351, recall = 0.9694,
[86] Aerial/MSI MUF
Mechanical Turk model/AI and test based on [193] method F1 = 0.9520
Best optimised parameters were determined with
Hybrid/MSI and DPtextbackslash 13 SPPs + Monte 95% confidence intervals for each parameter,
[72] rainfall (gridded NPN and MUF Carlo optimisation/ in situ phenological data using the chi-squared test. AICc was used to
and in situ) PH/PhenoCam M select the best model for all the species. R =
0.7–0.8, RMSE = 5–15, and Bias = ±10 days
Reference data set of 7500 sample points was
Satellite/SAR, MSI, Other/2 week CS Forest plantation area was estimated with high
[81] MUF CART/AI generated and stratified based on the first stage
DEM campaign overall accuracy (85%)
land cover classes
Rule #2 R2 = 0.29; p < 0.001. Median MODIS
Non-significant results, R2 = 0.014; p = 0.052; Rule values improved R2 = 0.67; p < 0.001. Lilac
[74] Satellite/MSI DP/NPN LF -
#1 R2 = 0.19; p < 0.001 revealed the greatest results R2 = 0.32–0.73; p <
0.001. R2 = 0.23 to 0.89 across species.
BDF has proven a suitable method as the
Satellite/LC Bayesian data Stratification keeping only those where CS data log-likelihood ratio (G2 ) and the chi-squared (χ2 )
[80] DP/Geo-Wiki HF
product fusion (BDF)/AI were at hand (i.e., 500) were 0.2511 and 5.991, corresponding to a p-value
= 0.8820. The OA of the cropland = 98.00%.
6 S models 1 , NN RF outperformed with r = 0.69, WIA = 0.79, MAE
[75] Satellite/MSI DP/NPN LF AI and RF 10-fold cross validation with the analogy of 80/20 = 12.46 days and RMSE = 17.26 days (for
regression ES deciduous forest)
1D and 3D
(A) 3-crop type classes: Accuracy: 74.2 ± 1.4%,
Satellite/MSI and CNN/AI and RF Feature Importance via Permutation; using also
[82] SP/Plantix MUF Precision = 75.9 ± 1.7%, Recall = 74.2 ± 1.4%, F1
SAR harmonic governance district statistics
= 73.7 ± 1.4%; (B) crop-type map accuracy = 72%
coefficients/ES
1 S: Threshold, amplitude threshold, delayed-moving window, first and higher-order derivate, function fitting method.
Remote Sens. 2022, 14, 1263 52 of 66
Table A2. Summary table of Land Use/ Land Cover studies included in this scoping review. Abstraction levels of data fusion and Technological equipment of
CS data collection are described as assisted by the categories and abbreviations presented in Sections 3.5.1 and 3.5, respectively. Methods are categorised as the
following, AI: Artificial Intelligence, S: Statistical method, ES: Ensemble methods, FS: Fuzzy Association, and M: Mechanistic models.
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
OA and kappa coefficient of urban land can reach
Stratified random validation using ground truth
[90] Satellite/MSI DP/OSM MUF CART, RF/ AI + ES 88.80% and kappa-coefficient 0.74 (1), and 97.08%
(1), Landsat-based LC maps (2)
and 0.92 (2)
5-fold CV, using a
[44] Satellite/MSI MUF SVM RBF/AI manually digitized OA = 93.4 and kappa coefficient = 0.913
LC
Scenario 4 with the general class-weighting factor
Probabilistic Resubstitution using the ATKIS national Digital predicted with OA = 78% and kappa: 0.67.
[95] Satellite/LC DP/OSM HF
Cluster Graphs/ES Landscape Model (1:250,000) Class-wise balanced accuracy produced more
accurate results
Generated two-hybrid maps with the first to
Stratification scheme based on Zhao et al. [194] perform better (OA: 87.9 %). The tree-cover land
[96] Satellite/LC DP/Geo-Wiki HF Logistic-GWR/S using an independent validation dataset, and 2 class presented the best results among all classes,
Geo-Wiki campaigns in both maps, PA_1: 0.95, UA_1: 0.94 and PA_2:
0.97, UA_2: 0.93
ISODATA and The confusion matrix of the LCC classification
Satellite/SAR and Estimation Stratification random sampling approach (50 map shows an OA = 90.2%. “Tree cover” and
[91] DP/OSM MUF
MSI fractional cover of validation points per LCC class) “cropland/grassland” reveal the greatest
LC/S confusion in change classes.
LULC (Landsat-5 TM + CS LULC map), with OA
Naïve Bayes CS
Stratified the train/testing dataset having an and kappa equal to 70% and 0.625, respectively.
[88] Satellite/MSI PH/Flickr MUF LULC, Maximum
analogy of 70/30 The accuracy level was mostly driven by urban
likelihood/AI
areas.
DT C.45 gave the lowest OA values. In the LULC
NB, DT-C.45/AI, (a) Random Subsampling-Hold out: 50/50 (b) system, NB achieved OA = 72.0%, followed by
[45] Satellite/MSI DP/OSM HF
RF/ES SMOTE stratification method RF in the 5 and 4-classes LULC system (OA =
81.0% and OA = 84.0%)
Remote Sens. 2022, 14, 1263 53 of 66
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
SM/SINA WEIBO Binary LU Level I classes: OA = 81.04% and
Random Subsampling-Hold out, based on 289
[92] Satellite/MSI and Baidu HL Thresholding, kappa-coefficient = 0.78; LU Level II classes: OA
visual inspected testing parcels
DP/OSM Similarity Index/S = 69.89% and kappa-coefficient = 0.68%
Other/Focus Statistical Confusion matrices were computed in the sense
[21] Satellite/MSI HF Resubstitution method using reference data
groups analysis/S of measuring the deviation in LU-class coverage
LCZ OA = 74.94%, kappa = 0.71 (the greater score
RF (RoFs and DT), 15 randomly defined training datasets (10
using RF method, and experts handcrafted
[89] Satellite/MSI DP/OSM MUF XG- Boost, and uniformly divided, 5 last extracting 500 samples
features). The classification accuracy increased
CNN/ ES for all the classes
with the imbalanced dataset
Out-of-bag error = 4.8% and OA = 95.2%, with
Random Subsampling-Hold out randomly
Hybrid/MSI and 95.1% of the forested areas to be correctly
[93] DP/OSM HF RF Classifier/ES selecting validation points, splitted in training
DEM classified. Low user’s accuracy of 57.4% was
and testing dataset
revealed on commercial lands
[136] Satellite/MSI DP/Google Earth MUF RF Classifier/ES Reference dataset by LCZ expert OA = 0 to 0.75 (produced by several iterations)
OA is based on the mean and standard deviation
DP/OSM SM /FB SVM and DCNN Transfer performance learning with UC-Merced
[64] Satellite/MSI MUF of all the results. OA_train = >90% and OA_test =
and Twitter /AI dataset and 10-fold Cross-Validation
86.32% and 74.13%
Ac-L-PBAL and
260 labelled samples were provided by RS Three-cross-validation technique of the 3
[87] Satellite/MSI DP/Crowd4RS MUF SVM RBF, RF,
experts algorithms. OA = 60.18%, 59.23% and 55%
k-NN/AI
5 tests were evaluated with the final one to reveal
Stratified random sampling of 200 points per
[2] Satellite/MSI DP/OSM HF Fuzzy logic/FS OA = 70%. Population true proportions (pi) [195],
class
UA and PA per class also calculated.
OA = 78% (study area A) and OA = 69% (study
[94] Satellite/MSI DP/OSM MUF RF Classifier/ES Stratified random sampling
area B)
Remote Sens. 2022, 14, 1263 54 of 66
Table A3. Summary table of Natural Hazards studies included in this scoping review. Abstraction levels of data fusion and Technological equipment of CS data
collection are described as assisted by the categories and abbreviations presented in Sections 3.5.1 and 3.5, respectively. Methods are categorised as the following, AI:
Artificial Intelligence, S: Statistical method, ES: Ensemble methods, FS: Fuzzy Association, and M: Mechanistic models.
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
Log-Likelihood, chi-squared, and p-value tests,
Multivariate
Hybrid/Satellite DP/VGI mappers revealed −15952 465.94, and 2.2e−16 16 for the
[101] HF Logistic Resubstitution DFO map
and in situ using Google Earth outside of the city model, and −7534, 273.41 and
regression/S
2.2e−16 for the inside.
Univariate
[110] Aerial Sensors LF Resubstitution between CS and DOE R2 = 0.87
L-Regression/S
Estimation of feature importance of the model.
DP/QGIS, Pixel-based classification seemed more noise
[109] Aerial Web-based MUF RF Classifier/ES Hold out human-labelled training data resistant with the AUC to drop 2.02, 7.3, and 0.85
eCognition % for NAR, label noise and random noise when
40% of the damage is contaminated.
Getis–Ord Evaluation of statistical significance based on
[104] Satellite/DEM SM/Twitter HF Using an elevation threshold
Gi*-statistic/S z-score values ± 1.64
Multivariate Backward stepwise significance method (Wald
Hybrid/in situ, Resubstitution with Wuhan water authority flood
[100] SM/Baidu HF logistic chi-square test = 0.001), AUC = 0.954, asymptotic
DEM risk maps
regression/S significance = 0.000
Satellite/DTM, SM/YouTube and Resubstitution using the observed water depth Nash–Sutcliffe efficiency (NSE) > 0.90 and R >
[98] LF EnKF/S
LiDAR Twitter values 0.97 (Best results)
ROC for Flickr images was 0.6 (train) and 0.55
(test). EO was 0.88 and 0.90, and AUROC
[102] Satellite/MSI, DEM PH/Flickr MUF WoE/S Random Subsampling-Hold out with 120 points
posterior = 0.95 and 0.93. Pair-wise conditional
independence ratio = 6.38
DP/JORD app, Matthews correlation coefficient for the binary
SM/Twitter, classification = 0.805. OA = 0.913, precision =
[103] Satellite/MSI MUF V-GAN/AI Two-fold cross-validation strategy
YouTube, 0.883, recall = 0.862, specificity = 0.943, and F1 =
PH/Flickr, Google 0.870
Remote Sens. 2022, 14, 1263 55 of 66
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
4 experiments were evaluated using NSE and
Resubstitution error rate between the reference
[99] in situ Sensor, Smartphone LF Kalman Filter/S Bias’s indexes. However, no general conclusions
and simulated values
can be derived.
Canonical correlation (F = 2.77 and p < 0.001)),
Mahalanobis Wilk’s lambda = 0.007. RMSD investigated
[107] Satellite/MSI, DEM Smartphone MUF LOOCV leveraging on the reference dataset
K-NN/AI professional (P), non-professional (NP), raw data
(R) and corrected by experts (C) 2
Hybrid/Satellite Flood-fill in situ reference measurements and experts’
[105] Smartphone MUF Kappa-coefficient = 0 to 0.746
and in situ simulation/S quality control
2 Best results: RMSD was estimated for the indicators related to the wildland-urban interface (WUI) fuel model, which are Conifer crown closure (All = 22.5%), Conifer crown base (P =
1.9 m), Suppressed conifers (All = 114.0 stems/ha), Surface vegetation (All = 21.6%), Large woody debris (All = 28.3%), Fine woody debris (All = 33.2%).
Table A4. Summary table of Urban Monitoring studies included in this scoping review. Abstraction levels of data fusion and Technological equipment of CS data
collection are described as assisted by the categories and abbreviations presented in Sections 3.5.1 and 3.5, respectively. Methods are categorised as the following, AI:
Artificial Intelligence, S: Statistical method, ES: Ensemble methods, FS: Fuzzy Association, and M: Mechanistic models.
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
OA = 89%, Sensitivity = 73%, Precision = 89%,
Logistic
[57] Satellite/MSI DP/OSM HF Validation by experts F1-score = 0.80, and 25% volunteers contribute
regression/S
80% of the classification result
The best model “ISRPS Gold standard data” with
Hold-out equally divided the dataset into average F1 = 0.874, Precision = 0.910, Recall =
[115] Satellite/MSI DP/OSM MUF FCN (VGG-16)/AI
train-test-validation 0.841 and F1 = 0.913, 0.764, 0.923 for the building,
road, and background classes.
Evaluation of CS traces using Shapiro-Wilk,
Hybrid/GNSS, RIPPER, PART, 10-fold Cross-Validation (CV) and Visual
[116] Smartphone LF skewness and kurtosis tests. RIPPER’s best
DEM, LC, Aerial M5Rule, OneR/AI inspection
results: Precision = 0.79, recall = 0.79 , F1 = 0.79
Remote Sens. 2022, 14, 1263 56 of 66
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
DP/OSM, Precision, Recall, F1, and AUC of combined CS
LeNet, AlexNet, Random Subsampling Hold-out with MapSwipe
[113] Satellite/MSI MapSwipe, HL data were higher with an average of 9.0%, 4.0%,
VggNet/AI and OsmAnd as ground-truth
OsmAnd and 5.7%
OA was increased from 74.13% to 91.44%.
FCNN, LSTM,
DP/OSM; Building and road categories were increased by
[114] Satellite/MSI MUF RNN (1)/AI; SVM, Random Subsampling Hold-out (4:1 train/test)
SM/Baidu around 1% (1); OA_SVM = 72.5%, LDA = 81.6%,
LDA (2)/AI
PM = 93.1% (2)
U-Net, LinkNet, Fusion RS with the outputs of road centerline Completeness (COM), Correctness (COR) and
[117] Satellite/MSI Smartphone/GNSS HF
D-LinkNet/AI extraction Quality (Q) equal to 0.761, 0.840, 0.672
Other/Focus
Recall, Precision, F1, MaF1 evaluation metrics
[118] Satellite/LiDAR Groups and MUF Faster R-CNN/AI Model division in Train, Test, Validation
presented in [196]
DP/Zooniverse
Logistic and beta regressions were applied for the
precision (0.70 ± 0.11) and recall (0.84 ± 0.12)
kd-tree-based Stratification based on population density classes, metrics, evaluating the village boundaries of (1)
[111] Satellite/MSI DP/Tomnod MUF hierarchical and reference data, created by citizens’ visual the examined areas, and (2) six cities of the globe.
classification/AI inspection Evaluation of the Population density by the
normalised correlation (NC), and the normalised
absolute difference (ND).
Specificity, Sensitivity, OA (ACC), Matthew’s
DeepVGI model
correlation coefficient (MCC) metrics. DeepVGI
DP/OSM, (Single Shot Hold out dataset division in train and test.
[7] Satellite/MSI HF model revealed similar results to MapSwipe, i.e.,
MapSwipe Detection (SSD) Maximum training epochs was set to 60000
OA = 91–96%, MCC = 74–84%, specificity =
CNN)/AI
95–97%, and sensitivity = 81–89%
Ensemble model revealed the best results in both
Nigeria and Guatemala, OA_N = 94.4%, OA_G =
Bagging CNN, V3, 96.4%, and F1_N = 92.2%, F1_G = 96.3%. Transfer
Hybrid/MSI and Hold-out (85//15 train/test) using a reference
[112] DP/OSM MUF VGG16/ES(1); SL Learning models performed with OA_N = 93%
LULC dataset.
models 3 /AI(2) and OA_G = 95%, Human benchmark (OA_N =
94.5%, OA_G = 96.4%). AdaBoost and logistic
regression performed better.
3 7 classification algorithms: Decision trees, Gradient boosting trees, AdaBoost, Random Forest, logistic regression, support vector machines, and k-nearest neighbour
Remote Sens. 2022, 14, 1263 57 of 66
Table A5. Summary table of Air monitoring studies included in this scoping review. Abstraction levels of data fusion and Technological equipment of CS data
collection are described as assisted by the categories and abbreviations presented in Sections 3.5.1 and 3.5, respectively. Methods are categorised as the following, AI:
Artificial Intelligence, S: Statistical method, ES: Ensemble methods, FS: Fuzzy Association, and M: Mechanistic models.
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
1-way ANOVA (Kruskal-Wallis test), pairwise
Non-parametric 2-sided Wilcoxon-Mann-Whitney and
Hybrid/DEM,
[121] Sensors MUF significant tests Reference weather stations Holm-Bonferroni. Difference CS = 1.0 K and
LULC and in situ
LCZ/S Reference = 1.8 K; Difference between the 2
datasets ≤ ± 0.2–2.9 K
Hybrid/in situ and in situ PM2.5 US EPA Air Quality concentrations,
[119] Sensors LF L-Regression/S R2 = 0.04
Satellite and Aeronet AOD
Annual mean, daily maximum and minimum
RF Regression;
Tair can be mapped with an RMSE = 0.52◦ C ,
Satellite/MSI + ANOVA test for the Hold-out (70/30) with bootstrapping repeatedly
[120] Sensors LF R2 = 0.5), 1.85◦ C (RR2 = 0.5) and 1.46◦ C
LiDAR explanatory occurred 25 times
(R2 = 0.33); Transferability revealed an RMSE =
variables/ES
0.02◦ C difference.
RMSE = 0.04 K and Mean Bias = 0.06 K (p =
Hybrid/LULC and S/Personal Compared with measurements from in situ
[97] MUF WRF model/M 0.001). Weak positive correlation (r = 0.28) of the
in situ Weather Station weather stations
elevation with variations in model’s performance
Table A6. Summary table of Ocean/marine monitoring studies included in this scoping review. Abstraction levels of data fusion and Technological equipment of
CS data collection are described as assisted by the categories and abbreviations presented in Sections 3.5.1 and 3.5, respectively. Methods are categorised as the
following, AI: Artificial Intelligence, S: Statistical method, ES: Ensemble methods, FS: Fuzzy Association, and M: Mechanistic models.
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
Spearman rank coefficient; R2 = 0.25, 0.32; mean
Multi-Linear
[124] Satellite/MSI Smartphone HF - absolute difference = 21 ± 16%, 22 ± 16%), mean
Regression/S
difference = 6 ± 26%, 6 ± 27%
Spearman rank coefficient; Adjusted R2 = 0.50,
Backward step
standard error 0.274 with stream distance
[122] Satellite/MSI Smartphone MUF Multi-Linear- Field measurements
weighted mixed forest and urban, and riparian
Regression/S
percentage
Turbidity: OA = 65.8%, kappa = 0.32, Error
out-of-bag using
[123] Satellite/MSI, good—bad evaluation 34.6% and 36.0%; Nitrate:
Sensor MUF RF Regression/ES 10–fold
DEM 70.3%, 0.26, 58.8%, 14.7%; Phosphate: 71.8%, 0.39,
cross-validation
50.0%, 22.6%
Remote Sens. 2022, 14, 1263 58 of 66
Table A7. Summary table of Humanitarian and crisis response studies included in this scoping review. Abstraction levels of data fusion and Technological equipment
of CS data collection are described as assisted by the categories and abbreviations presented in Sections 3.5.1 and 3.5, respectively. Methods are categorised as the
following, AI: Artificial Intelligence, S: Statistical method, ES: Ensemble methods, FS: Fuzzy Association, and M: Mechanistic models.
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
Averaged density Random sample selection; Visual inspection by
[125] Satellite/MSI DP/Zooniverse HF OA = 99.6%; Error of experts = 0.4%; of CS = 6.1%
of kilns/S an Independent adjudicator
Logistic Statistical significance of all models is below 5%.
[126] Satellite/Nightlights DP/Syria tracker MUF -
regression/S Best model result AIC = 127967
Table A8. Summary table of Soil moisture studies included in this scoping review. Abstraction levels of data fusion and Technological equipment of CS data
collection are described as assisted by the categories and abbreviations presented in Sections 3.5.1 and 3.5, respectively. Methods are categorised as the following, AI:
Artificial Intelligence, S: Statistical method, ES: Ensemble methods, FS: Fuzzy Association, and M: Mechanistic models.
Reference EO Data Source of CS DF Level Method Validation Evaluation Metric and Score
multiple regression analysis MLR_R2 = 0.19 to 0.35 and MLR_RMSE =
low-cost soil moisture leave-one-out
[69] Satellite/SAR, MSI, DEM LF (MLR), regression-kriging and 5.86 to 4.14; regression kriging RMSE =
sensorsc S cross-validation
cokriging/S 1.92–4.39; cokriging RMSE = 4.61–6.16
Remote Sens. 2022, 14, 1263 59 of 66
References
1. Rogelj, J.; Shindell, D.; Jiang, K.; Fifita, S. Mitigation Pathways Compatible with 1.5 °C in the Context of Sustainable Development;
Technical Report; Intergovernmental Panel on Climate Change: Geneva, Switzerland, 2018.
2. Fonte, C.C.; Lopes, P.; See, L.; Bechtel, B. Using OpenStreetMap (OSM) to enhance the classification of local climate zones in the
framework of WUDAPT. Urban Clim. 2019, 28, 100456.
3. Fritz, S.; See, L.; Mccallum, I.; You, L.; Bun, A.; Moltchanova, E.; Duerauer, M.; Albrecht, F.; Schill, C.; Perger, C.; et al. Mapping
global cropland and field size. Glob. Chang. Biol. 2015, 21, 1980–1992.
4. Song, Y.; Huang, B.; Cai, J.; Chen, B. Dynamic assessments of population exposure to urban greenspace using multi-source big
data. Sci. Total Environ. 2018, 634, 1315–1325.
5. Huang, Q.; Cervone, G.; Zhang, G. A cloud-enabled automatic disaster analysis system of multi-sourced data streams: An
example synthesizing social media, remote sensing and Wikipedia data. Comput. Environ. Urban Syst. 2017, 66, 23–37.
https://doi.org/10.1016/j.compenvurbsys.2017.06.004.
6. Masson, V.; Heldens, W.; Bocher, E.; Bonhomme, M.; Bucher, B.; Burmeister, C.; de Munck, C.; Esch, T.; Hidalgo, J.; Kanani-Sühring,
F.; et al. City-descriptive input data for urban climate models: Model requirements, data sources and challenges. Urban Clim.
2020, 31, 100536.
7. Herfort, B.; Li, H.; Fendrich, S.; Lautenbach, S.; Zipf, A. Mapping human settlements with higher accuracy and less volunteer
efforts by combining crowdsourcing and deep learning. Remote Sens. 2019, 11, 1799.
8. Leal Filho, W.; Echevarria Icaza, L.; Neht, A.; Klavins, M.; Morgan, E.A. Coping with the impacts of urban heat islands. A
literature based study on understanding urban heat vulnerability and the need for resilience in cities in a global climate change
context. J. Clean. Prod. 2018, 171, 1140–1149. https://doi.org/10.1016/j.jclepro.2017.10.086.
9. Donratanapat, N.; Samadi, S.; Vidal, J.M.; Sadeghi Tabas, S. A national scale big data analytics pipeline to assess the
potential impacts of flooding on critical infrastructures and communities. Environ. Model. Softw. 2020, 133, 104828.
https://doi.org/10.1016/j.envsoft.2020.104828.
10. Duro, R.; Gasber, T.; Chen, M.M.; Sippl, S.; Auferbauer, D.; Kutschera, P.; Bojor, A.I.; Andriychenko, V.; Chuang, K.Y.S. Satellite
imagery and on-site crowdsourcing for improved crisis resilience. In Proceedings of the 2019 15th International Conference on
Telecommunications (ConTEL), Graz, Austria, 3–5 July 2019; pp. 1–6.
11. Foody, G.M.; Ling, F.; Boyd, D.S.; Li, X.; Wardlaw, J. Earth observation and machine learning to meet Sustainable Development
Goal 8.7: Mapping sites associated with slavery from space. Remote Sens. 2019, 11, 266.
12. Goodchild, M.F. Citizens as sensors: The world of volunteered geography. GeoJournal 2007, 69, 211–221. https://doi.org/10.1007/
s10708-007-9111-y.
13. See, L.; Fritz, S.; Perger, C.; Schill, C.; McCallum, I.; Schepaschenko, D.; Duerauer, M.; Sturn, T.; Karner, M.; Kraxner, F.; et al.
Harnessing the power of volunteers, the internet and Google Earth to collect and validate global spatial information using
Geo-Wiki. Technol. Forecast. Soc. Chang. 2015, 98, 324–335.
14. Dong, J.; Metternicht, G.; Hostert, P.; Fensholt, R.; Chowdhury, R.R. Remote sensing and geospatial technologies in
support of a normative land system science: Status and prospects. Curr. Opin. Environ. Sustain. 2019, 38, 44–52.
https://doi.org/10.1016/j.cosust.2019.05.003.
15. Mazumdar, S.; Wrigley, S.; Ciravegna, F. Citizen science and crowdsourcing for earth observations: An analysis of stakeholder
opinions on the present and future. Remote Sens. 2017, 9, 87. https://doi.org/10.3390/rs9010087.
16. Sudmanns, M.; Tiede, D.; Lang, S.; Bergstedt, H.; Trost, G.; Augustin, H.; Baraldi, A.; Blaschke, T. Big Earth data:
Disruptive changes in Earth observation data management and analysis? Int. J. Digit. Earth 2020, 13, 832–850.
https://doi.org/10.1080/17538947.2019.1585976.
17. Hsu, A.; Khoo, W.; Goyal, N.; Wainstein, M. Next-Generation Digital Ecosystem for Climate Data Mining and Knowledge Discov-
ery: A Review of Digital Data Collection Technologies. Front. Big Data 2020, 3, 29. https://doi.org/10.3389/fdata.2020.00029.
18. Salcedo-Sanz, S.; Ghamisi, P.; Piles, M.; Werner, M.; Cuadra, L.; Moreno-Martínez, A.; Izquierdo-Verdiguier, E.; Mu noz-Marí, J.;
Mosavi, A.; Camps-Valls, G. Machine learning information fusion in Earth observation: A comprehensive review of methods,
applications and data sources. Inf. Fusion 2020, 63, 256–272. https://doi.org/10.1016/j.inffus.2020.07.004.
19. Meng, T.; Jing, X.; Yan, Z.; Pedrycz, W. A survey on machine learning for data fusion. Inf. Fusion 2020, 57, 115–129.
https://doi.org/10.1016/j.inffus.2019.12.001.
20. Fritz, S.; Fonte, C.C.; See, L. The role of Citizen Science in Earth Observation. Remote Sens. 2017, 9, 357.
21. Mialhe, F.; Gunnell, Y.; Ignacio, J.A.F.; Delbart, N.; Ogania, J.L.; Henry, S. Monitoring land-use change by combining participatory
land-use maps with standard remote sensing techniques: Showcase from a remote forest catchment on Mindanao, Philippines.
Int. J. Appl. Earth Obs. Geoinfor. 2015, 36, 69–82.
22. Goodchild, M. NeoGeography and the nature of geographic expertise. J. Locat. Based Serv. 2009, 3, 82–96. https://doi.org/10.1080/
17489720902950374.
23. Kosmidis, E.; Syropoulou, P.; Tekes, S.; Schneider, P.; Spyromitros-Xioufis, E.; Riga, M.; Charitidis, P.; Moumtzidou, A.; Pa-
padopoulos, S.; Vrochidis, S.; et al. HackAIR: Towards raising awareness about air quality in Europe by developing a collective
online platform. ISPRS Int. J. Geo-Inf. 2018, 7, 187.
Remote Sens. 2022, 14, 1263 60 of 66
24. Tserstou, A. SCENT: Citizen sourced data in support of environmental monitoring. In Proceedings of the 2017 21st In-
ternational Conference on Control Systems and Computer Science, Bucharest, Romania, 29–31 May 2017; pp. 612–616.
https://doi.org/10.1109/CSCS.2017.93.
25. Martinelli, M.; Moroni, D. Volunteered geographic information for enhanced marine environment monitoring. Appl. Sci. 2018, 8,
1743.
26. Havas, C.; Resch, B.; Francalanci, C.; Pernici, B.; Scalia, G.; Fernandez-Marquez, J.L.; Van Achte, T.; Zeug, G.; Mondardini, M.R.R.;
Grandoni, D.; et al. E2mC: Improving emergency management service practice through social media and crowdsourcing analysis
in near real time. Sensors 2017, 17, 2766.
27. Kovács, K.Z.; Hemment, D.; Woods, M.; Velden, N.K.V.D.; Xaver, A.; Giesen, R.H.; Burton, V.J.; Garrett, N.L.; Zappa, L.; Long,
D.; et al. Citizen observatory based soil moisture monitoring—The GROW example. Hung. Geogr. Bull. 2019, 2, 119–139.
28. Grainger, A. Citizen observatories and the new Earth observation science. Remote Sens. 2017, 9, 153. https://doi.org/10.3390/
rs9020153
29. See, L.; Mooney, P.; Foody, G.; Bastin, L.; Comber, A.; Estima, J.; Fritz, S.; Kerle, N.; Jiang, B.; Laakso, M.; et al. Crowdsourcing,
citizen science or volunteered geographic information? The current state of crowdsourced geographic information. ISPRS Int. J.
Geo-Inf. 2016, 5, 55. https://doi.org/10.3390/ijgi5050055.
30. Dell’Acqua, F.; De Vecchi, D. Potentials of active and passive geospatial crowdsourcing in complementing sentinel data and
supporting copernicus service portfolio. Proc. IEEE 2017, 105, 1913–1925. https://doi.org/10.1109/JPROC.2017.2727284.
31. See, L.; Fritz, S.; Dias, E.; Hendriks, E.; Mijling, B.; Snik, F.; Stammes, P.; Vescovi, F.D.; Zeug, G.; Mathieu, P.P.; et al. Supporting
Earth-Observation Calibration and Validation: A new generation of tools for crowdsourcing and citizen science. IEEE Geosci.
Remote Sens. Mag. 2016, 4, 38–50. https://doi.org/10.1109/MGRS.2015.2498840.
32. Poblet, M.; García-Cuesta, E.; Casanovas, P. Crowdsourcing roles, methods and tools for data-intensive disaster management. Inf.
Syst. Front. 2018, 20, 1363–1379. https://doi.org/10.1007/s10796-017-9734-6.
33. Sagl, G.; Resch, B.; Blaschke, T. Contextual Sensing: Integrating Contextual Information with Human and Technical Geo-Sensor
Information for Smart Cities. Sensors 2015, 15, 13. https://doi.org/10.3390/s150717013.
34. Resch, B. People as sensors and collective sensing-contextual observations complementing geo-sensor network measure-
ments. In Lecture Notes in Geoinformation and Cartography; Springer: Berlin/Heidelberg, Germany, 2013; pp. 391–406.
https://doi.org/10.1007/978-3-642-34203-5_22.
35. Ghermandi, A.; Sinclair, M. Passive crowdsourcing of social media in environmental research: A systematic map. Glob. Environ.
Chang. 2019, 55, 36–47. https://doi.org/10.1016/j.gloenvcha.2019.02.003.
36. Tracewski, L.; Bastin, L.; Fonte, C.C. Repurposing a deep learning network to filter and classify volunteered photographs for land
cover and land use characterization. Geo-Spat. Inf. Sci. 2017, 20, 252–268.
37. Ghamisi, P.; Rasti, B.; Yokoya, N.; Wang, Q.; Hofle, B.; Bruzzone, L.; Bovolo, F.; Chi, M.; Anders, K.; Gloaguen, R.; et al. Multisource
and multitemporal data fusion in remote sensing: A comprehensive review of the state of the art. IEEE Geosci. Remote Sens. Mag.
2019, 7, 6–39. https://doi.org/10.1109/MGRS.2018.2890023.
38. Dalla Mura, M.; Prasad, S.; Pacifici, F.; Gamba, P.; Chanussot, J.; Benediktsson, J.A. Challenges and Opportunities of Multimodality
and Data Fusion in Remote Sensing. Proc. IEEE 2015, 103, 1585–1601. https://doi.org/10.1109/JPROC.2015.2462751.
39. White, F.E. Data Fusion Lexicon; Joint Directors of Labs: Washington, DC, USA, 1991.
40. Castanedo, F. A review of data fusion techniques. Sci. World J. 2013, 2013, 19. https://doi.org/10.1155/2013/704504.
41. Schmitt, M.; Zhu, X.X. Data Fusion and Remote Sensing: An ever-growing relationship. IEEE Geosci. Remote Sens. Mag. 2016, 4,
6–23. https://doi.org/10.1109/MGRS.2016.2561021.
42. Estes, L.D.; McRitchie, D.; Choi, J.; Debats, S.; Evans, T.; Guthe, W.; Luo, D.; Ragazzo, G.; Zempleni, R.; Caylor, K.K. A
platform for crowdsourcing the creation of representative, accurate landcover maps. Environ. Model. Softw. 2016, 80, 41–53.
https://doi.org/10.1016/j.envsoft.2016.01.011.
43. Li, J.; Benediktsson, J.A.; Zhang, B.; Yang, T.; Plaza, A. Spatial Technology and Social Media in Remote Sensing: A Survey. Proc.
IEEE 2017, 105, 1855–1864. https://doi.org/10.1109/JPROC.2017.2729890.
44. Wan, T.; Lu, H.; Lu, Q.; Luo, N. Classification of High-Resolution Remote-Sensing Image Using OpenStreetMap Information.
IEEE Geosci. Remote Sens. Lett. 2017, 14, 2305–2309.
45. Johnson, B.A.; Iizuka, K. Integrating OpenStreetMap crowdsourced data and Landsat time-series imagery for rapid land
use/land cover (LULC) mapping: Case study of the Laguna de Bay area of the Philippines. Appl. Geogr. 2016, 67, 140–149.
https://doi.org/10.1016/j.apgeog.2015.12.006.
46. Vahidi, H.; Klinkenberg, B.; Johnson, B.A.; Moskal, L.M.; Yan, W. Mapping the individual trees in urban orchards by incorporating
Volunteered Geographic Information and very high resolution optical remotely sensed data: A template matching-based approach.
Remote Sens. 2018, 10, 1134.
47. Daudt, H.M.; Van Mossel, C.; Scott, S.J. Enhancing the scoping study methodology: A large, inter-professional team’s experience
with Arksey and O’Malley’s framework. BMC Med. Res. Methodol. 2013, 13, 48. https://doi.org/10.1186/1471-2288-13-48.
48. Arksey, H.; O’Malley, L. Scoping studies: Towards a methodological framework. Int. J. Soc. Res. Methodol. Theory Pract. 2005, 8,
19–32. https://doi.org/10.1080/1364557032000119616.
Remote Sens. 2022, 14, 1263 61 of 66
49. Tricco, A.C.; Lillie, E.; Zarin, W.; Brien, K.O.; Colquhoun, H.; Kastner, M.; Levac, D.; Ng, C.; Sharpe, J.P.; Wilson, K.; et al. A scoping
review on the conduct and reporting of scoping reviews. BMC Med. Res. Methodol. 2016, 16, 15. https://doi.org/10.1186/s12874-
016-0116-4.
50. Munn, Z.; Peters, M.D.; Stern, C.; Tufanaru, C.; McArthur, A.; Aromataris, E. Systematic review or scoping review? Guidance
for authors when choosing between a systematic or scoping review approach. BMC Med. Res. Methodol. 2018, 18, 143.
https://doi.org/10.1186/s12874-018-0611-x.
51. Peters, M.D.; Godfrey, C.M.; Khalil, H.; McInerney, P.; Parker, D.; Soares, C.B. Guidance for conducting systematic scoping
reviews. Int. J. -Evid.-Based Healthc. 2015, 13, 141–146. https://doi.org/10.1097/XEB.0000000000000050.
52. Saralioglu, E.; Gungor, O. Crowdsourcing in Remote Sensing: A review of applications and future directions. IEEE Geosci. Remote
Sens. Mag. 2020, 8, 1–23. https://doi.org/10.1109/MGRS.2020.2975132.
53. Howard, B.E.; Phillips, J.; Miller, K.; Tandon, A.; Mav, D.; Shah, M.R.; Holmgren, S.; Pelch, K.E.; Walker, V.; Rooney, A.A.; et al.
SWIFT-Review: A text-mining workbench for systematic review. Syst. Rev. 2016, 5, 87. https://doi.org/10.1186/s13643-016-
0263-z.
54. Jonnalagadda, S.; Petitti, D. A new iterative method to reduce workload in systematic review process. Int. J. Comput. Biol. Drug
Des. 2013, 6, 5–17. https://doi.org/10.1504/IJCBDD.2013.052198.
55. Levac, D.; Colquhoun, H.; O’Brien, K.K. Scoping studies: Advancing the methodology. Implement. Sci. 2010, 5, 69.
https://doi.org/10.1017/cbo9780511814563.003.
56. Tobler, W.; Barbara, S. Measuring Spatial Resolution. In Proceedings of the Land Resources Information Systems Conference,
Beijing, China, 1987; pp. 12–16. Available online: https://www.researchgate.net/profile/Waldo-Tobler/publication/291877360_
Measuring_spatial_resolution/links/595ef94ba6fdccc9b17fe8ee/Measuring-spatial-resolution.pdf (accessed on 25 January 2022).
57. de Albuquerque, J.P.; Herfort, B.; Eckle, M. The tasks of the crowd: A typology of tasks in geographic information crowdsourcing
and a case study in humanitarian mapping. Remote Sens. 2016, 8, 859.
58. Zhang, J. Multi-source remote sensing data fusion: Status and trends. Int. J. Image Data Fusion 2010, 1, 5–24. https://doi.org/
10.1080/19479830903561035
59. Emilien, A.V.; Thomas, C.; Thomas, H. UAV & satellite synergies for optical remote sensing applications: A literature review. Sci.
Remote Sens. 2021, 3, 100019. https://doi.org/10.1016/j.srs.2021.100019.
60. Xie, Y.; Wilson, A.M. Change point estimation of deciduous forest land surface phenology. Remote Sens. Environ. 2020, 240,
111698.
61. Jose Marıa, C.; Edward, C.; Wahlster, W. New Horizons for a Data-Driven Economy, 1st ed.; Springer: Berlin/Heidelberg, Germany,
2005; p. 312. https://doi.org/10.1007/978-3-319-21569-3.
62. Belgiu, M.; Csillik, O. Sentinel-2 cropland mapping using pixel-based and object-based time-weighted dynamic time warping
analysis. Remote Sens. Environ. 2018, 204, 509–523. https://doi.org/10.1016/j.rse.2017.10.005.
63. Lapierre, N.; Neubauer, N.; Miguel-Cruz, A.; Rios Rincon, A.; Liu, L.; Rousseau, J. The state of knowledge on technologies and
their use for fall detection: A scoping review. Int. J. Med. Inform. 2018, 111, 58–71. https://doi.org/10.1016/j.ijmedinf.2017.12.015.
64. Li, H.; Dou, X.; Tao, C.; Wu, Z.; Chen, J.; Peng, J.; Deng, M.; Zhao, L. Rsi-cb: A large-scale remote sensing image classification
benchmark using crowdsourced data. Sensors 2020, 20, 28–32.
65. Bleiholder, J.; Naumann, F. Data Fusion. ACM Comput. Surv. 2009, 41, 1–41. https://doi.org/10.1145/1456650.1456651.
66. Wald, L. A conceptual approach to the fusion of earth observation data. Surv. Geophys. 2000, 21, 177–186. https://doi.org/10.1023/
A:1006760101519
67. Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; Group, T.P. Preferred Reporting Items for Systematic Reviews and Meta-Analyses:
The PRISMA Statement. PLoS Med. 2009, 6, e1000097. https://doi.org/10.1371/journal.pmed.1000097.
68. Leibrand, A.; Sadoff, N.; Maslak, T.; Thomas, A. Using Earth Observations to Help Developing Countries Improve Access to
Reliable, Sustainable, and Modern Energy. Front. Environ. Sci. 2019, 7, 123. https://doi.org/10.3389/fenvs.2019.00123.
69. Kibirige, D.; Dobos, E. Soil moisture estimation using citizen observatory data, microwave satellite imagery, and environmental
covariates. Water 2020, 12, 2160. https://doi.org/10.3390/W12082160.
70. Salk, C.; Sturn, T.; See, L.; Fritz, S. Local knowledge and professional background have a minimal impact on volunteer citizen
science performance in a land-cover classification task. Remote Sens. 2016, 8, 774.
71. Mehdipoor, H.; Zurita-Milla, R.; Augustijn, E.W.; Izquierdo-Verdiguier, E. Exploring differences in spatial patterns and temporal
trends of phenological models at continental scale using gridded temperature time-series. Int. J. Biometeorol. 2020, 64, 409–421.
72. Melaas, E.K.; Friedl, M.A.; Richardson, A.D. Multiscale modeling of spring phenology across Deciduous Forests in the Eastern
United States. Glob. Chang. Biol. 2016, 22, 792–805.
73. Delbart, N.; Beaubien, E.; Kergoat, L.; Le Toan, T. Comparing land surface phenology with leafing and flowering observations
from the PlantWatch citizen network. Remote Sens. Environ. 2015, 160, 273–280.
74. Elmore, A.J.; Stylinski, C.D.; Pradhan, K. Synergistic use of citizen science and remote sensing for continental-scale measurements
of forest tree phenology. Remote Sens. 2016, 8, 502.
75. Xin, Q.; Li, J.; Li, Z.; Li, Y.; Zhou, X. Evaluations and comparisons of rule-based and machine-learning-based methods to retrieve
satellite-based vegetation phenology using MODIS and USA National Phenology Network data. Int. J. Appl. Earth Obs. Geoinfor.
2020, 93, 102189. https://doi.org/10.1016/j.jag.2020.102189.
Remote Sens. 2022, 14, 1263 62 of 66
76. Liu, X.; He, J.; Yao, Y.; Zhang, J.; Liang, H.; Wang, H.; Hong, Y. Classifying urban land use by integrating remote sensing and
social media data. Int. J. Geogr. Inf. Sci. 2017, 31, 1675–1696. https://doi.org/10.1080/13658816.2017.1324976.
77. Liu, J.; Hyyppä, J.; Yu, X.; Jaakkola, A.; Liang, X.; Kaartinen, H.; Kukko, A.; Zhu, L.; Wang, Y.; Hyyppä, H. Can global navigation
satellite system signals reveal the ecological attributes of forests? Int. J. Appl. Earth Obs. Geoinfor. 2016, 50, 74–79.
78. Wallace, C.S.A.; Walker, J.J.; Skirvin, S.M.; Patrick-Birdwell, C.; Weltzin, J.F.; Raichle, H. Mapping presence and predicting
phenological status of invasive buffelgrass in Southern Arizona using MODIS, climate and citizen science observation data.
Remote Sens. 2016, 8, 524.
79. Baker, F.; Smith, C.L.; Cavan, G. A combined approach to classifying land surface cover of urban domestic gardens using citizen
science data and high resolution image analysis. Remote Sens. 2018, 10, 537.
80. Gengler, S.; Bogaert, P. Integrating crowdsourced data with a land cover product: A Bayesian data fusion approach. Remote Sens.
2016, 8, 545.
81. Koskinen, J.; Leinonen, U.; Vollrath, A.; Ortmann, A.; Lindquist, E.; D’Annunzio, R.; Pekkarinen, A.; Käyhkö, N. Participatory
mapping of forest plantations with Open Foris and Google Earth Engine. ISPRS J. Photogramm. Remote Sens. 2019, 148, 63–74.
82. Wang, S.; Di Tommaso, S.; Faulkner, J.; Friedel, T.; Kennepohl, A.; Strey, R.; Lobell, D.B. Mapping crop types in southeast india
with smartphone crowdsourcing and deep learning. Remote Sens. 2020, 12, 2957. https://doi.org/10.3390/RS12182957.
83. Schepaschenko, D.G.; Shvidenko, A.Z.; Lesiv, M.Y.; Ontikov, P.V.; Shchepashchenko, M.V.; Kraxner, F. Estimation of forest area
and its dynamics in Russia based on synthesis of remote sensing products. Contemp. Probl. Ecol. 2015, 8, 811–817.
84. Schepaschenko, D.; See, L.; Lesiv, M.; McCallum, I.; Fritz, S.; Salk, C.; Moltchanova, E.; Perger, C.; Shchepashchenko, M.;
Shvidenko, A.; et al. Development of a global hybrid forest mask through the synergy of remote sensing, crowdsourcing and
FAO statistics. Remote Sens. Environ. 2015, 162, 208–220.
85. Lesiv, M.; Moltchanova, E.; Schepaschenko, D.; See, L.; Shvidenko, A.; Comber, A.; Fritz, S. Comparison of data fusion methods
using crowdsourced data in creating a hybrid forest cover map. Remote Sens. 2016, 8, 261.
86. Wiesner-Hanks, T.; Wu, H.; Stewart, E.; DeChant, C.; Kaczmar, N.; Lipson, H.; Gore, M.A.; Nelson, R.J. Millimeter-Level Plant
Disease Detection From Aerial Photographs via Deep Learning and Crowdsourced Data. Front. Plant Sci. 2019, 10, 1550.
87. Chi, M.; Sun, Z.; Qin, Y.; Shen, J.; Benediktsson, J.A. A Novel Methodology to Label Urban Remote Sensing Images Based on
Location-Based Social Media Photos. Proc. IEEE 2017, 105, 1926–1936.
88. Sitthi, A.; Nagai, M.; Dailey, M.; Ninsawat, S. Exploring land use and land cover of geotagged social-sensing images using naive
bayes classifier. Sustainability 2016, 8, 921.
89. Yokoya, N.; Ghamisi, P.; Xia, J.; Sukhanov, S.; Heremans, R.; Tankoyeu, I.; Bechtel, B.; Le Saux, B.; Moser, G.; Tuia, D. Open Data
for Global Multimodal Land Use Classification: Outcome of the 2017 IEEE GRSS Data Fusion Contest. IEEE J. Sel. Top. Appl.
Earth Obs. Remote Sens. 2018, 11, 1363–1377.
90. Liu, D.; Chen, N.; Zhang, X.; Wang, C.; Du, W. Annual large-scale urban land mapping based on Landsat time series in Google
Earth Engine and OpenStreetMap data: A case study in the middle Yangtze River basin. ISPRS J. Photogramm. Remote Sens. 2020,
159, 337–351.
91. Johnson, B.A.; Iizuka, K.; Bragais, M.A.; Endo, I.; Magcale-Macandog, D.B. Employing crowdsourced geographic data and
multi-temporal/multi-sensor satellite imagery to monitor land cover change: A case study in an urbanizing region of the
Philippines. Comput. Environ. Urban Syst. 2017, 64, 184–193. https://doi.org/10.1016/j.compenvurbsys.2017.02.002.
92. Hu, T.; Yang, J.; Li, X.; Gong, P. Mapping urban land use by using landsat images and open social data. Remote Sens. 2016, 8, 151.
93. Yang, D.; Fu, C.S.; Smith, A.C.; Yu, Q. Open land-use map: A regional land-use mapping strategy for incorporating OpenStreetMap
with earth observations. Geo-Spat. Inf. Sci. 2017, 20, 269–281.
94. Fonte, C.C.; Patriarca, J.; Jesus, I.; Duarte, D. Automatic extraction and filtering of openstreetmap data to generate training
datasets for land use land cover classification. Remote Sens. 2020, 12, 3428. https://doi.org/10.3390/rs12203428.
95. Hughes, L.H.; Streicher, S.; Chuprikova, E.; du Preez, J. A cluster graph approach to land cover classification boosting. Data 2019,
4, 1–24.
96. See, L.; Schepaschenko, D.; Lesiv, M.; McCallum, I.; Fritz, S.; Comber, A.; Perger, C.; Schill, C.; Zhao, Y.; Maus, V.; et al. Building a
hybrid land cover map with crowdsourcing and geographically weighted regression. ISPRS J. Photogramm. Remote Sens. 2015,
103, 48–56. https://doi.org/10.1016/j.isprsjprs.2014.06.016.
97. Hammerberg, K.; Brousse, O.; Martilli, A.; Mahdavi, A. Implications of employing detailed urban canopy parameters for
mesoscale climate modelling: A comparison between WUDAPT and GIS databases over Vienna, Austria. Int. J. Climatol. 2018, 38,
1241–1257.
98. Annis, A.; Nardi, F. Integrating VGI and 2D hydraulic models into a data assimilation framework for real time flood forecasting
and mapping. Geo-Spat. Inf. Sci. 2019, 22, 223–236.
99. Mazzoleni, M.; Cortes Arevalo, V.J.; Wehn, U.; Alfonso, L.; Norbiato, D.; Monego, M.; Ferri, M.; Solomatine, D.P. Towards
assimilation of crowdsourced observations for different levels of citizen engagement: The flood event of 2013 in the Bacchiglione
catchment. Hydrol. Earth Syst. Sci. Discuss. 2018, 22, 391–416
100. Zeng, Z.; Lan, J.; Hamidi, A.R.; Zou, S. Integrating Internet media into urban flooding susceptibility assessment: A case study in
China. Cities 2020, 101, 102697.
101. Yang, D.; Yang, A.; Qiu, H.; Zhou, Y.; Herrero, H.; Fu, C.S.; Yu, Q.; Tang, J. A Citizen-Contributed GIS Approach for Evaluating
the Impacts of Land Use on Hurricane-Harvey-Induced Flooding in Houston Area. Land 2019, 8, 25.
Remote Sens. 2022, 14, 1263 63 of 66
102. Rosser, J.F.; Leibovici, D.G.; Jackson, M.J. Rapid flood inundation mapping using social media, remote sensing and topographic
data. Nat. Hazards 2017, 87, 103–120.
103. Ahmad, K.; Pogorelov, K.; Riegler, M.; Conci, N.; Halvorsen, P. Social media and satellites: Disaster event detection, linking and
summarization. Multimed. Tools Appl. 2019, 78, 2837–2875.
104. Panteras, G.; Cervone, G. Enhancing the temporal resolution of satellite-based flood extent generation using crowdsourced data
for disaster monitoring. Int. J. Remote Sens. 2018, 39, 1459–1474.
105. Olthof, I.; Svacina, N. Testing urban flood mapping approaches from satellite and in situ data collected during 2017 and 2019
events in Eastern Canada. Remote Sens. 2020, 12, 3141. https://doi.org/10.3390/RS12193141.
106. Chini, M.; Pelich, R.; Pulvirenti, L.; Pierdicca, N.; Hostache, R.; Matgen, P. Sentinel-1 InSAR coherence to detect floodwater in
urban areas: Houston and hurricane harvey as a test case. Remote Sens. 2019, 11, 107.
107. Ferster, C.J.; Coops, N.C. Integrating volunteered smartphone data with multispectral remote sensing to estimate forest fuels. Int.
J. Digit. Earth 2016, 9, 171–196.
108. Hardy, C.C. Wildland fire hazard and risk: Problems, definitions, and context. For. Ecol. Manag. 2005, 211, 73–82.
https://doi.org/10.1016/j.foreco.2005.01.029.
109. Frank, J.; Rebbapragada, U.; Bialas, J.; Oommen, T.; Havens, T.C. Effect of label noise on the machine-learned classification of
earthquake damage. Remote Sens. 2017, 9, 803.
110. Hultquist, C.; Cervone, G. Citizen monitoring during hazards: Validation of Fukushima radiation measurements. GeoJournal
2018, 83, 189–206.
111. Gueguen, L.; Koenig, J.; Reeder, C.; Barksdale, T.; Saints, J.; Stamatiou, K.; Collins, J.; Johnston, C. Mapping Human Settlements
and Population at Country Scale from VHR Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 524–538.
112. Chew, R.F.; Amer, S.; Jones, K.; Unangst, J.; Cajka, J.; Allpress, J.; Bruhn, M. Residential scene classification for gridded population
sampling in developing countries using deep convolutional neural networks on satellite imagery. Int. J. Health Geogr. 2018, 17,
1–17.
113. Chen, J.; Zhou, Y.; Zipf, A.; Fan, H. Deep Learning from Multiple Crowds: A Case Study of Humanitarian Mapping. IEEE Trans.
Geosci. Remote Sens. 2019, 57, 1713–1722.
114. Zhao, W.; Bo, Y.; Chen, J.; Tiede, D.; Thomas, B.; Emery, W.J. Exploring semantic elements for urban scene recognition: Deep
integration of high-resolution imagery and OpenStreetMap (OSM). ISPRS J. Photogramm. Remote Sens. 2019, 151, 237–250.
115. Kaiser, P.; Wegner, J.D.; Lucchi, A.; Jaggi, M.; Hofmann, T.; Schindler, K.; Member, S. Learning Aerial Image Segmentation From
Online Maps. IEEE Trans. Geosci. Remote Sens. 2017, 55, 6050–6068.
116. Ivanovic, S.S.; Olteanu-Raimond, A.M.; Mustière, S.; Devogele, T. A filtering-based approach for improving crowdsourced GNSS
traces in a data update context. ISPRS Int. J. Geo-Inf. 2019, 8, 380. https://doi.org/10.3390/ijgi8090380.
117. Li, Y.; Xiang, L.; Zhang, C.; Wu, H. Fusing taxi trajectories and rs images to build road map via dcnn. IEEE Access 2019, 7,
161487–161498.
118. Lambers, K.; Verschoof-van der Vaart, W.B.; Bourgeois, Q.P.J. Integrating remote sensing, machine learning, and citizen science in
dutch archaeological prospection. Remote Sens. 2019, 11, 794.
119. Ford, B.; Pierce, J.R.; Wendt, E.; Long, M.; Jathar, S.; Mehaffy, J.; Tryner, J.; Quinn, C.; Van Zyl, L.; L’Orange, C.; et al. A low-cost
monitor for measurement of fine particulate matter and aerosol optical depth-Part 2: Citizen-science pilot campaign in northern
Colorado. Atmos. Meas. Tech. 2019, 12, 6385–6399.
120. Venter, Z.S.; Brousse, O.; Esau, I.; Meier, F. Hyperlocal mapping of urban air temperature using remote sensing and crowdsourced
weather data. Remote Sens. Environ. 2020, 242, 111791.
121. Fenner, D.; Meier, F.; Bechtel, B.; Otto, M.; Scherer, D. Intra and inter ‘local climate zone’ variability of air temperature as observed
by crowdsourced citizen weather stations in Berlin, Germany. Meteorol. Z. 2017, 26, 525–547.
122. Shupe, S.M. High resolution stream water quality assessment in the Vancouver, British Columbia region: A citizen science study.
Sci. Total Environ. 2017, 603–604, 745–759.
123. Thornhill, I.; Ho, J.G.; Zhang, Y.; Li, H.; Ho, K.C.; Miguel-Chinchilla, L.; Loiselle, S.A. Prioritising local action for water quality
improvement using citizen science; a study across three major metropolitan areas of China. Sci. Total Environ. 2017, 584–585,
1268–1281.
124. Garaba, S.P.; Friedrichs, A.; Voß, D.; Zielinski, O. Classifying natural waters with the forel-ule colour index system: Results,
applications, correlations and crowdsourcing. Int. J. Environ. Res. Public Health 2015, 12, 16096–16109.
125. Boyd, D.S.; Jackson, B.; Wardlaw, J.; Foody, G.M.; Marsh, S.; Bales, K. Slavery from Space: Demonstrating the role for satellite
remote sensing to inform evidence-based action related to UN SDG number 8. ISPRS J. Photogramm. Remote Sens. 2018, 142,
380–388.
126. Juan, A.D.; Bank, A. The Ba‘athist blackout? Selective goods provision and political violence in the Syrian civil war. J. Peace Res.
2015, 52, 91–104.
127. United Nations. United Nations Department of Economic and Social Affairs. Sustainable Development Knowledge Platform.
Sustainable Development Goals. Available online: https://sdgs.un.org/goals/goal8 (accessed on 28 February 2022).
128. Butler, B.W.; Anderson, W.R.; Catchpole, E.A. Influence of Slope on Fire Spread Rate. In Proceedings of the USDA Forest Service
Proceedings, Destin, FL, USA, 26–30 March 2007; pp. 75–82.
Remote Sens. 2022, 14, 1263 64 of 66
129. Beven, K.J.; Kirkby, M.J. A physically based, variable contributing area model of basin hydrology. Hydrol. Sci. Bull. 1979, 24,
43–69. https://doi.org/10.1080/02626667909491834.
130. Hjerdt, K.N.; McDonnell, J.J.; Seibert, J.; Rodhe, A. A new topographic index to quantify downslope controls on local drainage.
Water Resour. Res. 2004, 40, 1–6. https://doi.org/10.1029/2004WR003130.
131. Kindermann, G.; Obersteiner, M.; Rametsteiner, E.; McCallum, I. Predicting the deforestation-trend under different carbon-prices.
Carbon Balance Manag. 2006, 1, 15. https://doi.org/10.1186/1750-0680-1-15.
132. Nelson, G.C.; Valin, H.; Sands, R.D.; Havlík, P.; Ahammad, H.; Deryng, D.; Elliott, J.; Fujimori, S.; Hasegawa, T.; Heyhoe, E.; et al.
Climate change effects on agriculture: Economic responses to biophysical shocks. Proc. Natl. Acad. Sci. USA 2014, 111, 3274–3279.
https://doi.org/10.1073/pnas.1222465110.
133. Santoro, M.; Beaudoin, A.; Beer, C.; Cartus, O.; Fransson, J.E.; Hall, R.J.; Pathe, C.; Schmullius, C.; Schepaschenko, D.; Shvidenko,
A.; et al. Forest growing stock volume of the northern hemisphere: Spatially explicit estimates for 2010 derived from Envisat
ASAR. Remote Sens. Environ. 2015, 168, 316–334. https://doi.org/10.1016/j.rse.2015.07.005.
134. Schepaschenko, D.; McCallum, I.; Shvidenko, A.; Fritz, S.; Kraxner, F.; Obersteiner, M. A new hybrid land cover dataset for
Russia: A methodology for integrating statistics, remote sensing and in situ information. J. Land Use Sci. 2011, 6, 245–259.
https://doi.org/10.1080/1747423X.2010.511681.
135. Potapov, P.; Turubanova, S.; Hansen, M.C. Regional-scale boreal forest cover and change mapping using Landsat data composites
for European Russia. Remote Sens. Environ. 2011, 115, 548–561. https://doi.org/10.1016/j.rse.2010.10.001.
136. Bechtel, B.; Demuzere, M.; Sismanidis, P.; Fenner, D.; Brousse, O.; Beck, C.; Van Coillie, F.; Conrad, O.; Keramitsoglou, I.; Middel,
A.; et al. Quality of Crowdsourced Data on Urban Morphology—The Human Influence Experiment (HUMINEX). Urban Sci.
2017, 1, 15.
137. Stewart, I.D.; Oke, T.R. Local climate zones for urban temperature studies. Bull. Am. Meteorol. Soc. 2012, 93, 1879–1900.
https://doi.org/10.1175/BAMS-D-11-00019.1.
138. Tsendbazar, N.E.; de Bruin, S.; Herold, M. Assessing global land cover reference datasets for different user communities. ISPRS J.
Photogramm. Remote Sens. 2015, 103, 93–114. https://doi.org/10.1016/J.ISPRSJPRS.2014.02.008.
139. Getmapping. Aerial Data—High Resolution Imagery. Available online: https://www.getmapping.com/products/aerial-
imagery-data/aerial-data-infrared-imagery (accessed on 28 February 2022).
140. Rahman, M.S.; Di, L. A systematic review on case studies of remote-sensing-based flood crop loss assessment. Agriculture 2020,
10, 131. https://doi.org/10.3390/agriculture10040131.
141. Heiskanen, J. Estimating aboveground tree biomass and leaf area index in a mountain birch forest using ASTER satellite data. Int.
J. Remote Sens. 2006, 27, 1135–1158. https://doi.org/10.1080/01431160500353858.
142. Tucker, C.J. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens. Environ. 1979, 8, 127–150.
https://doi.org/10.1016/0034-4257(79)90013-0.
143. Barnes, E.M.; Clarke, T.R.; Richards, S.E.; Colaizzi, P.D.; Haberland, J.; Kostrzewski, M.; Waller, P.; Choi, C.; Riley, E.; Thompson,
T.; et al. Coinddent detection of crop water stress, nitrogen status and canopy density using ground-based multispectral data. In
Proceedings of the Fifth International Conference on Precision Agriculture, Bloomington, MN, USA, 16–19 July 2000.
144. Buschmann, C.; Nagel, E. In vivo spectroscopy and internal optics of leaves as basis for remote sensing of vegetation. Int. J.
Remote Sens. 1993, 14, 711–722. https://doi.org/10.1080/01431169308904370.
145. Gitelson, A.A.; Vi na, A.; Ciganda, V.; Rundquist, D.C.; Arkebauer, T.J. Remote estimation of canopy chlorophyll content in crops.
Geophys. Res. Lett. 2005, 32, 1–4. https://doi.org/10.1029/2005GL022688.
146. Meyer, G.E.; Neto, J.C. Verification of color vegetation indices for automated crop imaging applications. Comput. Electron. Agric.
2008, 63, 282–293. https://doi.org/10.1016/j.compag.2008.03.009.
147. Zha, Y.; Gao, J.; Ni, S. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. Int. J.
Remote Sens. 2003, 24, 583–594. https://doi.org/10.1080/01431160304987.
148. Feyisa, G.L.; Meilby, H.; Darrel Jenerette, G.; Pauliet, S. Locally optimized separability enhancement indices for urban land cover
mapping: Exploring thermal environmental consequences of rapid urbanization in Addis Ababa, Ethiopia. Remote Sens. Environ.
2016, 175, 14–31. https://doi.org/10.1016/j.rse.2015.12.026.
149. Xu, H. A new index for delineating built-up land features in satellite imagery. Int. J. Remote Sens. 2008, 29, 4269–4276.
https://doi.org/10.1080/01431160802039957.
150. Rikimaru, A.; Roy, P.S.; Miyatake, S. Tropical forest cover density mapping. Trop. Ecol. 2002, 43, 39–47.
151. Gao, B.C. NDWI A Normalized Difference Water Index for Remote Sensing of Vegetation Liquid Water From Space. Remote Sens.
Environ. 1996, 58, 257–266. https://doi.org/10.24059/olj.v23i3.1546.
152. Xu, H. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery.
Int. J. Remote Sens. 2006, 27, 3025–3033. https://doi.org/10.1080/01431160600589179.
153. Leal Filho, W.; Wolf, F.; Castro-Díaz, R.; Li, C.; Ojeh, V.N.; Gutiérrez, N.; Nagy, G.J.; Savić, S.; Natenzon, C.E.; Al-Amin, A.Q.; et al.
Addressing the urban heat islands effect: A cross-country assessment of the role of green infrastructure. Sustainability 2021, 13,
753. https://doi.org/10.3390/su13020753.
154. Liu, J.; Hyyppa, J.; Yu, X.; Jaakkola, A.; Kukko, A.; Kaartinen, H.; Zhu, L.; Liang, X.; Wang, Y.; Hyyppa, H. A Novel GNSS
Technique for Predicting Boreal Forest Attributes at Low Cost. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4855–4867.
Remote Sens. 2022, 14, 1263 65 of 66
155. Puissant, A.; Hirsch, J.; Weber, C. The utility of texture analysis to improve per-pixel classification for high to very high spatial
resolution imagery. Int. J. Remote Sens. 2005, 26, 733–745. https://doi.org/10.1080/01431160512331316838.
156. Small, D. Flattening gamma: Radiometric terrain correction for SAR imagery. IEEE Trans. Geosci. Remote Sens. 2011, 49, 3081–3093.
https://doi.org/10.1109/TGRS.2011.2120616.
157. Deng, J.; Dong, W.; Socher, R.; Li, L.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings
of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255.
https://doi.org/10.1109/CVPR.2009.5206848.
158. Neto, F.R.A.; Santos, C.A.S. Understanding crowdsourcing projects: A systematic review of tendencies, work flow, and quality
management. Inf. Process. Manag. 2018, 54, 490–506. https://doi.org/10.1016/j.ipm.2018.03.006.
159. Chaves, R.; Schneider, D.; Correia, A.; Borges, M.R.; Motta, C. Understanding crowd work in online crowdsourcing
platforms for urban planning: Systematic review. In Proceedings of the 2019 IEEE 23rd International Conference
on Computer Supported Cooperative Work in Design, CSCWD 2019, Porto, Portugal, 6–8 May 2019; pp. 273–278.
https://doi.org/10.1109/CSCWD.2019.8791936.
160. Patriarca, J.; Fonte, C.C.; Estima, J.; de Almeida, J.P.; Cardoso, A. Automatic conversion of OSM data into LULC maps:
Comparing FOSS4G based approaches towards an enhanced performance. Open Geospat. Data, Softw. Stand. 2019, 4, 11.
https://doi.org/10.1186/s40965-019-0070-2.
161. See, L.; Fritz, S.; You, L.; Ramankutty, N.; Herrero, M.; Justice, C.; Becker-Reshef, I.; Thornton, P.; Erb, K.; Gong,
P.; et al. Improved global cropland data as an essential ingredient for food security. Glob. Food Secur. 2015, 4, 37–45.
https://doi.org/10.1016/j.gfs.2014.10.004.
162. Sarı, A.; Ays, T.; Alptekine, G.I. A systematic literature review on crowdsourcing in software engineering. J. Syst. Softw. 2019, 153,
200–219. https://doi.org/10.1016/j.jss.2019.04.027.
163. Bubalo, M.; Zanten, B.T.V.; Verburg, P.H. Crowdsourcing geo-information on landscape perceptions and preferences: A review.
Landsc. Urban Plan. 2019, 184, 101–111. https://doi.org/10.1016/j.landurbplan.2019.01.001.
164. Reba, M.; Seto, K.C. A systematic review and assessment of algorithms to detect, characterize, and monitor urban land change.
Remote Sens. Environ. 2020, 242, 111739. https://doi.org/10.1016/j.rse.2020.111739.
165. Yuan, Q.; Shen, H.; Li, T.; Li, Z.; Li, S.; Jiang, Y.; Xu, H.; Tan, W.; Yang, Q.; Wang, J.; et al. Deep learning in environmental remote
sensing: Achievements and challenges. Remote Sens. Environ. 2020, 241, 111716. https://doi.org/10.1016/j.rse.2020.111716.
166. Guo, Y.; Liu, Y.; Oerlemans, A.; Lao, S.; Wu, S.; Lew, M.S. Deep learning for visual understanding: A review. Neurocomputing
2016, 187, 27–48. https://doi.org/10.1016/j.neucom.2015.09.116.
167. LeCun, Y. LeNet-5, Convolutional Neural Networks. Available online: http://yann.lecun.com/exdb/lenet/ (accessed on 28
February 2022).
168. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks; Curran Associates Inc.:
Lake Tahoe, Nevada, 2012; pp. 1097–1105. https://doi.org/10.1201/9781420010749.
169. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2015, arXiv:1409.1556.
170. Liu, J.; Li, T.; Xie, P.; Du, S.; Teng, F.; Yang, X. Urban big data fusion based on deep learning: An overview. Inf. Fusion 2020, 53,
123–133. https://doi.org/10.1016/j.inffus.2019.06.016.
171. Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive
Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. https://doi.org/10.1109/MGRS.2017.2762307.
172. Scott, D.W. Multivariate Density Estimation and Visualization; Springer: Berlin/Heidelberg, Germany, 2012; pp. 549–569.
173. Luque, A.; Carrasco, A.; Martín, A.; de las Heras, A. The impact of class imbalance in classification performance metrics based on
the binary confusion matrix. Pattern Recognit. 2019, 91, 216–231. https://doi.org/10.1016/j.patcog.2019.02.023.
174. Belgiu, M.; Drăgu, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm.
Remote Sens. 2016, 114, 24–31. https://doi.org/10.1016/j.isprsjprs.2016.01.011.
175. Du, P.; Samat, A.; Waske, B.; Liu, S.; Li, Z. Random Forest and Rotation Forest for fully polarized SAR image classification using po-
larimetric and spatial features. ISPRS J. Photogramm. Remote Sens. 2015, 105, 38–53. https://doi.org/10.1016/j.isprsjprs.2015.03.002.
176. Yang, C.; Yu, M.; Hu, F.; Jiang, Y.; Li, Y. Utilizing Cloud Computing to address big geospatial data challenges. Comput. Environ.
Urban Syst. 2017, 61, 120–128. https://doi.org/10.1016/j.compenvurbsys.2016.10.010.
177. Miao, X.; Heaton, J.S.; Zheng, S.; Charlet, D.A.; Liu, H. Applying tree-based ensemble algorithms to the classification
of ecological zones using multi-temporal multi-source remote-sensing data. Int. J. Remote Sens. 2012, 33, 1823–1849.
https://doi.org/10.1080/01431161.2011.602651.
178. Rainforth, T.; Wood, F. Canonical Correlation Forests. arXiv 2015, arXiv:1507.05444.
179. Abdollahi, A.; Pradhan, B.; Shukla, N.; Chakraborty, S. Deep learning approaches applied to remote sensing datasets for road
extraction: A state-of-the-art review. Remote Sens. 2020, 12, 1444.
180. Arsanjania, J.J.; Vaz, E. An assessment of a collaborative mapping approach for exploring land use patterns for several European
metropolises. Int. J. Appl. Earth Obs. Geoinf. 2015, 35, 329–337.
181. Rüetschi, M.; Schaepman, M.E.; Small, D. Using multitemporal Sentinel-1 C-band backscatter to monitor phenology and classify
deciduous and coniferous forests in Northern Switzerland. Remote Sens. 2018, 10, 55. https://doi.org/10.3390/rs10010055.
Remote Sens. 2022, 14, 1263 66 of 66
182. Wang, K.; Chen, J.; Kiaghadi, A.; Dawson, C. A New Algorithm for Land-Cover Classification Using PolSAR and InSAR Data
and Its Application to Surface Roughness Mapping Along the Gulf Coast. IEEE Trans. Geosci. Remote Sens. 2021, 60, 4502915.
https://doi.org/10.1109/TGRS.2021.3083492.
183. Owuor, I.; Hochmair, H.H. An Overview of Social Media Apps and their Potential Role in Geospatial Research. Int. J. -Geo-Inf.
Artic. 2020, 9, 526.
184. Razavian, A.S.; Azizpour, H.; Sullivan, J.; Carlsson, S. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition. In
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA, 23–28
June 2014; pp. 512–519. https://doi.org/10.1109/CVPRW.2014.131.
185. Hagenauer, J.; Helbich, M. A geographically weighted artificial neural network. Int. J. Geogr. Inf. Sci. 2021, 36, 215–235.
https://doi.org/10.1080/13658816.2021.1871618.
186. Goodchild, M.F.; Li, L. Assuring the quality of volunteered geographic information. Spat. Stat. 2012, 1, 110–120.
https://doi.org/10.1016/j.spasta.2012.03.002.
187. Chi, C.; Wang, Y.; Li, Y.; Tong, X. Multistrategy Repeated Game-Based Mobile Crowdsourcing Incentive Mechanism for Mobile
Edge Computing in Internet of Things. Wirel. Commun. Mob. Comput. 2021, 2021, 6695696. https://doi.org/10.1155/2021/6695696.
188. Schuir, J.; Brinkhege, R.; Anton, E.; Oesterreich, T.; Meier, P.; Teuteberg, F. Augmenting Humans in the Loop: Towards an
Augmented Reality Object Labeling Application for Crowdsourcing Communities. In Proceedings of the International Conference
on Wirtschaftsinformatik, Essen, Germany, 9–11 March 2021.
189. Vohland, K.; Land-Zandstra, A.; Ceccaroni, L.; Lemmens, R.; Perelló, J.; Ponti, M.; Samson, R.; Wagenknecht, K. The Science of
Citizen Science: Theories, Methodologies and Platforms. In Proceedings of the CSCW ’17: Computer Supported Cooperative Work
and Social Computing Portland, OR, USA, 25 February–1 March 2017; pp. 395–400. https://doi.org/10.1145/3022198.3022652.
190. Van Coillie, F.M.; Gardin, S.; Anseel, F.; Duyck, W.; Verbeke, L.P.; De Wulf, R.R. Variability of operator performance in
remote-sensing image interpretation: The importance of human and external factors. Int. J. Remote Sens. 2014, 35, 754–778.
https://doi.org/10.1080/01431161.2013.873152.
191. Shulla, K.; Leal Filho, W.; Sommer, J.H.; Lange Salvia, A.; Borgemeister, C. Channels of collaboration for citizen science and the
sustainable development goals. J. Clean. Prod. 2020, 264, 121735. https://doi.org/10.1016/j.jclepro.2020.121735.
192. Moczek, N.; Voigt-Heucke, S.L.; Mortega, K.G.; Fabó Cartas, C.; Knobloch, J. A self-assessment of european citizen
science projects on their contribution to the UN sustainable development goals (SDGs). Sustainability 2021, 13, 1774.
https://doi.org/10.3390/su13041774.
193. Bell, S.; Upchurch, P.; Snavely, N.; Bala, K. Material recognition in the wild with the Materials in Context Database. In Proceedings
of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015;
pp. 3479–3487. https://doi.org/10.1109/CVPR.2015.7298970.
194. Zhao, Y.; Gong, P.; Yu, L.; Hu, L.; Li, X.; Li, C.; Zhang, H.; Zheng, Y.; Wang, J.; Zhao, Y.; et al. Towards a common validation sample
set for global land-cover mapping. Int. J. Remote Sens. 2014, 35, 4795–4814. https://doi.org/10.1080/01431161.2014.930202.
195. Card, D.H. Using Known Map Category Marginal Frequencies To Improve Estimates of Thematic Map Accuracy. Photogramm.
Eng. Remote Sens. 1982, 48, 431–439.
196. Verschoof-van der Vaart, W.B.; Lambers, K. Learning to Look at LiDAR: The Use of R-CNN in the Automated Detection of Archae-
ological Objects in LiDAR Data from the Netherlands. J. Comput. Appl. Archaeol. 2019, 2, 31–40. https://doi.org/10.5334/jcaa.32.