Sistem Informasi Geografis Lanjut

“Analysis of Land Cover and Population Relationship

Using Grid-based City of Tehran, Iran”

Created By:

Sima Neyrizi 6016221017

Lecturer: Hepi Hapsari Handayani, ST, M.Sc, PhD



ChapterI Introduction
1.1 Background
1.2 Problem Limitations
1.3 Objectives And Benefits
ChapterII Theoretical Basis
2.1 Geogrhaphic Information System
2.2 Spatial Analsis
2.3 Grid – based Analysis
2.4 Linear Regression
2.5 ArcGIS
2.6 Tehran City
ChapterIII Methodology
3.1 Location
3.2 Tools And Materials
3.3 Flowchart
3.4 Explanation
ChapterIV Results And Discussions
4.1 Results
4.2 Discussions
ChapterV Closing
5.1 Conclusion
5.2 Suggestion
Chapter I
1.1 Background

GIS is a system designed to capture, store, manipulate, analyze, manage, and present spatial
or geographic data. GIS technology allows us to visualize, analyze, and interpret complex data
sets in a geographic context, enabling us to better understand and solve problems related to
geography, environment, and urban planning. GIS can be used in a variety of applications such
as resource management, land use planning, transportation planning, emergency management,
and more. GIS works by integrating various data sources, such as satellite imagery, aerial
photography, census data, and digital maps, into a single system where they can be viewed and
analyzed together. GIS software provides tools for data manipulation, such as querying,
editing, and analysis, as well as tools for visualization, such as mapping and charting. GIS is a
powerful tool for decision-making and has become an integral part of many industries and
fields, including environmental science, public health, urban planning, natural resource
management, and many others.

GIS can be used to analyze population density, land use patterns, and transportation
infrastructure. This information can be used to plan for the development of cities and to ensure
that infrastructure and services are accessible to the population.

In this practicum, a spatial analysis was carried out based on a grid to determine the influence
of population in the capital city of Iran, Tehran, on the level of availability of Bareland, Urban
area, and water sources using ArcGIS software.

1.2 Problem Limitations

The limitation of the problem from making this report is:
1. What is the concept of grid-based analysis?
2. How do we process population and land cover data using the grid-based analysis
method using ArcGIS software?
3. What are the results of the analysis of the relationship between population data and
land cover in the city of Tehran, Iran using regression analysis?

1.3 Objectives And Benefits

The purpose and benefits of making this report are:
1. To understand the basic concept of grid-based analysis.
2. To process population and land cover data for the City of Tehran, Iran, ArcGIS
software is used.
3. To analyze the relationship between population data and land cover in Tehran City,
Iran using regression analysis.
Chapter II
Theoretical Basis

2.1 Geographic Information System

GIS stands for Geographic Information System. It is a system designed to capture, store,
manipulate, analyze, manage and present spatial or geographical data. In other words, GIS
is a computer-based tool that helps users to collect, process, visualize and analyze
geospatial data.
The basic concept of GIS is to integrate various types of geospatial data, such as maps,
aerial photographs, satellite imagery, and other types of data such as demographic,
environmental, and economic data, into a single system. The system allows users to create
maps, perform spatial analysis, and generate reports and visualizations based on the data.
GIS is used in a variety of applications such as urban planning, resource management,
environmental analysis, emergency management, and business intelligence. It helps users
to make informed decisions by providing them with accurate and timely information about
the location and distribution of resources, infrastructure, and environmental conditions.

GIS can be classified into two main types, digital and manual:
a. Digital GIS: Digital GIS involves the use of computer software to capture, store,
manipulate, analyze, and present geospatial data. It is a highly automated process
that uses digital technologies such as satellite imagery, aerial photography, and GPS
to collect and analyze data. Digital GIS can be used for a wide range of applications,
including environmental analysis, urban planning, and natural resource
b. Manual GIS: Manual GIS involves the use of traditional methods and tools such as
paper maps, pencils, and rulers to capture, store, and analyze geospatial data. It is a
time-consuming and labour-intensive process that is primarily used in areas where
digital technologies are not available or practical. Manual GIS can be used for
simple mapping and data collection tasks, but it is not suitable for complex spatial
analysis or large-scale projects.
Overall, digital GIS is more efficient and accurate than manual GIS, but it requires
specialized software and hardware to operate. Manual GIS, on the other hand, is a
low-cost and simple method of data collection but is limited by its ability to analyze
and manage large amounts of data.
GIS operates by 5 main components:
1. Hardware: The hardware component of GIS includes computers, servers, printers, scanners,
GPS receivers, tablets, and other peripheral devices used to store, process, and display
geographic information.
2. Software: The software component of GIS includes computer programs and
applications that enable users to capture, store, manipulate, analyze, and display
geospatial data. Popular GIS software includes ArcGIS, QGIS, and Google Earth.
3. Data: The data component of GIS includes all geospatial data used in GIS, such as
maps, satellite imagery, aerial photographs, and other types of data such as
demographic, environmental, and economic data. GIS data is often stored in databases
or file formats such as shapefiles, geodatabases, and KML files.
4. Methods: The methods component of GIS includes the procedures and techniques used
to collect, process, analyze, and present geospatial data. This includes methods for data
capture, spatial analysis, cartography, and data management.
5. People: The people component of GIS includes the professionals who use GIS, such as
GIS analysts, cartographers, surveyors, and geographers. It also includes end-users who
use GIS data and applications to make decisions and solve problems related to
geographic information.

Overall, these five components work together to create a GIS system that enables users
to analyze, manage, and present spatial data in a meaningful way.

2.2 Spatial Analysis

Spatial analysis is the process of analyzing and interpreting patterns, relationships, and
trends in geospatial data. It involves the use of GIS tools and techniques to explore and
understand the spatial relationships between different features and phenomena on the
earth's surface. Spatial analysis can help answer questions such as:

• What is the relationship between land use and environmental quality?

• Where are the areas with the highest concentration of a particular demographic group?
• What is the distance between two points and how long would it take to travel that
• What is the best location for a new facility based on factors such as accessibility, cost,
and environmental impact?

Spatial analysis can be used in a variety of applications such as urban planning, natural
resource management, epidemiology, and market research. Some common techniques used
in the spatial analysis include overlay analysis, spatial clustering, network analysis, and
interpolation. Overall, spatial analysis enables users to gain insights and make informed
decisions about the distribution, patterns, and relationships of geospatial data.
Spatial analysis is a crucial aspect of GIS data processing, as it allows users to extract
valuable insights and knowledge from geospatial data. Some common applications of
spatial analysis in GIS data processing include:

1. Spatial Query and Selection: Spatial analysis tools can be used to query and select
features based on their location, attributes, or spatial relationship with other features.
For example, one could select all buildings within a certain distance of a road or river,
or all households with a specific income range within a particular region.
2. Spatial Interpolation: Spatial interpolation involves estimating values for locations
where no data is available by using information from nearby locations. This technique
is often used to estimate rainfall, temperature, and other environmental variables.
Spatial interpolation techniques include inverse distance weighting, kriging, and radial
basis functions.
3. Network Analysis: Network analysis involves using spatial data to solve problems
related to transportation, communication, and logistics. Network analysis tools can be
used to find the shortest path between two points, optimize routes for vehicles, and
analyze traffic patterns.
4. Spatial Statistics: Spatial statistics involves applying statistical methods to geospatial
data to analyze patterns and relationships between different features. This can be used
to identify areas with high or low concentrations of a particular feature, detect spatial
clustering, and assess spatial autocorrelation.
5. Spatial Modeling and Prediction: Spatial modelling involves creating mathematical
models that represent the spatial relationships between different features. These models
can be used to predict future outcomes, such as the spread of a disease or the impact of
a policy intervention.

In processing GIS data, spatial analysis can be used to provide solutions to spatial
problems. The benefits of this spatial analysis depend on the functions performed. A
summary of these benefits is as follows:

• Create, select, map and analyze cell-based raster data.

• Carry out integrated vector/raster data analysis.
• Getting new information from existing data.
• Selecting information from several data layers.
• Integrate raster data sources with vector data

Overall, spatial analysis is a critical component of GIS data processing, as it enables users
to extract valuable insights and make informed decisions about spatially-related issues.

2.3 Grid-based Analysis

The grid-based analysis is a type of spatial analysis that involves dividing an area into a
regular grid of cells and analyzing data based on the characteristics of each cell. This
approach is commonly used in raster-based GIS systems, which represent geospatial data
as a grid of cells, each with a unique value.
The grid-based analysis is commonly used in both raster and vector data processing. The
application of grid-based analysis in these two types of data processing differs due to their
fundamental differences in data representation.
Raster data is represented as a grid of cells, where each cell represents a specific
geographic location and has a value representing a certain attribute of that location, such as
elevation or land use. The grid-based analysis is well-suited for raster data, as it allows for
the analysis of each cell's attributes and the relationships between them. Some common
applications of grid-based analysis for raster data include:

1. Zonal statistics: This involves calculating summary statistics for a specific geographic
zone, such as the mean elevation or the total population within each grid cell.
2. Interpolation: This involves estimating values for locations where data is missing, using
values from surrounding grid cells.
3. Viewshed analysis: This involves determining the visibility of a given point or area by
analyzing the view from each grid cell.
4. Terrain analysis: This involves analyzing terrain features such as slope, aspect, and
elevation, and using this information to model processes such as erosion or wildfire
On the other hand, vector data is represented as discrete objects with distinct boundaries
and attributes, such as points, lines, and polygons. While grid-based analysis is not as well-
suited for vector data as it is for raster data, it can still be useful for some applications.
Some common applications of grid-based analysis for vector data include:

1. Density analysis: This involves creating a grid-based representation of point or line

features and analyzing the density of these features in different areas.
2. Distance analysis: This involves creating a grid-based representation of vector features
and analyzing the distance between different features.
3. Interpolation: This involves creating a grid-based representation of vector features and
estimating values for locations where data is missing, using values from surrounding
grid cells.

In summary, grid-based analysis is a powerful tool for analyzing both raster and vector
data, and it can be applied in a variety of applications, including environmental modelling,
land use planning, and natural resource management.

2.4 Linear Regression

The simple linear regression equation is an equation model that describes the
relationship of one independent variable/predictor (X) with one dependent
variable/response (Y), which is usually described by a straight line. The general equation
of linear regression is:

y = ax + b

y = dependent variable
x = independent variable
While variables a and b are determined from the equation:
a = n(Σxy) – (Σx) (Σy)
n(Σx²) – (Σx)²
b = (Σy) (Σx²) – (Σx) (Σxy)
n(Σx²) – (Σx)²

To measure the strength of the relationship between variables X and Y, a correlation

analysis was carried out in which the results were expressed by a number known as the
correlation coefficient. Usually, regression analysis is often done together with correlation
analysis. The correlation equation (r) is expressed by

In calculating the strength or accuracy of the predicted value, it can also be checked
using the linear regression method using the value of the coefficient of determination or
R2, the greater the value of R2, the better or more accurate the predicted value. While the
coefficient of determination (R2) can be determined by squaring the correlation
coefficient. Later on, its application to positive and negative correlation relationships
describes the interrelationship and influence of each other of the two variables. If the
relationship is positive linear then the value added of the two variables will be directly
proportional, while negative will be inversely proportional.

2.5 ArcGIS
ArcGIS is a suite of geographic information system (GIS) software products developed
and marketed by Esri, a leading provider of GIS technology. ArcGIS provides tools for
managing, analyzing, and visualizing geospatial data, allowing users to create maps and
perform spatial analysis on data from various sources.

ArcGIS consists of several components, including:

1. ArcGIS Desktop: This is the primary software used for creating and analyzing
geospatial data, including tools for data editing, geoprocessing, and visualization.
2. ArcGIS Online: This is a cloud-based platform that provides access to web mapping
applications and geospatial data, allowing users to create and share maps and other
geographic content.
3. ArcGIS Server: This is a software component that enables users to publish and share
GIS data and services over the internet or within an organization's network.
4. ArcGIS Pro: This is a modern, 64-bit desktop application that provides advanced 2D
and 3D mapping and analysis capabilities.
5. ArcGIS Mobile: This is a mobile application that provides access to GIS data and maps
on mobile devices.
6. ArcGIS Enterprise: This is a comprehensive GIS platform that allows organizations to
create and manage their own GIS infrastructure, including data storage, analysis, and
visualization tools.

ArcGIS is widely used by a variety of industries and organizations, including

government agencies, environmental organizations, natural resource management
agencies, and businesses. Its robust capabilities and user-friendly interface make it a
popular choice for GIS professionals and enthusiasts alike.

2.6 Tehran city

Tehran is a mountainside city situated at an altitude of 900-1700 m above sea level. It
covers an area of 1500 km2 located on the slope of Alborz Mountain. Its urban area spreads
entirely over the Iranian plateau, on the slopes of a very high and dense mountain barrier
(known as Towchal) whose peak of 3933 m is 2200 m higher than the city’s residential
As the capital city of Iran, Tehran has the largest population in the country among other
cities and is the centre of cultural, economic, political and social activities. In 1996, Tehran
had a population of 6,758,845, and in 2011 its population increased to
8,154,051( and it grew up to around 8.5 million in the middle
of 2014.. About 30% of Iran’s public-sector workforce and 45% of large industrial firms
are located in Tehran.
Since Tehran has a dry climate, bareland is the most dominant LULC type in the area.
The central area is dominated by urban dense and urban sparse, which also spread along
the roads toward the west, southwest and southeast directions. The north side of Tehran is
almost bareland because of Alborz Mountain. Cropland and grassland are found in areas
close to local people’s residential areas (urban dense or urban sparse).
Chapter III
3.1 Location
The location of the practicum and grid-based analysis in this report is Tehran City, Iran

Practicum Location Tehran, Iran

3.2 Tools And Materials

For the implementation of the analysis and preparation of reports the tools and materials
needed include:
3.2.1 Tools
1. Hardware
a. Laptop

Hardware (Laptop)

b. Mouse
2. Software

a. ArcGIS 10.4

ArcGIS Software

b. Microsoft Office

Microsoft Office

3.2.2 Materials

Materials used in this practicum include

1. Land cover data for the City of Tehran in 2014 downloaded from
2. Population data for the City of Tehran in 2014 downloaded from
3.3 Flowchart


Data gathering

land cover data of Tehran city Population data of Iran (2014)


Resampling of land cover data

Making fishnets


Urban area Bareland area Water area

Zonal statistic

Making graph

Population-urban graph Population-bareland graph Population-water graph


3.4 Explanation of Stages
The explanation of the stages in the analysis of the relationship between land cover and
population using ArcGIS software is as follows:
1. Downloading the land cover data for the City of Tehran in 2014

2. Downloading the population data of Iran in 2014 on the website
3. Opening ArcGIS Software

4. Inputting land cover data for the City of Tehran in 2014 and population data for Iran
in 2014 into ArcGIS

5. Change the cell size of the land cover data from 30m x 30m to 10m x 10m so that
the size is the same as the Cell Size of the population data using the Resample tool
by searching Resample then fill in as shown below and then click OK.
6. Then make AOI square which includes population areas and land cover and clip
population and land cover data according to the AOI that has been made

7. Create a grid (fishnet) with a size of 100 x 100 meters in ArcGIS for land cover and
population data, by clicking ArcToolbox → Data Management Tools → Sampling
→ Create Fishnet
The result is like the figure below

8. Reclassify each land cover in the form of urban, bareland and, water. Enter the
number 1 in the land cover that will be reclassified and 0 for others, as in the
following example:
9. Until the Reclassify results are obtained from each land cover as follows
10. The next step is to perform zonal statistics on ArcToolbox → Spatial Analyst Tools
→ Zonal → Zonal Statistics as Table
In the Zonal Statistics window, in the first column enter the previous fishnet results,
in the second column select FID, in the third column fill in the previous reclassify
results, in the last column select SUM, then click OK. Do it 3 times for each land
reclassification and 1 time for population data.

11. Then combine the table data for each land cover with population data, by right-
clicking on the zonal statistics results layer to be combined → clicking Join →
specifying from which column the data will be combined → specifying which layer
the data will be combined → and clicking OK. The steps for merging data are as
The result is like the figure below and is ready to export to excel for creating the graph
Chapter IV
Result And Discussion
4.1 Result
Based on the practicum that has been carried out, the following results are obtained:

1. Reclassified Land Cover Data (Urban Space and Urban Dense)

Result of urban classifying

2. Reclassified Land Cover Data(Bareland)

Result of bareland classifying

3. Reclassified Land Cover Data(Water)

Result of water classifying

4.2 Discussion
Based on the Zonal Statistics as Table process, SUM data were obtained related to
the total number of pixels in one fishnet sample box that has been made. The data will be
used to conduct an analysis related to the relationship between land cover and population
in the city of Tehran, Iran. The analysis was carried out by plotting on a scatter chart to
create a linear regression line and calculating the correlation value using RStudio software
with the following details:
a. Linear regression Urban – Population

Multiple R-squared: 0.3831

According to the graph above, it can be seen that the linear regression line has a
positive slope, meaning the urban areas increase alongside the growing up of
population. The multiple R-squared value (0.3831) shows that the relationship
between the urban area and population is moderately strong, meaning that as the urban
area increases, the population tends to increase as well. However, there are likely other
factors that influence the population variable that is not captured by the urban area
variable alone, since the R-squared value is less than 1.

On the other hand, the test of Pearson’s Correlation was done between population
and urban area to confirm the result of the graph and the value for correlation was
around 0.62 which based on the definition of Pearson’s Correlation, means there is a
positive and almost strong correlation between mentioned two variables. We can say
this is because as a population grows, there is a greater demand for housing,
infrastructure, and services, which leads to the expansion of urban areas. Especially
about Tehran which is the capital city and because of its facilities, job opportunities
and many other reasons people tend to live there.

b. Linear regression Bareland– Population

Multiple R-squared: 0.1991

According to the graph above, it can be seen that the linear regression line has a
negative slope, meaning the bareland areas decrease alongside the growing up of the
population. So, there is likely to be a negative correlation between population and
bareland areas. The test of Pearson’s Correlation was done between population and
bareland area to confirm the visualization and the result for correlation was about
-0.44 which based on the definition of Pearson’s Correlation, means there is a negative
and almost weak correlation between mentioned two variables in Tehran city. We can
say this is because as a population grows, there is a greater demand for housing,
infrastructure, and services, which leads to the expansion of urban areas and
consequently the area of barelands will decrease.

The multiple R-squared value of 0.1991 between population and bare land
indicates that there is a weak positive relationship between the two variables as it is
near zero. It is important to note that correlation does not imply causation, and the
relationship between population and bare land could be influenced by other variables,
such as economic development, migration patterns, or government policies.
Therefore, it is important to interpret the results carefully and consider other factors
before concluding the relationship between the variables.

As mentioned before, Tehran is a big and developed city which is the main reason
to attract people to live there, so there must be not much bareland according to the
demands for living there.

c. Linear regression Water – Population

Multiple R-squared: 3.052e-06

According to the graph above, there is an almost zero-slope regression line, which
indicates that there is no correlation between the water area and the population
number. The result of Paearson’s test showed that the value for correlation for these
two variables is around 0.001 and it can be said that is almost zero. In Pearson’s test,
zero represents no correlation between two variables. In general, there is likely to be
a weak or no correlation between population and water area. This is because the
availability of water bodies does not necessarily determine the size or density of a
population. There are other factors such as the availability of resources, climate, and
economic conditions that can significantly impact population growth and density.
Also a Multiple R-squared value of 3.052e-06 means that only a very small fraction
(0.000003052) of the variation in a population can be explained by water area alone
in the given model.

About Tehran city, although there are not many water resources, instead there are
so many potential reasons that people prefer to live there.

d. Transportation and commercial facilities, urban area and population and buildings

The availability of transportation and commercial facilities can have a

significant impact on the correlation efficiency between urban areas and the
population. Here are some of the effects:

1. Increased mobility: Good transportation facilities, such as highways, railways, and

airports, make it easier for people to move in and out of urban areas. This can lead to
an increase in population as people are more willing to move to urban areas if they can
easily travel to other locations for work, education, or entertainment. Additionally, the
availability of commercial facilities such as shopping centres and supermarkets can
make it more convenient for people to live in urban areas by providing easy access to
goods and services.
2. Greater accessibility: Good transportation facilities also make it easier for people within
urban areas to access different parts of the city. This can lead to a more efficient
distribution of the population within the city, as people are more likely to live and work
in different areas of the city. The availability of commercial facilities can also increase
accessibility within urban areas, as people can easily access shopping and other services
without needing to travel long distances.
3. Improved economic development: Good transportation and commercial facilities can
also have a positive impact on economic development, as they make it easier for
businesses to transport goods and services and for people to access job opportunities.
The availability of commercial facilities can attract businesses to urban areas, which
can create job opportunities and stimulate economic growth.
4. Increased demand for housing: Improved transportation and commercial facilities can
increase the demand for housing in urban areas, as people are more willing to live in
these areas if they can easily travel to other locations and have access to commercial
facilities. This can lead to an increase in housing prices, which can have both positive
and negative effects on the population. On the one hand, higher housing prices can
make it more difficult for people to afford to live in urban areas. On the other hand,
higher housing prices can attract higher-income residents, which can further stimulate
economic development.

Overall, as it is shown in the figures below, the availability of transportation and

commercial facilities can have a significant impact on the correlation efficiency
between urban areas and the population. By increasing mobility, accessibility, and
economic development, these facilities can attract more people to urban areas and
create a more efficient distribution of the population and buildings within cities.

Transportation facilities map of Tehran city

Commercial facilities map of Tehran city

Buildings dense map of Tehran city

Chapter V
5.1 Conclusion

Based on the practicum and analysis of this report, it can be concluded:

1. The total population in the City of Tehran, Iran affects the availability of
existing urban areas. This can be seen by the line going up and the value of the
correlation coefficient (0.62). The increase in the x variable (Urban) was also
followed by an increase in the y variable (Population).
2. The total population in the City of Tehran, Iran affects the availability of
existing bareland areas. This can be seen by the regression line that descends
from left to bottom right and the result of the correlation coefficient value
(-0.44). The increase in the y variable (Population) was followed by a decrease
in the x variable (Bareland).
3. The total population in Tehran City, Iran does not affect the availability of
existing water areas. This can be seen from the linear regression graph which is
more spread out and the linear line which is almost parallel to the x-axis, as well
as the result of the correlation coefficient value (0.001).
4. Analysis using the grid-based method allows us to identify and analyze the
relationship between population size and land cover in an area.

5.2 Suggestion
The suggestions given by the author in the process of analysis and preparation of
this report include:
1. Conducting a lot of literature studies related to grid-based analysis so that in the
analysis process it will be more understandable and easier to process the data.
2. Apart from that, it is better to try Spearman's correlation analysis to see if there
is a non-linear relationship between mentioned variables or not.

