Download as pdf or txt
Download as pdf or txt
You are on page 1of 252

Introduction to Trees

What are trees?

 Tree is a hierarchical data structure which stores the information naturally in


the form of hierarchy style.
 Tree is one of the most powerful and advanced data structures.
 It is a non-linear data structure compared to arrays, linked lists, stack and
queue.
 It represents the nodes connected by edges.

The above figure represents structure of a tree. Tree has 2 subtrees.


A is a parent of B and C.
B is called a child of A and also parent of D, E, F.

Tree is a collection of elements called Nodes, where each node can have
arbitrary number of children.

Field Description

Root Root is a special node in a tree. The entire tree is referenced through it. It does not
have a parent.

Parent Node Parent node is an immediate predecessor of a node.

Child Node All immediate successors of a node are its children.

Siblings Nodes with the same parent are called Siblings.

Path Path is a number of successive edges from source node to destination node.

Height of Height of a node represents the number of edges on the longest path between that
Node node and a leaf.

Height of Height of tree represents the height of its root node.


Tree

Depth of Depth of a node represents the number of edges from the tree's root node to the
Node node.

Degree of Degree of a node represents a number of children of a node.


Node

Edge Edge is a connection between one node to another. It is a line between two nodes
Introduction to Trees

or a node and a leaf.

In the above figure, D, F, H, G are leaves. B and C are siblings. Each node
excluding a root is connected by a direct edge from exactly one other node
parent → children.

Levels of a node
Levels of a node represents the number of connections between the node and
the root. It represents generation of a node. If the root node is at level 0, its
next node is at level 1, its grand child is at level 2 and so on. Levels of a node
can be shown as follows:

Note:

- If node has no children, it is called Leaves or External Nodes.

- Nodes which are not leaves, are called Internal Nodes. Internal nodes have
at least one child.

- A tree can be empty with no nodes or a tree consists of one node called
the Root.

Height of a Node

As we studied, height of a node is a number of edges on the longest path


between that node and a leaf. Each node has height.

In the above figure, A, B, C, D can have height. Leaf cannot have height as
there will be no path starting from a leaf. Node A's height is the number of
edges of the path to K not to D. And its height is 3.

Note:

- Height of a node defines the longest path from the node to a leaf.

- Path can only be downward.


Introduction to Trees

Depth of a Node

While talking about the height, it locates a node at bottom where for depth, it is
located at top which is root level and therefore we call it depth of a node.

In the above figure, Node G's depth is 2. In depth of a node, we just count how
many edges between the targeting node & the root and ignoring the directions.

Note: Depth of the root is 0.

Advantages of Tree
 Tree reflects structural relationships in the data.
 It is used to represent hierarchies.
 It provides an efficient insertion and searching operations.
 Trees are flexible. It allows to move subtrees around with minimum effort.
Binary Tree- Representation in
Memory
A binary tree is a non-linear data structure to maintain binary relationships among
elements. Binary trees are special trees where a node can have maximum two child
nodes. These are on the left and right side of a given nodes so called left child and right
child nodes. These trees are best used to store decision trees which represent decisions
involving yes or no, true or false or 0 or 1. They are most frequently used in gaming
applications where only two moves are possible to be taken by a player. It stores various
states that may be achieved after a move is taken by a player.

Memory Representation-Array
A small and almost complete binary tree can be easily stored in a linear array. Small tree
is preferably stored in linear array because searching process in a linear array is
expensive. Complete means that if most of the nodes have two child nodes.

To store binary tree in a linear array, you need to consider the positional indexes of the
nodes. This indexing must be considered starting with 1 from the root node going from
left to right as you go down from one level to other.

Assigning of indexes is done in this way-

Index of parent= INT[index of child node/2]


Index of Left Child = 2 * Index of parent
Index of Right Child = 2 * Index of parent+1

These rules are used to store the tree of the above example in an array

If a binary tree contains less number of elements by is deep In structure, the memory
underutilization is a major issue.

Binary Tree in a Linked Representation


In linked and dynamic representation, the linked list data structure is used. Each node
constitutes of a data part and two link parts. The two link parts store address of left and
right child nodes. Data part is used to store the information about the binary tree element.
This is a better representation as nodes can be added or deleted at any location. Memory
utilization is better in this binary tree representation.
Tree Data Structure
We read the linear data structures like an array, linked list, stack and queue in which all
the elements are arranged in a sequential manner. The different data structures are used
for different kinds of data.

Some factors are considered for choosing the data structure:

o What type of data needs to be stored?: It might be a possibility that a certain data
structure can be the best fit for some kind of data.
o Cost of operations: If we want to minimize the cost for the operations for the most
frequently performed operations. For example, we have a simple list on which we have to
perform the search operation; then, we can create an array in which elements are stored
in sorted order to perform the binary search. The binary search works very fast for the
simple list as it divides the search space into half.
o Memory usage: Sometimes, we want a data structure that utilizes less memory.

A tree is also one of the data structures that represent hierarchical data. Suppose we
want to show the employees and their positions in the hierarchical form then it can be
represented as shown below:
The above tree shows the organization hierarchy of some company. In the above
structure, john is the CEO of the company, and John has two direct reports named
as Steve and Rohan. Steve has three direct reports named Lee, Bob, Ella where Steve is
a manager. Bob has two direct reports named Sal and Emma. Emma has two direct
reports named Tom and Raj. Tom has one direct report named Bill. This particular
logical structure is known as a Tree. Its structure is similar to the real tree, so it is named
a Tree. In this structure, the root is at the top, and its branches are moving in a
downward direction. Therefore, we can say that the Tree data structure is an efficient
way of storing the data in a hierarchical way.

Let's understand some key points of the Tree data structure.

o A tree data structure is defined as a collection of objects or entities known as nodes that
are linked together to represent or simulate hierarchy.
o A tree data structure is a non-linear data structure because it does not store in a
sequential manner. It is a hierarchical structure as elements in a Tree are arranged in
multiple levels.
o In the Tree data structure, the topmost node is known as a root node. Each node
contains some data, and data can be of any type. In the above tree structure, the node
contains the name of the employee, so the type of data would be a string.
o Each node contains some data and the link or reference of other nodes that can be
called children.

Some basic terms used in Tree data structure.

Let's consider the tree structure, which is shown below:

In the above structure, each node is labeled with some number. Each arrow shown in the
above figure is known as a link between the two nodes.

o Root: The root node is the topmost node in the tree hierarchy. In other words, the root
node is the one that doesn't have any parent. In the above structure, node numbered 1
is the root node of the tree. If a node is directly linked to some other node, it would be
called a parent-child relationship.
o Child node: If the node is a descendant of any node, then the node is known as a child
node.
o Parent: If the node contains any sub-node, then that node is said to be the parent of
that sub-node.
o Sibling: The nodes that have the same parent are known as siblings.
o Leaf Node:- The node of the tree, which doesn't have any child node, is called a leaf
node. A leaf node is the bottom-most node of the tree. There can be any number of leaf
nodes present in a general tree. Leaf nodes can also be called external nodes.
o Internal nodes: A node has atleast one child node known as an internal
o Ancestor node:- An ancestor of a node is any predecessor node on a path from the root
to that node. The root node doesn't have any ancestors. In the tree shown in the above
image, nodes 1, 2, and 5 are the ancestors of node 10.
o Descendant: The immediate successor of the given node is known as a descendant of a
node. In the above figure, 10 is the descendant of node 5.

Properties of Tree data structure


o Recursive data structure: The tree is also known as a recursive data structure. A tree
can be defined as recursively because the distinguished node in a tree data structure is
known as a root node. The root node of the tree contains a link to all the roots of its
subtrees. The left subtree is shown in the yellow color in the below figure, and the right
subtree is shown in the red color. The left subtree can be further split into subtrees
shown in three different colors. Recursion means reducing something in a self-similar
manner. So, this recursive property of the tree data structure is implemented in various
applications.

o Number of edges: If there are n nodes, then there would n-1 edges. Each arrow in the
structure represents the link or path. Each node, except the root node, will have atleast
one incoming link known as an edge. There would be one link for the parent-child
relationship.
o Depth of node x: The depth of node x can be defined as the length of the path from the
root to the node x. One edge contributes one-unit length in the path. So, the depth of
node x can also be defined as the number of edges between the root node and the node
x. The root node has 0 depth.
o Height of node x: The height of node x can be defined as the longest path from the
node x to the leaf node.

Based on the properties of the Tree data structure, trees are classified into various
categories.

Implementation of Tree
The tree data structure can be created by creating the nodes dynamically with the help
of the pointers. The tree in the memory can be represented as shown below:
The above figure shows the representation of the tree data structure in the memory. In
the above structure, the node contains three fields. The second field stores the data; the
first field stores the address of the left child, and the third field stores the address of the
right child.

In programming, the structure of a node can be defined as:

1. struct node
2. {
3. int data;
4. struct node *left;
5. struct node *right;
6. }

The above structure can only be defined for the binary trees because the binary tree can
have utmost two children, and generic trees can have more than two children. The
structure of the node for generic trees would be different as compared to the binary
tree.

Applications of trees
The following are the applications of trees:

o Storing naturally hierarchical data: Trees are used to store the data in the hierarchical
structure. For example, the file system. The file system stored on the disc drive, the file
and folder are in the form of the naturally hierarchical data and stored in the form of
trees.
o Organize data: It is used to organize data for efficient insertion, deletion and searching.
For example, a binary tree has a logN time for searching an element.
o Trie: It is a special kind of tree that is used to store the dictionary. It is a fast and efficient
way for dynamic spell checking.
o Heap: It is also a tree data structure implemented using arrays. It is used to implement
priority queues.
o B-Tree and B+Tree: B-Tree and B+Tree are the tree data structures used to implement
indexing in databases.
o Routing table: The tree data structure is also used to store the data in routing tables in
the routers.

Types of Tree data structure


The following are the types of a tree data structure:

o General tree: The general tree is one of the types of tree data structure. In the general
tree, a node can have either 0 or maximum n number of nodes. There is no restriction
imposed on the degree of the node (the number of nodes that a node can contain). The
topmost node in a general tree is known as a root node. The children of the parent node
are known as subtrees.
There can be n number of subtrees in a general tree. In the general tree, the subtrees are
unordered as the nodes in the subtree cannot be ordered.
Every non-empty tree has a downward edge, and these edges are connected to the
nodes known as child nodes. The root node is labeled with level 0. The nodes that have
the same parent are known as siblings.
o Binary tree: Here, binary name itself suggests two numbers, i.e., 0 and 1. In a binary tree,
each node in a tree can have utmost two child nodes. Here, utmost means whether the
node has 0 nodes, 1 node or 2 nodes.

To know more about the binary tree, click on the link given below:
https://www.javatpoint.com/binary-tree
o Binary Search tree: Binary search tree is a non-linear data structure in which one node is
connected to n number of nodes. It is a node-based data structure. A node can be
represented in a binary search tree with three fields, i.e., data part, left-child, and right-
child. A node can be connected to the utmost two child nodes in a binary search tree, so
the node contains two pointers (left child and right child pointer).
Every node in the left subtree must contain a value less than the value of the root node,
and the value of each node in the right subtree must be bigger than the value of the
root node.

A node can be created with the help of a user-defined data type known as struct, as
shown below:

1. struct node
2. {
3. int data;
4. struct node *left;
5. struct node *right;
6. }

The above is the node structure with three fields: data field, the second field is the left
pointer of the node type, and the third field is the right pointer of the node type.

To know more about the binary search tree, click on the link given below:

https://www.javatpoint.com/binary-search-tree

o AVL tree

It is one of the types of the binary tree, or we can say that it is a variant of the binary
search tree. AVL tree satisfies the property of the binary tree as well as of the binary
search tree. It is a self-balancing binary search tree that was invented by Adelson
Velsky Lindas. Here, self-balancing means that balancing the heights of left subtree and
right subtree. This balancing is measured in terms of the balancing factor.

We can consider a tree as an AVL tree if the tree obeys the binary search tree as well as
a balancing factor. The balancing factor can be defined as the difference between the
height of the left subtree and the height of the right subtree. The balancing factor's
value must be either 0, -1, or 1; therefore, each node in the AVL tree should have the
value of the balancing factor either as 0, -1, or 1.

To know more about the AVL tree, click on the link given below:

https://www.javatpoint.com/avl-tree

o Red-Black Tree
The red-Black tree is the binary search tree. The prerequisite of the Red-Black tree is
that we should know about the binary search tree. In a binary search tree, the value of
the left-subtree should be less than the value of that node, and the value of the right-
subtree should be greater than the value of that node. As we know that the time
complexity of binary search in the average case is log2n, the best case is O(1), and the
worst case is O(n).

When any operation is performed on the tree, we want our tree to be balanced so that
all the operations like searching, insertion, deletion, etc., take less time, and all these
operations will have the time complexity of log2n.

The red-black tree is a self-balancing binary search tree. AVL tree is also a height
balancing binary search tree then why do we require a Red-Black tree. In the AVL tree,
we do not know how many rotations would be required to balance the tree, but in the
Red-black tree, a maximum of 2 rotations are required to balance the tree. It contains
one extra bit that represents either the red or black color of a node to ensure the
balancing of the tree.

o Splay tree

The splay tree data structure is also binary search tree in which recently accessed
element is placed at the root position of tree by performing some rotation operations.
Here, splaying means the recently accessed node. It is a self-balancing binary search
tree having no explicit balance condition like AVL tree.

It might be a possibility that height of the splay tree is not balanced, i.e., height of both
left and right subtrees may differ, but the operations in splay tree takes order
of logN time where n is the number of nodes.

Splay tree is a balanced tree but it cannot be considered as a height balanced tree
because after each operation, rotation is performed which leads to a balanced tree.

o Treap

Treap data structure came from the Tree and Heap data structure. So, it comprises the
properties of both Tree and Heap data structures. In Binary search tree, each node on
the left subtree must be equal or less than the value of the root node and each node on
the right subtree must be equal or greater than the value of the root node. In heap data
structure, both right and left subtrees contain larger keys than the root; therefore, we
can say that the root node contains the lowest value.
In treap data structure, each node has both key and priority where key is derived from
the Binary search tree and priority is derived from the heap data structure.

The Treap data structure follows two properties which are given below:

o Right child of a node>=current node and left child of a node <=current node (binary
tree)
o Children of any subtree must be greater than the node (heap)

o B-tree

B-tree is a balanced m-way tree where m defines the order of the tree. Till now, we read
that the node contains only one key but b-tree can have more than one key, and more
than 2 children. It always maintains the sorted data. In binary tree, it is possible that leaf
nodes can be at different levels, but in b-tree, all the leaf nodes must be at the same
level.

If order is m then node has the following properties:

o Each node in a b-tree can have maximum m children


o For minimum children, a leaf node has 0 children, root node has minimum 2 children and
internal node has minimum ceiling of m/2 children. For example, the value of m is 5
which means that a node can have 5 children and internal nodes can contain maximum 3
children.
o Each node has maximum (m-1) keys.

The root node must contain minimum 1 key and all other nodes must contain
atleast ceiling of m/2 minus 1 keys.
Binary Tree
The Binary tree means that the node can have maximum two children. Here, binary
name itself suggests that 'two'; therefore, each node can have either 0, 1 or 2 children.

Let's understand the binary tree through an example.

The above tree is a binary tree because each node contains the utmost two children. The
logical representation of the above tree is given below:
In the above tree, node 1 contains two pointers, i.e., left and a right pointer pointing to
the left and right node respectively. The node 2 contains both the nodes (left and right
node); therefore, it has two pointers (left and right). The nodes 3, 5 and 6 are the leaf
nodes, so all these nodes contain NULL pointer on both left and right parts.

Properties of Binary Tree


o At each level of i, the maximum number of nodes is 2i.
o The height of the tree is defined as the longest path from the root node to the
leaf node. The tree which is shown above has a height equal to 3. Therefore, the
maximum number of nodes at height 3 is equal to (1+2+4+8) = 15. In general,
the maximum number of nodes possible at height h is (20 + 21 + 22+….2h) =
2h+1 -1.
o The minimum number of nodes possible at height h is equal to h+1.
o If the number of nodes is minimum, then the height of the tree would be
maximum. Conversely, if the number of nodes is maximum, then the height of the
tree would be minimum.

If there are 'n' number of nodes in the binary tree.

The minimum height can be computed as:

As we know that,
n = 2h+1 -1

n+1 = 2h+1

Taking log on both the sides,

log2(n+1) = log2(2h+1)

log2(n+1) = h+1

h = log2(n+1) - 1

The maximum height can be computed as:

As we know that,

n = h+1

h= n-1

Types of Binary Tree

There are four types of Binary tree:

o Full/ proper/ strict Binary tree


o Complete Binary tree
o Perfect Binary tree
o Degenerate Binary tree
o Balanced Binary tree

1. Full/ proper/ strict Binary tree

The full binary tree is also known as a strict binary tree. The tree can only be considered
as the full binary tree if each node must contain either 0 or 2 children. The full binary
tree can also be defined as the tree in which each node must contain 2 children except
the leaf nodes.

Let's look at the simple example of the Full Binary tree.


In the above tree, we can observe that each node is either containing zero or two
children; therefore, it is a Full Binary tree.

Properties of Full Binary Tree

o The number of leaf nodes is equal to the number of internal nodes plus 1. In the
above example, the number of internal nodes is 5; therefore, the number of leaf
nodes is equal to 6.
o The maximum number of nodes is the same as the number of nodes in the binary
tree, i.e., 2h+1 -1.
o The minimum number of nodes in the full binary tree is 2*h-1.
o The minimum height of the full binary tree is log2(n+1) - 1.
o The maximum height of the full binary tree can be computed as:

n= 2*h - 1

n+1 = 2*h

h = n+1/2

Complete Binary Tree


The complete binary tree is a tree in which all the nodes are completely filled except the
last level. In the last level, all the nodes must be as left as possible. In a complete binary
tree, the nodes should be added from the left.

Let's create a complete binary tree.

The above tree is a complete binary tree because all the nodes are completely filled, and
all the nodes in the last level are added at the left first.

Properties of Complete Binary Tree

o The maximum number of nodes in complete binary tree is 2h+1 - 1.


o The minimum number of nodes in complete binary tree is 2h.
o The minimum height of a complete binary tree is log2(n+1) - 1.
o The maximum height of a complete binary tree is

Perfect Binary Tree

A tree is a perfect binary tree if all the internal nodes have 2 children, and all the leaf
nodes are at the same level.
Let's look at a simple example of a perfect binary tree.

The below tree is not a perfect binary tree because all the leaf nodes are not at the same
level.
Note: All the perfect binary trees are the complete binary trees as well as the full binary
tree, but vice versa is not true, i.e., all complete binary trees and full binary trees are the
perfect binary trees.

Degenerate Binary Tree

The degenerate binary tree is a tree in which all the internal nodes have only one
children.

Let's understand the Degenerate binary tree through examples.


The above tree is a degenerate binary tree because all the nodes have only one child. It
is also known as a right-skewed tree as all the nodes have a right child only.
The above tree is also a degenerate binary tree because all the nodes have only one
child. It is also known as a left-skewed tree as all the nodes have a left child only.

Balanced Binary Tree

The balanced binary tree is a tree in which both the left and right trees differ by atmost
1. For example, AVL and Red-Black trees are balanced binary tree.

Let's understand the balanced binary tree through examples.


The above tree is a balanced binary tree because the difference between the left subtree
and right subtree is zero.
The above tree is not a balanced binary tree because the difference between the left
subtree and the right subtree is greater than 1.

Binary Tree Implementation

A Binary tree is implemented with the help of pointers. The first node in the tree is
represented by the root pointer. Each node in the tree consists of three parts, i.e., data,
left pointer and right pointer. To create a binary tree, we first need to create the node.
We will create the node of user-defined as shown below:

1. struct node
2. {
3. int data,
4. struct node *left, *right;
5. }
In the above structure, data is the value, left pointer contains the address of the left
node, and right pointer contains the address of the right node.

Binary Tree program in C

1. #include<stdio.h>
2. struct node
3. {
4. int data;
5. struct node *left, *right;
6. }
7. void main()
8. {
9. struct node *root;
10. root = create();
11. }
12. struct node *create()
13. {
14. struct node *temp;
15. int data;
16. temp = (struct node *)malloc(sizeof(struct node));
17. printf("Press 0 to exit");
18. printf("\nPress 1 for new node");
19. printf("Enter your choice : ");
20. scanf("%d", &choice);
21. if(choice==0)
22. {
23. return 0;
24. }
25. else
26. {
27. printf("Enter the data:");
28. scanf("%d", &data);
29. temp->data = data;
30. printf("Enter the left child of %d", data);
31. temp->left = create();
32. printf("Enter the right child of %d", data);
33. temp->right = create();
34. return temp;
35. }
36. }

The above code is calling the create() function recursively and creating new node on
each recursive call. When all the nodes are created, then it forms a binary tree structure.
The process of visiting the nodes is known as tree traversal. There are three types
traversals used to visit a node:

o Inorder traversal
o Preorder traversal
o Postorder traversal
Searching

Searching means finding or locating some specific element or node within a


data structure. However, searching for some specific node in binary search
tree is pretty easy due to the fact that, element in BST are stored in a
particular order.

1. Compare the element with the root of the tree.


2. If the item is matched then return the location of the node.
3. Otherwise check if item is less than the element present on root, if so
then move to the left sub-tree.
4. If not, then move to the right sub-tree.
5. Repeat this procedure recursively until match found.
6. If element is not found then return NULL.
Algorithm:

Search (ROOT, ITEM)

o Step 1: IF ROOT -> DATA = ITEM OR ROOT = NULL


Return ROOT
ELSE
IF ROOT < ROOT -> DATA
Return search(ROOT -> LEFT, ITEM)
ELSE
Return search(ROOT -> RIGHT,ITEM)
[END OF IF]
[END OF IF]
o Step 2: END
Binary Search tree
In this article, we will discuss the Binary search tree. This article will be very
helpful and informative to the students with technical background as it is
an important topic of their course.

Before moving directly to the binary search tree, let's first see a brief
description of the tree.

What is a tree?
A tree is a kind of data structure that is used to represent the data in
hierarchical form. It can be defined as a collection of objects or entities
called as nodes that are linked together to simulate a hierarchy. Tree is a
non-linear data structure as the data in a tree is not stored linearly or
sequentially.

Now, let's start the topic, the Binary Search tree.

What is a Binary Search tree?


A binary search tree follows some order to arrange the elements. In a
Binary search tree, the value of left node must be smaller than the parent
node, and the value of right node must be greater than the parent node.
This rule is applied recursively to the left and right subtrees of the root.

Let's understand the concept of Binary search tree with an example.


In the above figure, we can observe that the root node is 40, and all the
nodes of the left subtree are smaller than the root node, and all the nodes
of the right subtree are greater than the root node.

Similarly, we can see the left child of root node is greater than its left child
and smaller than its right child. So, it also satisfies the property of binary
search tree. Therefore, we can say that the tree in the above image is a
binary search tree.

Suppose if we change the value of node 35 to 55 in the above tree, check


whether the tree will be binary search tree or not.
In the above tree, the value of root node is 40, which is greater than its left
child 30 but smaller than right child of 30, i.e., 55. So, the above tree does
not satisfy the property of Binary search tree. Therefore, the above tree is
not a binary search tree.

Advantages of Binary search tree


o Searching an element in the Binary search tree is easy as we always
have a hint that which subtree has the desired element.

o As compared to array and linked lists, insertion and deletion


operations are faster in BST.

Example of creating a binary search tree


Now, let's see the creation of binary search tree using an example.

Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50

o First, we have to insert 45 into the tree as the root of the tree.

o Then, read the next element; if it is smaller than the root node, insert
it as the root of the left subtree, and move to the next element.

o Otherwise, if the element is larger than the root node, then insert it as
the root of the right subtree.

Now, let's see the process of creating the Binary search tree using the given
data element. The process of creating the BST is shown below -

Step 1 - Insert 45.


Step 2 - Insert 15.

As 15 is smaller than 45, so insert it as the root node of the left subtree.

Step 3 - Insert 79.

As 79 is greater than 45, so insert it as the root node of the right subtree.
Step 4 - Insert 90.

90 is greater than 45 and 79, so it will be inserted as the right subtree of 79.

Step 5 - Insert 10.

10 is smaller than 45 and 15, so it will be inserted as a left subtree of 15.

Step 6 - Insert 55.


55 is larger than 45 and smaller than 79, so it will be inserted as the left
subtree of 79.

Step 7 - Insert 12.

12 is smaller than 45 and 15 but greater than 10, so it will be inserted as the
right subtree of 10.
Step 8 - Insert 20.

20 is smaller than 45 but greater than 15, so it will be inserted as the right
subtree of 15.

Step 9 - Insert 50.

50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as a
left subtree of 55.
Now, the creation of binary search tree is completed. After that, let's move
towards the operations that can be performed on Binary search tree.

We can perform insert, delete and search operations on the binary search
tree.

Let's understand how a search is performed on a binary search tree.

Searching in Binary search tree


Searching means to find or locate a specific element or node in a data
structure. In Binary search tree, searching a node is easy because elements
in BST are stored in a specific order. The steps of searching a node in Binary
Search tree are listed as follows -

1. First, compare the element to be searched with the root element of


the tree.
2. If root is matched with the target element, then return the node's
location.

3. If it is not matched, then check whether the item is less than the root
element, if it is smaller than the root element, then move to the left
subtree.

4. If it is larger than the root element, then move to the right subtree.

5. Repeat the above procedure recursively until the match is found.

6. If the element is not found or not present in the tree, then return
NULL.

Now, let's understand the searching in binary tree using an example. We


are taking the binary search tree formed above. Suppose we have to find
node 20 from the below tree.

Step1:

Step2:
Step3:

Now, let's see the algorithm to search an element in the Binary search tree.

Algorithm to search an element in Binary search


tree
1. Search (root, item)
2. Step 1 - if (item = root → data) or (root = NULL)
3. return root
4. else if (item < root → data)
5. return Search(root → left, item)
6. else
7. return Search(root → right, item)
8. END if
9. Step 2 - END

Now let's understand how the deletion is performed on a binary search


tree. We will also see an example to delete an element from the given tree.

Deletion in Binary Search tree


In a binary search tree, we must delete a node from the tree by keeping in
mind that the property of BST is not violated. To delete a node from BST,
there are three possible situations occur -

o The node to be deleted is the leaf node, or,

o The node to be deleted has only one child, and,

o The node to be deleted has two children

We will understand the situations listed above in detail.

When the node to be deleted is the leaf node

It is the simplest case to delete a node in BST. Here, we have to replace the
leaf node with NULL and simply free the allocated space.

We can see the process to delete a leaf node from BST in the below image.
In below image, suppose we have to delete node 90, as the node to be
deleted is a leaf node, so it will be replaced with NULL, and the allocated
space will free.
When the node to be deleted has only one child

In this case, we have to replace the target node with its child, and then
delete the child node. It means that after replacing the target node with its
child node, the child node will now contain the value to be deleted. So, we
simply have to replace the child node with NULL and free up the allocated
space.

We can see the process of deleting a node with one child from BST in the
below image. In the below image, suppose we have to delete the node 79,
as the node to be deleted has only one child, so it will be replaced with its
child 55.

So, the replaced node 79 will now be a leaf node that can be easily deleted.

When the node to be deleted has two children


This case of deleting a node in BST is a bit complex among other two cases.
In such a case, the steps to be followed are listed as follows -

o First, find the inorder successor of the node to be deleted.

o After that, replace that node with the inorder successor until the
target node is placed at the leaf of tree.

o And at last, replace the node with NULL and free up the allocated
space.

The inorder successor is required when the right child of the node is not
empty. We can obtain the inorder successor by finding the minimum
element in the right child of the node.

We can see the process of deleting a node with two children from BST in
the below image. In the below image, suppose we have to delete node 45
that is the root node, as the node to be deleted has two children, so it will
be replaced with its inorder successor. Now, node 45 will be at the leaf of
the tree so that it can be deleted easily.

Now let's understand how insertion is performed on a binary search tree.


Insertion in Binary Search tree
A new key in BST is always inserted at the leaf. To insert an element in BST,
we have to start searching from the root node; if the node to be inserted is
less than the root node, then search for an empty location in the left
subtree. Else, search for the empty location in the right subtree and insert
the data. Insert in BST is similar to searching, as we always have to maintain
the rule that the left subtree is smaller than the root, and right subtree is
larger than the root.

Now, let's see the process of inserting a node into BST using an example.
The complexity of the Binary Search tree
Let's see the time and space complexity of the Binary search tree. We will
see the time complexity for insertion, deletion, and searching operations in
best case, average case, and worst case.

1. Time Complexity
Operations Best case Average case Worst case
time time time
complexity complexity complexity

Insertion O(log n) O(log n) O(n)

Deletion O(log n) O(log n) O(n)

Search O(log n) O(log n) O(n)

Where 'n' is the number of nodes in the given tree.


2. Space Complexity
Operations Space complexity

Insertion O(n)

Deletion O(n)

Search O(n)

o The space complexity of all operations of Binary search tree is O(n).

Implementation of Binary search tree


Now, let's see the program to implement the operations of Binary Search
tree.

Program: Write a program to perform operations of Binary Search tree in


C++.

In this program, we will see the implementation of the operations of binary


search tree. Here, we will see the creation, inorder traversal, insertion, and
deletion operations of tree.

Here, we will see the inorder traversal of the tree to check whether the
nodes of the tree are in their proper location or not. We know that the
inorder traversal always gives us the data in ascending order. So, after
performing the insertion and deletion operations, we perform the inorder
traversal, and after traversing, if we get data in ascending order, then it is
clear that the nodes are in their proper location.

1. #include <iostream>
2. using namespace std;
3. struct Node {
4. int data;
5. Node *left;
6. Node *right;
7. };
8. Node* create(int item)
9. {
10. Node* node = new Node;
11. node->data = item;
12. node->left = node->right = NULL;
13. return node;
14. }
15. /*Inorder traversal of the tree formed*/
16. void inorder(Node *root)
17. {
18. if (root == NULL)
19. return;
20. inorder(root->left); //traverse left subtree
21. cout<< root->data << " "; //traverse root node
22. inorder(root->right); //traverse right subtree
23. }
24. Node* findMinimum(Node* cur) /*To find the inorder successor*/
25. {
26. while(cur->left != NULL) {
27. cur = cur->left;
28. }
29. return cur;
30. }
31. Node* insertion(Node* root, int item) /*Insert a node*/
32. {
33. if (root == NULL)
34. return create(item); /*return new node if tree is empty*/
35. if (item < root->data)
36. root->left = insertion(root->left, item);
37. else
38. root->right = insertion(root->right, item);
39. return root;
40. }
41. void search(Node* &cur, int item, Node* &parent)
42. {
43. while (cur != NULL && cur->data != item)
44. {
45. parent = cur;
46. if (item < cur->data)
47. cur = cur->left;
48. else
49. cur = cur->right;
50. }
51. }
52. void deletion(Node*& root, int item) /*function to delete a node*/
53. {
54. Node* parent = NULL;
55. Node* cur = root;
56. search(cur, item, parent); /*find the node to be deleted*/
57. if (cur == NULL)
58. return;
59. if (cur->left == NULL && cur-
>right == NULL) /*When node has no children*/
60. {
61. if (cur != root)
62. {
63. if (parent->left == cur)
64. parent->left = NULL;
65. else
66. parent->right = NULL;
67. }
68. else
69. root = NULL;
70. free(cur);
71. }
72. else if (cur->left && cur->right)
73. {
74. Node* succ = findMinimum(cur->right);
75. int val = succ->data;
76. deletion(root, succ->data);
77. cur->data = val;
78. }
79. else
80. {
81. Node* child = (cur->left)? cur->left: cur->right;
82. if (cur != root)
83. {
84. if (cur == parent->left)
85. parent->left = child;
86. else
87. parent->right = child;
88. }
89. else
90. root = child;
91. free(cur);
92. }
93. }
94. int main()
95. {
96. Node* root = NULL;
97. root = insertion(root, 45);
98. root = insertion(root, 30);
99. root = insertion(root, 50);
100. root = insertion(root, 25);
101. root = insertion(root, 35);
102. root = insertion(root, 45);
103. root = insertion(root, 60);
104. root = insertion(root, 4);
105. printf("The inorder traversal of the given binary tree is - \n");
106. inorder(root);
107. deletion(root, 25);
108. printf("\nAfter deleting node 25, the inorder traversal of the given
binary tree is - \n");
109. inorder(root);
110. insertion(root, 2);
111. printf("\nAfter inserting node 2, the inorder traversal of the given b
inary tree is - \n");
112. inorder(root);
113. return 0;
114. }

Output

After the execution of the above code, the output will be -


What is a Tree in Data Structure?

A tree is a hierarchical data structure where each node has a maximum of


two child nodes (sometimes referred to as successors). The topmost
node in the tree is called the root node, and the bottommost nodes are
called leaves. leaves. Between the root and leaves are node branches, a
tree can be empty, with just a root node, or it can have many levels of
nodes. The root node is unique, but other nodes can have multiple parent
nodes.

For example, in a family tree, each person has only one biological mother
and father, but they may have multiple grandparents, aunts, and uncles,
etc. In computer science and software programming, trees are often used
to represent the structure of HTML doc, trees are often used to represent
the structure of HTML documents or file systems. They can also be used
to store data such as DNA sequences or mathematical expressions. trees
are often implemented using pointers in programming languages such as
C++.

Types of Trees

After introducing trees in the data structure, we know they are used for
different purposes. Here is an overview of some of the most popular
types of trees in the data structure.

1. General Tree

A general tree is the most basic type of tree. It is made up of nodes that
can have any number of child nodes. There is no specific relationship
between the nodes; they can be traversed in any order. General trees are
used when the relationship between the nodes is not important.
2. Binary tree

A binary tree is a special type of tree where each and every node can
have no more than two child nodes. The left child node is always less
than the parent node, and the right child node is always greater than or
equal to the parent node. Binary trees are used when the nodes'
relationship is important and needs to be kept in order.

3. Binary Search Tree

A binary search tree (BST) is a binary tree where every node has a value
greater than all the values in its left subtree and less than all the values in
its right subtree. BSTs are used when quickly searching for a value in a
large dataset is important.

4. AVL Tree

An AVL tree (named after its inventors Adelson-Velsky and Landis) is a


type of BST where each node has a value that is greater than all the
values in its left subtree and less than all the values in its right subtree. In
addition, an AVL tree must also be balanced, meaning that the difference
between the heights of the left subtree and right subtree must be no more
than 1. AVL trees are used when quickly searching for a value in a large
dataset is important and when maintaining balance is also important.

5. Red-Black Tree

A red-black tree (RBT) is another type of self-balancing BST where each


node has an extra bit associated with it that denotes its color (red or
black). In addition, certain constraints must be met for an RBT to be valid:
1) every leaf (NULL) node is black, 2) if a node is red, then both its
children must be black, 3) every simple path from any given node to any
of its descendant leaves contains the same number of black nodes. RBTs
are used when quickly searching for a value in a large dataset while also
maintaining balance and ensuring constraint satisfaction is important.

6. N-ary Tree

An n-ary tree (or k-ary tree) generalizes BSTs and RBTs by allowing each
node to have no more than k children instead of just 2 children as in
BSTs/RBTs. N-ary trees are used when quick search timeshare is
important and when the data does not fit well into a traditional binary tree
structure(i.e., when k > 2).

Basic Terminologies Used in Tree Data Structure

Understanding basic tree data structure terminologies is crucial for


anyone who wants to work with this type of data. Here, we will review
some of the most important terms that you need to know.

 Root: The root is the topmost node in a tree. It does not have a parent and
typically has zero or more child nodes.
 Child Node: A child node is any node that has a parent. Child nodes can have
their own children (sub-nodes), which makes them parent nodes as well.
 Parent: A parent is a node that has at least one child node. Parent nodes can
also have their own parents (super-nodes), making them child nodes.
Source

 Sibling: Siblings are nodes that share the same parent node. They can be
thought of as "brothers and sisters" within the tree structure.
 Leaf Node: A leaf node is any node with no child nodes. Leaf nodes are
typically the "end" of a tree branch.
 Internal Nodes: An internal node is a node that has at least one child node.
Internal nodes are typically found "in-between" other nodes in a tree structure.
 Ancestor Node: An ancestor node is any node that is on the path from the root
to the current node. Ancestor nodes can be thought of as "parents, grandparents,
etc."
 Descendant: A descendant is a node that is a child, grandchild, great-
grandchild, etc., of the current node. In other words, a descendant is any node
that is "below" the current node in the tree structure.
 Height of a Node: The height of a node is the no. of edges from the node to the
deepest leaf descendant. To put it another way, it is the "distance" from the
node to the bottom of the tree.
 Depth of a Node: The depth of a node is the no. of edges from the root to the
node. Therefore, it is the "distance" from the root to the node.
 Height of a Tree: The height of a tree is the height of its root node.
To get an insider's view into the advanced terminologies of trees in the
data structure, you can look for Python Programming for beginners and
experts in online courses. You can get the best out of your knowledge with
the most reliable resources around.

Properties of Tree Data Structure

In computer science, a tree is a widely used data structure that simulates


a hierarchical tree structure with a set of linked nodes. A tree data
structure has the following properties:

1. Recursive data structure


2. Number of edges
3. Depth of node x
4. Height of node x
Read on to learn more about each of these properties of tree data
structure in detail!

1. Recursive Data Structure: A tree is a recursive data structure because it has a


root node, and every node has zero or more child nodes. The root node is the
topmost node in the tree, and the child nodes are located below the root node. If
each node in the tree has zero or more child nodes, then the tree is said to be an
n-ary tree.
2. Number of Edges: The number of edges in a tree is always one less than the
number of nodes. This is because there is always one fewer edge than there are
nodes in any given path from the root to any leaf node.
3. Depth of Node x: The depth of a node is defined as the length of the shortest
path from the root to that node. In other words, it is simply the number of edges
on the path from the root to that particular node.
4. Height of Node x: The height of a node is expressed as the length of the
longest path from the node to any leaf node. In other words, it is simply the
number of edges on the path from that particular node to the deepest leafnode.
By understanding these four properties, you will have a strong foundation
on which to build more complex applications using trees!

Applications
In computer science, the tree data structure can be used to store
hierarchical data. A tree traversal is a process of visiting each node in a
tree. This can be done in different ways like pre-order, post-order, or in-
order. Trees are also used to store data that naturally have hierarchical
relationships, like the file system on our computers. Besides that, trees
are also used in several applications like heaps, tries, and suffix trees.
Let's take a look at some of these applications:

1) Storing Naturally Hierarchical Data

One of the main applications of tree data structure is to store hierarchical


data. A lot of real-world data falls into this category. For instance, think
about the file system on your computer. The files and folders are stored
in a hierarchical fashion with a root folder (usually denoted by /). Each
subfolder can further have more subfolders and so on. So when you want
to store such data, a tree data structure is the most intuitive way to do it.

2) Organize Data

Trees can also be used as an organizational tool. For instance, a family


tree is one such example where family relationships are represented
using a tree-like structure. Similarly, trees can also be used to represent
geographical features like states and cities in the USA or Countries and
Continents in the world etc.

3) Trie

Trie is an efficient information retrieval data structure that is based on the


key concept of words being prefixes of other words. It's also known as a
radix or prefix tree. A Trie has three main properties – keys have
consistent length, keys are in lexicographical order, and no key is a prefix
of another key at the same level.

With these three conditions met, Trie provides an efficient way to retrieve
strings from a dataset with a time complexity of O(M), where M is the
length of the string retrieved. This makes it suitable for dictionary
operations like autocomplete or spell check etc., which have become very
popular these days with internet users all over the world.

4) Heap

A heap is a special type of binary tree where every parent node has
either two child nodes or no child nodes, and every node satisfies one
heap property – min heap or max heap property software programming. The
Min heap property states that every parent node must have a value less
than or equal to its child node, while the max heap property specifies that
the parent node's value must be greater than or equal to the value of its
child nodes (in the case of two children).

So depending upon which property we need to enforce upon our nodes,


we call it min heap or max heap accordingly. Since heaps are complete
binary trees, they provide a good performance guarantee for insertion
and deletion operation with the time complexity of O(logN), where N
number of elements currently present inside our heap data structure.
Heaps also play an important role behind recommendation algorithms like
the "People Also Watched" section on Netflix or Amazon Prime.

5) B-Tree and T-Tree

B-trees and T-trees are two types of trees in the data structure that are
used to efficiently store large amounts of data. These trees are often
used in databases because they allow for quick insertion and deletion of
records while still maintaining fast access times.

6) Routing Table

A routing table maintains information about routes to particular network


destinations, possibly via multiple network links/routers, thereby allowing
efficient route discovery based on partial match instead of comparison
against all known routes leading up to the destination, reducing routing
overhead considerably, especially in high volume traffic networks.

Difference Between the Binary Tree and the Binary


Search Tree

There are two main types of binary trees: the binary tree and the binary
search tree. Both types of trees in data structure have their own unique
characteristics and drawbacks.

The biggest difference between the two types of trees in the data
structure is in how they are structured. A binary tree comprises two nodes,
each of which can have zero, one, or two child nodes. On the other hand,
a binary search tree is made up of nodes that each have two child nodes.
This difference in structure means that binary trees are more efficient
when searching for data, while binary search trees are more efficient
when it comes to insertions and deletions.

Another difference between the two types of trees in the data structure is
how they are traversed. Binary trees can be traversed in either a breadth-
first or a depth-first manner, while binary search trees can only be
traversed in a depth-first manner. This difference can be significant in
performance; the breadth-first traversal of a binary tree is typically faster
than the depth-first traversal of a binary search tree.

Overall, the choice of which tree to use depends on the application's


specific needs. Both kinds of trees have their own advantages and
disadvantages, so it is important to choose the type that best suits the
application's needs.

Advantages of Tree Data Structure


1. Speed

Trees offer quicker search, insertion, and deletion than other data
structures, such as linked lists, because of their shorter depth. For
example, to delete an element from a linked list, you need to traverse the
entire list until you find the element you want to delete, which could take
O(n) time if the list is unsorted and up to O(log n) if it's sorted.

However, if you know the value of the element you want to delete
beforehand, deleting it from a tree would only take O(log n) time since
you can simply search for it and then remove it.

2. Flexibility

Trees do not have a fixed size like arrays, so they can grow and shrink as
needed, making them very flexible, especially when dealing with dynamic
data sets. For example, let's say you have an array of integers that can
hold 100 elements, and you want to add the 101st element to it. Still,
unfortunately, there's no more space left in the array, so you have to
create a larger array big enough to hold all 101 elements and then copy
all elements from the old array into a new one, which could be inefficient.
Furthermore, since treey does not have a fixed size, you could simply add
the 101st element without worrying about creating larger arrays or
copying data, making it more flexible than arrays.

3. Space Efficiency

Trees only require extra space for pointers since each node only needs to
store the address or reference of its child nodes, unlike arrays which
require extra space for every single element even if some of those
elements are not used yet. For example, let's say we have an array of
integers that can hold 1000 elements, but we only store 500 values. Then
half of our array's memory would go wasted, which is not very space
efficient.

In contrast, with trees, since each node only needs extra space for the
address or reference of its child nodes, we don't waste any memory even
if some parts of our tree are empty.
Tree Data Structure
We read the linear data structures like an array, linked list, stack and queue in which all
the elements are arranged in a sequential manner. The different data structures are used
for different kinds of data.

Some factors are considered for choosing the data structure:

o What type of data needs to be stored?: It might be a possibility that a certain


data structure can be the best fit for some kind of data.

o Cost of operations: If we want to minimize the cost for the operations for the
most frequently performed operations. For example, we have a simple list on
which we have to perform the search operation; then, we can create an array in
which elements are stored in sorted order to perform the binary search. The
binary search works very fast for the simple list as it divides the search space into
half.

o Memory usage: Sometimes, we want a data structure that utilizes less memory.

A tree is also one of the data structures that represent hierarchical data. Suppose we
want to show the employees and their positions in the hierarchical form then it can be
represented as shown below:
The above tree shows the organization hierarchy of some company. In the above
structure, john is the CEO of the company, and John has two direct reports named
as Steve and Rohan. Steve has three direct reports named Lee, Bob, Ella where Steve is
a manager. Bob has two direct reports named Sal and Emma. Emma has two direct
reports named Tom and Raj. Tom has one direct report named Bill. This particular
logical structure is known as a Tree. Its structure is similar to the real tree, so it is named
a Tree. In this structure, the root is at the top, and its branches are moving in a
downward direction. Therefore, we can say that the Tree data structure is an efficient
way of storing the data in a hierarchical way.

Let's understand some key points of the Tree data structure.

o A tree data structure is defined as a collection of objects or entities known as


nodes that are linked together to represent or simulate hierarchy.

o A tree data structure is a non-linear data structure because it does not store in a
sequential manner. It is a hierarchical structure as elements in a Tree are arranged
in multiple levels.
o In the Tree data structure, the topmost node is known as a root node. Each node
contains some data, and data can be of any type. In the above tree structure, the
node contains the name of the employee, so the type of data would be a string.

o Each node contains some data and the link or reference of other nodes that can
be called children.

Some basic terms used in Tree data structure.

Let's consider the tree structure, which is shown below:

In the above structure, each node is labeled with some number. Each arrow shown in the
above figure is known as a link between the two nodes.

o Root: The root node is the topmost node in the tree hierarchy. In other words,
the root node is the one that doesn't have any parent. In the above structure,
node numbered 1 is the root node of the tree. If a node is directly linked to
some other node, it would be called a parent-child relationship.

o Child node: If the node is a descendant of any node, then the node is known as a
child node.

o Parent: If the node contains any sub-node, then that node is said to be the
parent of that sub-node.

o Sibling: The nodes that have the same parent are known as siblings.

o Leaf Node:- The node of the tree, which doesn't have any child node, is called a
leaf node. A leaf node is the bottom-most node of the tree. There can be any
number of leaf nodes present in a general tree. Leaf nodes can also be called
external nodes.

o Internal nodes: A node has atleast one child node known as an internal

o Ancestor node:- An ancestor of a node is any predecessor node on a path from


the root to that node. The root node doesn't have any ancestors. In the tree
shown in the above image, nodes 1, 2, and 5 are the ancestors of node 10.

o Descendant: The immediate successor of the given node is known as a


descendant of a node. In the above figure, 10 is the descendant of node 5.

Properties of Tree data structure


o Recursive data structure: The tree is also known as a recursive data structure.
A tree can be defined as recursively because the distinguished node in a tree data
structure is known as a root node. The root node of the tree contains a link to all
the roots of its subtrees. The left subtree is shown in the yellow color in the below
figure, and the right subtree is shown in the red color. The left subtree can be
further split into subtrees shown in three different colors. Recursion means
reducing something in a self-similar manner. So, this recursive property of the
tree data structure is implemented in various applications.

o Number of edges: If there are n nodes, then there would n-1 edges. Each arrow
in the structure represents the link or path. Each node, except the root node, will
have atleast one incoming link known as an edge. There would be one link for the
parent-child relationship.

o Depth of node x: The depth of node x can be defined as the length of the path
from the root to the node x. One edge contributes one-unit length in the path.
So, the depth of node x can also be defined as the number of edges between the
root node and the node x. The root node has 0 depth.

o Height of node x: The height of node x can be defined as the longest path from
the node x to the leaf node.

Based on the properties of the Tree data structure, trees are classified into various
categories.

Implementation of Tree
The tree data structure can be created by creating the nodes dynamically with the help
of the pointers. The tree in the memory can be represented as shown below:
The above figure shows the representation of the tree data structure in the memory. In
the above structure, the node contains three fields. The second field stores the data; the
first field stores the address of the left child, and the third field stores the address of the
right child.

In programming, the structure of a node can be defined as:

1. struct node
2. {
3. int data;
4. struct node *left;
5. struct node *right;
6. }

The above structure can only be defined for the binary trees because the binary tree can
have utmost two children, and generic trees can have more than two children. The
structure of the node for generic trees would be different as compared to the binary
tree.

Applications of trees
The following are the applications of trees:

o Storing naturally hierarchical data: Trees are used to store the data in the
hierarchical structure. For example, the file system. The file system stored on the
disc drive, the file and folder are in the form of the naturally hierarchical data and
stored in the form of trees.

o Organize data: It is used to organize data for efficient insertion, deletion and
searching. For example, a binary tree has a logN time for searching an element.

o Trie: It is a special kind of tree that is used to store the dictionary. It is a fast and
efficient way for dynamic spell checking.

o Heap: It is also a tree data structure implemented using arrays. It is used to


implement priority queues.

o B-Tree and B+Tree: B-Tree and B+Tree are the tree data structures used to
implement indexing in databases.

o Routing table: The tree data structure is also used to store the data in routing
tables in the routers.

Types of Tree data structure


The following are the types of a tree data structure:

o General tree: The general tree is one of the types of tree data structure. In the
general tree, a node can have either 0 or maximum n number of nodes. There is
no restriction imposed on the degree of the node (the number of nodes that a
node can contain). The topmost node in a general tree is known as a root node.
The children of the parent node are known as subtrees.
There can be n number of subtrees in a general tree. In the general tree, the
subtrees are unordered as the nodes in the subtree cannot be ordered.
Every non-empty tree has a downward edge, and these edges are connected to
the nodes known as child nodes. The root node is labeled with level 0. The nodes
that have the same parent are known as siblings.

o Binary tree: Here, binary name itself suggests two numbers, i.e., 0 and 1. In a
binary tree, each node in a tree can have utmost two child nodes. Here, utmost
means whether the node has 0 nodes, 1 node or 2 nodes.
To know more about the binary tree, click on the link given below:
https://www.javatpoint.com/binary-tree

o Binary Search tree: Binary search tree is a non-linear data structure in which one
node is connected to n number of nodes. It is a node-based data structure. A
node can be represented in a binary search tree with three fields, i.e., data part,
left-child, and right-child. A node can be connected to the utmost two child
nodes in a binary search tree, so the node contains two pointers (left child and
right child pointer).
Every node in the left subtree must contain a value less than the value of the root
node, and the value of each node in the right subtree must be bigger than the
value of the root node.

A node can be created with the help of a user-defined data type known as struct, as
shown below:
1. struct node
2. {
3. int data;
4. struct node *left;
5. struct node *right;
6. }

The above is the node structure with three fields: data field, the second field is the left
pointer of the node type, and the third field is the right pointer of the node type.

To know more about the binary search tree, click on the link given below:

https://www.javatpoint.com/binary-search-tree

o AVL tree

It is one of the types of the binary tree, or we can say that it is a variant of the binary
search tree. AVL tree satisfies the property of the binary tree as well as of the binary
search tree. It is a self-balancing binary search tree that was invented by Adelson
Velsky Lindas. Here, self-balancing means that balancing the heights of left subtree and
right subtree. This balancing is measured in terms of the balancing factor.

We can consider a tree as an AVL tree if the tree obeys the binary search tree as well as
a balancing factor. The balancing factor can be defined as the difference between the
height of the left subtree and the height of the right subtree. The balancing factor's
value must be either 0, -1, or 1; therefore, each node in the AVL tree should have the
value of the balancing factor either as 0, -1, or 1.

To know more about the AVL tree, click on the link given below:

https://www.javatpoint.com/avl-tree

o Red-Black Tree

The red-Black tree is the binary search tree. The prerequisite of the Red-Black tree is
that we should know about the binary search tree. In a binary search tree, the value of
the left-subtree should be less than the value of that node, and the value of the right-
subtree should be greater than the value of that node. As we know that the time
complexity of binary search in the average case is log2n, the best case is O(1), and the
worst case is O(n).

When any operation is performed on the tree, we want our tree to be balanced so that
all the operations like searching, insertion, deletion, etc., take less time, and all these
operations will have the time complexity of log2n.

The red-black tree is a self-balancing binary search tree. AVL tree is also a height
balancing binary search tree then why do we require a Red-Black tree. In the AVL tree,
we do not know how many rotations would be required to balance the tree, but in the
Red-black tree, a maximum of 2 rotations are required to balance the tree. It contains
one extra bit that represents either the red or black color of a node to ensure the
balancing of the tree.

o Splay tree

The splay tree data structure is also binary search tree in which recently accessed
element is placed at the root position of tree by performing some rotation operations.
Here, splaying means the recently accessed node. It is a self-balancing binary search
tree having no explicit balance condition like AVL tree.

It might be a possibility that height of the splay tree is not balanced, i.e., height of both
left and right subtrees may differ, but the operations in splay tree takes order
of logN time where n is the number of nodes.

Splay tree is a balanced tree but it cannot be considered as a height balanced tree
because after each operation, rotation is performed which leads to a balanced tree.

o Treap

Treap data structure came from the Tree and Heap data structure. So, it comprises the
properties of both Tree and Heap data structures. In Binary search tree, each node on
the left subtree must be equal or less than the value of the root node and each node on
the right subtree must be equal or greater than the value of the root node. In heap data
structure, both right and left subtrees contain larger keys than the root; therefore, we
can say that the root node contains the lowest value.

In treap data structure, each node has both key and priority where key is derived from
the Binary search tree and priority is derived from the heap data structure.

The Treap data structure follows two properties which are given below:
o Right child of a node>=current node and left child of a node <=current node
(binary tree)

o Children of any subtree must be greater than the node (heap)

o B-tree

B-tree is a balanced m-way tree where m defines the order of the tree. Till now, we read
that the node contains only one key but b-tree can have more than one key, and more
than 2 children. It always maintains the sorted data. In binary tree, it is possible that leaf
nodes can be at different levels, but in b-tree, all the leaf nodes must be at the same
level.

If order is m then node has the following properties:

o Each node in a b-tree can have maximum m children

o For minimum children, a leaf node has 0 children, root node has minimum 2
children and internal node has minimum ceiling of m/2 children. For example, the
value of m is 5 which means that a node can have 5 children and internal nodes
can contain maximum 3 children.

o Each node has maximum (m-1) keys.

The root node must contain minimum 1 key and all other nodes must contain
atleast ceiling of m/2 minus 1 keys.
Graph
A graph can be defined as group of vertices and edges that are used to connect these
vertices. A graph can be seen as a cyclic tree, where the vertices (Nodes) maintain any
complex relationship among them instead of having parent child relationship.

Definition
A graph G can be defined as an ordered set G(V, E) where V(G) represents the set of
vertices and E(G) represents the set of edges which are used to connect these vertices.

A Graph G(V, E) with 5 vertices (A, B, C, D, E) and six edges ((A,B), (B,C), (C,E), (E,D), (D,B),
(D,A)) is shown in the following figure.

Directed and Undirected Graph


A graph can be directed or undirected. However, in an undirected graph, edges are not
associated with the directions with them. An undirected graph is shown in the above
figure since its edges are not attached with any of the directions. If an edge exists
between vertex A and B then the vertices can be traversed from B to A as well as A to B.

In a directed graph, edges form an ordered pair. Edges represent a specific path from
some vertex A to another vertex B. Node A is called initial node while node B is called
terminal node.

A directed graph is shown in the following figure.


Graph Terminology
Path
A path can be defined as the sequence of nodes that are followed in order to reach
some terminal node V from the initial node U.

Closed Path
A path will be called as closed path if the initial node is same as terminal node. A path
will be closed path if V0=VN.

Simple Path
If all the nodes of the graph are distinct with an exception V0=VN, then such path P is
called as closed simple path.

Cycle
A cycle can be defined as the path which has no repeated edges or vertices except the
first and last vertices.

Connected Graph
A connected graph is the one in which some path exists between every two vertices (u,
v) in V. There are no isolated nodes in connected graph.
Complete Graph
A complete graph is the one in which every node is connected with all other nodes. A
complete graph contain n(n-1)/2 edges where n is the number of nodes in the graph.

Weighted Graph
In a weighted graph, each edge is assigned with some data such as length or weight.
The weight of an edge e can be given as w(e) which must be a positive (+) value
indicating the cost of traversing the edge.

Digraph
A digraph is a directed graph in which each edge of the graph is associated with some
direction and the traversing can be done only in the specified direction.

Loop
An edge that is associated with the similar end points can be called as Loop.

Adjacent Nodes
If two nodes u and v are connected via an edge e, then the nodes u and v are called as
neighbours or adjacent nodes.

Degree of the Node


A degree of a node is the number of edges that are connected with that node. A node
with degree 0 is called as isolated node.
Graph representation
In this article, we will discuss the ways to represent the graph. By Graph representation,
we simply mean the technique to be used to store some graph into the computer's
memory.

A graph is a data structure that consist a sets of vertices (called nodes) and edges. There
are two ways to store Graphs into the computer's memory:

o Sequential representation (or, Adjacency matrix representation)

o Linked list representation (or, Adjacency list representation)

In sequential representation, an adjacency matrix is used to store the graph. Whereas in


linked list representation, there is a use of an adjacency list to store the graph.

In this tutorial, we will discuss each one of them in detail.

Now, let's start discussing the ways of representing a graph in the data structure.

Sequential representation
In sequential representation, there is a use of an adjacency matrix to represent the
mapping between vertices and edges of the graph. We can use an adjacency matrix to
represent the undirected graph, directed graph, weighted directed graph, and weighted
undirected graph.

If adj[i][j] = w, it means that there is an edge exists from vertex i to vertex j with weight
w.

An entry Aij in the adjacency matrix representation of an undirected graph G will be 1 if


an edge exists between Vi and Vj. If an Undirected Graph G consists of n vertices, then
the adjacency matrix for that graph is n x n, and the matrix A = [aij] can be defined as -

aij = 1 {if there is a path exists from Vi to Vj}

aij = 0 {Otherwise}

It means that, in an adjacency matrix, 0 represents that there is no association exists


between the nodes, whereas 1 represents the existence of a path between two edges.
If there is no self-loop present in the graph, it means that the diagonal entries of the
adjacency matrix will be 0.

Now, let's see the adjacency matrix representation of an undirected graph.

In the above figure, an image shows the mapping among the vertices (A, B, C, D, E), and
this mapping is represented by using the adjacency matrix.

There exist different adjacency matrices for the directed and undirected graph. In a
directed graph, an entry Aij will be 1 only when there is an edge directed from Vi to Vj.

Adjacency matrix for a directed graph


In a directed graph, edges represent a specific path from one vertex to another vertex.
Suppose a path exists from vertex A to another vertex B; it means that node A is the
initial node, while node B is the terminal node.

Consider the below-directed graph and try to construct the adjacency matrix of it.
In the above graph, we can see there is no self-loop, so the diagonal entries of the
adjacent matrix are 0.

Adjacency matrix for a weighted directed graph

It is similar to an adjacency matrix representation of a directed graph except that instead


of using the '1' for the existence of a path, here we have to use the weight associated
with the edge. The weights on the graph edges will be represented as the entries of the
adjacency matrix. We can understand it with the help of an example. Consider the below
graph and its adjacency matrix representation. In the representation, we can see that the
weight associated with the edges is represented as the entries in the adjacency matrix.

In the above image, we can see that the adjacency matrix representation of the
weighted directed graph is different from other representations. It is because, in this
representation, the non-zero values are replaced by the actual weight assigned to the
edges.

Adjacency matrix is easier to implement and follow. An adjacency matrix can be used
when the graph is dense and a number of edges are large.

Though, it is advantageous to use an adjacency matrix, but it consumes more space.


Even if the graph is sparse, the matrix still consumes the same space.

Linked list representation


An adjacency list is used in the linked representation to store the Graph in the
computer's memory. It is efficient in terms of storage as we only have to store the values
for edges.

Let's see the adjacency list representation of an undirected graph.

In the above figure, we can see that there is a linked list or adjacency list for every node
of the graph. From vertex A, there are paths to vertex B and vertex D. These nodes are
linked to nodes A in the given adjacency list.

An adjacency list is maintained for each node present in the graph, which stores the
node value and a pointer to the next adjacent node to the respective node. If all the
adjacent nodes are traversed, then store the NULL in the pointer field of the last node of
the list.

The sum of the lengths of adjacency lists is equal to twice the number of edges present
in an undirected graph.
Now, consider the directed graph, and let's see the adjacency list representation of that
graph.

For a directed graph, the sum of the lengths of adjacency lists is equal to the number of
edges present in the graph.

Now, consider the weighted directed graph, and let's see the adjacency list
representation of that graph.

In the case of a weighted directed graph, each node contains an extra field that is called
the weight of the node.

In an adjacency list, it is easy to add a vertex. Because of using the linked list, it also
saves space.

Implementation of adjacency matrix representation of Graph


Now, let's see the implementation of adjacency matrix representation of graph in C.
In this program, there is an adjacency matrix representation of an undirected graph. It
means that if there is an edge exists from vertex A to vertex B, there will also an edge
exists from vertex B to vertex A.

Here, there are four vertices and five edges in the graph that are non-directed.

1. /* Adjacency Matrix representation of an undirected graph in C */


2.
3. #include <stdio.h>
4. #define V 4 /* number of vertices in the graph */
5.
6. /* function to initialize the matrix to zero */
7. void init(int arr[][V]) {
8. int i, j;
9. for (i = 0; i < V; i++)
10. for (j = 0; j < V; j++)
11. arr[i][j] = 0;
12. }
13.
14. /* function to add edges to the graph */
15. void insertEdge(int arr[][V], int i, int j) {
16. arr[i][j] = 1;
17. arr[j][i] = 1;
18. }
19.
20. /* function to print the matrix elements */
21. void printAdjMatrix(int arr[][V]) {
22. int i, j;
23. for (i = 0; i < V; i++) {
24. printf("%d: ", i);
25. for (j = 0; j < V; j++) {
26. printf("%d ", arr[i][j]);
27. }
28. printf("\n");
29. }
30. }
31.
32. int main() {
33. int adjMatrix[V][V];
34.
35. init(adjMatrix);
36. insertEdge(adjMatrix, 0, 1);
37. insertEdge(adjMatrix, 0, 2);
38. insertEdge(adjMatrix, 1, 2);
39. insertEdge(adjMatrix, 2, 0);
40. insertEdge(adjMatrix, 2, 3);
41.
42. printAdjMatrix(adjMatrix);
43.
44. return 0;
45. }

Output:

After the execution of the above code, the output will be -

Implementation of adjacency list representation of Graph


Now, let's see the implementation of adjacency list representation of graph in C.

In this program, there is an adjacency list representation of an undirected graph. It


means that if there is an edge exists from vertex A to vertex B, there will also an edge
exists from vertex B to vertex A.
1. /* Adjacency list representation of a graph in C */
2. #include <stdio.h>
3. #include <stdlib.h>
4.
5. /* structure to represent a node of adjacency list */
6. struct AdjNode {
7. int dest;
8. struct AdjNode* next;
9. };
10.
11. /* structure to represent an adjacency list */
12. struct AdjList {
13. struct AdjNode* head;
14. };
15.
16. /* structure to represent the graph */
17. struct Graph {
18. int V; /*number of vertices in the graph*/
19. struct AdjList* array;
20. };
21.
22.
23. struct AdjNode* newAdjNode(int dest)
24. {
25. struct AdjNode* newNode = (struct AdjNode*)malloc(sizeof(struct AdjNode));
26. newNode->dest = dest;
27. newNode->next = NULL;
28. return newNode;
29. }
30.
31. struct Graph* createGraph(int V)
32. {
33. struct Graph* graph = (struct Graph*)malloc(sizeof(struct Graph));
34. graph->V = V;
35. graph->array = (struct AdjList*)malloc(V * sizeof(struct AdjList));
36.
37. /* Initialize each adjacency list as empty by making head as NULL */
38. int i;
39. for (i = 0; i < V; ++i)
40. graph->array[i].head = NULL;
41. return graph;
42. }
43.
44. /* function to add an edge to an undirected graph */
45. void addEdge(struct Graph* graph, int src, int dest)
46. {
47. /* Add an edge from src to dest. The node is added at the beginning */
48. struct AdjNode* check = NULL;
49. struct AdjNode* newNode = newAdjNode(dest);
50.
51. if (graph->array[src].head == NULL) {
52. newNode->next = graph->array[src].head;
53. graph->array[src].head = newNode;
54. }
55. else {
56.
57. check = graph->array[src].head;
58. while (check->next != NULL) {
59. check = check->next;
60. }
61. // graph->array[src].head = newNode;
62. check->next = newNode;
63. }
64.
65. /* Since graph is undirected, add an edge from dest to src also */
66. newNode = newAdjNode(src);
67. if (graph->array[dest].head == NULL) {
68. newNode->next = graph->array[dest].head;
69. graph->array[dest].head = newNode;
70. }
71. else {
72. check = graph->array[dest].head;
73. while (check->next != NULL) {
74. check = check->next;
75. }
76. check->next = newNode;
77. }
78. }
79. /* function to print the adjacency list representation of graph*/
80. void print(struct Graph* graph)
81. {
82. int v;
83. for (v = 0; v < graph->V; ++v) {
84. struct AdjNode* pCrawl = graph->array[v].head;
85. printf("\n The Adjacency list of vertex %d is: \n head ", v);
86. while (pCrawl) {
87. printf("-> %d", pCrawl->dest);
88. pCrawl = pCrawl->next;
89. }
90. printf("\n");
91. }
92. }
93.
94. int main()
95. {
96.
97. int V = 4;
98. struct Graph* g = createGraph(V);
99. addEdge(g, 0, 1);
100. addEdge(g, 0, 3);
101. addEdge(g, 1, 2);
102. addEdge(g, 1, 3);
103. addEdge(g, 2, 4);
104. addEdge(g, 2, 3);
105. addEdge(g, 3, 4);
106. print(g);
107. return 0;
108. }

Output:

In the output, we will see the adjacency list representation of all the vertices of the
graph. After the execution of the above code, the output will be -
Introduction to Graphs – Data Structure and
Algorithm
A Graph is a non-linear data structure consisting of vertices and edges. The vertices are sometimes
also referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph.
More formally a Graph is composed of a set of vertices( V ) and a set of edges( E ). The graph is
denoted by G(V, E).
Graph data structures are a powerful tool for representing and analyzing complex relationships
between objects or entities. They are particularly useful in fields such as social network analysis,
recommendation systems, and computer networks. In the field of sports data science, graph data
structures can be used to analyze and understand the dynamics of team performance and player
interactions on the field.

Components of a Graph
 Vertices: Vertices are the fundamental units of the graph. Sometimes, vertices are also known as
vertex or nodes. Every node/vertex can be labeled or unlabelled.
 Edges: Edges are drawn or used to connect two nodes of the graph. It can be ordered pair of
nodes in a directed graph. Edges can connect any two nodes in any possible way. There are no
rules. Sometimes, edges are also known as arcs. Every edge can be labelled/unl abelled.

Types Of Graph
1. Null Graph
A graph is known as a null graph if there are no edges in the graph.
2. Trivial Graph
Graph having only a single vertex, it is also the smallest graph possible.
3. Undirected Graph
A graph in which edges do not have any direction. That is the nodes are unordered
pairs in the definition of every edge.
4. Directed Graph
A graph in which edge has direction. That is the nodes are ordered pairs in the
definition of every edge.

5. Connected Graph
The graph in which from one node we can visit any other node in the graph is known
as a connected graph.
6. Disconnected Graph
The graph in which at least one node is not reachable from a node is known as a
disconnected graph.
7. Regular Graph
The graph in which the degree of every vertex is equal to K is called K regular graph.
8. Complete Graph
The graph in which from each node there is an edge to each other node.

.
9. Cycle Graph
The graph in which the graph is a cycle in itself, the degree of each vertex is 2.
10. Cyclic Graph
A graph containing at least one cycle is known as a Cyclic graph.
11. Directed Acyclic Graph
A Directed Graph that does not contain any cycle.
12. Bipartite Graph
A graph in which vertex can be divided into two sets such that vertex in each set does
not contain any edge between them.

13. Weighted Graph


 A graph in which the edges are already specified with suitable weight is known as
a weighted graph.
 Weighted graphs can be further classified as directed weighted graphs and
undirected weighted graphs.
Representation of Graphs
There are two ways to store a graph:
 Adjacency Matrix
 Adjacency List
Adjacency Matrix
In this method, the graph is stored in the form of the 2D matrix where rows and
columns denote vertices. Each entry in the matrix represents the weight of the edge
between those vertices.

Adjacency List
This graph is represented as a collection of linked lists. There is an array of pointer
which points to the edges connected to that vertex.

Comparison between Adjacency Matrix and Adjacency List


When the graph contains a large number of edges then it is good to store it as a
matrix because only some entries in the matrix will be empty. An algorithm such
as Prim’s and Dijkstra adjacency matrix is used to have less complexity.
Action Adjacency Matrix Adjacency List
Action Adjacency Matrix Adjacency List

Adding Edge O(1) O(1)

Removing an edge O(1) O(N)

Initializing O(N*N) O(N)

Basic Operations on Graphs


Below are the basic operations on the graph:
 Insertion of Nodes/Edges in the graph – Insert a node into the graph.
 Deletion of Nodes/Edges in the graph – Delete a node from the graph.
 Searching on Graphs – Search an entity in the graph.
 Traversal of Graphs – Traversing all the nodes in the graph.
Usage of graphs
 Maps can be represented using graphs and then can be used by computers to
provide various services like the shortest path between two cities.
 When various tasks depend on each other then this situation can be represented
using a Directed Acyclic graph and we can find the order in which tasks can be
performed using topological sort.
 State Transition Diagram represents what can be the legal moves from current
states. In-game of tic tac toe this can be used.
Real-Life Applications of Graph

Following are the real-life applications:


 Graph data structures can be used to represent the interactions between players
on a team, such as passes, shots, and tackles. Analyzing these interactions can
provide insights into team dynamics and areas for improvement.
 Commonly used to represent social networks, such as networks of friends on
social media.
 Graphs can be used to represent the topology of computer networks, such as the
connections between routers and switches.
 Graphs are used to represent the connections between different places in a
transportation network, such as roads and airports.
 Neural Networks: Vertices represent neurons and edges represent the synapses
between them. Neural networks are used to understand how our brain works and
how connections change when we learn. The human brain has about 10^11
neurons and close to 10^15 synapses.
 Compilers: Graphs are used extensively in compilers. They can be used for type
inference, for so-called data flow analysis, register allocation, and many other
purposes. They are also used in specialized compilers, such as query
optimization in database languages.
 Robot planning: Vertices represent states the robot can be in and the edges the
possible transitions between the states. Such graph plans are used, for example,
in planning paths for autonomous vehicles.
When to use Graphs:
 When you need to represent and analyze the relationships between different
objects or entities.
 When you need to perform network analysis.
 When you need to identify key players, influencers or bottlenecks in a system.
 When you need to make predictions or recommendations.
 Modeling networks: Graphs are commonly used to model various types of
networks, such as social networks, transportation networks, and computer
networks. In these cases, vertices represent nodes in the network, and edges
represent the connections between them.
Advantages and Disadvantages:

Advantages:

1. Graphs are a versatile data structure that can be used to represent a wide range
of relationships and data structures.
2. They can be used to model and solve a wide range of problems, including
pathfinding, data clustering, network analysis, and machine learning.
3. Graph algorithms are often very efficient and can be used to solve complex
problems quickly and effectively.
4. Graphs can be used to represent complex data structures in a simple and intuiti ve
way, making them easier to understand and analyze.

Disadvantages:

1. Graphs can be complex and difficult to understand, especially for people who are
not familiar with graph theory or related algorithms.
2. Creating and manipulating graphs can be computationally expensive, especially
for very large or complex graphs.
3. Graph algorithms can be difficult to design and implement correctly, and can be
prone to bugs and errors.
4. Graphs can be difficult to visualize and analyze, especially for very large or
complex graphs, which can make it challenging to extract meaningful insights
from the data
Applications of Graphs in Data Structure
Graphs data structures have a variety of applications. Some of the most popular
applications are:

 Helps to define the flow of computation of software programs.


 Used in Google maps for building transportation systems. In google maps, the
intersection of two or more roads represents the node while the road connecting two
nodes represents an edge. Google maps algorithm uses graphs to calculate the
shortest distance between two vertices.
 Used in social networks such as Facebook and Linkedin.
 Operating Systems use Resource Allocation Graph where every process and
resource acts as a node. While we draw edges from resources to the allocated
process.
 Used in the world wide web where the web pages represent the nodes.
 Blockchains also use graphs. The nodes are blocks that store many transactions
while the edges connect subsequent blocks.
 Used in modeling data.
Applications of Dijkstra’s shortest path
algorithm
Dijkstra’s algorithm is one of the most popular algorithms for solving many single-
source shortest path problems having non-negative edge weight in the graphs i.e., it
is to find the shortest distance between two vertices on a graph. It was conceived by
computer scientist Edsger W. Dijkstra in 1956 and published three years later.
Dijkstra’s Algorithm has several real-world use cases, some of which are as follows:
1. Digital Mapping Services in Google Maps: Many times we have tried to find the
distance in G-Maps, from one city to another, or from your location to the nearest
desired location. There encounters the Shortest Path Algorithm, as there are
various routes/paths connecting them but it has to show the minimum distance, so
Dijkstra’s Algorithm is used to find the minimum distance between two locations
along the path. Consider India as a graph and represent a city/place with a vertex
and the route between two cities/places as an edge, then by using this algorithm,
the shortest routes between any two cities/places or from one city/place to
another city/place can be calculated.
2. Social Networking Applications: In many applications you might have seen the
app suggests the list of friends that a particular user may know. How do you think
many social media companies implement this feature efficiently, especially when
the system has over a billion users. The standard Dijkstra algorithm can be
applied using the shortest path between users measured through handshakes or
connections among them. When the social networking graph is very small, it uses
standard Dijkstra’s algorithm along with some other features to find the shortest
paths, and however, when the graph is becoming bigger and bigger, the standard
algorithm takes a few several seconds to count and alternate advanced
algorithms are used.
3. Telephone Network: As we know, in a telephone network, each line has a
bandwidth, ‘b’. The bandwidth of the transmission line is the highest frequency
that line can support. Generally, if the frequency of the signal is higher in a certain
line, the signal is reduced by that line. Bandwidth represents the amount of
information that can be transmitted by the line. If we imagine a city to be a graph,
the vertices represent the switching stations, and the edges represent the
transmission lines and the weight of the edges represents ‘b’. So as you can see
it can fall into the category of shortest distance problem, for which the Dijkstra is
can be used.
4. IP routing to find Open shortest Path First: Open Shortest Path First (OSPF) is
a link-state routing protocol that is used to find the best path between the source
and the destination router using its own Shortest Path First. Dijkstra’s algorithm is
widely used in the routing protocols required by the routers to update their
forwarding table. The algorithm provides the shortest cost path from the source
router to other routers in the network.
5. Flighting Agenda: For example, If a person needs software for making an
agenda of flights for customers. The agent has access to a database with all
airports and flights. Besides the flight number, origin airport, and destination, the
flights have departure and arrival time. Specifically, the agent wants to determine
the earliest arrival time for the destination given an origin airport and start time.
There this algorithm comes into use.
6. Designate file server: To designate a file server in a LAN(local area network),
Dijkstra’s algorithm can be used. Consider that an infinite amount of time is
required for transmitting files from one computer to another computer. Therefore
to minimize the number of “hops” from the file server to every other computer on
the network the idea is to use Dijkstra’s algorithm to minimize the shortest path
between the networks resulting in the minimum number of hops.
7. Robotic Path: Nowadays, drones and robots have come into existence, some of
which are manual, some automated. The drones/robots which are automated and
are used to deliver the packages to a specific location or used for a task are
loaded with this algorithm module so that when the source and destination is
known, the robot/drone moves in the ordered direction by following the shortest
path to keep delivering the package in a minimum amount of time.
What is Dijkstra’s Algorithm?

What if you are provided with a graph of nodes where every node is linked to several

other nodes with varying distance. Now, if you begin from one of the nodes in the graph,

what is the shortest path to every other node in the graph?

Well simply explained, an algorithm that is used for finding the shortest distance, or

path, from starting node to target node in a weighted graph is known as Dijkstra’s

Algorithm.

This algorithm makes a tree of the shortest path from the starting node, the source, to all

other nodes (points) in the graph.

Dijkstra's algorithm makes use of weights of the edges for finding the path that

minimizes the total distance (weight) among the source node and all other nodes. This

algorithm is also known as the single-source shortest path algorithm.

Dijkstra’s algorithm is the iterative algorithmic process to provide us with the shortest

path from one specific starting node to all other nodes of a graph. It is different from

the minimum spanning tree as the shortest distance among two vertices might not

involve all the vertices of the graph.


It is important to note that Dijkstra’s algorithm is only applicable when all weights are

positive because, during the execution, the weights of the edges are added to find the

shortest path.

And therefore if any of the weights are introduced to be negative on the edges of the

graph, the algorithm would never work properly. However, some algorithms like

the Bellman-Ford Algorithm can be used in such cases.

It is also a known fact that breadth-first search(BFS) could be used for calculating the

shortest path for an unweighted graph, or for a weighted graph that has the same cost at

all its edges.

But if the weighted graph has unequal costs at all its edges, then BFS infers uniform-cost

search. Now what?

Instead of extending nodes in order of their depth from the root, uniform-cost search

develops the nodes in order of their costs from the root. And a variant of this algorithm is

accepted as Dijkstra’s Algorithm.

Generally, Dijkstra’s algorithm works on the principle of relaxation where an

approximation of the accurate distance is steadily displaced by more suitable values until

the shortest distance is achieved.


Also, the estimated distance to every node is always an overvalue of the true distance and

is generally substituted by the least of its previous value with the distance of a recently

determined path.

It uses a priority queue to greedily choose the nearest node that has not been visited yet

and executes the relaxation process on all of its edges. (From)

Example Involved

For example, an individual wants to calculate the shortest distance between the source, A,

and the destination, D, while calculating a subpath which is also the shortest path

between its source and destination. Let’s see here how Dijkstra’s algorithm works;

It works on the fact that any subpath, let say a subpath B to D of the shortest path

between vertices A and D is also the shortest path between vertices B and D, i.e., each

subpath is the shortest path.

Here, Dijkstra’s algorithm uses this property in the reverse direction, that means, while

determining distance, we overestimate the distance of each vertex from the starting vertex

then inspect each node and its neighbours for detecting the shortest subpath to those

neighbours.
This way the algorithm deploys a greedy approach by searching for the next plausible

solution and expects that the end result would be the appropriate solution for the entire

problem.

How to Implement the Dijkstra Algorithm?

Before proceeding the step by step process for implementing the algorithm, let us

consider some essential characteristics of Dijkstra’s algorithm;

 Basically, the Dijkstra’s algorithm begins from the node to be selected, the source node,

and it examines the entire graph to determine the shortest path among that node and all

the other nodes in the graph.

 The algorithm maintains the track of the currently recognized shortest distance from each

node to the source code and updates these values if it identifies another shortest path.

 Once the algorithm has determined the shortest path amid the source code to another

node, the node is marked as “visited” and can be added to the path.

 This process is being continued till all the nodes in the graph have been added to the path,

as this way, a path gets created that connects the source node to all the other nodes

following the plausible shortest path to reach each node.

Also Read | Types of Statistical Analysis

Now explaining the step by step process of algorithm implementation;


1. The very first step is to mark all nodes as unvisited,

2. Mark the picked starting node with a current distance of 0 and the rest nodes with

infinity,

3. Now, fix the starting node as the current node,

4. For the current node, analyse all of its unvisited neighbours and measure their

distances by adding the current distance of the current node to the weight of the

edge that connects the neighbour node and current node,

5. Compare the recently measured distance with the current distance assigned to the

neighbouring node and make it as the new current distance of the neighbouring

node,

6. After that, consider all of the unvisited neighbours of the current node, mark the

current node as visited,

7. If the destination node has been marked visited then stop, an algorithm has ended,

and

8. Else, choose the unvisited node that is marked with the least distance, fix it as the

new current node, and repeat the process again from step 4.

Working Example of Dijkstra's Algorithm

In the above section, you have gained the step by step process of Dijkstra’s algorithm,

now let’s study the algorithm with an explained example.


We will calculate the shortest path between node C and the other nodes in the graph.

Example of Dijkstra's Algorithm

1. During the execution of the algorithm, each node will be marked with its

minimum distance to node C as we have selected node C.

In this case, the minimum distance is 0 for node C. Also, for the rest of the nodes,

as we don’t know this distance, they will be marked as infinity (∞), except node C

(currently marked as red dot).


Graphical Representation of Node C as Current Node

2. Now the neighbours of node C will be checked, i.e, node A, B, and D. We start

with B, here we will add the minimum distance of current node (0) with the weight

of the edge (7) that linked the node C to node B and get 0+ 7= 7.

Now, this value will be compared with the minimum distance of B (infinity), the

least value is the one that remains the minimum distance of B, like in this case, 7 is

less than infinity, and marks the least value to node B.

Assign Node B a minimum distance value


3. Now, the same process is checked with neighbour A. We add 0 with 1 (weight of

edge that connects node C to A), and get 1. Again, 1 is compared with the

minimum distance of A (infinity), and marks the lowest value.

Assign Node A a minimum distance value

The same is repeated with node D, and marked 2 as lowest value at D.

Assign Node D a minimum distance value

Since, all the neighbours of node C have checked, so node C is marked as visited

with a green check mark.


Marked Node C as visited

4. Now, we will select the new current node such that the node must be unvisited

with the lowest minimum distance, or the node with the least number and no check

mark. Here, node A is the unvisited with minimum distance 1, marked as current

node with red dot.

Graphical Representation of Node A as Current Node

We repeat the algorithm, checking the neighbour of the current node while

ignoring the visited node, so only node B will be checked.


For node B, we add 1 with 3 (weight of the edge connecting node A to B) and

obtain 4. This value, 4, will be compared with the minimum distance of B, 7, and

mark the lowest value at B as 4.

Assign Node B a minimum distance value

5. After this, node A marked as visited with a green check mark. The current node is

selected as node D, it is unvisited and has a smallest recent distance. We repeat the

algorithm and check for node B and E.

Graphical Representation of Node D as Current Node


For node B, we add 2 to 5, get 7 and compare it with the minimum distance value

of B, since 7>4, so leave the smallest distance value at node B as 4.

For node E, we obtain 2+ 7= 9, and compare it with the minimum distance of E

which is infinity, and mark the smallest value as node E as 9. The node D is

marked as visited with a green check mark.

Marked Node D as visited

6. The current node is set as node B, here we need to check only node E as it is

unvisited and the node D is visited. We obtain 4+ 1=5, compare it with the

minimum distance of the node.

As 9 > 5, leave the smallest value at node node E as 5.

We mark D as visited node with a green check mark, and node E is set as the

current node.
Marked Node B as visited

7. Since it doesn’t have any unvisited neighbours, so there is not any requirement to

check anything. Node E is marked as a visited node with a green mark.

Marked Node E as visited

So, we are done as no unvisited node is left. The minimum distance of each node is now

representing the minimum distance of that node from node C.

Applications of Dijkstra’s Algorithm


Before learning any algorithm, we should know the fundamental purpose of using an

algorithm that could help us in real-world applications. Such as, for Dijkstra’s algorithm,

we are trying to find the solutions to least path based problems.

For example, if a person wants to travel from city A to city B where both cities are

connected with various routes. Which route commonly he/ she should choose?

Undoubtedly, we would adopt the route through which we could reach the destination

with the least possible time, distance and even cost.

Further, with the discussion, it has various real-world use cases, some of the applications

are the following:

 For map applications, it is hugely deployed in measuring the least possible distance and

check direction amidst two geographical regions like Google Maps, discovering map

locations pointing to the vertices of a graph, calculating traffic and delay-timing, etc.

 For telephone networks, this is also extensively implemented in the conducting of data in

networking and telecommunication domains for decreasing the obstacle taken place for

transmission.

 Wherever addressing the need for shortest path explications either in the domain of

robotics, transport, embedded systems, laboratory or production plants, etc, this

algorithm is applied.

 Besides that, other applications are road conditions, road closures and construction, and

IP routing to detect Open Shortest Path First.


Advantages and Disadvantages of Dijkstra’s Algorithm

Discussing some advantages of Dijkstra’s algorithm;

1. One of the main advantages of it is its little complexity which is almost linear.

2. It can be used to calculate the shortest path between a single node to all other

nodes and a single source node to a single destination node by stopping the

algorithm once the shortest distance is achieved for the destination node.

3. It only works for directed-, weighted graphs and all edges should have non-

negative values.

Despite various applications and advantages, Dijkstra’s algorithm has


disadvantages also, such as;

1. It does an obscured exploration that consumes a lot of time while processing,

2. It is unable to handle negative edges,

3. As it heads to the acyclic graph, so can’t achieve the accurate shortest path, and

4. Also, there is a need to maintain tracking of vertices, have been visited.


Shortest Path Problem-

In data structures,
 Shortest path problem is a problem of finding the shortest path(s) between
vertices of a given graph.
 Shortest path between two vertices is a path that has the least cost as
compared to all other existing paths.

Shortest Path Algorithms-

Shortest path algorithms are a family of algorithms used for solving the
shortest path problem.

Applications-

Shortest path algorithms have a wide range of applications such as in-


 Google Maps
 Road Networks
 Logistics Research

Types of Shortest Path Problem-

Various types of shortest path problem are-


1. Single-pair shortest path problem
2. Single-source shortest path problem
3. Single-destination shortest path problem
4. All pairs shortest path problem

Single-Pair Shortest Path Problem-

 It is a shortest path problem where the shortest path between a given pair of
vertices is computed.
 A* Search Algorithm is a famous algorithm used for solving single-pair
shortest path problem.

Single-Source Shortest Path Problem-


 It is a shortest path problem where the shortest path from a given source
vertex to all other remaining vertices is computed.
 Dijkstra’s Algorithm and Bellman Ford Algorithm are the famous algorithms
used for solving single-source shortest path problem.

Single-Destination Shortest Path Problem-

 It is a shortest path problem where the shortest path from all the vertices to
a single destination vertex is computed.
 By reversing the direction of each edge in the graph, this problem reduces
to single-source shortest path problem.
 Dijkstra’s Algorithm is a famous algorithm adapted for solving single-
destination shortest path problem.

All Pairs Shortest Path Problem-

 It is a shortest path problem where the shortest path between every pair of
vertices is computed.
 Floyd-Warshall Algorithm and Johnson’s Algorithm are the famous
algorithms used for solving All pairs shortest path problem.

The shortest path problem is the problem of finding a path between two vertices (or
nodes) in a graph such that the sum of the weights of its constituent edges is
minimized. The shortest path between any two nodes of the graph can be founded
using many algorithms, such as Dijkstra’s algorithm, Bellman-Ford algorithm, Floyd
Warshall algorithm. There are some properties of finding the shortest paths based on
which the algorithm to find the shortest path works:
1. Optimal Substructure Property
 All the sub-paths of the shortest path must also be the shortest paths.
 If there exists the shortest path length between two nodes U and V, then
greedily choosing the edge with the minimum length between V to S will give
the shortest path length between U and S.
 All the algorithms listed above work based on this property.
 For example, let P1 be a sub-path from (X → Y) of the shortest path (S →X
→Y → V) of graph G. And let P2 be any other path (X → Y) in graph G.
Then, the cost of P1 must be less than or equal to the cost of P2.
Otherwise, the path (S →X →Y → V) will not be the shortest path between
nodes S and V.

Graph G

2. Triangle Inequality
 Let d(a, b) be the length of the shortest path from a to b in graph G1. Then,
 d(a, b) ≤ d(a, x) + d(x, b)
Dijkstra Algorithm-

 Dijkstra Algorithm is a very famous greedy algorithm.


 It is used for solving the single source shortest path problem.
 It computes the shortest path from one particular source node to all other remaining nodes of
the graph.

Also Read- Shortest Path Problem

Conditions-

It is important to note the following points regarding Dijkstra Algorithm-

 Dijkstra algorithm works only for connected graphs.


 Dijkstra algorithm works only for those graphs that do not contain any negative weight edge.
 The actual Dijkstra algorithm does not output the shortest paths.
 It only provides the value or cost of the shortest paths.
 By making minor modifications in the actual algorithm, the shortest paths can be easily
obtained.
 Dijkstra algorithm works for directed as well as undirected graphs.

Dijkstra Algorithm-

dist[S] ← 0 // The distance to source vertex is set to 0

Π[S] ← NIL // The predecessor of source vertex is set as NIL

for all v ∈ V - {S} // For all other vertices

do dist[v] ← ∞ // All other distances are set to ∞

Π[v] ← NIL // The predecessor of all other vertices is set as NIL

S ← ∅ // The set of vertices that have been visited 'S' is initially empty

Q ← V // The queue 'Q' initially contains all the vertices

while Q ≠ ∅ // While loop executes till the queue is not empty

do u ← mindistance (Q, dist) // A vertex from Q with the least distance is selected

S ← S ∪ {u} // Vertex 'u' is added to 'S' list of vertices that have been visited

for all v ∈ neighbors[u] // For all the neighboring vertices of vertex 'u'

do if dist[v] > dist[u] + w(u,v) // if any new shortest path is discovered

then dist[v] ← dist[u] + w(u,v) // The new value of the shortest path is selected

return dist
Implementation-

The implementation of above Dijkstra Algorithm is explained in the following steps-

Step-01:

In the first step. two sets are defined-

 One set contains all those vertices which have been included in the shortest path tree.
 In the beginning, this set is empty.
 Other set contains all those vertices which are still left to be included in the shortest path tree.
 In the beginning, this set contains all the vertices of the given graph.

Step-02:

For each vertex of the given graph, two variables are defined as-

 Π[v] which denotes the predecessor of vertex ‘v’


 d[v] which denotes the shortest path estimate of vertex ‘v’ from the source vertex.

Initially, the value of these variables is set as-

 The value of variable ‘Π’ for each vertex is set to NIL i.e. Π[v] = NIL
 The value of variable ‘d’ for source vertex is set to 0 i.e. d[S] = 0
 The value of variable ‘d’ for remaining vertices is set to ∞ i.e. d[v] = ∞

Step-03:

The following procedure is repeated until all the vertices of the graph are processed-

 Among unprocessed vertices, a vertex with minimum value of variable ‘d’ is chosen.
 Its outgoing edges are relaxed.
 After relaxing the edges for that vertex, the sets created in step-01 are updated.

What is Edge Relaxation?


Consider the edge (a,b) in the following graph-

Here, d[a] and d[b] denotes the shortest path estimate for vertices a and b respectively from the
source vertex ‘S’.

Now,

If d[a] + w < d[b]

then d[b] = d[a] + w and Π[b] = a

This is called as edge relaxation.

Time Complexity Analysis-

Case-01:

This case is valid when-

 The given graph G is represented as an adjacency matrix.


 Priority queue Q is represented as an unordered list.

Here,

 A[i,j] stores the information about edge (i,j).


 Time taken for selecting i with the smallest dist is O(V).
 For each neighbor of i, time taken for updating dist[j] is O(1) and there will be maximum V
neighbors.
 Time taken for each iteration of the loop is O(V) and one vertex is deleted from Q.
 Thus, total time complexity becomes O(V2).

Case-02:

This case is valid when-

 The given graph G is represented as an adjacency list.


 Priority queue Q is represented as a binary heap.

Here,

 With adjacency list representation, all vertices of the graph can be traversed using BFS in
O(V+E) time.
 In min heap, operations like extract-min and decrease-key value takes O(logV) time.
 So, overall time complexity becomes O(E+V) x O(logV) which is O((E + V) x logV) = O(ElogV)
 This time complexity can be reduced to O(E+VlogV) using Fibonacci heap.

PRACTICE PROBLEM BASED ON DIJKSTRA ALGORITHM-

Problem-

Using Dijkstra’s Algorithm, find the shortest distance from source vertex ‘S’ to remaining vertices
in the following graph-
Also, write the order in which the vertices are visited.

Solution-

Step-01:

The following two sets are created-

 Unvisited set : {S , a , b , c , d , e}
 Visited set : { }

Step-02:

The two variables Π and d are created for each vertex and initialized as-

 Π[S] = Π[a] = Π[b] = Π[c] = Π[d] = Π[e] = NIL


 d[S] = 0
 d[a] = d[b] = d[c] = d[d] = d[e] = ∞

Step-03:

 Vertex ‘S’ is chosen.


 This is because shortest path estimate for vertex ‘S’ is least.
 The outgoing edges of vertex ‘S’ are relaxed.

Before Edge Relaxation-

Now,

 d[S] + 1 = 0 + 1 = 1 < ∞
∴ d[a] = 1 and Π[a] = S

 d[S] + 5 = 0 + 5 = 5 < ∞
∴ d[b] = 5 and Π[b] = S

After edge relaxation, our shortest path tree is-


Now, the sets are updated as-

 Unvisited set : {a , b , c , d , e}
 Visited set : {S}

Step-04:

 Vertex ‘a’ is chosen.


 This is because shortest path estimate for vertex ‘a’ is least.
 The outgoing edges of vertex ‘a’ are relaxed.

Before Edge Relaxation-


Now,

 d[a] + 2 = 1 + 2 = 3 < ∞
∴ d[c] = 3 and Π[c] = a

 d[a] + 1 = 1 + 1 = 2 < ∞
∴ d[d] = 2 and Π[d] = a

 d[b] + 2 = 1 + 2 = 3 < 5
∴ d[b] = 3 and Π[b] = a

After edge relaxation, our shortest path tree is-


Now, the sets are updated as-

 Unvisited set : {b , c , d , e}
 Visited set : {S , a}

Step-05:

 Vertex ‘d’ is chosen.


 This is because shortest path estimate for vertex ‘d’ is least.
 The outgoing edges of vertex ‘d’ are relaxed.

Before Edge Relaxation-


Now,

 d[d] + 2 = 2 + 2 = 4 < ∞
∴ d[e] = 4 and Π[e] = d

After edge relaxation, our shortest path tree is-

Now, the sets are updated as-

 Unvisited set : {b , c , e}
 Visited set : {S , a , d}

Step-06:

 Vertex ‘b’ is chosen.


 This is because shortest path estimate for vertex ‘b’ is least.
 Vertex ‘c’ may also be chosen since for both the vertices, shortest path estimate is least.
 The outgoing edges of vertex ‘b’ are relaxed.

Before Edge Relaxation-

Now,

 d[b] + 2 = 3 + 2 = 5 > 2
∴ No change

After edge relaxation, our shortest path tree remains the same as in Step-05.

Now, the sets are updated as-

 Unvisited set : {c , e}
 Visited set : {S , a , d , b}

Step-07:

 Vertex ‘c’ is chosen.


 This is because shortest path estimate for vertex ‘c’ is least.
 The outgoing edges of vertex ‘c’ are relaxed.

Before Edge Relaxation-


Now,

 d[c] + 1 = 3 + 1 = 4 = 4
∴ No change

After edge relaxation, our shortest path tree remains the same as in Step-05.

Now, the sets are updated as-

 Unvisited set : {e}


 Visited set : {S , a , d , b , c}

Step-08:

 Vertex ‘e’ is chosen.


 This is because shortest path estimate for vertex ‘e’ is least.
 The outgoing edges of vertex ‘e’ are relaxed.
 There are no outgoing edges for vertex ‘e’.
 So, our shortest path tree remains the same as in Step-05.

Now, the sets are updated as-

 Unvisited set : { }
 Visited set : {S , a , d , b , c , e}

Now,
 All vertices of the graph are processed.
 Our final shortest path tree is as shown below.
 It represents the shortest path from source vertex ‘S’ to all other remaining vertices.

The order in which all the vertices are processed is :

S , a , d , b , c , e.

To gain better understanding about Dijkstra Algorithm,


What is Sorting?

A Sorting Algorithm is used to rearrange a given array or list of elements according to a


comparison operator on the elements. The comparison operator is used to decide the new
order of elements in the respective data structure.
For Example: The below list of characters is sorted in increasing order of their ASCII values.
That is, the character with a lesser ASCII value will be placed first than the character with a
higher ASCII value.

What is Sorting?
Sorting is a process of ordering or placing a list of elements from a collection in
some kind of order. It is nothing but storage of data in sorted order. Sorting can be
done in ascending and descending order. It arranges the data in a sequence which
makes searching easier.

For example, suppose we have a record of employee. It has following data:

Employee No.
Employee Name
Employee Salary
Department Name

Here, employee no. can be takes as key for sorting the records in ascending or
descending order. Now, we have to search a Employee with employee no. 116, so
we don't require to search the complete record, simply we can search between the
Employees with employee no. 100 to 120.

Sorting Techniques
Sorting technique depends on the situation. It depends on two parameters.

1. Execution time of program that means time taken for execution of program.
2. Space that means space taken by the program.

Sorting techniques are differentiated by their efficiency and space requirements.

Sorting can be performed using several techniques or methods, as follows:

1. Bubble Sort
2. Insertion Sort
3. Selection Sort
4. Quick Sort
5. Heap Sort

1. Bubble Sort
 Bubble sort is a type of sorting.
 It is used for sorting 'n' (number of items) elements.
 It compares all the elements one by one and sorts them based on their values.

 The above diagram represents how bubble sort actually works. This sort takes
O(n2) time. It starts with the first two elements and sorts them in ascending order.
 Bubble sort starts with first two elements. It compares the element to check which
one is greater.
 In the above diagram, element 40 is greater than 10, so these values must be
swapped. This operation continues until the array is sorted in ascending order.
Example: Program for Bubble Sort

#include <stdio.h>
void bubble_sort(long [], long);

int main()
{
long array[100], n, c, d, swap;
printf("Enter Elements\n");
scanf("%ld", &n);
printf("Enter %ld integers\n", n);
for (c = 0; c < n; c++)
scanf("%ld", &array[c]);
bubble_sort(array, n);
printf("Sorted list in ascending order:\n");
for ( c = 0 ; c < n ; c++ )
printf("%ld\n", array[c]);
return 0;
}
void bubble_sort(long list[], long n)
{
long c, d, t;
for (c = 0 ; c < ( n - 1 ); c++)
{
for (d = 0 ; d < n - c - 1; d++)
{
if (list[d] > list[d+1])
{
/* Swapping */
t = list[d];
list[d] = list[d+1];
list[d+1] = t;
}
}
}
}

Output:

2. Insertion Sort
 Insertion sort is a simple sorting algorithm.
 This sorting method sorts the array by shifting elements one by one.
 It builds the final sorted array one item at a time.
 Insertion sort has one of the simplest implementation.
 This sort is efficient for smaller data sets but it is insufficient for larger lists.
 It has less space complexity like bubble sort.
 It requires single additional memory space.
 Insertion sort does not change the relative order of elements with equal keys
because it is stable.

 The above diagram represents how insertion sort works. Insertion sort works like
the way we sort playing cards in our hands. It always starts with the second
element as key. The key is compared with the elements ahead of it and is put it in
the right place.
 In the above figure, 40 has nothing before it. Element 10 is compared to 40 and is
inserted before 40. Element 9 is smaller than 40 and 10, so it is inserted before 10
and this operation continues until the array is sorted in ascending order.
Example: Program for Insertion Sort

#include <stdio.h>

int main()
{
int n, array[1000], c, d, t;
printf("Enter number of elements\n");
scanf("%d", &n);
printf("Enter %d integers\n", n);
for (c = 0; c < n; c++)
{
scanf("%d", &array[c]);
}
for (c = 1 ; c <= n - 1; c++)
{
d = c;
while ( d > 0 && array[d] < array[d-1])
{
t = array[d];
array[d] = array[d-1];
array[d-1] = t;
d--;
}
}
printf("Sorted list in ascending order:\n");
for (c = 0; c <= n - 1; c++)
{
printf("%d\n", array[c]);
}
return 0;
}

Output:
Selection Sort

 Selection sort is a simple sorting algorithm which finds the smallest element in the
array and exchanges it with the element in the first position. Then finds the second
smallest element and exchanges it with the element in the second position and
continues until the entire array is sorted.
 In the above diagram, the smallest element is found in first pass that is 9 and it is
placed at the first position. In second pass, smallest element is searched from the
rest of the element excluding first element. Selection sort keeps doing this, until
the array is sorted.
Example: Program for Selection Sort

#include <stdio.h>

int main()
{
int array[100], n, c, d, position, swap;
printf("Enter number of elements\n");
scanf("%d", &n);
printf("Enter %d integers\n", n);
for ( c = 0 ; c < n ; c++ )
scanf("%d", &array[c]);
for ( c = 0 ; c < ( n - 1 ) ; c++ )
{
position = c;
for ( d = c + 1 ; d < n ; d++ )
{
if ( array[position] > array[d] )
position = d;
}
if ( position != c )
{
swap = array[c];
array[c] = array[position];
array[position] = swap;
}
}
printf("Sorted list in ascending order:\n");
for ( c = 0 ; c < n ; c++ )
printf("%d\n", array[c]);
return 0;
}

Output:

Quick Sort

 Quick sort is also known as Partition-exchange sort based on the rule of Divide
and Conquer.
 It is a highly efficient sorting algorithm.
 Quick sort is the quickest comparison-based sorting algorithm.
 It is very fast and requires less additional space, only O(n log n) space is required.
 Quick sort picks an element as pivot and partitions the array around the picked
pivot.
There are different versions of quick sort which choose the pivot in
different ways:

1. First element as pivot


2. Last element as pivot

3. Random element as pivot

4. Median as pivot

Algorithm for Quick Sort


Step 1: Choose the highest index value as pivot.

Step 2: Take two variables to point left and right of the list excluding pivot.

Step 3: Left points to the low index.

Step 4: Right points to the high index.

Step 5: While value at left < (Less than) pivot move right.

Step 6: While value at right > (Greater than) pivot move left.

Step 7: If both Step 5 and Step 6 does not match, swap left and right.

Step 8: If left = (Less than or Equal to) right, the point where they met is new
pivot.
The above diagram represents how to find the pivot value in an array. As we see,
pivot value divides the list into two parts (partitions) and then each part is
processed for quick sort. Quick sort is a recursive function. We can call the partition
function again.

Example: Demonstrating Quick Sort


#include<stdio.h>
#include<conio.h>

//quick Sort function to Sort Integer array list


void quicksort(int array[], int firstIndex, int lastIndex)
{
//declaaring index variables
int pivotIndex, temp, index1, index2;
if(firstIndex < lastIndex)
{
//assigninh first element index as pivot element
pivotIndex = firstIndex;
index1 = firstIndex;
index2 = lastIndex;
//Sorting in Ascending order with quick sort
while(index1 < index2)
{
while(array[index1] <= array[pivotIndex] && index1 < lastIndex)
{
index1++;
}
while(array[index2]>array[pivotIndex])
{
index2--;
}
if(index1<index2)
{
//Swapping opertation
temp = array[index1];
array[index1] = array[index2];
array[index2] = temp;
}
}
//At the end of first iteration, swap pivot element with index2 element
temp = array[pivotIndex];
array[pivotIndex] = array[index2];
array[index2] = temp;
//Recursive call for quick sort, with partiontioning
quicksort(array, firstIndex, index2-1);
quicksort(array, index2+1, lastIndex);
}
}
int main()
{
//Declaring variables
int array[100],n,i;
//Number of elements in array form user input
printf("Enter the number of element you want to Sort : ");
scanf("%d",&n);
//code to ask to enter elements from user equal to n
printf("Enter Elements in the list : ");
for(i = 0; i < n; i++)
{
scanf("%d",&array[i]);
}
//calling quickSort function defined above
quicksort(array,0,n-1);
//print sorted array
printf("Sorted elements: ");
for(i=0;i<n;i++)
printf(" %d",array[i]);
getch();
return 0;
}

Output:

Heap Sort

 Heap sort is a comparison based sorting algorithm.


 It is a special tree-based data structure.
 Heap sort is similar to selection sort. The only difference is, it finds largest element
and places the it at the end.
 This sort is not a stable sort. It requires a constant space for sorting a list.
 It is very fast and widely used for sorting.
It has following two properties:

1. Shape Property
2. Heap Property

1. Shape property represents all the nodes or levels of the tree are fully filled.
Heap data structure is a complete binary tree.
2. Heap property is a binary tree with special characteristics. It can be classified
into two types:

I. Max-Heap
II. Min Heap

I. Max Heap: If the parent nodes are greater than their child nodes, it is called
a Max-Heap.

II. Min Heap: If the parent nodes are smaller than their child nodes, it is called
a Min-Heap.

Example: Program for Heap Sort

#include <stdio.h>
void main()
{
int heap[10], no, i, j, c, root, temp;
printf("\n Enter no of elements :");
scanf("%d", &no);
printf("\n Enter the nos : ");
for (i = 0; i < no; i++)
scanf("%d", &heap[i]);
for (i = 1; i < no; i++)
{
c = i;
do
{
root = (c - 1) / 2;
if (heap[root] < heap[c]) /* to create MAX heap array */
{
temp = heap[root];
heap[root] = heap[c];
heap[c] = temp;
}
c = root;
} while (c != 0);
}
printf("Heap array : ");
for (i = 0; i < no; i++)
printf("%d\t ", heap[i]);
for (j = no - 1; j >= 0; j--)
{
temp = heap[0];
heap[0] = heap[j]; /* swap max element with rightmost leaf element */
heap[j] = temp;
root = 0;
do
{
c = 2 * root + 1; /* left node of root element */
if ((heap[c] < heap[c + 1]) && c < j-1)
c++;
if (heap[root]<heap[c] && c<j) /* again rearrange to max heap array */
{
temp = heap[root];
heap[root] = heap[c];
heap[c] = temp;
}
root = c;
} while (c < j);
}
printf("\n The sorted array is : ");
for (i = 0; i < no; i++)
printf("\t %d", heap[i]);
printf("\n Complexity : \n Best case = Avg case = Worst case = O(n logn) \n");
}

Output:
Sorting: Internal & external sorting
Floyd Warshall Algorithm-

 Floyd Warshall Algorithm is a famous algorithm.


 It is used to solve All Pairs Shortest Path Problem.
 It computes the shortest path between every pair of vertices of the given
graph.
 Floyd Warshall Algorithm is an example of dynamic programming approach.

Also Read- Shortest Path Problem

Advantages-

Floyd Warshall Algorithm has the following main advantages-


 It is extremely simple.
 It is easy to implement.

Algorithm-

Floyd Warshall Algorithm is as shown below-

Create a |V| x |V| matrix // It represents the distance between every pair of vertices as given
For each cell (i,j) in M do-
if i = = j
M[ i ][ j ] = 0 // For all diagonal elements, value = 0
if (i , j) is an edge in E
M[ i ][ j ] = weight(i,j) // If there exists a direct edge between the vertices, value = weight of edge
else
M[ i ][ j ] = infinity // If there is no direct edge between the vertices, value = ∞
for k from 1 to |V|
for i from 1 to |V|
for j from 1 to |V|
if M[ i ][ j ] > M[ i ][ k ] + M[ k ][ j ]
M[ i ][ j ] = M[ i ][ k ] + M[ k ][ j ]

Time Complexity-

 Floyd Warshall Algorithm consists of three loops over all the nodes.
 The inner most loop consists of only constant complexity operations.
 Hence, the asymptotic complexity of Floyd Warshall algorithm is O(n3).
 Here, n is the number of nodes in the given graph.

When Floyd Warshall Algorithm Is Used?

 Floyd Warshall Algorithm is best suited for dense graphs.


 This is because its complexity depends only on the number of vertices in
the given graph.
 For sparse graphs, Johnson’s Algorithm is more suitable.

PRACTICE PROBLEM BASED ON FLOYD WARSHALL


ALGORITHM-

Problem-

Consider the following directed weighted graph-


Using Floyd Warshall Algorithm, find the shortest path distance between every
pair of vertices.

Solution-

Step-01:

 Remove all the self loops and parallel edges (keeping the lowest weight
edge) from the graph.
 In the given graph, there are neither self edges nor parallel edges.

Step-02:

 Write the initial distance matrix.


 It represents the distance between every pair of vertices in the form of given
weights.
 For diagonal elements (representing self-loops), distance value = 0.
 For vertices having a direct edge between them, distance value = weight of
that edge.
 For vertices having no direct edge between them, distance value = ∞.

Initial distance matrix for the given graph is-


Step-03:

Using Floyd Warshall Algorithm, write the following 4 matrices-


To learn how to write these matrices, watch this video here.
The last matrix D4 represents the shortest path distance between every pair of
vertices.

Remember-

 In the above problem, there are 4 vertices in the given graph.


 So, there will be total 4 matrices of order 4 x 4 in the solution excluding the
initial distance matrix.
 Diagonal elements of each matrix will always be 0.
Radix Sort Algorithm
In this article, we will discuss the Radix sort Algorithm. Radix sort is the linear
sorting algorithm that is used for integers. In Radix sort, there is digit by digit
sorting is performed that is started from the least significant digit to the most
significant digit.

The process of radix sort works similar to the sorting of students names,
according to the alphabetical order. In this case, there are 26 radix formed due
to the 26 alphabets in English. In the first pass, the names of students are
grouped according to the ascending order of the first letter of their names.
After that, in the second pass, their names are grouped according to the
ascending order of the second letter of their name. And the process continues
until we find the sorted list.

Now, let's see the algorithm of Radix sort.

Algorithm
1. radixSort(arr)
2. max = largest element in the given array
3. d = number of digits in the largest element (or, max)
4. Now, create d buckets of size 0 - 9
5. for i -> 0 to d
6. sort the array elements using counting sort (or any stable sort) according to th
e digits at
7. the ith place

Working of Radix sort Algorithm


Now, let's see the working of Radix sort Algorithm.

The steps used in the sorting of radix sort are listed as follows -
o First, we have to find the largest element (suppose max) from the given
array. Suppose 'x' be the number of digits in max. The 'x' is calculated
because we need to go through the significant places of all elements.
o After that, go through one by one each significant place. Here, we have
to use any stable sorting algorithm to sort the digits of each significant
place.

Now let's see the working of radix sort in detail by using an example. To
understand it more clearly, let's take an unsorted array and try to sort it using
radix sort. It will make the explanation clearer and easier.

In the given array, the largest element is 736 that have 3 digits in it. So, the
loop will run up to three times (i.e., to the hundreds place). That means three
passes are required to sort the array.

Now, first sort the elements on the basis of unit place digits (i.e., x = 0). Here,
we are using the counting sort algorithm to sort the elements.

Pass 1:
In the first pass, the list is sorted on the basis of the digits at 0's place.
After the first pass, the array elements are -

Pass 2:
In this pass, the list is sorted on the basis of the next significant digits (i.e.,
digits at 10th place).
After the second pass, the array elements are -

Pass 3:
In this pass, the list is sorted on the basis of the next significant digits (i.e.,
digits at 100th place).
After the third pass, the array elements are -

Now, the array is sorted in ascending order.

Radix sort complexity


Now, let's see the time complexity of Radix sort in best case, average case, and
worst case. We will also see the space complexity of Radix sort.

1. Time Complexity

Case Time Complexity

Best Case Ω(n+k)

Average Case θ(nk)


Worst Case O(nk)

o Best Case Complexity - It occurs when there is no sorting required, i.e.


the array is already sorted. The best-case time complexity of Radix sort
is Ω(n+k).
o Average Case Complexity - It occurs when the array elements are in
jumbled order that is not properly ascending and not properly
descending. The average case time complexity of Radix sort is θ(nk).
o Worst Case Complexity - It occurs when the array elements are
required to be sorted in reverse order. That means suppose you have to
sort the array elements in ascending order, but its elements are in
descending order. The worst-case time complexity of Radix sort
is O(nk).

Radix sort is a non-comparative sorting algorithm that is better than the


comparative sorting algorithms. It has linear time complexity that is better
than the comparative algorithms with complexity O(n logn).

2. Space Complexity

Space Complexity O(n + k)

Stable YES

o The space complexity of Radix sort is O(n + k).

Implementation of Radix sort


Now, let's see the programs of Radix sort in different programming languages.

Program: Write a program to implement Radix sort in C language.

1. #include <stdio.h>
2.
3. int getMax(int a[], int n) {
4. int max = a[0];
5. for(int i = 1; i<n; i++) {
6. if(a[i] > max)
7. max = a[i];
8. }
9. return max; //maximum element from the array
10. }
11.
12. void countingSort(int a[], int n, int place) // function to implement counting s
ort
13. {
14. int output[n + 1];
15. int count[10] = {0};
16.
17. // Calculate count of elements
18. for (int i = 0; i < n; i++)
19. count[(a[i] / place) % 10]++;
20.
21. // Calculate cumulative frequency
22. for (int i = 1; i < 10; i++)
23. count[i] += count[i - 1];
24.
25. // Place the elements in sorted order
26. for (int i = n - 1; i >= 0; i--) {
27. output[count[(a[i] / place) % 10] - 1] = a[i];
28. count[(a[i] / place) % 10]--;
29. }
30.
31. for (int i = 0; i < n; i++)
32. a[i] = output[i];
33. }
34.
35. // function to implement radix sort
36. void radixsort(int a[], int n) {
37.
38. // get maximum element from array
39. int max = getMax(a, n);
40.
41. // Apply counting sort to sort elements based on place value
42. for (int place = 1; max / place > 0; place *= 10)
43. countingSort(a, n, place);
44. }
45.
46. // function to print array elements
47. void printArray(int a[], int n) {
48. for (int i = 0; i < n; ++i) {
49. printf("%d ", a[i]);
50. }
51. printf("\n");
52. }
53.
54. int main() {
55. int a[] = {181, 289, 390, 121, 145, 736, 514, 888, 122};
56. int n = sizeof(a) / sizeof(a[0]);
57. printf("Before sorting array elements are - \n");
58. printArray(a,n);
59. radixsort(a, n);
60. printf("After applying Radix sort, the array elements are - \n");
61. printArray(a, n);
62. }

Output:

After the execution of the above code, the output will be -


Heap Sort Algorithm
In this article, we will discuss the Heapsort Algorithm. Heap sort processes the
elements by creating the min-heap or max-heap using the elements of the
given array. Min-heap or max-heap represents the ordering of array in which
the root element represents the minimum or maximum element of the array.

Heap sort basically recursively performs two main operations -

o Build a heap H, using the elements of array.


o Repeatedly delete the root element of the heap formed in 1st phase.

Before knowing more about the heap sort, let's first see a brief description
of Heap.

What is a heap?
A heap is a complete binary tree, and the binary tree is a tree in which the
node can have the utmost two children. A complete binary tree is a binary tree
in which all the levels except the last level, i.e., leaf node, should be completely
filled, and all the nodes should be left-justified.

What is heap sort?


Heapsort is a popular and efficient sorting algorithm. The concept of heap sort
is to eliminate the elements one by one from the heap part of the list, and
then insert them into the sorted part of the list.

Heapsort is the in-place sorting algorithm.

Now, let's see the algorithm of heap sort.

Algorithm
1. HeapSort(arr)
2. BuildMaxHeap(arr)
3. for i = length(arr) to 2
4. swap arr[1] with arr[i]
5. heap_size[arr] = heap_size[arr] ? 1
6. MaxHeapify(arr,1)
7. End

BuildMaxHeap(arr)

1. BuildMaxHeap(arr)
2. heap_size(arr) = length(arr)
3. for i = length(arr)/2 to 1
4. MaxHeapify(arr,i)
5. End

MaxHeapify(arr,i)

1. MaxHeapify(arr,i)
2. L = left(i)
3. R = right(i)
4. if L ? heap_size[arr] and arr[L] > arr[i]
5. largest = L
6. else
7. largest = i
8. if R ? heap_size[arr] and arr[R] > arr[largest]
9. largest = R
10. if largest != i
11. swap arr[i] with arr[largest]
12. MaxHeapify(arr,largest)
13. End

Working of Heap sort Algorithm


Now, let's see the working of the Heapsort Algorithm.
In heap sort, basically, there are two phases involved in the sorting of
elements. By using the heap sort algorithm, they are as follows -

o The first step includes the creation of a heap by adjusting the elements
of the array.
o After the creation of heap, now remove the root element of the heap
repeatedly by shifting it to the end of the array, and then store the heap
structure with the remaining elements.

Now let's see the working of heap sort in detail by using an example. To
understand it more clearly, let's take an unsorted array and try to sort it using
heap sort. It will make the explanation clearer and easier.

First, we have to construct a heap from the given array and convert it into max
heap.

After converting the given heap into max heap, the array elements are -
Next, we have to delete the root element (89) from the max heap. To delete
this node, we have to swap it with the last node, i.e. (11). After deleting the
root element, we again have to heapify it to convert it into max heap.

After swapping the array element 89 with 11, and converting the heap into
max-heap, the elements of array are -

In the next step, again, we have to delete the root element (81) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (54). After
deleting the root element, we again have to heapify it to convert it into max
heap.
After swapping the array element 81 with 54 and converting the heap into
max-heap, the elements of array are -

In the next step, we have to delete the root element (76) from the max heap
again. To delete this node, we have to swap it with the last node, i.e. (9). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 76 with 9 and converting the heap into
max-heap, the elements of array are -

In the next step, again we have to delete the root element (54) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (14). After
deleting the root element, we again have to heapify it to convert it into max
heap.
After swapping the array element 54 with 14 and converting the heap into
max-heap, the elements of array are -

In the next step, again we have to delete the root element (22) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (11). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 22 with 11 and converting the heap into
max-heap, the elements of array are -
In the next step, again we have to delete the root element (14) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (9). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 14 with 9 and converting the heap into
max-heap, the elements of array are -

In the next step, again we have to delete the root element (11) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (9). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 11 with 9, the elements of array are -

Now, heap has only one element left. After deleting it, heap will be empty.
After completion of sorting, the array elements are -

Now, the array is completely sorted.

Heap sort complexity


Now, let's see the time complexity of Heap sort in the best case, average case,
and worst case. We will also see the space complexity of Heapsort.

1. Time Complexity

Case Time Complexity

Best Case O(n logn)

Average Case O(n log n)

Worst Case O(n log n)

o Best Case Complexity - It occurs when there is no sorting required, i.e.


the array is already sorted. The best-case time complexity of heap sort
is O(n logn).
o Average Case Complexity - It occurs when the array elements are in
jumbled order that is not properly ascending and not properly
descending. The average case time complexity of heap sort is O(n log
n).
o Worst Case Complexity - It occurs when the array elements are
required to be sorted in reverse order. That means suppose you have to
sort the array elements in ascending order, but its elements are in
descending order. The worst-case time complexity of heap sort is O(n
log n).

The time complexity of heap sort is O(n logn) in all three cases (best case,
average case, and worst case). The height of a complete binary tree having n
elements is logn.

2. Space Complexity

Space Complexity O(1)

Stable N0

o The space complexity of Heap sort is O(1).

Implementation of Heapsort
Now, let's see the programs of Heap sort in different programming languages.

Program: Write a program to implement heap sort in C language.

1. #include <stdio.h>
2. /* function to heapify a subtree. Here 'i' is the
3. index of root node in array a[], and 'n' is the size of heap. */
4. void heapify(int a[], int n, int i)
5. {
6. int largest = i; // Initialize largest as root
7. int left = 2 * i + 1; // left child
8. int right = 2 * i + 2; // right child
9. // If left child is larger than root
10. if (left < n && a[left] > a[largest])
11. largest = left;
12. // If right child is larger than root
13. if (right < n && a[right] > a[largest])
14. largest = right;
15. // If root is not largest
16. if (largest != i) {
17. // swap a[i] with a[largest]
18. int temp = a[i];
19. a[i] = a[largest];
20. a[largest] = temp;
21.
22. heapify(a, n, largest);
23. }
24. }
25. /*Function to implement the heap sort*/
26. void heapSort(int a[], int n)
27. {
28. for (int i = n / 2 - 1; i >= 0; i--)
29. heapify(a, n, i);
30. // One by one extract an element from heap
31. for (int i = n - 1; i >= 0; i--) {
32. /* Move current root element to end*/
33. // swap a[0] with a[i]
34. int temp = a[0];
35. a[0] = a[i];
36. a[i] = temp;
37.
38. heapify(a, i, 0);
39. }
40. }
41. /* function to print the array elements */
42. void printArr(int arr[], int n)
43. {
44. for (int i = 0; i < n; ++i)
45. {
46. printf("%d", arr[i]);
47. printf(" ");
48. }
49.
50. }
51. int main()
52. {
53. int a[] = {48, 10, 23, 43, 28, 26, 1};
54. int n = sizeof(a) / sizeof(a[0]);
55. printf("Before sorting array elements are - \n");
56. printArr(a, n);
57. heapSort(a, n);
58. printf("\nAfter sorting array elements are - \n");
59. printArr(a, n);
60. return 0;
61. }

Output
Comparison Between Various Sorting Algorithms
Time Complexities of all Sorting Algorithms
The efficiency of an algorithm depends on two parameters:
1. Time Complexity
2. Space Complexity
Time Complexity: Time Complexity is defined as the number of times a
particular instruction set is executed rather than the total time taken. It is
because the total time took also depends on some external factors like
the compiler used, processor’s speed, etc.
Space Complexity: Space Complexity is the total memory space
required by the program for its execution.
Both are calculated as the function of input size(n).
One important thing here is that in spite of these parameters the
efficiency of an algorithm also depends upon the nature and size
of the input.
Types Of Time Complexity :
1. Best Time Complexity: Define the input for which algorithm takes
less time or minimum time. In the best case calculate the lower bound
of an algorithm. Example: In the linear search when search data is
present at the first location of large data then the best case occurs.
2. Average Time Complexity: In the average case take all random
inputs and calculate the computation time for all inputs.
And then we divide it by the total number of inputs.
3. Worst Time Complexity: Define the input for which algorithm takes a
long time or maximum time. In the worst calculate the upper bound of
an algorithm. Example: In the linear search when search data is
present at the last location of large data then the worst case occurs.
Following is a quick revision sheet that you may refer to at the last
minute

Algorithm Time Complexity Space Complexity

Best Average Worst Worst

Selection Sort Ω(n^2) θ(n^2) O(n^2) O(1)

Bubble Sort Ω(n) θ(n^2) O(n^2) O(1)

Insertion Sort Ω(n) θ(n^2) O(n^2) O(1)

Heap Sort Ω(n log(n)) θ(n log(n)) O(n log(n)) O(1)

Quick Sort Ω(n log(n)) θ(n log(n)) O(n^2) O(n)

Merge Sort Ω(n log(n)) θ(n log(n)) O(n log(n)) O(n)

Bucket Sort Ω(n +k) θ(n +k) O(n^2) O(n)


Algorithm Time Complexity Space Complexity

Best Average Worst Worst

Radix Sort Ω(nk) θ(nk) O(nk) O(n + k)

Count Sort Ω(n +k) θ(n +k) O(n +k) O(k)

Shell Sort Ω(n log(n)) θ(n log(n)) O(n^2) O(1)

Tim Sort Ω(n) θ(n log(n)) O(n log (n)) O(n)

Tree Sort Ω(n log(n)) θ(n log(n)) O(n^2) O(n)

Cube Sort Ω(n) θ(n log(n)) O(n log(n)) O(n)


What is a File ?
A file can be defined as a data structure which stores the sequence of records. Files are
stored in a file system, which may exist on a disk or in the main memory. Files can be
simple (plain text) or complex (specially-formatted).

The collection of files is known as Directory. The collection of directories at the different
levels, is known as File System.

Attributes of the File


1.Name

Every file carries a name by which the file is recognized in the file system. One directory
cannot have two files with the same name.

2.Identifier
Along with the name, Each File has its own extension which identifies the type of the file.
For example, a text file has the extension .txt, A video file can have the extension .mp4.

3.Type

In a File System, the Files are classified in different types such as video files, audio files,
text files, executable files, etc.

4.Location

In the File System, there are several locations on which, the files can be stored. Each file
carries its location as its attribute.

5.Size

The Size of the File is one of its most important attribute. By size of the file, we mean the
number of bytes acquired by the file in the memory.

6.Protection

The Admin of the computer may want the different protections for the different files.
Therefore each file carries its own set of permissions to the different group of Users.

7.Time and Date

Every file carries a time stamp which contains the time and date on which the file is last
modified.

Operations on the File


A file is a collection of logically related data that is recorded on the secondary storage in
the form of sequence of operations. The content of the files are defined by its creator
who is creating the file. The various operations which can be implemented on a file such
as read, write, open and close etc. are called file operations. These operations are
performed by the user by using the commands provided by the operating system. Some
common operations are as follows:
1.Create operation:

This operation is used to create a file in the file system. It is the most widely used
operation performed on the file system. To create a new file of a particular type the
associated application program calls the file system. This file system allocates space to
the file. As the file system knows the format of directory structure, so entry of this new
file is made into the appropriate directory.

2. Open operation:

This operation is the common operation performed on the file. Once the file is created,
it must be opened before performing the file processing operations. When the user
wants to open a file, it provides a file name to open the particular file in the file system.
It tells the operating system to invoke the open system call and passes the file name to
the file system.

3. Write operation:

This operation is used to write the information into a file. A system call write is issued
that specifies the name of the file and the length of the data has to be written to the file.
Whenever the file length is increased by specified value and the file pointer is
repositioned after the last byte written.

4. Read operation:

This operation reads the contents from a file. A Read pointer is maintained by the OS,
pointing to the position up to which the data has been read.
5. Re-position or Seek operation:

The seek system call re-positions the file pointers from the current position to a specific
place in the file i.e. forward or backward depending upon the user's requirement. This
operation is generally performed with those file management systems that support
direct access files.

6. Delete operation:

Deleting the file will not only delete all the data stored inside the file it is also used so
that disk space occupied by it is freed. In order to delete the specified file the directory
is searched. When the directory entry is located, all the associated file space and the
directory entry is released.

7. Truncate operation:

Truncating is simply deleting the file except deleting attributes. The file is not
completely deleted although the information stored inside the file gets replaced.

8. Close operation:

When the processing of the file is complete, it should be closed so that all the changes
made permanent and all the resources occupied should be released. On closing it
deallocates all the internal descriptors that were created when the file was opened.

9. Append operation:

This operation adds data to the end of the file.

10. Rename operation:

This operation is used to rename the existing file.


Sorting Algorithm

Sorting Algorithms
A Sorting Algorithm is used to rearrange a given array or list elements according to a
comparison operator on the elements. The comparison operator is used to decide the
new order of element in the respective data structure.

Selection Sort
The selection sort algorithm sorts an array by repeatedly finding the minimum element
(considering ascending order) from unsorted part and putting it at the beginning. The
algorithm maintains two subarrays in a given array.
1) The subarray which is already sorted.
2) Remaining subarray which is unsorted.
In every iteration of selection sort, the minimum element (considering ascending order)
from the unsorted subarray is picked and moved to the sorted subarray.

Following example explains the above steps:


arr[] = 64 25 12 22 11

// Find the minimum element in arr[0...4]


// and place it at beginning
11 25 12 22 64

// Find the minimum element in arr[1...4]


// and place it at beginning of arr[1...4]
11 12 25 22 64

// Find the minimum element in arr[2...4]


// and place it at beginning of arr[2...4]
11 12 22 25 64

// Find the minimum element in arr[3...4]


// and place it at beginning of arr[3...4]
11 12 22 25 64

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Sorting Algorithm

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Sorting Algorithm

Bubble Sort
Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the
adjacent elements if they are in wrong order.
Example:
First Pass:
( 5 1 4 2 8 ) –> ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and
swaps since 5 > 1.
( 1 5 4 2 8 ) –> ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) –> ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5),
algorithm does not swap them.
Second Pass:
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 )
( 1 4 2 5 8 ) –> ( 1 2 4 5 8 ), Swap since 4 > 2
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. The
algorithm needs one whole pass without any swap to know it is sorted.
Third Pass:
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )

Following is the implementations of Bubble Sort.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Sorting Algorithm

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Sorting Algorithm
Insertion Sort
Insertion sort is a simple sorting algorithm that works the way we sort playing cards in
our hands.
Algorithm
// Sort an arr[] of size n
insertionSort(arr, n)
Loop from i = 1 to n-1.
……a) Pick element arr[i] and insert it into sorted sequence arr[0…i-1]
Example:

Another Example:
12, 11, 13, 5, 6
Let us loop for i = 1 (second element of the array) to 4 (last element of the array)
i = 1. Since 11 is smaller than 12, move 12 and insert 11 before 12
11, 12, 13, 5, 6
i = 2. 13 will remain at its position as all elements in A[0..I-1] are smaller than 13
11, 12, 13, 5, 6

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Sorting Algorithm
i = 3. 5 will move to the beginning and all other elements from 11 to 13 will move one
position ahead of their current position.
5, 11, 12, 13, 6
i = 4. 6 will move to position after 5, and elements from 11 to 13 will move one position
ahead of their current position.
5, 6, 11, 12, 13

HeapSort
Heap sort is a comparison based sorting technique based on Binary Heap data
structure. It is similar to selection sort where we first find the maximum element and
place the maximum element at the end. We repeat the same process for remaining
element.
What is Binary Heap?
Let us first define a Complete Binary Tree. A complete binary tree is a binary tree in
which every level, except possibly the last, is completely filled, and all nodes are as far
left as possible
A Binary Heap is a Complete Binary Tree where items are stored in a special order
such that value in a parent node is greater(or smaller) than the values in its two children
nodes. The former is called as max heap and the latter is called min heap. The heap
can be represented by binary tree or array.

Why array based representation for Binary Heap?


Since a Binary Heap is a Complete Binary Tree, it can be easily represented as array
and array based representation is space efficient. If the parent node is stored at index I,
the left child can be calculated by 2 * I + 1 and right child by 2 * I + 2 (assuming the
indexing starts at 0).
Heap Sort Algorithm for sorting in increasing order:
1. Build a max heap from the input data.
2. At this point, the largest item is stored at the root of the heap. Replace it with the last
item of the heap followed by reducing the size of heap by 1. Finally, heapify the root of
tree.
3. Repeat above steps while size of heap is greater than 1.
How to build the heap?
Heapify procedure can be applied to a node only if its children nodes are heapified. So
the heapification must be performed in the bottom up order.
Lets understand with the help of an example:

Input data: 4, 10, 3, 5, 1


4(0)
/ \

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Sorting Algorithm
10(1) 3(2)
/ \
5(3) 1(4)

The numbers in bracket represent the indices in the array


representation of data.

Applying heapify procedure to index 1:


4(0)
/ \
10(1) 3(2)
/ \
5(3) 1(4)

Applying heapify procedure to index 0:


10(0)
/ \
5(1) 3(2)
/ \
4(3) 1(4)
The heapify procedure calls itself recursively to build heap
in top down manner.

Merge Sort
Like QuickSort, Merge Sort is a Divide and Conquer algorithm. It divides input array in
two halves, calls itself for the two halves and then merges the two sorted halves. The
merge() function is used for merging two halves. The merge(arr, l, m, r) is key process
that assumes that arr[l..m] and arr[m+1..r] are sorted and merges the two sorted sub-
arrays into one. See following C implementation for details.
MergeSort(arr[], l, r)
If r > l

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Sorting Algorithm
1. Find the middle point to divide the array into two halves:
middle m = (l+r)/2
2. Call mergeSort for first half:
Call mergeSort(arr, l, m)
3. Call mergeSort for second half:
Call mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)
The following diagram from shows the complete merge sort process for an example
array {38, 27, 43, 3, 9, 82, 10}. If we take a closer look at the diagram, we can see that
the array is recursively divided in two halves till the size becomes 1. Once the size
becomes 1, the merge processes comes into action and starts merging arrays back till
the complete array is merged.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Sorting Algorithm

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Searching Algorithm

Searching Algorithms
‘Recent Articles’ on Searching
Searching Algorithms are designed to check for an element or retrieve an element from
any data structure where it is stored. Based on the type of search operation, these
algorithms are generally classified into two categories:
1. Sequential Search: In this, the list or array is traversed sequentially and every
element is checked. For example: Linear Search.
2. Interval Search: These algorithms are specifically designed for searching in sorted
data-structures. These type of searching algorithms are much more efficient than
Linear Search as they repeatedly target the center of the search structure and
divide the search space in half. For Example: Binary Search.

Linear Search to find the element “20” in a given list of numbers

Binary Search to find the element “23” in a given list of numbers

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Searching Algorithm

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Introduction to Files

What is Files & Types of Files? Types of File Operations.

Files: As we know that Computers are used for storing the information for a
Permanent Time or the Files are used for storing the Data of the users for a Long
time Period. And the files can contains any type of information means they can Store
the text, any Images or Pictures or any data in any Format. So that there must be
Some Mechanism those are used for Storing the information, Accessing the
information and also Performing Some Operations on the files.
There are Many files which have their Owen Type and own names. When we Store a
File in the System, then we must have to specify the Name and the Type of File. The
Name of file will be any valid Name and Type means the application with the file has
linked.
So that we can say that Every File also has Some Type Means Every File belongs to
Special Type of Application software’s. When we Provides a Name to a File then we
also specify the Extension of the File because a System will retrieve the Contents of
the File into that Application Software. For Example if there is a File Which Contains
Some Paintings then this will Opened into the Paint Software.

1) Ordinary Files or Simple File: Ordinary File may belong to any type of
Application for example notepad, paint, C Program, Songs etc. So all the Files those
are created by a user are Ordinary Files. Ordinary Files are used for Storing the
information about the user Programs. With the help of Ordinary Files we can store
the information which contains text, database, any image or any other type of
information.
2) Directory files: The Files those are Stored into the a Particular Directory or
Folder. Then these are the Directory Files. Because they belongs to a Directory and
they are Stored into a Directory or Folder. For Example a Folder Name Songs which
Contains Many Songs So that all the Files of Songs are known as Directory Files.
3) Special Files: The Special Files are those which are not created by the user. Or
The Files those are necessary to run a System. The Files those are created by the
System. Means all the Files of an Operating System or Window, are refers to Special
Files. There are Many Types of Special Files, System Files, or windows Files, Input
output Files. All the System Files are Stored into the System by using. sys
Extension.
4) FIFO Files: The First in First Out Files are used by the System for Executing the
Processes into Some Order. Means To Say the Files those are Come first, will be
Executed First and the System Maintains a Order or Sequence Order. When a user
Request for a Service from the System, then the Requests of the users are Arranged
into Some Files and all the Requests of the System will be performed by the System
by using Some Sequence Order in which they are Entered or we can say that all the
files or Requests those are Received from the users will be Executed by using Some
Order which is also called as First in First Out or FIFO order.

Types of File Operations


Files are not made for just reading the Contents, we can also Perform Some other
operations on the Files those are Explained below As :

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Introduction to Files

1) Read Operation: Meant To Read the information which is Stored into the Files.
2) Write Operation: For inserting some new Contents into a File.
3) Rename or Change the Name of File.
4) Copy the File from one Location to another.
5) Sorting or Arrange the Contents of File.
6) Move or Cut the File from One Place to Another.
7) Delete a File
8) Execute Means to Run Means File Display Output.
We can Also Link a File with any other File. These are also called as the Symbolic
Links, in the Symbolic Links all the files are linked by using Some Text or Some
Alias.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Sorting: Internal & external sorting
File Oganization

What is File?
File is a collection of records related to each other. The file size is limited by the
size of memory and storage medium.

There are two important features of file:

1. File Activity
2. File Volatility

File activity specifies percent of actual records which proceed in a single run.

File volatility addresses the properties of record changes. It helps to increase


the efficiency of disk design than tape.

File Organization

File organization ensures that records are available for processing. It is used to
determine an efficient file organization for each base relation.

For example, if we want to retrieve employee records in alphabetical order of


name. Sorting the file by employee name is a good file organization. However, if
we want to retrieve all employees whose marks are in a certain range, a file is
ordered by employee name would not be a good file organization.

Types of File Organization


There are three types of organizing the file:

1. Sequential access file organization


2. Direct access file organization
3. Indexed sequential access file organization

1. Sequential access file organization


 Storing and sorting in contiguous block within files on tape or disk is called
as sequential access file organization.
 In sequential access file organization, all records are stored in a sequential
order. The records are arranged in the ascending or descending order of a key
field.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


File Oganization

 Sequential file search starts from the beginning of the file and the records can
be added at the end of the file.
 In sequential file, it is not possible to add a record in the middle of the file
without rewriting the file.
Advantages of sequential file

 It is simple to program and easy to design.


 Sequential file is best use if storage space.
Disadvantages of sequential file

 Sequential file is time consuming process.


 It has high data redundancy.
 Random searching is not possible.
2. Direct access file organization
 Direct access file is also known as random access or relative file organization.
 In direct access file, all records are stored in direct access storage device
(DASD), such as hard disk. The records are randomly placed throughout the
file.
 The records does not need to be in sequence because they are updated directly
and rewritten back in the same location.
 This file organization is useful for immediate access to large amount of
information. It is used in accessing large databases.
 It is also called as hashing.
Advantages of direct access file organization

 Direct access file helps in online transaction processing system (OLTP) like
online railway reservation system.
 In direct access file, sorting of the records are not required.
 It accesses the desired records immediately.
 It updates several files quickly.
 It has better control over record allocation.
Disadvantages of direct access file organization

 Direct access file does not provide back up facility.


 It is expensive.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


File Oganization

 It has less storage space as compared to sequential file.


3. Indexed sequential access file organization
 Indexed sequential access file combines both sequential file and direct access
file organization.
 In indexed sequential access file, records are stored randomly on a direct
access device such as magnetic disk by a primary key.
 This file have multiple keys. These keys can be alphanumeric in which the
records are ordered is called primary key.
 The data can be access either sequentially or randomly using the index. The
index is stored in a file and read into memory when the file is opened.
Advantages of Indexed sequential access file organization

 In indexed sequential access file, sequential file and random file access is
possible.
 It accesses the records very fast if the index table is properly organized.
 The records can be inserted in the middle of the file.
 It provides quick access for sequential and direct processing.
 It reduces the degree of the sequential search.
Disadvantages of Indexed sequential access file organization

 Indexed sequential access file requires unique keys and periodic


reorganization.
 Indexed sequential access file takes longer time to search the index for the
data access or retrieval.
 It requires more storage space.
 It is expensive because it requires special software.
 It is less efficient in the use of storage space as compared to other file
organizations.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


FILES AND FILE ORGANIZATION
TYPES OF STORAGE

Primary Storage Secondary Storage


• Faster Access • Slower Access
• Expensive • Cheaper
• Lesser storage capacity • Greater Storage Capacity
• Temporary Storage • Permanent Storage
INTRODUCING FILES

• Secondary Storage Structure


• Used for permanent storage
• It’s a collection of records or a stream of bytes
• Every Record is a collection of fields
• A particular field is chosen as a Key
• Records are organised in file by using the key. Primary and secondary keys.
AN EXAMPLE

• Consider a student database.


• Every student has a unique record
• Record has details of student-i.e name, Student ID etc. These are the fields.
• The unique key can be the Student ID. The records can be organised in the file on basis
of student ID.
WHY ARE FILES NECESSARY

• Sometimes data is too large to be stored in main memory.


• Maintaining permanent record is possible only by using a secondary storage. Hence
files.
PHYSICAL AND LOGICAL FILES

• Physical Files:
A collection of bits stored in the secondary storage device
• Logical File:
A channel that connects he program to the physical file(Stream).

An example
FILE* out
out=fopen(“sample.txt”,”w”);
Here out is the logical file and sample.txt is the physical file.
BASIC OPERATIONS IN FILES

• Opening a File: A logical file is associated with the physical file


• Closing a File: The logical file associated with the physical file is freed.
fclose(file pointer)
• Reading from file: Data present in physical file is read by using the logical file
• Writing to a File: Data can be written to physical file by using the logical file
FILE POSITION

• Every logical file has a file position pointer.


• When we open a new stream the position pointer is set to beginning of the file.
• As data is read or written the file position pointer is moved accordingly.
OPERATIONS ON FILE POINTER

• To move file pointer to required position.


fseek(file pointer,offset, position);
• To display current location of pointer.
long position=ftell(pointer);
• To check for end of file
while(!feof(pointer));
TYPES OF FILES

• Sequential File - Stored in the order entered


• Random Access Files - An record is accessed using an index.(Hashing).
• Direct Access Files - The records are stored based on their relative position with respect
to first record.
Record with key 50 is placed at location 50
SEQUENTIAL ACCESS FILES

• Records are stored in the order entered.


• Used when all the records have to be processed.
• Complexity for searching O(n)
RANDOM ACCESS FILES

• An record is accessed using an index.


• The index of record position in file has to be maintained in the main memory.
• The Index can be created using hashing.
• Search complexity is less. Complexity of Indexing method used.
• Disadvantage: While handling very large databases its not possible to maintain an index
in the main memory.
INDEXED ACCESS FILES

• An record is accessed using an index.


• The index of record position in file has to be maintained in the main memory.
• The Index can be created using hashing.
• Search complexity is less. Complexity of Indexing method used.
• Disadvantage: While handling very large databases its not possible to maintain an index
in the main memory.
DIRECT ACCESS FILES

• The records are stored based on their relative position with respect to first record.
• Record with key 50 is placed at location 50
• The search complexity is O(1)
• Disadvantage is a lot of memory is wasted.
• For example if no record has key 100 the position 100 is wasted.
FILE ORGANIZATION AND STRUCTURE

• "File organization" refers to the logical relationships among the various records that
onstitute the file, particularly with respect to the means of identification and access to
any specific record. "File structure" refers to the format of the label and data blocks and
of any logical record control information.
• The organization of a given file may be sequential, relative, or indexed.
FILE ORGANIZATION AND STRUCTURE

• Sequential Files
• A sequential file is organized such that each record in the file except the first has a unique predecessor record and each record
except the last has a unique successor record. These predecessor-successor relationships are established by the order in which
the records are written when the file is created. Once established, the predecessor-successor relationships do not change except
in the case where records are added to the end of the file.
• A file that is organized sequentially must be accessed sequentially.
• Variable- or Fixed-Length Sequential Files
• Sequential files may be recorded in variable-length or fixed-length record form. If a file consists of variable-length records, each
logical record is preceded by control information that indicates the size of the logical record. The control information is recorded
when the logical record is written, based on the size of the internal record specified in the WRITE statement, and is subsequently
used by the input- output control system to determine the location of successive logical records. If a file consists of fixed- length
records, the record size is established at the time the file is opened and is the same for every logical record on the file. Therefore,
there is no need to record any control information with the logical record.
CONT…….

• Relative Files
• A relative file, which must be allocated to random mass storage file space in theexecution activity, is organized
such that each record location is uniquely identified by an integer value greater than zero which specifies ordinal
position on the file. In the RELATIVE KEY phrase of the SELECT clause, the source program specifies a numeric
integer data item as the relative key item.
• Indexed Files
• An indexed file, which must be allocated in the execution activity to two or more random mass storage files (one
for the index, and one or more for the data), is organized such that each record is uniquely identified by the value
of a key within the record. In the RECORD KEY phrase of the SELECT clause, the source program specifies one
of the data items within one of the records associated with the file as the record key data item. Each attempt to
access a record based on the record key item causes a search of the index file for a key that matches the current
contents of the record key data item in the file record area. The matching index record in turn points to the
location of the associated data record.
FILES
• File: A file is a collection of rated data that is treated as a single unit on a peripheral device. for example text document in word
processing.
• Types OF FILES:
• Master file:it contains records of permanent data types.master files are created at the time when you install yopur business. if you
wish to convert your company into computerised one you need to create master file which can be created by using your manual file
folder and keying data onto storage devices for example the name of coustomer ,dob,genderetc these are permanent data types
• Transaction file: It contains data which is used to update the records of master file for example address of the costumer etc.
transaction file ,A collection of transaction records. The data in transaction files is used to update the master files, which contain the
data about the subjects of the organization (customers, employees, vendors, etc.). Transaction files also serve as audit trails and
history for the organization. Where before they were transferred to offline storage after some period of time, they are increasingly
being kept online for routine analyses.See data warehouse, transaction processing and information system.
• A report is a textual work (usually of writing, speech, television, or film) made with the specific intention of relaying information or
recounting certain events in a widely presentable form.
• Written reports are documents which present focused, salient content to a specific audience. Reports are often used to display the
result of an experiment, investigation, or inquiry. The audience may be public or private, an individual or the public in general.
Reports are used in government, business, education, science, and other fields.
• A report file is a file that describes how a report is printed.
FILES

WORK FILE IS Temporary file containing documents, drafts, records, roughnotes, and sketches employed
in the analysis or preparation of plans, projects, or other documents.
Program Files is a folder in Microsoft Windows operating systems where applications that are not part of
the operating system are installed by default.
A text file (sometimes spelled "textfile": an old alternate name is "flatfile") is a kind of computer file that is
structured as a sequence of lines of electronic text. A text file exists within a computer file system. The end
of a text file is often denoted by placing one or more special characters, known as an end-of-file marker,
after the last line in a text file.
"Text file" refers to a type of container, while plain text refers to a type of content. Text files can contain
plain text, but they are not limited to such.
At a generic level of description, there are two kinds of computer files: text files and binary files

This Photo by Unknown Author is licensed under CC BY-ND

Any Queries

Binary Tree Traversing

Traversal technique for Binary Tree


Binary Tree
A binary tree is a finite collection of elements or it can be said it is made up
of nodes. Where each node contains the left pointer, right pointer, and a data
element. The root pointer points to the topmost node in the tree. When the
binary tree is not empty, so it will have a root element and the remaining
elements are partitioned into two binary trees which are called the left pointer
and right pointer of a tree.

Traversing in the Binary Tree


Tree traversal is the process of visiting each node in the tree exactly once.
Visiting each node in a graph should be done in a systematic manner. If search
result in a visit to all the vertices, it is called a traversal. There are basically
three traversal techniques for a binary tree that are,

1. Preorder traversal
2. Inorder traversal
3. Postorder traversal

1) Preorder traversal
To traverse a binary tree in preorder, following operations are carried out:

1. Visit the root.


2. Traverse the left sub tree of root.
3. Traverse the right sub tree of root.

Note: Preorder traversal is also known as NLR traversal.

Algorithm:

Algorithm preorder(t)
/*t is a binary tree. Each node of t has three fields:
lchild, data, and rchild.*/
{
If t! =0 then
{
Visit(t);

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Tree Traversing

Preorder(t->lchild);
Preorder(t->rchild);
}
}

Example: Let us consider the given binary tree,

Therefore, the preorder traversal of the above tree will


be: 7,1,0,3,2,5,4,6,9,8,10

2) Inorder traversal
To traverse a binary tree in inorder traversal, following operations are carried
out:

1. Traverse the left most sub tree.


2. Visit the root.
3. Traverse the right most sub tree.

Note: Inorder traversal is also known as LNR traversal.


Sanatan Dharma College, Ambala Cantt Minakshi Gupta
Binary Tree Traversing

Algorithm:

Algorithm inorder(t)

/*t is a binary tree. Each node of t has three fields:


lchild, data, and rchild.*/
{
If t! =0 then
{
Inorder(t->lchild);
Visit(t);
Inorder(t->rchild);
}
}

Example: Let us consider a given binary tree.

Therefore the inorder traversal of above tree will be: 0,1,2,3,4,5,6,7,8,9,10

3) Postorder traversal
To traverse a binary tree in postorder traversal, following operations are
carried out:

1. Traverse the left sub tree of root.


2. Traverse the right sub tree of root.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Tree Traversing

3. Visit the root.

Note: Postorder traversal is also known as LRN traversal.

Algorithm:

Algorithm postorder(t)

/*t is a binary tree .Each node of t has three fields:


lchild, data, and rchild.*/
{
If t! =0 then
{
Postorder(t->lchild);
Postorder(t->rchild);
Visit(t);
}
}

Example: Let us consider a given binary tree.

Therefore the postorder traversal of the above tree will


be: 0,2,4,6,5,3,1,8,10,9,7

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Tree

BINARY TREE

A tree is said to be binary tree when,

1. A binary tree has a root node. It may not have any child nodes(0 child nodes,
NULL tree).

2. A root node may have one or two child nodes. Each node forms a binary tree
itself.

3. The number of child nodes cannot be more than two.

4. It has a unique path from the root to every other node.

There are four types of binary tree:

1. Full Binary Tree


2. Complete Binary Tree
3. Skewed Binary Tree
4. Extended Binary Tree

1. Full Binary Tree


 If each node of binary tree has either two children or no child at all, is said to
be a Full Binary Tree.
 Full binary tree is also called as Strictly Binary Tree.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Tree

 Every node in the tree has either 0 or 2 children.


 Full binary tree is used to represent mathematical expressions.
2. Complete Binary Tree
 If all levels of tree are completely filled except the last level and the last level
has all keys as left as possible, is said to be a Complete Binary Tree.
 Complete binary tree is also called as Perfect Binary Tree.

 In a complete binary tree, every internal node has exactly two children and all
leaf nodes are at same level.
 For example, at Level 2, there must be 22 = 4 nodes and at Level 3 there must
be 23 = 8 nodes.
3. Skewed Binary Tree
 If a tree which is dominated by left child node or right child node, is said to be
a Skewed Binary Tree.
 In a skewed binary tree, all nodes except one have only one child node. The
remaining node has no child.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Tree

 In a left skewed tree, most of the nodes have the left child without
corresponding right child.
 In a right skewed tree, most of the nodes have the right child without
corresponding left child.
4. Extended Binary Tree
 Extended binary tree consists of replacing every null subtree of the original tree
with special nodes.
 Empty circle represents internal node and filled circle represents external node.
 The nodes from the original tree are internal nodes and the special nodes are
external nodes.
 Every internal node in the extended binary tree has exactly two children and
every external node is a leaf. It displays the result which is a complete binary
tree.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Tree
AVL Tree

 AVL tree is a height balanced tree.


 It is a self-balancing binary search tree.
 AVL tree is another balanced binary search tree.
 It was invented by Adelson-Velskii and Landis.
 AVL trees have a faster retrieval.
 It takes O(logn) time for addition and deletion operation.
 In AVL tree, heights of left and right subtree cannot be more than one for all
nodes.

 The above tree is AVL tree because the difference between heights of left and
right subtrees for every node is less than or equal to 1.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Tree
 The above tree is not AVL because the difference between heights of left and
right subtrees for 9 and 19 is greater than 1.
 It checks the height of the left and right subtree and assures that the
difference is not more than 1. The difference is called balance factor.

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Huffman’s Algorithm
Path Lenghts;Huffman
Algorithms
Extended binary tree or 2-tree is a binary
tree T in which each node has either 0 or 2
children.
The nodes with 0 children are called
external nodes, and the nodes with 2
children are called internal nodes.
The diagram shown in next slide, where
the internal nodes are denoted by
circles and the external nodes are
denoted by squares.
In any 2-Tree, the number NE of
external nodes is 1 more than the
number N1 of internal nodes; that is,
N E = N1 + 1
Ex. N1 = 6 and NE = N1 + 1 = 7
What is Huffman’s Algorithm?
 Huffman’s algorithm is a method for building
an extended binary tree with minimum weighted path
length from a set of given weights
It is technique for compressing data.
What is Huffman’s Algorithm?

Huffman coding is a lossless data compression


algorithm. The idea is to assign variable-length
codes to input characters, lengths of the
assigned codes are based on the frequencies of
corresponding characters.

The most frequent character gets the smallest


code and the least frequent character gets the
largest code.
Tree Construction Rules

• There are mainly two major parts in Huffman Coding

1) Build a Huffman Tree from input characters.


2)Traverse the Huffman Tree and assign codes to
characters

• Steps to build Huffman Tree


1. Create a leaf node for each unique character
2. Extract two nodes with the minimum frequency
3. Create a new internal node with frequency equal to the
sum of the two nodes frequencies.
4. 4. Repeat steps#2 and #3 until the heap contains only
one node. The remaining node is the root node and the
tree is complete
Huffman Tree Construction

A C E H I

3 5 8 2 7
Huffman Tree Construction

A H C E I

3 2 5 8 7

5
Huffman Tree Construction

A E I
H
8 7
3 2
C

5 5

10
Huffman Tree Construction
E I
A H
8 7
3 2
C
15
5 5

10
Huffman Tree Construction

A H
E = 01
3 2
I = 00
C E I
1 0 C = 10
5 5 8 7 A = 111
1 0 1 0 H = 110
10 15

1 0
25
A B C D E F G H
22 5 11 19 2 11 25 5
Try to build Huffman
Algorithm
Binary Search Tree

For a binary tree to be a binary search tree, the data of all the nodes in the left sub-tree of the root
node should be ≤ the data of the root. The data of all the nodes in the right subtree of the root node
should be > the data of the root.

Example

In Fig. 1, consider the root node with data = 10.

 Data in the left sub tree is: [5,1,6]


 All data elements are < 10
 Data in the right sub tree is: [19,17]
 All data elements are > 10

Also, considering the root node with data=5, its children also satisfy the specified ordering. Similarly,
the root node with data=19 also satisfies this ordering. When recursive, all sub trees satisfy the left
and right sub tree ordering.

The tree is known as a Binary Search Tree or BST.

Traversing the tree

There are mainly three types of tree traversals.

Pre-order traversal

In this traversal technique the traversal order is root-left-right i.e.

 Process data of root node


 First, traverse left sub tree completely
 Then, traverse right sub tree

Void preorder (struct node*root)

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Search Tree

{
If (root)
{
Printf ("%d ",root->data); //Printf root->data
preorder(root->left); //Go to left subtree
preorder(root->right); //Go to right subtree
}
}

Post-order traversal

In this traversal technique the traversal order is left-right-root.

 Process data of left subtree


 First, traverse right subtree
 Then, traverse root node

void postorder(struct node*root)


{
if(root)
{
postorder(root->left); //Go to left sub tree
postorder(root->right); //Go to right sub tree
printf("%d ",root->data); //Printf root->data
}
}

In-order traversal

In in-order traversal, do the following:

 First process left subtree (before processing root node)


 Then, process current root node
 Process right subtree

void inorder(struct node*root)


{
if(root)
{
inorder(root->left); //Go to left subtree
printf("%d ",root->data); //Printf root->data
inorder(root->right); //Go to right subtree
}
}

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Search Tree

Consider the in-order traversal of a sample BST

 The 'inorder( )' procedure is called with root equal to node with data=10
 Since the node has a left subtree, 'inorder( )' is called with root equal to node with data=5
 Again, the node has a left subtree, so 'inorder( )' is called with root=1

The function call stack is as follows:

 Node with data=1 does not have a left subtree. Hence, this node is processed.
 Node with data=1 does not have a right subtree. Hence, nothing is done.
 inorder(1) gets completed and this function call is popped from the call stack.

The stack is as follows:

 Left subtree of node with data=5 is completely processed. Hence, this node gets processed.
 Right subtree of this node with data=5 is non-empty. Hence, the right subtree gets processed
now. 'inorder(6)' is then called.

Note

'inorder(6)' is only equivalent to saying inorder(pointer to node with data=6). The notation has been
used for brevity.

The function call stack is as follows:

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Search Tree

Again, the node with data=6 has no left subtree, Therefore, it can be processed and it also has no
right subtree. 'inorder(6)' is then completed.

Both the left and right subtrees of node with data=5 have been completely processed.
Hence, inorder(5) is then completed.

 Now, node with data=10 is processed


 Right subtree of this node gets processed in a similar way as described until step 10
 After right subtree of this node is completely processed, entire traversal of the BST is
complete

The order in which BST in Fig. 1 is visited is: 1, 5, 6, 10, 17, 19. The in-order traversal of a BST
gives a sorted ordering of the data elements that are present in the BST. This is an important
property of a BST.

Insertion in BST

Sanatan Dharma College, Ambala Cantt Minakshi Gupta


Binary Search Tree

Consider the insertion of data=20 in the BST.

Algorithm

Compare data of the root node and element to be inserted.

1. If the data of the root node is greater, and if a left subtree exists, then repeat step 1 with root
= root of left subtree. Else, insert element as left child of current root.
2. If the data of the root node is greater, and if a right subtree exists, then repeat step 2 with
root = root of right subtree. Else, insert element as right child of current root.
3. mplementation
4. struct node* insert(struct node* root, int data)
5. {
6. if (root == NULL) //If the tree is empty, return a
new,single node
7. return newNode(data);
8. else
9. {
10. //Otherwise, recur down the tree
11. if (data <= root->data)
12. root->left = insert(root->left, data);
13. else
14. root->right = insert(root->right, data);
15. //return the (unchanged) root pointer
16. return root;
17. }
18. }

Sanatan Dharma College, Ambala Cantt Minakshi Gupta

You might also like