MCA Data Structures With Algorithms 15
MCA Data Structures With Algorithms 15
MCA Data Structures With Algorithms 15
Names of Sub-Units
Introduction, Data Hierarchy, File Attributes, Text Files, Binary Files, Basic File Operations, File
Organizations and Indexing
Overview
This unit begins by discussing about the concept of �iles and �iles organization and data hierarchy.
Next, the unit discusses the �iles attributes, text �iles and binary �iles. Further the unit explains the
basic �ile operations. Towards the end, the unit discusses the �ile organizations and indexing.
Learning Objectives
http://cre8te.co.uk/wp-content/uploads/2014/07/Files-And-Folders-Updated-June-2015.pdf
15.1 INTRODUCTION
File is a collection of record. It is allocated for storing large amount of information stored on devices
excluding internal memory of the computer. It meant that records are stored in the secondary storage.
The �ile should be organized so that operations will be determined effectively based on features of
secondary storage devices to deploy the �ile. Operations on �ile are to insert and delete records, update
or process records and search for records. These operations are applicable for lists, trees, arrays, list
structures and complex lists. Time used in managing information has been resulted in fast operations
and the ef�iciency should be determined against care needed to manage the organized data structure.
At the time of executing high level language programs, the operations manage the �iles as data.
Operations are traversing and processing all records and also individuals chosen in random order.
Primary function of �ile system is to offer storage facilities and enable �iles to be searched conveniently
so that records may be sequentially retrieved.
All data has its own hierarchy in data hierarchy, starting at a comprehensive top level and continuing
down to a de�inite bottom level. Someone, for example, is looking for a video game title in a database.
2
The video game console type is �irst, followed by the game creator, the genre, the �irst letter of the
game’s name, and lastly the game itself. This method of cataloguing data makes it easier to locate. It
also makes it easier for the database to process new data by ensuring that datum is only recorded in the
appropriate category.
A �ile is a data structure that contains a series of records in a logical order. Files are kept in a �ile system,
which might be located on a drive or in main memory. Simple (plain text) or complicated �iles are both
possible (specially-formatted). The term “directory” refers to a group of �iles.
3
The �ile system is a collection of directories organised at various levels as shown in Figure 1:
Data
Files
Directory
File System
4
Execute �ile: This allows or disallows the execution of executable �iles.
List folder: This allows or disallows viewing of the folder’s �ile names and subfolder names. List
Folder affects just the contents of the folder; it has no bearing on whether or not the folder you are
setting the permission on will be listed.
Read data: This allows or disallows viewing of data in �iles.
Read attributes: This allows or disallows viewing of a �ile’s or folder’s properties, such as “read-only”
and “hidden”
Read extended attributes: This allows or denies viewing the extended attributes of a �ile or folder.
Extended attributes are de�ined by programs and may vary by program.
Create �iles: This allows or denies creating �iles within the folder
Write data: This allows or denies making changes to a �ile and overwriting existing content.
Create folders: This allows or denies creating subfolders within the folder.
Append data: It may consent or reject while creating changes to the end of the �ile but does not
change, deleting and overwriting when data is existing.
Write attributes: It allows or denies changing the attributes of a �ile or folder, for example, “readonly”
or “hidden”.
Write extended attributes: It allows or disallows modifying a �ile’s or folder's extended attributes.
Programs de�ine extended characteristics, which might differ from one programme to the next. The
Write Extended Attributes permission does not grant the ability to create or delete �iles or folders;
rather, it grants the ability to modify the extended attributes of an existing �ile or folder.
Delete subfolders and �iles: It allows or disallows the deletion of subfolders and �iles, even if the
delete permission on the subfolder or �ile has not been given.
Delete: It allows or prevents the deletion of a �ile or folder. Even if you do not have Remove permission
on a �ile or folder, if you have Delete Subfolders and Files permission on the parent folder, you may
still delete it.
Read permissions: This allows or denies reading permissions of a �ile or folder.
Change permissions: This allows or denies changing permissions of the �ile or folder.
Take ownership: This allows or disallows the user to take ownership of a �ile or folder. Regardless of
any current rights that protect the �ile or folder, the owner of the �ile or folder can always alter its
permissions.
Synchronise: This allows or disallows separate threads to synchronise with another thread that
may signal the handle for the �ile or folder. Only multithreaded, multiprocessing programs are
allowed to use this permission.
5
different formats, including ANSI for Windows-based operating systems and ASCII for cross-platform
use.
In a Windows OS, a text editor such as Word or Notepad is used to create a text �ile with the extension.txt
(operating system). Nearly all computer languages, including PHP and Java, employ text �iles to write
and store source code. By changing the �ile extension from.txt to.php or.cpp, the generated �ile can be
converted into a similar programming language.
6
translation into binary. An enhanced size will be determined by low level link compression and text data
have less entropy as it has enhanced size.
The standard libraries and Microsoft windows enables the programmer to determine parameter if
�ile is focused on binary or plain text while opening a �ile. In Unix, the standard libraries enables the
programmer to determine whether a �ile is expected to be binary or text.
Viewing
Hex viewer is used to view �ile data as sequence of hexadecimal values of binary �ile. If the binary �ile is
viewed in text editor, each group will be translated as a character and user shows textual characters.
If the �ile is opened in other applications, then it has own use for each byte. The application considers
each byte as output stream of numbers between 0 and 255. It replaces the unprintable characters with
spaces indicating human readable text. It can be helpful for monitoring binary �ile to identify password
in games and hidden text and retain corrupted document. It can be used to explore the suspicious �iles
for unwanted effects. If the �ile is considered as run and executable, the operating system will interpret
the �ile as sequence of instructions in machine language.
Interpretation
Standards are signi�icant to the binary �iles. ASCII character will be displayed in text. Byte may be pixel
or sound or entire word. Binary is meaningless until the executed algorithm describes what needs to be
done with each byte, word or bit. Evaluating the binary to map against the known formats will cause
wrong conclusion. It can be used in steganography where the binary �ile exhibits the hidden content.
7
Delete operation: Not only will removing the �ile delete all of the data it contains, but it will also free
up disc space. To delete the selected �ile, the directory is searched. Once the directory entry is found,
all associated �ile space and the directory entry are released.
Close operation: When the �ile has been processed, it should be closed so that all of the changes are
permanent and all of the resources used are released. When you close the �ile, it deallocates all of the
internal descriptors that were created when you opened it.
15.7.1 Indexing
Indexing is a data structure technique that helps to speed up data retrieval. As we can quickly locate
and access the data in the database, it is a must-know data structure that will be needed for database
optimizing. Indexing minimizes the number of disk accesses required when a query is processed. Indexes
are created as a combination of the two columns.
Data retrieval is aided by indexing, which is a data structure approach. Because it lets us to quickly
identify and access data in the database, it is a must-know data structure for database optimization.
Indexing lowers the number of disc accesses necessary when a query is run. The two columns are mixed
together in indexes:
First column: The Search key is in the �irst column. It has a copy of the table’s primary key or
candidate key. This column’s values can be sorted or not. However, if the values are sorted, the
related data is easily accessible.
Second column: The Data reference or Pointer is the second column. It contains the disc block
address where the relevant key value can be found. Figure 3 depicts the structure of index:
8
Clustered indexing
Multilevel indexing
Primary Indexing
There are only two columns in primary indexing. The main key values, which are the search keys, are
in the �irst column. The pointers in the second column contain the address to the search key value’s
matching data block. The table should be sorted, and the records in the index �ile and the data blocks
should have a one-to-one relationship. This is a slower but more traditional mechanism. Primary
indexing is further classi�ied into two types are as follows:
Dense index: For each search key value in the data �ile, there is an index record that contains a
search key and a pointer. Despite the fact that the dense index is a quick solution, it requires more
memory to store index records for each key value. Figure 4 depicts dense index:
1 1 John 25
2 2 Jack 24
3 3 Amey 18
4 4 Ellena 29
5 5 Kate 31
6 6 Will 26
Index record Data block
1 John 25
2 Jack 24
1
3 Amey 18
4
4 Ellena 29
6
5 Kate 31
6 Will 26
9
An intermediate node is a communication medium between index and data, as shown in Figure 6:
2 Jack 24
1 1 John 25
2 2
6 Will 26
3 4
3 Amey 18
4 5
5 6 4 Ellena 29
6 5 Kate 31
Index �ile Intermediate note Data block
Clustered Indexing
In clustered indexing the table is well-organized. When the indexes are created with the help of non-
primary key at that time, to get the unique values we associate more than two columns together to
identify data uniquely to create the index, as shown in Figure 7:
4 1 Mill 19
2 1 Gim 20
4 4 Ellena 29
5 4 Ronald 19
6 Will 26
6 Ruby 22
Multilevel Indexing
Multilevel indexing is used when the primary index does not �it in the memory. The indices are increased
when the size of the database is increased. In fact only a single-level index can be too huge to accumulate
in the main memory. The data block gets breaken down into the smaller blocks to be stored in the main
memory in multilevel indexing.
10
The multilevel indexing is further classi�ied into two methods:
B+ tree indexing
B- tree indexing
Data hierarchy is the systematic organisation of data, which is typically done in a hierarchical
method.
A �ile can be a “free formed,” “indexed” or “organised” collection of linked bytes that is only understood
by the person who generated it.
A text �ile is a form of digital �ile that is non-executable and only contains text.
A �ile organization guarantees that records are prepared for processing.
Indexing is a data structure technique that helps to speed up data retrieval.
In the secondary indexing the columns of the candidate key hold the values with the respective
pointer that has the values to the location of an address.
There are only two columns in primary indexing.
In clustered indexing the table is well-organized.
Multilevel indexing is used when the primary index does not �it in the memory.
15.9 GLOSSARY
Data hierarchy: The systematic organisation of data, which is typically done in a hierarchical
method.
File: It can be a “free formed,” “indexed” or “organised” collection of linked bytes that is only
understood by the person who generated it.
Text �ile: It is a form of digital �ile that is non-executable and only contains text.
File organization: The guarantees that records are prepared for processing in �ile organization.
Indexing: It is a data structure technique that helps to speed up data retrieval.
Secondary indexing: The columns of the candidate key hold the values with the respective pointer
that has the values to the location of an address.
Clustered indexing: The table is well-organized in clustered indexing.
Multilevel indexing: It is used when the primary index does not �it in the memory.
11
5. Indexing minimizes the number of disk accesses required when a query is processed. What is the
concept of indexing?
https://limbd.org/objectives-factors-to-be-consider-of-�ile-organization/
https://www.indeed.com/career-advice/career-development/types-of-�iles
You can discuss about the concept of �iles and �iles organization with your friends. Also, discuss
about the concept of data hierarchy in real life.
12