Lect1 - Introduction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 76

Distributed System

Dr. D.S. Kushwaha


Computer Architectures
 Computer architectures consisting of
interconnected, multiple processors are
basically of two types:
 Tightly coupled systems
 Loosely coupled systems
Tightly coupled systems
 There is a single system wide primary
memory (address space) that is shared by all
the processors.
 These are also known as parallel processing
systems.

System wide
CPU CPU CPU CPU
Shared memory

Interconnection hardware
Loosely coupled systems
 The processors do not share memory, and
each processor has its own local memory.
 These are also known as distributed
computing systems or simply distributed
systems.

Local memory Local memory Local memory Local memory

CPU CPU CPU CPU

Communication Network
Distributed Computing System
 A DCS is a collection of independent computers that
appears to its users as a single coherent system,
or
A collection of processors interconnected by a
communication network in which
 Each processor has its own local memory and other
peripherals, and
 The communication between any two processors of
the system takes place by message passing.
 For a particular processor, its own resources are
local, whereas the other processors and their
resources are remote.
Cont…
 Together, a processor and its resources are
usually referred to as a node or site or machine
of the distributed computing system.

 A distributed system is organized as middleware.

Note that the middleware layer extends over multiple


machines.
Distributed System
Distributed Computing System Models
1. Minicomputer Model
2. Workstation model
3. Workstation-server model
4. Processor-pool model
5. Hybrid model
Minicomputer model
 Extension of the centralized time - sharing system.
 It consists of a few minicomputers interconnected by a
communication network.
 Each user is logged on to one specific minicomputer,
with remote access to other minicomputers.
 The network allows a user to access remote resources
that are available on some machine other than the one
on to which the user is currently logged.
 This model may be used when resource sharing with
remote users is desired.
Cont…

Mini- Terminals
computer

Mini- Mini-
Communication
computer computer
network

Mini-
computer
Workstation Model
 It consists of several workstations interconnected by a
communication network.

 The idea of the workstation model is to interconnect all


workstations by a high-speed LAN so that:
 idle workstations may be used to process jobs of users
who are logged onto other workstations and
 The nodes which don’t have sufficient processing power
at their own workstations may send their process.
Cont…

workstation

workstation workstation

Communication workstation
workstation network

workstation workstation

workstation
Cont… : Issues
1. How does the system find an idle workstation?
2. How is a process transferred from one
workstation to get it executed on another
workstation?
3. What happens to a remote process if a user
logs onto a workstation that was idle until now
and was being used to execute a process of
another workstation?
Cont… : Issues
 Three approaches to handle third issue:
 Allow the remote process share the resources
of the workstation along with its own logged-on
user’s processes.

 Kill the remote process.

 Migrate the remote process back to its home


workstation, so that its execution can be
continued there.
Workstation Server Model
 It consists of a few minicomputers and several
workstation maybe diskless.
 When diskless workstations are used on a network, the
file system to be used by these workstations must be
implemented either by a diskful workstation or by a
minicomputer equipped with a disk for file storage.
 Each minicomputer is used as a server machine to
provide one or more type of services.
Cont…
 In addition to workstations, there are specialized
machines/workstations for running server processes
(called servers) for managing and providing access to
shared resources.
 The user’s processes need not be migrated to the
server machines for getting the work done by those
machines.
 For better overall system performance, the local disk of
a diskful workstation is normally used for such purposes
as
 Storage of temporary files,
 Storage of unshared files,
 Storage of shared files that are rarely changed,
 Paging activity in virtual-memory management, and
 Caching of remotely accessed data.
Cont…

workstation

workstation workstation

Communication workstation
workstation network

Mini-
Mini- Mini-
computer
computer computer
Used as
Used as Used as
Print
File server database
server
server
Cont…
 Advantages of workstation server model over
workstation model:
 It is much cheaper to use a few minicomputers equipped
with large and fast disks.
 System maintenance is easy.
 Flexibility to use any workstation and access the files
because all files are managed by file servers.
 Request-response protocol is mainly used to access the
services of the server machines.
 Both client and server can run on the same computer.
 A user has guaranteed response time because each
workstations are not used for executing remote
processes.
Processor Pool Model
 Based on notion that most of the time, the CPU
computing power is not used. One needs large
computing power for short time.
 The pool of processors consists of a large number of
microcomputers and minicomputers attached to the
network.
 Each processor in the pool has its own memory to load
and run the program.
 A user does not log onto a particular machine but to the
system as a whole.
Cont…

Terminals
Communication
network

Run File
server server
Cont…
 Advantages over workstation server model:
 Better utilization
 The entire processing power of the system is
available for use by the currently logged-on users.

 Greater flexibility
 The system’s services can be easily expanded
without the need to install any more computers;
 The processors in the pool can be allocated to act
as extra servers to carry any additional load arising
from an increased user population or to provide new
services.
Cont…
 Disadvantage
 Unsuitable for high-performance interactive
applications.
 Because of the slow speed of communication
between the computer on which the application
program of a user is being executed and the
terminal via which the user is interacting with the
system.
Hybrid Model
 Workstation Server Model is most widely used for small
program/Application.
 Processor-Pool is ideal for massive computation jobs.

 Hybrid combines the above two model as


 A basic workstation/server model
 Addition of pool of processors
 The processors in the pool can be allocated dynamically
for computations that are too large for workstations
 Gives guaranteed response to interactive jobs.
 More expensive
Major Issues
 In contrast to Standalone or centralized systems, issues:
 Communication &
 Security pose challenge
 Access Control

 Privacy Constraints

 Reliability and Performance of Distributed System rests


on underlying communication network
Factors that led to the emergence of
distributed computing system
 Inherently distributed applications
 Organizations once based in a particular location have gone
global.

 Information sharing among distributed users


 CSCW

 Resource sharing
 Such as software libraries, databases, and hardware resources.
Factors that led to the emergence of
distributed computing system
 Better price-performance ratio

 Shorter response times and higher throughput

 Higher reliability
 Reliability refers to the degree of tolerance against errors
and component failures in a system.
 Achieved by multiplicity of resources.
Factors that led to the emergence of
distributed computing system
 Extensibility and incremental growth
 By adding additional resources to the system as and when
the need arise.
 These are termed as open distributed System.

 Better flexibility in meeting user’s needs


Distributed Operating system
 OS controls the resources of a computer system and
provides its users with an interface or virtual machine
that is more convenient to use than the bare machine.
 Primary tasks of an operating system:
 To present users with a virtual machine that is easier to
program than the underlying hardware.
 To manage the various resources of the system.
 Types of operating systems used for distributed
computing systems:
 Network operating systems
 Distributed operating systems.
Uniprocessor Operating Systems

 Separating applications from operating system code


through a microkernel
Multicomputer Operating Systems
Message Transfer
Network Operating System
Network Operating System
Distributed System as Middleware
Middleware and Openness

In an open middleware-based distributed system, the protocols


used by each middleware layer should be the same, as well as
the interfaces they offer to applications
Comparison between Systems
Distributed OS
Network Middleware-
Item Multi- Multi- OS based OS
processor computer
Degree of
Very High High Low High
transparency
Same OS on all
Yes Yes No No
nodes
Number of copies
1 N N N
of OS
Basis for Shared
Messages Files Model specific
communication memory
Resource Global, Global,
Per node Per node
management central distributed
Scalability No Moderately Yes Varies
Openness Closed Closed Open Open
Major Differences
1. System image
 In NOS, user perceives the DCS as a group of nodes
connected by a communication N/W. Hence, user is aware of
multiple computers.

 DOS hides the existence of multiple computers and provides


a single system image to its users. Hence group of Networked
nodes act as virtual Uniprocessor.

 In NOS, by default user’s job is executed on the machine on


which the user is currently logged on, else he has to remote
login.

 DOS dynamically allocates jobs to the various machines of


the system for processing.
Cont…
2. AUTONOMY
Different nodes in NOS may use different O.S. but communicate

with each other by using a mutually agreed on communication
protocol.
 In DOS, there exists single system wide O.S. and each node of
the DCS runs a part of OS (i.e. identical kernels run on all the
nodes of DCS).
This ensures same set of system calls globally valid.

3. FAULT TOLERANCE
 Low in Network OS
 High in Distribute OS

DCS using NOS generally referred as N/W System


DCS using DOS generally referred as Distributed System
Issues
 Transparency
 Reliability
 Flexibility
 Performance
 Scalability
 Heterogeneity
 Security
 Emulation to existing operating systems
TRANSPARENCY
 To make the existence of multiple computers transparent &
 Provide a single system image to its users.

 ISO 1992 identifies eight types of transparencies:


 Access Transparency
 Location Transparency
 Replication Transparency
 Failure Transparency
 Migration Transparency
 Concurrency Transparency
 Performance Transparency
 Scaling Transparency
Access Transparency

Allows users to access remote resources in the same


way as local i.e. user interface which is in the form of a
set of system calls should not distinguish between local
and remote resources.
Location transparency
 Aspects of location transparency
 Name Transparency
 Name of a resource should not reveal any information
about physical location of resource.
 Even movable resources such as files must be
allowed to move without having their name changed.

 User Mobility
 No matter which machine a user is logged onto, he
should be able to access a resource with the same
name.
Replication Transparency
 Replicas are used for better performance and reliability.

 Replicated resource and the replication activity should be


transparent to user.

 Two important issues related to replication transparency are:


 Naming of Replicas
 Replication Control

 There should be a method to map a user-supplied name of the


resource to an appropriate replica of the resource.
Failure Transparency
 In partial failures, system continues to function may be
with degraded transparency.

 Complete failure transparency is not achievable.

 Failure of the communication network of a distributed


system normally disrupts the work of its users and is
noticeable by the users.
Migration Transparency
 Migration may be done for performance, reliability and
security reasons.

 Aim of migration transparency


 To ensure that the movement of the object is handled
automatically by the system in a user-transparent manner.

 Issues for migration transparency:


 Migration decisions
 Migration of an object
 Migration object as a process
Concurrency transparency
 Virtualization that:
 One is sole user of the system &
 Other users do not exist in the system.

 Properties for providing concurrency transparency:


 Event Ordering Property
 It ensures that all access requests to various system
resources are properly ordered to provide a consistent view
to all users of the system.

 Mutual Exclusion property


 At any time, at most one process accesses a shared
resource & not used simultaneously.
Cont…
 No starvation policy
 It ensures that if every process that is granted a
resource, which must not be used simultaneously by
multiple processes, eventually releases it, every request
for that resource is eventually granted.

 No deadlock
 It ensures that a situation will never occur in which
competing processes prevent their mutual progress even
though no single one requests more resources than
available in the system.
Performance transparency
 The aim of performance transparency is to allow the
system to be automatically reconfigured to improve
performance, as loads vary dynamically in the system.

 A situation in which one processor of the system is


overloaded with jobs while another processor is idle
should not be allowed to occur.
Scaling transparency
 Allows the system to expand and scale without
disrupting the activities of the users.

 Requires open system architecture and use of scalable


algorithms.
RELIABILITY

 The distributed systems are required to be


more reliable than centralized systems due to
 The multiple instances of resources.
 Failure should be avoided & their types are:
 Fail stop Failure: System stops functioning after
changing to a state in which its failure can be
detected.
 Byzantine Failure: System continues to function but
produces wrong results.
Cont…
 Methods to handle failure in distributed system :
 Fault Avoidance
 Fault Tolerance
 Fault Detection
 Fault detection and recovery
Fault Avoidance

 It Deals with designing the components of the system


in such a way that the occurrence of faults is
minimized.

 For this, reliable hardware components and intensive


software test are used.
Fault Tolerance
 Ability of a system to continue functioning in the event
of partial system failure.
 Method for tolerating the faults:
 Redundancy techniques:
 Avoid single point of failure by replicating critical
hardware and software components,
 Additional overhead is needed to maintain multiple
copies of replicated resource & consistency issue.
 Distributed control
 A highly available distributed file system should have
multiple and independent file servers controlling multiple
and independent storage device.
Fault detection and recovery
 Atomic Transaction

 Use of stateless servers


 History of the serviced requests between Client and Server
affects the execution of service request.

 Acknowledgements and timeout-based retransmissions


of messages
 For IPC between two processes, the system must have ways
to detect lost messages so that these could be retransmitted.
FLEXIBILITY
 The design of a distributed operating system should be flexible due
to:
 Ease of Modification
 Ease of Enhancement

 Kernel is the most important influencing design factor which


operates in a separate address space that is inaccessible to user
processes.

 Commonly used models for kernel design in distributed operating


systems:
 Microkernel model
 Monolithic model
Monolithic kernel model
 Most operating system services such as process management,
memory management, device management , file management,
name management, and inter-process communication are
provided by the kernel.

 Result: The kernel has a large monolithic structure.

 The large size of the kernel reduces the overall flexibility and
configurability of the resulting OS.

 A request may be serviced faster.


 No message passing and no context switching are required while the
kernel is performing the job.
Cont…

Node 1 Node 2 Node n


user user user
applications applications applications
Monolithic Monolithic Monolithic
kernel kernel kernel
(includes most (includes most (includes most
OS services) OS services) OS services)

Network Hardware
Microkernel model
 Goal is to keep the kernel as small as possible.

 Kernel provides only the minimal facilities necessary for


implementing additional operating system services like:
 inter-process communication,
 low-level device management, &
 some memory management.

 All other OS services are implemented as user-level server


processes. So it is easy to modify the design or add new
services.
 Each server has its own address space and can be
programmed separately.
Cont…

Node 1 Node 2 Node n


user user user
applications applications applications
Server/manager Server/manager Server/manager
modules modules modules

Microkernel Microkernel Microkernel


(has only (has only (has only
minimal facilities) minimal facilities) minimal facilities)

Network Hardware
Cont…
 Modular in nature, OS is easy to design, implement and
install.
 For adding or changing a service, there is no need to stop
the system and boot a new kernel, as in the case of
monolithic kernel.
 Performance penalty :
 Each server module has to its own address space. Hence
some form of message based IPC is required while
performing some job.
 Message passing between server processes and the
microkernel requires context switches, resulting in additional
performance overhead.
Cont…
 Advantages of microkernel model over monolithic kernel
model:
 Flexibility to design, maintenance and portability.
 In practice the performance penalty is not too much due to
other factors and the small overhead involved in exchanging
messages which is usually negligible.
PERFORMANCE
Design principles for better performance are:
 Batch if possible
 Transfer of data in large chunk is more efficient than
individual pages.
 Caching whenever possible
 Saves a large amount of time and network bandwidth.

 Minimize copying of data


 While a message is transferred from sender to receiver, it
takes the following path:
 From senders stack to its message buffer

 From message buffer in senders address apace to

message buffer in kernels address space


Cont…
 Finally from kernel to NIC
 Similarly on receipt ,hence six copy operations are
required.

 Minimize Network traffic


 Migrate process closer to resource.

 Take advantage of fine-grain parallelism for


multiprocessing
 Use of threads for structuring server processes.
 Fine-grained concurrency control of simultaneous
accesses by multiple processes to a shared resource for
better performance.
SCALABILITY
 Scalability refers to the capability of a system to adapt to
increased service load.

 Some principles for designing the scalable distributed


systems are:
 Avoid centralized entities
 Central file server

 Centralized database

 Avoid centralized algorithms


 Perform most operations on client workstation
Heterogeneity
 Dissimilar hardware or software systems.

 Some incompatibilities in a heterogeneous distributed


system are:
 Internal formatting schemes
 Communication protocols and topologies of different networks
 Different servers at different nodes

 Some form of data translation is necessary for


interaction.

 An intermediate standard data format can be used.


Security
 More difficult than in a centralized system because of:
 The lack of a single point of control &
 The use of insecure networks for data communication.

 Requirements for security


 It should be possible for the sender of a message to know
that the message was received by the intended receiver.
 It should be possible for the receiver of a message to know
that the message was sent by the genuine receiver.
Cont…
 It should be possible for both the sender and receiver of a
message to be guaranteed that the contents of the message
were not changed while it was in transfer.

 Cryptography is the solution.


Challenges
 Alignment with the needs of the business / user / non-
computer specialists / community and society
 Need to address the scalability issue:
 large scale data,
 high performance computing,
 response time,
 rapid prototyping, and rapid time to production.
 Need to effectively address
 ever shortening cycle of obsolescence,
 heterogeneity and
 rapid changes in requirements
 What about providing all this in a cost-effective
manner?
68
Enter the cloud

 Cloud computing is Internet-based computing,


whereby shared resources, software and
information are provided to computers and other
devices on-demand, like the electricity grid.
 The cloud computing is a culmination of
numerous attempts at large scale computing with
seamless access to virtually limitless resources.
 on-demand computing, utility computing,
ubiquitous computing, autonomic computing,
platform computing, edge computing, elastic
computing, grid computing, …
69
Enabling Technologies
Cloud applications: data-intensive,
compute-intensive, storage-intensive

Bandwidth
WS
Services interface

Web-services, SOA, WS standards

VM0 VM1 VMn

Storage Virtualization: bare metal, hypervisor. …


Models: S3,
BigTable,
BlobStore,
... Multi-core architectures

64-bit
processor 70
What is a Cloud?

SLAs

Web Services

Virtualization
Why Cloud Computing?
 Cloud computing is an important model for the
distribution and access of computing resources.
As-needed availability:
Aligns resource expenditure with actual resource
usage thus allowing the organization to pay only
for the resources required, when they are
required.
Fog, Edge & Cloud
 Local data processing helps to mitigate the
weakness of cloud computing.
 In addition, fog computing also brings new
advantages, such as greater context-
awareness, realtime processing, lower
bandwidth requirement, etc.
Fog Computing
 A standard that defines how edge computing
should work, and it facilitates the operation of
compute, storage and networking services
between end devices and cloud computing
data centers.
 Additionally, many use fog as a jumping-off
point for edge computing.
Why Fog Computing?

 Handle large data volume created by


IoT devices.
 Latency Reduction as compared to
traditional IoT Networks.
 Avoid data bottlenecks taking place at
the cloud.
 Dense geographical distribution and
local resource pooling.
 Better security and failure resistance.
 Backbone bandwidth savings for better
Edge Computing
 Edge brings processing close to the data
source, and it does not need to be sent to a
remote cloud or other centralized systems for
processing.
 By eliminating the distance and time it takes
to send data to centralized sources, we can
improve the speed and performance of data
transport, as well as devices and applications
on the edge.

You might also like