Introduction To Distributed Operating Systems Communication in Distributed Systems
Introduction To Distributed Operating Systems Communication in Distributed Systems
solutions.
1) write-through cache - a word is written to the
cache, is written to memory as well.
all writes, hits and misses, cause bus traffic,
introducing snoopy cache (snooping
cache)(eavesdropping)
cache continuously monitors bus for updates
HARDWARE CONCEPTS..
Switched Multiprocessors
Limitation
Solution
NUMA (NonUniform Memory Access) machine.
Each CPU can access its own local memory
quickly, but accessing anybody else's memory is
slower.
GRID
best suited to problems that have an inherent two-dimensional
nature, such as graph theory or vision
HARDWARE CONCEPTS..
HYPERCUBE
Tightly-coupled
SOFTWARE CONCEPTS..
Network Operating Systems
/work/news.
/games/work/news
SOFTWARE CONCEPTS..
True Distributed Systems
tightly-coupled software on the same
loosely-coupled (i.e., multicomputer)
hardware.
The name of the resource must not secretly encode the location
of the resource.
users try to access the same resource at the same time? For example, what
happens if two users try to update the same file at the same time? If the
system is concurrency transparent, the users will not notice the existence of
other users. One mechanism for achieving this form of transparency would be
for the system to lock a resource automatically once someone had started to
use it, unlocking it only when the access was finished. In this manner, all
resources would only be accessed sequentially, never concurrently.
1.Response time
2.throughput (number of jobs per hour)
3.system utilization, and
4.amount of network capacity consumed
network capacity,
not fault tolerant
saturate all the
communication lines
key issues
Transmits 0s and 1s.
How many volts to use for 0 and 1,
how many bits per second can be sent,
whether transmission can take place in both
directions simultaneously
the size and shape of the network connector (plug),
the number of pins and meaning of each
The physical layer protocol deals with standardizing
the electrical, mechanical, and signalling interfaces
LAYERED PROTOCOLS..
The Data Link Layer
groups the bits into units, called frames
sees that each frame is correctly received
Puts a special bit pattern on the start and end
of each frame, to mark them, as well as
computing a checksum by adding up all the
bytes in the frame in a certain way.
When the frame arrives, the receiver re
computes the checksum from the data and
compares the result to the checksum following
the frame.
LAYERED PROTOCOLS..
LAYERED PROTOCOLS..
The Network Layer
Routing- For a message to get from the
sender to the receiver, have to make a
number of hops, at each one choosing an
outgoing line to use. choosing the best path
is called routing.
connection-oriented connection-less
Setup
required
LAYERED PROTOCOLS..
The Transport Layer
the session layer should be able to deliver a
message to the transport layer with the
expectation that it will be delivered without
loss.
the transport layer breaks it into pieces small
enough for each to fit in a single packet, assigns
each one a sequence number, and then sends
them all.
Protocols - X.25(packets arrive sequentially) or
IP (may arrive mix)
Transport layer puts all messages in order
LAYERED PROTOCOLS..
International Standards Organization (ISO)
protocols
TP0 to TP4 - differences relate to error handling
and the ability to send several transport
connections over a single X.25 connection
Department of
Defense Network (DoD) Network Model
TCP(connection Oriented)
UDP(connection less)
LAYERED PROTOCOLS..
The Session Layer
enhanced version of the transport layer.
It provides dialog control and it provides
synchronization facilities.
The latter are useful to allow users to insert
checkpoints into long transfers, so that in the
event of a crash it is only necessary to go back to
the last checkpoint, rather than all the way back
to the beginning
LAYERED PROTOCOLS..
The Presentation Layer
is concerned with the meaning of the bits.
Connection(virtual circuit)
transport connections
9
rows
Solutions
Cells are copied into a queue associated with the
output buffer and lets it wait there.
Can have pool of buffers that can be used for both
input and output buffering.
buffer on the input side, but allow the second or third
cell in line to be switched, even if the first one cannot
be.
ASYNCHRONOUS TRANSFER MODE
NETWORKS...
Some Implications of ATM for Distributed
Systems
(a) Machine.process addressing. (b) Process addressing with broadcasting. (c) Address lookup via a name
THE CLIENT-SERVER MODEL..
ISSUES
Blocking versus Nonblocking Primitives
Client blocked
In send primitive.
until message sent
In recieve primitive.
Until message
recieved
Client blocked
Only till message
is copied to
message buffer
A non blocking send primitive.
THE CLIENT-SERVER MODEL..
ISSUES
Disadvantage of non blocking - the sender cannot modify the
message buffer until the message has been sent which Can lead to
overwriting..
Since sending process has no idea of when the transmission is done
its not safe to use buffer
Solutions
1) kernel copy the message to an internal kernel buffer and then
allow the process to continue.
Overhead(performance, extra copy) of copying message to kernel
buffer.
2) interrupt the sender when the message has been sent to inform
it that the buffer is once again available.
user-level interrupts make programming tricky.
THE CLIENT-SERVER MODEL..
ISSUES
Buffered versus Unbuffered Primitives
Unbuffered primitives
receive(addr, &m) - tells the kernel of the machine on which it is running that the
calling process is listening to address addr and is prepared to receive one message
sent to that address and stored in a single message buffer, pointed to by m.
problem
works fine as long as the server calls receive (tells the server's kernel which address
the server is using to put the incoming message ) before the client calls send.
If it takes too long, the sending kernel can resend the request to
guard against the possibility of a lost message.
an acknowledgement from the client's kernel to the server's kernel is
sometimes used.
Solution
to devise a network standard or canonical form for integers, characters,
booleans, floating-point numbers, and so on, and require all senders to
convert their internal representation to this form while marshaling.
Sometimes unnecessary conversions required.
Solution
the client uses its own native format and indicates the format in the
first byte of the message.
REMOTE PROCEDURE CALL..
Parameter Passing
How are pointers passed?.
drawbacks
not every language has exceptions or signals(PASCAL).
having to write an exception or signal handler destroys
the transparency.
REMOTE PROCEDURE CALL..
Lost Request Messages.
kernel start a timer when sending the request. If
the timer expires before a reply or
acknowledgement comes back, the kernel sends the
message again.
Solution
Actions to be taken
(b) the system has to report failure back to the client (e.g., raise an
exception)
Result
they waste CPU cycles.
They can also lock files or otherwise tie up valuable
resources.
if the client reboots and does the RPC again, but the
reply from the orphan comes back immediately
afterward, confusion can result.
REMOTE PROCEDURE CALL..
Client Crashes
Solutions
1) extermination
before a client stub sends an RPC message, it makes a log entry
telling what it is about to do which is kept on disk(medium that
survives crashes).
After a reboot, the log is checked and the orphan is explicitly
killed off.
Disadvantage
the horrendous expense of writing a disk record for every RPC.
the network may be partitioned, due to a failed gateway,
making it impossible to kill them, even if they can be located.
REMOTE PROCEDURE CALL..
Client Crashes
Solutions
2) reincarnation
divide time up into sequentially numbered epochs.
When a client reboots, it broadcasts a message to
all machines declaring the start of a new epoch.
When such a broadcast comes in, all remote
computations are killed.
if the network is partitioned, some orphans may
survive. However, when they report back, their
replies will contain an obsolete epoch number,
making them easy to detect.
REMOTE PROCEDURE CALL..
Client Crashes
Solutions
3) gentle reincarnation
Disadvantage
performance loss.
All that extra software gets in the way. Besides, the main
advantage (no lost packets) is hardly needed on a LAN.
REMOTE PROCEDURE CALL..
IMPLEMENTATION ISSUES
RPC Protocols.
Disadavantages of IP – performance.
IP was not designed as an end-user protocol.
Packet switching is not required in LAN.
Too much overhead on header of the packet.
1. Error control
easy to implement requires more administration
2. Flow control
overrun errors are impossible receiver overrun is a possibility
REMOTE PROCEDURE CALL..
IMPLEMENTATION ISSUES
III) Critical Path
The sequence of instructions that is executed on
every RPC.
REMOTE PROCEDURE CALL..
IMPLEMENTATION ISSUES
III) Critical Path
As per Schroeder and Burrows (1990)
Then the kernel inspects the packet and maps the page
containing it into the server's address space. If this type
of mapping is not possible, the kernel copies the packet
to the server stub (copy 2).
REMOTE PROCEDURE CALL..
IMPLEMENTATION ISSUES
IV Copying
Worst case
the kernel copies the message from the client stub into a kernel buffer for
subsequent transmission, either because it is not convenient to transmit
directly from user space or the network is currently busy (copy 1).
At this point, the hardware is started, causing the packet to be moved over the
network to the interface board on the destination machine (copy 3).
When the packet-arrived interrupt occurs on the server's machine, the kernel
copies it to a kernel buffer, probably because it cannot tell where to put it
until it has examined it, which is not possible until it has extracted it from the
hardware buffer (copy 4).
The kernel scans the entire process table, checking each timer value
against the current time. Any nonzero value that is less than or equal to
the current time corresponds to an expired timer, which is then processed
and reset.
REMOTE PROCEDURE CALL..
Problem Areas
allowing local procedures unconstrained access to remote global
variables, and vice versa, cannot be implemented, yet prohibiting
this access violates the transparency principle (that programs
should not have to act differently due to RPC).
Peer groups
all the processes are equal.
All decisions are made collectively.
Peer Groups versus Hierarchical Groups
GROUP COMMUNICATION.
DES IGN ISS UES
Advantage
symmetric and has no single point of failure
Disadvantage
decision making is more complicated. To
decide anything, a vote has to be taken,
incurring some delay and overhead.
Peer Groups versus Hierarchical Groups
GROUP COMMUNICATION.
DES IGN ISS UES
Hierarchical Groups
one process is the coordinator and all the
others are workers.
The coordinator then decides which worker is
best suited to carry it out.
Peer Groups versus Hierarchical Groups
GROUP COMMUNICATION.
DES IGN ISS UES
Advantage
as long as it is running, it can make decisions
without bothering everyone else.
Disadvantage
Loss of the coordinator brings the entire group
to a grinding halt
Group Membership GROUP COMMUNICATION.
DES IGN ISS UES
advantage
is straightforward, efficient, and easy to
implement.
Disadvantage
a single point of failure at the group server.
Group Membership GROUP COMMUNICATION.
DES IGN ISS UES
Issues
1) if a member crashes, it leaves the group. The other members
have to discover this experimentally by noticing that the
crashed member no longer responds to anything.
I)
DES IGN ISS UES
Group Addressing GROUP COMMUNICATION.
DES IGN ISS UES
all-or-nothing delivery.
The sender starts out by sending a message to all members of the group.
When a process receives a message, if it has not yet seen this particular
message, it, too, sends the message to all members of the group (again
with timers and retransmissions if necessary).
If it has already seen the message, this step is not necessary and the
message is discarded.
No matter how many machines crash or how many packets are lost,
eventually all the surviving processes will get the message
Message Ordering GROUP COMMUNICATION.
DES IGN ISS UES
If processes 0 and 4 are both trying to update the same record in a data
base, 1 and 3 end up with different final values due to the order of
messages sent.
Message Ordering GROUP COMMUNICATION.
DES IGN ISS UES
only one packet can be on a LAN at any instant. With gateways and
multiple networks, it is possible for two packets to be "on the wire"
simultaneously.