Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 59

Distributed Systems (IT 441)

Chapter 2 – Communication

A distributed system is a system whose components are located on


different networked computers, which communicate and coordinate
their actions by passing messages to one another.
Objectives of the Chapter
 review of how processes communicate in a network (the
rules or the protocols) and their structures
 introduce the four widely used communication models for
distributed systems:
 Remote Procedure Call (RPC)
 Remote Method Invocation (RMI)
 Message-Oriented Middleware (MOM)
 Streams

2
Introduction
 Interprocess communication is at the heart of all distributed
systems
 communication in distributed systems is based on message
passing as offered by the underlying network as opposed to
using shared memory
 modern distributed systems consist of thousands of
processes scattered across an unreliable network such as
the Internet
 unless the primitive communication facilities of the network
are replaced by more advanced ones, development of large
scale Distributed Systems becomes extremely difficult.

3
2.1 Layered Protocols
 two computers, possibly from different manufacturers, must
be able to talk to each other
 for such a communication, there has to be a standard
 The ISO OSI (Open Systems Interconnection) Reference
Model is one of such standards - 7 layers
 TCP/IP protocol suite is the other; has 4 or 5 layers
 OSI
 Open – to connect open systems or systems that are open
for communication with other open systems using standard
rules that govern the format, contents, and meaning of the
messages sent and received
 these rules are called protocols
 two types of protocols: connection-oriented and
connectionless

4
layers, interfaces, and protocols in the OSI model
5
Media (lower) Layers
 Physical: Physical characteristics of the media
 Data Link: Reliable data delivery across the link
 Network: Managing connections across the network
or routing
 Transport: End-to-end connection and reliability
(handles
lost packets); TCP (connection-oriented),
UDP (connectionless), etc.
 Session: Managing sessions between applications
(dialog control and synchronization); rarely
supported
 Presentation: Data presentation to applications; concerned
with the syntax and semantics of the
information transmitted
 Application: Network services to applications; contains
protocols that are commonly needed by
users;
Host (upper) FTP, HTTP, SMTP, ...
Layers
6
a typical message as it appears on the network

7
 Transport Protocols: Client-Server TCP

assuming no messages are lost,


 the client initiates a setup
connection using a three-way
handshake (1-3)
 the client sends its request (4)
 it then sends a message to
close the connection (5)
 the server acknowledges
receipt and informs the client
that the connection will be
closed down (6)
 then sends the answer (7)
followed by a request to close
the connection (8)
 the client responds with an ack
normal operation of TCP to finish conversation (9)
8
 much of the overhead in TCP is for managing the connection
 combine connection setup with
request and closing connection
with answer
 such protocol is called TCP for
Transactions (T/TCP)
 the client sends a single
message consisting of a setup
request, service request, and
information to the server that
the connection will be closed
down immediately after
receiving the answer (1)
 the server sends acceptance of
connection request, the
answer, and a connection
release (2)
 the client acknowledges tear
down of the connection (3) transactional TCP 9
2.2 Remote Procedure Call
 the first distributed systems were based on explicit message
exchange between processes through the use of explicit
send and receive procedures; but do not allow access
transparency
 in 1984, Birrel and Nelson introduced a different way of
handling communication: RPC
 it allows a program to call a procedure located on another
machine
 simple and elegant, but there are implementation problems
 the calling and called procedures run in different address
spaces
 parameters and results have to be exchanged; what if the
machines are not identical?
 what happens if both machines crash?

10
2.2.1 Basic RPC Operation
 Conventional Procedure Call, i.e., on a single machine
 e.g. count = read (fd, buf, bytes); a C like statement, where
fd is an integer indicating a file
buf is an array of characters into which data are read
bytes is the number of bytes to be read
Stack pointer

Stack pointer

parameter passing in a local procedure the stack while the called


call: the stack before the call to read procedure is active
 parameters can be call-by-value (fd and bytes) or call-by
reference (buf) or in some languages call-by-copy/restore 11
 Client and Server Stubs
 RPC would like to make a remote procedure call look the
same as a local one; it should be transparent, i.e., the calling
procedure should not know that the called procedure is
executing on a different machine or vice versa

principle of RPC between a client and server program


 when a program is compiled, it uses different versions of
library functions called client stubs
 a server stub is the server-side equivalent of a client stub 12
RPC Model

 A Server defines the server’s interface using an interface


definition language (IDL)
 The IDL specifies the names, parameters, and types for all client-
callable server procedures.
 A stub compiler reads the IDL and produces two stub
procedures for client and server:
 The server program implements the server procedures and links them
with the server-side stubs.

 The client program implements the client program and links it with the
client-side stubs.

 The stubs are responsible for managing all details of the remote
communication between client and server.
13
RPC Stubs

 A client-side stub looks to the client as if it were a callable server


procedure.
 A server-side stub looks to the server as if a client called it.

 The client program thinks it is calling the server (In fact, it is


calling the client stub).
 The server program thinks it is called by the client (In fact, it is
called by the server stub).
 The stubs send messages to each other to make the RPC happen
“transparently”.

14
Steps of a Remote Procedure Call

15
16
 Steps of a Remote Procedure Call
1. Client procedure calls client stub in the normal way
2. Client stub builds a message and calls the local OS
(packing parameters into a message is called parameter
marshaling)
3. Client's OS sends the message to the remote OS
4. Remote OS gives the message to the server stub
5. Server stub unpacks the parameters and calls the server
6. Server does the work and returns the result to the stub
7. Server stub packs it in a message and calls the local OS
8. Server's OS sends the message to the client's OS
9. Client's OS gives the message to the client stub
10. Stub unpacks the result and returns to client
 hence, for the client remote services are accessed by making
ordinary (local) procedure calls; not by calling send and
receive
 server machine vs server process; client machine vs client process
17
2.2.2 Parameter Passing
o The function of the client stub is to take its parameters, pack them into a mes
sage, and send them to the server stub. While this sounds straightforward, it is
not quite as simple as it at first appears.
1. Passing Value Parameters
 e.g., consider a remote procedure add(i, j), where i and j are
integer parameters

steps involved in doing remote computation through RPC


18
 the above discussion applies if the server and the client
machines are identical
 but that is not the case in large distributed systems
 the machines may differ in data representation (e.g., IBM
mainframes use EBCDIC whereas IBM PCs use ASCII)
 there are also differences in representing integers(1’s
complement or 2’s complement) and floating-point numbers
 byte numbering may be different (from right to left in Pentium
called little endian and left to right in SPARC, big endian)
 e.g.
 consider a procedure with two parameters, an integer and a
four-character string; each one 32-bit word (5, “JILL”)
 the sender is Intel and the receiver is SPARC

19
original message on the Pentium
(the numbers in boxes indicate the address of each byte)

the message after receipt on the SPARC; wrong integer (5*224) 83886080, but
correct string

20
 one approach is to invert the bytes of each word after receipt

the message after being inverted (correct integer but wrong string)

 additional information is required to tell which is an


integer and which is a string
 Solution: use a standard representation
Example: external data representation (XDR)

21
2.3 Remote Object (Method) Invocation (RMI)
 resulted from object-based technology that has proven its
value in developing nondistributed applications
 it is an expansion of the RPC mechanisms
 it enhances distribution transparency as a consequence of
an object that hides its internal from the outside world by
means of a well-defined interface
 Distributed Objects
 an object encapsulates data, called the state, and the
operations on those data, called methods
 methods are made available through interfaces
 the state of an object can be manipulated only by invoking
methods
 this allows an interface to be placed on one machine while
the object itself resides on another machine; such an
organization is referred to as a distributed object
 the state of an object is not distributed, only the interfaces
are; such objects are also referred to as remote objects 22
23
24
25
 the implementation of an object’s interface is called a proxy
(analogous to a client stub in RPC systems)
 it is loaded into the client’s address space when a client
binds to a distributed object
 tasks: a proxy marshals method invocation into messages
and unmarshals reply messages to return the result of the
method invocation to the client
 a server stub, called a skeleton, unmarshals messages and
marshals replies

26
common organization of a remote object with client-side proxy

27
The Stub and Skeleton
call

skeleton
Stub
RMI Client RMI Server
return

 A client invokes a remote method, the call is first


forwarded to stub.
 The stub is responsible for sending the remote call
over to the server-side skeleton
 The stub opening a socket to the remote server,
marshaling the object parameters and forwarding the
data stream to the skeleton.
 A skeleton contains a method that receives the remote
calls, unmarshals the parameters, and invokes the
actual remote object implementation.

28
the situation when passing an object by reference or by value

29
2.4 Message Oriented Communication

 RPCs and RMIs are not adequate for all distributed system
applications
 the provision of access transparency may be good but
they have semantics that is not adequate for all
applications
 example problems
 they assume that the receiving side is running at the
time of communication
 a client is blocked until its request has been processed

30
2.4.1 Persistence and Synchronicity in
Communication
 assume the communication system is organized as a
computer network shown below

general organization of a communication system in which hosts are connected


through a network
31
 communication can be
 persistent or transient
 asynchronous or synchronous
 persistent: a message that has been submitted for
transmission is stored by the communication system as long
as it takes to deliver it to the receiver
 e.g., email delivery, snail mail delivery

persistent communication of letters back in the days of the Pony Express


32
 transient: a message that has been submitted for transmission
is stored by the communication system only as long as the
sending and receiving applications are executing
 asynchronous: a sender continues immediately after it has
submitted its message for transmission
 synchronous: the sender is blocked until its message is
stored in a local buffer at the receiving host or delivered to the
receiver
 the different types of communication can be combined
 persistent asynchronous: e.g., email
 transient asynchronous: e.g., UDP, asynchronous RPC
 in general there are six possibilities

Persistent Transient

Asynchronous  

Synchronous  message-oriented; three forms

33
persistent asynchronous communication persistent synchronous communication

34
transient asynchronous communication receipt-based transient synchronous
communication

 weakest form; the sender is


blocked until the message is
stored in a local buffer at the
receiving host

35
delivery-based transient synchronous response-based transient synchronous
communication at message delivery communication
 the sender is blocked until the  strongest form; the sender is
message is delivered to the blocked until it receives a reply
receiver for further processing; message from the receiver
e.g., asynchronous RPC

36
2.4.2 Message-Oriented Transient Communication
 many applications are built on top of the simple message-
oriented model offered by the transport layer
 standardizing the interface of the transport layer by
providing a set of primitives allows programmers to use
messaging protocols
 they also allow porting applications

1. Berkley Sockets
 an example is the socket interface as used in Berkley
UNIX
 a socket is a communication endpoint to which an
application can write data that are to be sent over the
network, and from which incoming data can be read.

37
Primitive Meaning Executed by
Socket Create a new communication endpoint; also
reserve resources to send and receive messages both
Bind Attach a local address to a socket; e.g., IP
address with a known port number
Listen Announce willingness to accept connections; for
connection-oriented communication
Accept Block caller until a connection request arrives
servers
Connect Actively attempt to establish a connection; the
client is blocked until connection is set up
Send Send some data over the connection
Receive Receive some data over the connection
Close Release the connection
socket primitives for TCP/IP

38
connection-oriented communication pattern using sockets

39
2.4.3 Message-Oriented Persistent Communication
 there are message-oriented middleware services, called
message-queuing systems or Message-Oriented Middleware
(MOM)
 they support persistent asynchronous communication
 they have intermediate-term storage capacity for messages,
without requiring the sender or the receiver to be active
during message transmission
 unlike Berkley sockets and MPI, message transfer may take
minutes instead of seconds or milliseconds
Message-Queuing Model
 applications communicate by inserting messages in
specific queues
 it permits loosely-coupled communication
 the sender may or may not be running; similarly the
receiver may or may not be running, giving four possible
combinations
40
 messages are managed by queue managers
 they generally interact with the application that sends and
receives messages
 some also serve as routers or relays, i.e., they forward
incoming messages to other queue managers
 however, each queue manager needs a copy of the queue-
to-location mapping, leading to network management
problems for large-scale queuing systems
 the solution is to use a few routers that know about the
network topology

41
the general organization of a message-queuing system with routers
42
2.4 Stream Oriented Communication
 until now, we focused on exchanging independent and
complete units of information
 time has no effect on correctness; a system can be slow or fast
 however, there are communications where time has a critical
role
 Multimedia
 media
 storage, transmission, interchange, presentation,
representation and perception of different data types:
 text, graphics, images, voice, audio, video, animation, ...
 movie: video + audio + …
 multimedia: handling of a variety of representation media
 end user pull
 information overload and starvation
 technology push
 emerging technology to integrate media 43
 Types of Media
 two types
 discrete media: text, executable code, graphics, images;
temporal relationships between data items are not
fundamental to correctly interpret the data
 continuous media: video, audio, animation; temporal
relationships between data items are fundamental to
correctly interpret the data
 a data stream is a sequence of data units and can be applied
to discrete as well as continuous media
 stream-oriented communication provides facilities for the
exchange of time-dependent information (continuous media)
such as audio and video streams

44
 timing in transmission modes
 asynchronous transmission mode: data items are
transmitted one after the other, but no timing constraints;
e.g. text transfer
 synchronous transmission mode: a maximum end-to-end
delay defined for each data unit; it is possible that data can
be transmitted faster than the maximum delay, but not slower
 isochronous transmission mode: maximum and minimum
end-to-end delay are defined; also called bounded delay
jitter; applicable for distributed multimedia systems
 a continuous data stream can be simple or complex
 simple stream: consists of a single sequence of data; e.g.,
mono audio, video only
 complex stream: consists of several related simple streams
that must be synchronized; e.g., stereo audio, video
consisting of audio and video (may also contain subtitles,
translation to other languages, ...)
45
movie as a set of simple streams

46
 a stream can be considered as a virtual connection between a
source and a sink
 the source or the sink could be a process or a device

setting up a stream between two processes across a network

setting up a stream directly between two devices 47


 the data stream can also be multicasted to several receivers
 if devices and the underlying networks have different
capabilities, the stream may be filtered, generally called
adaptation (filtering?, transcoding?)

an example of multicasting a stream to several receivers

48
 Quality of Service (QoS)
 QoS requirements describe what is needed from the
underlying distributed system and network to ensure
acceptable delivery; e.g. viewing experience of a user
 for continuous data, the concerns are
 timeliness: data must be delivered in time
 volume: the required throughput must be met
 reliability: a given level of loss of data must not be
exceeded
 quality of perception; highly subjective

49
 QoS Dimensions
 timeliness dimensions
 latency (maximum delay between consecutive frames)
 start-up latency (maximum delay before starting a
presentation)
 jitter (delay variance)
 volume dimensions
 throughput in frames/sec or bits/sec or bytes/sec
 reliability dimensions
 MTBF (Mean Time Between Failure) of disks
 MTTR (Mean Time To Repair)
 error rates on the telecommunication lines

50
 QoS Requirements
 deterministic
 precise values or ranges
 e.g., latency must be between 45 and 55 ms
 probabilistic
 probability of the required QoS
 e.g., latency should be < 50 ms for 95% of the frames
 stochastic distributions
 e.g., frame arrival should follow normal distribution with
mean interval-time of 40 ms and 5 ms variance
 classes
 e.g., guaranteed and best effort

51
 QoS Management
 can be static or dynamic
 Static QoS Management Functions
 specification
 e.g., deterministic range for timeliness, volume and
reliability categories
 negotiation
 the application may accept lower level of QoS for
lower cost
 admission control
 if this test is passed, the system has to guarantee the
promised QoS
 resource reservation
 may be necessary to provide guaranteed QoS

52
 Dynamic QoS Management Functions
 monitoring
 notices deviation from QoS level
 at a certain level of granularity (e.g., every 100 ms)
 policing
 detect participants not keeping themselves to the contract
 e.g., source sends faster than negotiated (e.g., 25 fps)
 maintenance
 sustaining the negotiated QoS
 e.g., the system requires more resources
 renegotiation
 client tries to adapt – may be can accept lower QoS

53
 QoS requirements can be specified using flow specification
containing bandwidth requirements, transmission rates,
delays, ...
 e.g. by Partridge (1992)
 it uses the token bucket algorithm which specifies how the
stream will shape its network traffic (in fact the leaky
bucket, as used in networking)
 the idea is to shape bursty traffic into fixed-rate traffic by
averaging the data rate
 packets may be dropped if the bucket is full
 the input rate may vary, but the output rate remains
constant

54
the principle of a token bucket algorithm

55
 problem in flow specification
 an application may not know its requirements
 how can a user (human) specify quality using the various
parameters? usually very difficult
 may be provide defaults for various streams as high,
medium, low quality
 Setting up a Stream
 resources such as bandwidth, buffers, processing power
must be reserved once a flow specification is made
 on such protocol is RSVP - Resource reSerVation Protocol
 it is a transport-level protocol for enabling resource
reservation in network routers

56
the basic organization of RSVP for resource reservation in a distributed system

57
 Stream Synchronization
 how to maintain temporal relations between streams, e.g., lip
synchronization
 two approaches
1. explicitly by operating on the data units of simple
streams; the responsibility of the application

the principle of explicit synchronization on the level of data units 58


2. through a multimedia middleware that offers a collection of
interfaces for controlling audio and video streams as well as
devices such as monitors, cameras, microphones, ...

the principle of synchronization as supported by high-level interfaces

59

You might also like