Processes and Processors in Distributed Systems Distributed File Systems

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 54

 Processes and Processors in Distributed

Systems
 Distributed File Systems
Distributed File Systems
 The file service is the specification of what the file system offers to its clients, describes the primitives
available, what parameters they take, and what actions they perform.
 the file service specifies the file system's interface to the clients.

 A file server is a process that runs on some machine and helps implement the file service.
 they should not know number, location or function of each file servers. When a particular service is
called, results should be generated without even knowing that the system is distributed.
File service Directory service

concerned with the concerned with creating


operations on individual and managing directories,
files, such as reading, adding and deleting files
writing, and appending. from directories.
 The File Service Interface.
 a file is a sequence of bytes.

 A files can have attributes(owner, size, creation date, and access permissions.), which are pieces of information about
the file but which are not part of the file itself and primitives to read and write some of the attributes.

 the only file operations are CREATE and READ. Once a file has been created, it cannot be changed(immutable) which
makes it much easier to support file caching and replication.
 The File Service Interface.

 Protection in distributed systems uses capabilities and access control lists.

 With capabilities, each user has a kind of ticket, called a capability, for each object to which it has access.
The capability specifies which kinds of accesses are permitted

 All access control list schemes associate with each file a list of users who may access the file and how.
 The File Service Interface.

 File services – 2 types,


 1) upload/download model or a remote access model.
 The File Service Interface.
 1) upload/download model or a remote access model.

 Only read file and write file operations.

 read transfers an entire file from one of the file servers to the requesting client.

 Write transfers an entire file the other way, from client to server. The files can be stored in memory or on a local disk, as
needed.
 The File Service Interface.
 1) upload/download model or a remote access model.

 Advantages
 conceptual simplicity.
 whole file transfer is highly efficient.
 Disadvantages
 However, enough storage must be available on the client to store all the files required.
 if only a fraction of a file is needed, moving the entire file is wasteful.
 The File Service Interface.

 2) remote access model.


 The File Service Interface.

 2) remote access model.


 the file service provides a large number of operations for opening and closing files, reading and writing parts of files,
moving around within files (LSEEK), examining and changing file attributes, and so on.
 here the file system runs on the servers, not on the clients.
 Advantage
 Does not requiring much space on the clients
 eliminates the need to pull in entire files when only small pieces are needed.
 The Directory Server Interface
 provides operations for
i. creating and deleting directories,
ii. naming and renaming files,
iii.moving them from one directory to another.

 defines some alphabet and syntax for composing file (and directory) names.

 All distributed systems allow directories to contain subdirectories, to make it possible for users to group related files together,
leading to a tree of directories, often called a hierarchical file system.
 The Directory Server Interface

A directory tree contained on one A directory graph on two machines.


machine.
links or pointers to an arbitrary
a link to a directory can be removed directory
only when the directory pointed to is
empty. it is allowed to remove a link to a
directory as long as at least one other
link exists
Tracked with the help of a reference
count, shown in the upper right-hand
corner of each directory.
 The Directory Server Interface
 Issues
 1) whether all machines (and processes) should have
exactly the same view of the directory hierarchy or not.

All clients have the same view of the file system.

Two file servers


different clients may have different views of the file system
 The Directory Server Interface
 Issues

 A system in which different clients may have different views of the file system is flexible and straightforward to
implement, but it has the disadvantage of not making the entire system behave like a single old-fashioned
timesharing system.

 In a timesharing system, the file system looks the same to any process(All clients have the same view of the file
system)
 The Directory Server Interface
 Naming Transparency.
 The principal problem with this form of naming is that it is not fully transparent.

 1) location transparency - the path name gives no hint as to where the file (or other object) is located.

 A path like /server1/dir1/dir2/x tells everyone that x is located on server 1, but it does not tell where that server is located.

 The server is free to move anywhere it wants to in the network without the path name having to be changed.

 Location independence is not easy to achieve, but it is a desirable property to have in a distributed system .
 The Directory Server Interface
 Naming Transparency.

 there are three common approaches to file and directory naming in a distributed system:

 1. Machine + path naming, such as /machine/path


or machine:path.
 2. Mounting remote file systems onto the local file
hierarchy.
 3. A single name space that looks the same on all
machines.
 The Directory Server Interface
 Two-Level Naming.

 Files (and other objects) have symbolic names such as prog.c, for use by people, and also internal, binary names for
use by the system itself. directories provide a mapping between these two naming levels.

 A more general naming scheme is to have the binary name indicate both a server and a specific file on that server,
alternatively a symbolic link (directory entry that maps onto a (server, file name) string, which can be looked up
on the server named to find the binary name.
 The Directory Server Interface
 Two-Level Naming.

 another way is to use capabilities as the binary names.Looking up an ASCII name yields a capability.

 A more general naming scheme is to have the binary name indicate both a server and a specific file on that
server, alternatively a symbolic link (directory entry that maps onto a (server, file name) string, which can
be looked up on the server named to find the binary name.
 The Directory Server Interface
 Semantics of File Sharing.
Four ways of dealing with the shared files in a distributed system.
 File Usage.

 Satyanarayanan (1981) made a study of file usage


patterns.
 Observed file system properties are
 System Structure.

 there is no distinction between clients and servers.


All machines run the same basic software, so any
machine wanting to offer file service to the public
is free to do so.

 Offering file service is just a matter of exporting


the names of selected directories so that other
machines can access them.
 System Structure.
 A second implementation issue on which systems
differ is how the file and directory service is
structured.

Iterative lookup of a/b/c. Automatic lookup.


 System Structure.
 The final structural issue that we will consider here is
whether or not file, directory, and other servers
should maintain state information about clients.

A comparison of stateless and stateful servers.


 Caching.
 In client server model
Four places to store files or parts of files.

1)server's disk - There is generally plenty of space


there and the files are then accessible to all clients,
with only one copy no consistency problems arise.

The problem with using this is performance (network

transfer, efficient algorithm to be used)


 Caching.
 In client server model
2) server's main memory - is easy to do and
totally transparent to the clients. Since the
server can keep its memory and disk copies
synchronized, from the clients' point of view,
there is only one copy of each file, so no
consistency problems arise.

Disks transfers are avoided but Network access


still exist.
 Caching.
 In client server model
2) client's main memory -
Various ways of doing caching in client memory
 Caching.
 In client server model
2) client's main memory -
 Cache Consistency.
 Client caching introduces inconsistency into the system.
 If two clients simultaneously read the same file and then both modify it, several
problems occur.
 when the two files are written back to the server, the one written last will overwrite
the other one.

Four algorithms for managing a client file cache.


 Cache Consistency.
 Conclusions
 server caching is easy to do and almost always worth the trouble, independent of whether client caching is present or
not.

 Server caching has no effect on the file system semantics seen by the clients.

 Client caching, offers better performance at the price of increased complexity and possibly fuzzier semantics.
 Replication
 multiple copies of selected files are maintained, with each copy on a separate file server.

 Reasons

 1. To increase reliability by having independent backups of each file.

 2. To allow file access to occur even if one file server is down.

 3. To split the workload over multiple servers


 Replication
 Replication can be done in 3 ways
 1) Explicit file replication.
 Programmer controls the entire process. When a process makes a file, it does so on
one specific server. Then it can make additional copies on other servers, if desired.
 Replication
 2) Lazy file replication.
 only one copy of each file is created, on some server. Later, the
server itself makes replicas on other servers automatically, without
the programmer's knowledge.
 Replication
 3) File replication using a group.
 all write system calls are simultaneously transmitted to all the
servers, so extra copies are made at the same time the original is
made.
 Update Protocols
 Just sending an update message to each copy in sequence is not a good idea because if the process doing the update crashes
partway through, some copies will be changed and others not.

 primary copy replication algorithm


 one server is designated as the primary. All the others are secondaries.
 When a replicated file is to be updated, the change is sent to the primary server, which makes the change locally and then
sends commands to the secondaries, ordering them to change, too. Reads can be done from any copy, primary or secondary.
 Update Protocols
 Issue
 the primary crashes before it has had a chance to instruct all the secondaries, the update should be
written to stable storage prior to changing the primary copy.

 when a server reboots after a crash, a check can be made to see if any updates were in progress at the
time of the crash. If so, they can still be carried out. Sooner or later, all the secondaries will be updated.
 Update Protocols

 Disadvantage
 if the primary is down, no updates can be performed.

 Solution
 Gifford proposed Voting algorithm.
 The basic idea is to require clients to request and acquire the permission of multiple servers before either reading or writing a replicated
file.
 Update Protocols

 Voting algorithm.
 suppose that a file is replicated on N servers.

 to update a file, a client must first contact at least half the servers plus 1 (a majority) and get them to agree to do the update.

 Once they have agreed, the file is changed and a new version number is associated with the new file. The version number is used to
identify the version of the file and is the same for all the newly updated files.
 Update Protocols

 Voting algorithm(contd).
 To read a replicated file, a client must also contact at least half the servers plus 1 and ask them to send the version
numbers associated with the file.

 If all the version numbers agree, this must be the most recent version because an attempt to update only the
remaining servers would fail because there are not enough of them.
 Update Protocols

 Gifford's scheme in general.


 To read a file of which N replicas exist, a client needs to assemble a read quorum, an arbitrary collection of any Nr servers, or more.

 To modify a file, a write quorum of at least Nw servers is required. The values of Nr and Nw are subject to the constraint that Nr+Nw>N.

 Only after the appropriate number of servers has agreed to participate can a file be read or written.
 Update Protocols

 Gifford's scheme in general.

Three examples of the voting algorithm.


 An Example: Sun's Network File System
 NFS Architecture.
 allows an arbitrary collection of clients and servers to share a common file system.

 NFS allows every machine to be both a client and a server at the same time.

 Each NFS server exports one or more of its directories for access by remote clients, so these directories can be exported automatically whenever the server is booted.

 Thus the basic architectural characteristic of NFS is that servers export directories and clients mount them remotely.

 The shared files are just there in the directory hierarchy of multiple machines and can be read and written the usual way.
 An Example: Sun's Network File System
 NFS Protocols.
 1) handles mounting.
 A client can send a path name to a server and request permission to mount that directory somewhere in its directory hierarchy.

 If the path name is legal and the directory specified has been exported, the server returns a file handle(fields uniquely identifying the file
system type, the disk, the i-node number of the directory, and security information.) to the client.

 Subsequent calls to read and write files in the mounted directory use the file handle.
 An Example: Sun's Network File System
 NFS Protocols.
 1) handles mounting.

 A client can send a path name to a server and request permission to mount that directory somewhere in its directory hierarchy.

 If the path name is legal and the directory specified has been exported, the server returns a file handle(fields uniquely identifying the file system type, the disk, the i-node number of the directory, and
security information.) to the client.

 Subsequent calls to read and write files in the mounted directory use the file handle.

 Sun's version of UNIX supports automounting.


 An Example: Sun's Network File System
 NFS Protocols.
 1) handles mounting.

 Automounting has two principal advantages.

 If the user does not even need that server at the moment, all that work is wasted.

 By allowing the client to try a set of servers in parallel, a degree of fault tolerance can be achieved (because only one of them need to be up), and the
performance can be improved.
 An Example: Sun's Network File System
 NFS Protocols.
 2) for directory and file access.

 Clients can send messages to servers to manipulate directories and to read and write files. In addition, they can also
access file attributes, such as file mode, size, and time of last modification.

 Most UNIX system calls supported by NFS, with the exception of OPEN and CLOSE.
 An Example: Sun's Network File System
 NFS Protocols.
 2) for directory and file access.

 To read a file, a client sends the server a message containing the file name, with a request to look it up and return a file handle,
which is a structure that identifies the file.
 Unlike an OPEN call, this LOOKUP operation does not copy any information into internal system tables.
 The READ call contains the file handle of the file to read, the offset in the file to begin reading, and the number of bytes desired
 An Example: Sun's Network File System
 NFS Protocols.
 2) for directory and file access.

 advantage
 the server does not have to remember anything about open connections in between calls to it.

 Thus if a server crashes and then recovers, no information about open files is lost, because there is none.
 An Example: Sun's Network File System
 NFS Implementation
 An Example: Sun's Network File System
 NFS Implementation
1) system call layer. - handles calls like OPEN, READ, and
CLOSE. After parsing the call and checking the parameters, it
invokes the second layer
2) virtual file system (VFS) layer - maintain a table with one entry
for each open file, analogous to the table of i-nodes for open files
in UNIX.

the VFS layer has an entry, called a v-node (virtual i-node), for
every open file. V-nodes are used to tell whether the file is local
or remote.

The kernel then constructs a v-node for the remote directory and
asks the NFS client code to create an r-node (remote i-node) in
its internal tables to hold the file handle. The v-node points to the
r-node. Each v-node in the VFS layer will ultimately contain either
a pointer to an r-node in the NFS client code, or a pointer to an i-
node in the local operating system.
 An Example: Sun's Network File System
 NFS Implementation
When a remote file is opened, at some point during the parsing of the path
name, the kernel hits the directory on which the remote file system is
mounted. It sees that this directory is remote and in the directory's v-node
finds the pointer to the r-node. It then asks the NFS client code to open the
file. The NFS client code looks up the remaining portion of the path name on
the remote server associated with the mounted directory and gets back a file
handle for it. It makes an r-node for the remote file in its tables and reports
back to the VFS layer, which puts in its tables a v-node for the file that
points to the r-node. Again here we see that every open file or directory has
a v-node that points to either an r-node or an i-node.

The caller is given a file descriptor for the remote file. This file descriptor is
mapped onto the v-node by tables in the VFS layer.

When a file handle is sent to it for file access, it checks the handle, and if it
is valid, uses it. Validation can include verifying an authentication key
contained in the RPC headers, if security is enabled..
 An Example: Sun's Network File System
 NFS Implementation
When the file descriptor is used in a subsequent system call, for example,
read, the VFS layer locates the corresponding v-node, and from that
determines whether it is local or remote and also which i-node or r-node
describes it

When a remote file is opened, at some point during the parsing of the path
name, the kernel hits the directory on which the remote file system is
mounted. It sees that this directory is remote and in the directory's v-node
finds the pointer to the r-node. It then asks the NFS client code to open the
file. The NFS client code looks up the remaining portion of the path name on
the remote server associated with the mounted directory and gets back a file
handle for it. It makes an r-node for the remote file in its tables and reports
back to the VFS layer, which puts in its tables a v-node for the file that
points to the r-node. Again here we see that every open file or directory has
a v-node that points to either an r-node or an i-node.

The caller is given a file descriptor for the remote file. This file descriptor is
mapped onto the v-node by tables in the VFS layer.
 An Example: Sun's Network File System
 NFS Implementation
When a file handle is sent to it for file access, it checks the handle, and if it
is valid, uses it. Validation can include verifying an authentication key
contained in the RPC headers, if security is enabled. When the file
descriptor is used in a subsequent system call, for example, read, the VFS
layer locates the corresponding v-node, and from that determines whether it
is local or remote and also which i-node or r-node describes it.

Transfers between client and server are done in large chunks, normally
8192 bytes, even if fewer bytes are requested. After the client's VFS layer
has gotten the 8K chunk it needs, it automatically issues a request for the
next chunk, so it will have it should it be needed shortly. This feature, known
as read ahead, improves performance considerably. For writes an
analogous policy is followed.

If a WRITE system call supplies fewer than 8192 bytes of data, the data are
just accumulated locally. Only when the entire 8K chunk is full is it sent to
the server. However, when a file is closed, all of its data are sent to the
server immediately.
 An Example: Sun's Network File System
 Lessons Learned
 Satyanarayanan (1990b) has stated following general principles
that he believes distributed file system designers should follow

You might also like