Outline
- Layered Protocols
- Remote Procedure Call (RPC)
- Issues:
- Parameter Passing
- Binding
- Failure Handling
- Performance and implementation issues
Communication Protocols
- Protocols are agreements/rules on communication
- Protocols could be connection-oriented, or connectionless
Layered Protocols
A message as it appears on the network may contain headers/trailers that encapsulate the message as it traverse through the different layers in the OSI model. We can look at the OSI model in detail.
Physical Layer
- Goal: Raw bits over some communication channel
- Sample Issues:
- How to encode a 0 or 1
- What voltage should be uesd?
- How long does a bit need to be signaled?
- What kind of cable should be used?
- Example:
Data Link Layer
- Goal: Transmits error free frames (really a vector of bits) over the physical link
- Sample Issues:
- How big is a frame? (framing)
- How can errors be detected in sending the frame?
- What demarks the end of a frame?
- How to control access to a shared channel? (flow control)
- Examples:
Network Layer
- Goal: Route packets from the source to destination
- Sample Issues:
- How to route packets that have to travel several hops?
- Congestion control algorithm: traffic shaping, flow specifications, and bandwith reservation
- Accounting - charge for use of the network
- Fragment or combine packets depending on rules of link layer
- Examples:
Transport Layer
- Goal: Accurately transport session data in order
- Sample Issues:
- How to order messages and detect duplicates
- Error detection (corrupt packets) and retransmission
- Connectionless or connection-oriented
- Examples:
Session and Presentation Layer
- Goal: Common services shared by several apps
- Sample Issues:
- Allows users on different machines to establish sesions between them
- Encodes data in a standard agreed upon way
Application Layer
- Goal: Common types of exchanges standardized
- Examples: FTP, SMTP, HTTP
Middleware Protocols
Middleware:
- An application that logically lives in the applicaiton layer
- Contains many general-purpose protocols that warrant their own layers
Remote Procedure Call (RPC)
- Client-Server provides a mechanism for services in distribyted systems, BUT
- Requires explicit communication (send/recv)
- We don’t really want that, we want the communication to be transparent in a remote procedure call
- How do we make distributed computing look like traditional computing?
- Can we use procedure calls?
In distributed systems, the callee can be on a one system, and the caller on another.
- Remote Procedure Call (RPC)
- We do NOT want explicit message passing
Goal: Make RPC look like a local procedure call
- Allow remote services to be called as procedures
- Call should not be aware that the callee is actually on a different machine (and vice versa)
Conventional Procedure Call
Parameters are pushed onto the stack, then the return address, and then the local variables are pushed onto the stack.
Observations
Parameters in C:
- call by reference OR call by value
- Value parameters include like numbers, characters, etc.
- Reference parameters include pointers, or addresses in our address space.
- Many options are language dependent
Remote procedure calls can simulate this through:
- Stubs – procies
- Marshalling
Stubs
- Suppose a client makes a procedure call (just like a local procedure call), but to a client stub
- Server is written as a standard procedure
- The stubs take care of packaging arguments, and sending messages
- The packaging is called marshalling
- Stub compilers can automatically generate stubs from specs in an IDL (interface definition language)
Steps of a Remote Procedure Call
- Client procedure calls client stub normally
- Stub builds message, calls local OS
- Client’s OS sends message to remote OS
- Remote OS gives message to server stub
- Server stub unpacks parameters, calls server
- Server does work, returns result to the stub
- Server stub packs it into a message, calls local OS
- Server OS sends message to client OS
- Client OS gives message to client stub
- Client Stub unpacks result, and returns to client
Marshalling: Value Parameters
Problem: Different machines can have different data formats..
Marshalling: Reference Parameters
Problem: How do we pass pointers?
- If it points to a well defined data structure, ?
- What about data structures containing pointers?
Implementation Issues
- Choice of protocol
- Copying costs are dominant overheads
Summary
- RPCs make distributed computations look like local computations
- Issues:
- Parameter passing
- Binding
- Failure Handling
- Performance and implementation issues
Case Study: Sun RPC
- One of the most widely used RPC systems
- Also known as Open Network Computing (ONC)
- Originally developed by Sun, now widely available
- Sun RPC package has an RPC compiler (rpcgen)
- Automatically generates the client and server stubs
- RPC package uses XDR (eXternal Data Representation)
- to represent data sent between client and server stubs
- Has built-in representation for basic types (int, float, char) & a declarative language for specifying complex data types
Binder: Port Mapper
1. Server start-up:
- Creates port
- Server stub calls
svc_register
to register program # and version # with local port mapper
- Port mapper stores the prog #, version #, and port
2. Client start-up:
- Calls
clnt_create
to locate server port
- Upon return, client can start to call procedures at the server
Sun RPC Programming
The RPC library is a collection of tools for automating the creation of RPC clients and servers
- RPC clients are processes that call remote procedures
- RPC servers are processes that include procedures which are called by clients
RPC Library
- XDR Routines – Data is translated into standardized XDR format
- RPC run time library
- call rpc service, e.g. high-level client lib
call_rpc(..)
- register with portmapper, e.g. high-level server lib
registerrpc(..)
- dispatch incoming request to correct procedure, e.g. high-level server lib
svc_run(..)
- Program generator: RPCGEN
Example: RPC Programming
- Write RPC protocol specification file
foo.x
- Write server procedure
fooservices.c
- Write client procedure
foomain.c
Example: RPCGEN
- A tool for automating the creation of RPC client and servers.
- The program
rpcgen
does most of the work for you.
- The input to
rpcgen
is a protocol definition in the form of a list of remote procedures and parameter types.
Using SunRPC
rpcgen -c foo.x
- foo_clnt.c (client stubs)
- foo_svc.c (server main)
- foo_xdr.c (xdr filters)
- foo.h (shared header file)
gcc -o fooclient foomain.c foo_clnt.c foo_xdr.c -lnsl
- foomain.c is the client main() (and possibly other functions) that call rpc services via the client stub functions in foo_clnt.c
- The client stubs use the xdr functions.
gcc -o fooserver fooservices.c foo_svc.c foo_xdr.c -lrpcsvc -lnsl
- fooservices.c contains the definitions of the actual remote procedures
- Copy the server
fooserver
to the remote machine and run it in the background
- Now you can call the remote procedure on a local machine
Lightweight RPCs
- Many RPCs occur between client and server on the same machine
- use a lightweight RPC mechanism (LRPC)
- Server S exports interface to remote proceduresa
- Client C on same machine imports interface
- OS kernel creates data structures including an argument stack shared between S and C
Other RPC Models
- Asynchronous RPC
- Server can reply as soon as request is received and execute procedure later
- Deferred-synchronous RPC
- Use two asynchronous RPCs
- Client needs a reply but can’t wait for it; server sends reply via another asynchronous RPC
- One-way RPC
- Client does not even wait for an ACK from the server
Remote Method Invocation (RMI)
- RPCs applied to objects
- Class: object-oriented abstraction; module with data and operations
- Seperation between interface and implementation
- Interface resides on one machine, implementation on another
- RMIs support system-wide object references
- Parameters can be object references
Proxies and Skeletons
- Proxy: client stub
- Maintains server ID, endpoint, object ID
- Sets up and tears down connection with the server
- Does serialization of local object parameters
- In practice, can be downloaded/constructed on the fly
- Skeleton: server stub
- Does deserialization and passes parameters to the server and sends results to the proxy
Static vs Dynamic RMI
Static invocation
- Use predefined interface definitions
- Require that interfaces of an object are known when client application is being developed
Dynamic invocation
- A method invocation is composed at runtime
- An application selects at runtime which method it will invoke for some remote object
Parameter Passing
- Less restrictive RPCs
- Supports system-wide object references
- Local objects passed by value, remote objects passed by reference
Persistence and Synchronicity combinations
Persistent: Receiver does not need to be running, message is persistent
Transient: Receiver needs to be running, otherwise message is discarded
Async: Sender continues immediately after it submits its message
Sync: Sender waits on a message to be stored in a local buffer at the receiving host, or actually delivered to the receiver.
Delivery-based: Waits for an ACK from receiver
Response-based: Waits for receiver to process request
RPC & RMI belong to transient synchronous communication.
Message-oriented Transient Communication
Many distributed systems are built on top of simple message-oriented model
- Transport-level: Berkeley sockets
- Sockets are a logical communication endpoint (send/recv)
Message-Passing Interface (MPI)
- Sockets are designed for network communication (TCP/IP)
- Simple send/receive primitives
- Use general-purpose protocol stacks like TCP/IP
Sockets are not a suitable abstraction for other protocols in clusters or massively parallel systems
MPI
- Hardware independent
- Designed for parallel applications on clusters
- Transient (sync/async) communication
- Key idea: Communication between groups of processes
- An endpoint can be represented as: (groupID, processID)
- Tons of primitives to facilitate this
Message-oriented Persistent Communication
These provide intermediate storage for messages, when a sender/receiver is inactive
The only guarantee to the sender is that their message will be delivered to the receiver queue (does not guarantee actual processing of the message).
A data stream is a sequence of data units:
- Discrete stream
- Continuous stream
For continuous streams:
- Async Transmission mode: Data items are transmitted sequentially, w/ no timing constraints
- Sync Transmission mode: Maximum end-to-end delay for each data unit
- Isochronous Transmission mode: Data units are transferred at a particular time