What do we know from looking at the socket interface for TCP?
- Connections are made between ports on hosts
- Applications see data as arriving in undelimited streams of bytes
We have ignored UDP, which rules the world of real-time entertainment!
The stream socket interface doesn’t really support messages. That’s why we need a good presentation layer. Also, flow-control has to be built into the application protocol.
The datagram socket interface accepts all of these limitations.
UDP socket interface
- Creating a UDP socket
udpsock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
- The server, first one to receive a packet, must
have an allocated port number
- There are no true connections — packets
are sent and received
(data, sender_address) = socket.recvfrom(buffer_size)
Dealing with unreliability
Because UDP packets can be lost, duplicated and reordered, the application protocol often needs to tag datagrams with sequence numbers in order to match requests with responses. In the ONC RPC this tag is called the xid or transmission id. In the ONC RPC the package also indicates if it is a call (request) or reply
It gets worse. Usually the requester must set timers so that it can retransmit a lost request. This significantly complicates the programming of the UDP client.
But it gets even worse. If the request isn’t idempotent, the server should not perform it more than once. Think about what this means for common file operations and for REST applications.
ONC RPC supports three call semantics related to guarantees of response to requests.
- at most once
- at least once
In at least once, the requester keeps transmitting into a reply is received. In at most once, the server must cache responses. In maybe, it usually happens.
Another look at “connections”
The BSD socket interface does allow calling
on a datagram socket. This sets a default destination for sent sockets and
allows the use of
It also allows the use of
will only return packets sent from the “connected” socket.
The Python socket module also supports this silliness
for a connection-less protocol.
In some UDP connection-like protocols, the well-known part is used only for the initial message. The reply for the initial message is made from with newly allocated UDP port number. Subsequent messages use the new UDP port.
Examples of UDP servers
This server enumerates and echoes line.
Notice how easy it is to maintain multiple clients.
Also, notice the absence of
Here’s a connection-oriented server using a Python dictionary to keep up with information about each client. The server returns the line count of both the client and all clients.
The UDP specification was published in 1980. Notice that the checksum is “the 16-bit one's complement of the one's complement sum of a pseudo header”. This is one place where assembly language is likely to be used.
This is done by the operating system. Take a look at 2500 lines of real operating system code.
TCP is designed to keep the whole network happy.
Going across the Internet is less predictable than going across the campus network.
Compare the sizes of the TCP and UDP modules in the Linux kernel to see how much more code is devoted to TCP. TCP Vegas is a congestion avoidance algorithm.
See the TCP specification.
The three-way handshake is very important. The sequence numbers need to unpredictable to avoid ancient IP spoofing attacks.
Take a look at the TCP Connection State Diagram, Figure 6
of RFC 793
or better yet, consult this diagram:
Let’s review the “three things” mentioned on page 406 of the textbook.
- The client’s ACK is lost: Server is in SYN_RECEIVED
- The link from LISTEN to SYN_SENT allows the server to become a client (but no know applications does this
- There are unshown arcs for timeouts which make their way to CLOSED
These diagrams explain it all. However, pay attention: Each end of a TCP connection is both a sender and receiver. The sender on one end transmits data to the receive on the other. It’s easier to understand as two independent one-way connections. Also, the pointers out of the applications, LastByteWritten and LastByteRead, are often the responsibly the operating system file interface.
Slides 26 & 27
We need to look at all of these constraints.
MSL is maximum segment lifetime.
We will see more solutions later.
We are talking about bandwidth between applications, aren’t we?
The usual MTU, maximum transmission unit, for our computers is 1500. The lower levels can fake it.
Concern is that the silly window continues between the two ends and never gets a full load.
Picking up on Thursday
- Times when the receiving application (operating system) may acknowledge
- Times when the sending application (operating system) may (re)transmit
Also, take a look at a sliding window demo.
Nagle’s algorithm is described in RFC 896.
The paper Improving Round-Trip Time Estimates in Reliable Transport Protocols presents the Karn algorithm.
The Karn & Partridge algorithm was attempts to address the problem of congestion occuring within a growing internet. Chapter 6 presents more recent work.
Interesting things in the book but not the slides
Some application programmers are using
and URG to indicate data boundaries.
The Java socket class has an method
sendUrgentData to send
an urgent byte.
Python allows programmer’s to send
out-of-band using C-like calls.
TCP has several options for extending performance. These include timestamps, window scaling factors and selective acknowledgment.
Use the source
tcp and udp performance tuning in Linux
[…]$ sudo tcpdump host not connectinghost […]$ sudo tcpdump -n host not connectinghost […]$ sudo tcpdump -n tcp and host not connectinghost […]$ sudo tcpdump -n udp and host not connectinghost […]$ sudo tcpdump -n net not network