See All Titles |
![]() ![]() Networking ConceptsNetworking systems are well-defined by the OSI/ISO (Open Systems Interconnection/ International Standards Organization) seven-layer model, which suggests the following levels of the networking process: physical, data link, network, transport, session, presentation, and application. However, keep in mind that, in practice, protocols span multiple layers, and you shouldn't worry if your application doesn't fit in this model. Most of today's networking stacks (including TCP/IP) use less layers that are not quite as well separated as in the OSI model. Consequently, if you try to map a TCP/IP session onto the OSI model, you will get a bit confused because some layers are merged, and some others are removed.
Network connections can be of two types: connection-oriented or connectionless (packet-oriented). Let's talk about the pair TCP/IP, which is a packet-oriented implementation. Nowadays, I can't imagine a unique machine that doesn't support it. TCP/IP is the most widely used networking protocol possibly because it is robust and untied to any particular physical medium, and maybe also because the specifications are freely available. TCP/IP was originally created by the United States Department of Defense, and soon, this protocol combination became the network of choice for the U.S. government, the Internet, and the universities. This tuple runs on virtually every operating system platform, which makes it strong when internetworking between different LAN environments is required. Today, a great number of commercial and public networks are built on top of this implementation. Although the Internet grew out of the TCP/IP work done at universities and the U.S. Department of Defense, it didn't adopt TCP/IP until part of the way through. The network layer of the TCP/IP stack is provided by the Internet Protocol (commonly known as IP). This protocol provides the basic mechanism for routing packets in the Internet because it sends packets of data back and forth without building an end-to-end connection. IP doesn't understand the relationships between packets, and doesn't perform retransmission. (It is not a reliable communication protocol!) Therefore, it requires higher-level protocols such as TCP and UDP to provide a reliable class of service. It does ensure that the IP header is not corrupted though. TCP stands for Transmission Control Protocol, and it is the main form of communication over the Internet because it provides a reliable, session-based service for the delivery of sequenced packets. This connection-oriented protocol provides a reliable two-way connection service over a session. Each packet of information exchanged over a session is given a sequence number through which it gets tracked and individually acknowledged. Duplicate packages are detected and discarded by the session services. Sequence numbers are not globally unique or even necessarily unique to the session. Although in a small enough time window, they would be unique to the session. The TCP/IP protocol doesn't provide an application interface layer—the application provides the application layer. However, sockets have emerged as TCP/IP's premier peer-to-peer API, providing a way of writing portable networking applications. UDP, which stands for User Datagram Protocol, is another protocol that provides transport services. This protocol provides an unreliable but fast datagram service. They are unreliable in the sense that they are not acknowledged or tracked through a sequence number. After transmitting the diagram, you have to hope that it gets received. We don't know if the recipient is there, or even if he is expecting a diagram. Some statistics say that about 5% of the diagrams don't make it. That's depressing, isn't it? Note
UDP is useful for streaming media, where a packet that is late is useless, so retransmission is not desirable. UDP is a connectionless transport protocol that doesn't guarantee delivery or packet sequence. As an example, UDP is used by the ping command in order to check whether a host is reachable in the network. No doubt the UDP protocol is faster than the TCP protocol. The reason is because the TCP protocol spends more time switching information between the machines in order to guarantee that the information gets transferred. That doesn't happen when using UDP, which makes it considerably faster than TCP. Another fact is that while transferring data packets, the TCP protocol waits until all the packets arrive, and organizes them in sequence for the client program. However, the UDP protocol doesn't do that. It allows the client program to decide how the packets should be interpreted because packets aren't received in any specific ordering format. The problem is that this kind of implementation is completely unreliable because there is no way to confirm whether the information has reached its destiny. If you need a stream-oriented protocol, TCP is about as fast as you will get it. If it was such a bad protocol, it would have been replaced by now. ProtocolsThe most commonly used application protocols are built on top of TCP/IP infrastructures. Actually, they don't have to know any details about TCP nor about IP because a thin layer called sockets exists between TCP/IP and them. Python has modules that handle and support the access to all the following protocols. These protocols use the services provided by the sockets in order to transport packets on the network and to make connections to other hosts.
AddressesA socket address, on the TCP/IP internet structure, consists of two parts: an Internet address (commonly known as an IP address) and a port number. The IP address defines the addressing and routing of information around the network, uniquely identifying a network interface. An IP address is a 32-bit number (a sequence of four bytes), usually represented by four decimal numbers ranging from 0 to 255, separated by dots. A IP address looks something similar to 128.85.15.53. Each IP number must be unique for each TCP/IP network interface card within an administered domain, which in most cases means that each machine connected to the Internet has a unique IP address. Actually, a networked machine can have more Internet addresses than network interfaces. This is quite common in virtual hosting situations. A port is an entry point to an application/service that resides on a server. It is a number represented by a 16-bit integer. This number can range between 0 and 65535, but you can't freely use all of them inside your programs. Always choose a port number greater than 1024 because the range 0–1023 is reserved by the operation system for some network protocols. Specific ports are shown in Table 10.1. Note
Ports 0-1023 are called privileged ports and on most systems only the super user can run applications that use them. If you do not specify a port for one of the end points of your connection, one from the 1024-65535 range will be chosen.
A larger list of ports can be found in the /etc/services file on UNIX machines or c:\windows\services on Win95/Win98 machines. Most of the time, you don't need to worry about knowing the IP addresses offhand. DNS services provide a translation between IP addresses and hostnames because it is much easier to remind a name than a sequence of numbers. You should know that extra mappings between IP addresses and hostnames can be added in the /etc/hosts or c:\windows\hosts file. The conclusion is that if you need to connect your client program to an application running on a server, you just need to know the server's IP address or hostname, and the port number in which the application is listening. Together TCP and IP provide the basic network services for the Internet. SocketsSockets are objects that provide the current portable standard for network application providers on certain suites of network protocols (such as TCP/IP, ICMP/IP, UDP/IP, and so forth). They allow programs to accept and make connections, such as to send and receive data. It is important that each end of a network communication have a socket object in order to establish the communication channel. Sockets were first introduced in 1981 as the UNIX BSD 4.2 generic interface that would provide UNIX-to-UNIX communications over networks. Since that occasion, sockets have become part of the BSD UNIX system kernel, and they have also been adopted on a lot of other UNIX-like Operating Systems, including Linux. Support for sockets is also provided, in the form of libraries, on a multiplicity of non-BSD UNIX systems, including MS-DOS, Windows, OS/2, Mac OS, and most mainframe environments. The Windows socket API, known colloquially as WinSock, is a multivendor specification that has standardized the use of TCP/IP under Windows. This library is based on the Berkeley sockets interface as well. Of course, WinSock is not as convenient as a real sockets interface because the socket descriptors can't be passed to the select function as file descriptors can. The reason for all this multi-environment possibility is because sockets are implemented using a standard C-level interface, which makes it easier to implement in other operating systems. Each socket has a type that defines the protocol which implements the environment where the socket is used. These types are specified at creation time. The three most popular socket types are: stream, datagram, and raw. stream and datagram sockets can interface directly to the TCP protocol, whereas the raw sockets interface to the IP protocol. Note, however, that sockets are not limited to TCP/IP. Stream over a PF_INET connection will give TCP, and datagram over PF_INET will give UDP. The socket ModuleThe socket module is a very simple object-based interface that provides access to a low-level BSD socket-style network. Both client and server sockets can be implemented using this module. This module provides an exception called error, which is raised every time a socket- or address-related error happens. Now we will look at the methods that are implemented by this module.
Each socket object has the following methods:
The next two functions are usually used for sending packets on a datagram oriented protocol such as UDP.
For those that already have Python 2.0 installed, you should know that as a result of some changes in the Python design, you are encouraged to use an extra pair of parenthesis when passing tuples as arguments to some functions of the socket module. Note that some funtions still accept the old interface, but you are encouraged to start using the new model right away, for example, socket.connect( ('hostname', 80) ). Among the functions that still accept the old interface, we have: socket.connect(), socket.connect_ex(), and socket.bind(). Starting with Python 2.0, it's available OpenSSL support for the socket module. That means that from now on you can encrypt the data you send over a socket using this implementation of the Secure Socket Layer. In order to have it properly installed you need to edit the Modules/Setup file to include SSL support before compiling Python. Doing so will add the socket.ssl() function to your socket module. socket.ssl() This function takes a socket object and returns an SSL socket. basic syntax: socket.ssl(socket, keyfile, certfile) Making ConnectionsBecause we already know that sockets are mostly used for TCP and UDP connections, let's see how to implement those interfaces using Python. Initially, we will check the necessary steps to start a TCP connection. The server application needs to
After these steps are performed, the TCP client application just needs to
When the server receives the client request to establish a connection, it processes the request and sends the response back to the client. 1: # TCP server example 2: import socket 3: svrsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 4: svrsocket.bind("", 8888) 5: svrsocket.listen(5) 6: while 1: 7: data_to_send = "This string could be anything" 8: clisocket, address = svrsocket.accept() 9: print "I got a connection from ", address 10: clisocket.send(data_to_send) 11: clisocket.close() The first argument in line 3 is the family address protocol. Currently, Python supports only two values: AF_UNIX (for UNIX domain sockets) and AF_INET (for Internet sockets). If you are using a non-UNIX system, you must use the AF_INET protocol. The second argument in line 3 defines the type of connection that must be open. The common choices are SOCK_STREAM for stream-based connections (TCP) and SOCK_DGRAM for datagram-based connection (UDP). Depending on your system, you might also have other options: SOCK_SEQPACKET, SOCK_RAW, SOCK_RDM, SOCK_PACKET (Obsolete). After creating a server socket, you need to bind the socket to a port on the local machine (line 4). The socket will listen to this port and process all the requests that come to this port. In this example, we are connecting to port 8888. Remember that you should not use port numbers up to 1024 because they are reserved for system services. The 20,000–30,000 range is also prohibited because it is reserved for the Remote Procedure Call (RPC) services. Of course you should use these port numbers if you are implementing one of those services. Tip
On UNIX systems, you need to have root privileges to implement services on ports lower than 1024. NT systems implement the same concept where ports lower than 1024 can only be used by system (or root) processes or by programs executed by privileged users. The listen() method (line 5) tells the server to start "listening" to the port, waiting for connections. After a client connects to this server, the accept() method (line 8) is invoked, and a new socket is created. Note that two sockets are involved in the whole process: one to establish the connection, and the other one to manage all the transactions between the client and the server. The following example implements the client version of our program: 1: # TCP client example 2: import socket 3: clisocket = socket.socket(socket.AD_INET, SOCK_STREAM) 4: clisocket.connect("lessaworld.com", 8888) 5: data = clisocket.recv(512) 6: clisocket.close() 7: print "The data received is ", data The socket() method (line 3) creates a TCP socket that tries to connect to the server/port specified as arguments of the connect() method (line 4). After the connection is set up, the recv() method (line 5) is used to read the data. In this example, we are limiting the maximum number of 512 bytes to be read. The next task is to implement the same client/server architecture using the UPD protocol. The steps necessary to start a UDP connection are as follows:
After these steps are performed, the UDP client application just needs to
When the server receives the client request to establish a connection, it sends the response back to the client. And that's it. As you know, there is no concept of connection here. The following code example demonstrates an example of how to handle an UDP server. 1: # UDP server example 2: import socket 3: svrsocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) 4: svrsocket.bind("", 8000) 5: while 1: 6: data, address = svrsocket.recvfrom(256) 7: print address[0], "said : ", data The recvfrom() method (line 6) is used to read datagrams that are sent to the port, which is informed in line 4. The recvfrom() method returns two arguments: the actual data and the address of the host that has sent the data. The following code example demonstrates an example of how to handle an UDP client. 1: # UDP client example 2: import socket 3: clisocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) 4: while 1: 5: data = raw_input("Type something: ") 6: if data: 7: clisocket.sendto(data, ("lessaworld.com", 8000)) 8: else: 9: break 10: s.close() To send data to the server implementation, you need to use the sendto() method (line 7). The first argument is the data you want to send, and the second one is a tuple containing both the hostname and the port number waiting for your connection. The UDP implementation doesn't try to set up a connection before starting to send diagrams. When you transmit data using UDP, it's hard to know whether the other machine has received the datagram. For more information about sockets, you should consider viewing Gordon McMillan's HOWTO on socket programming at http://www.python.org/doc/howto/sockets/ Darrell Gallion's Web site also has some examples that might help you get started with sockets: http://www.dorb.com/darrell/sockets Asynchronous SocketsThe asyncore module provides the basic infrastructure for writing and handling asynchronous socket service clients and servers that are the result of a series of events dispatched by an event loop. This module is used to check what is happening with sockets in the system, and it implements routines to handle each situation. The core of this module is the dispatcher class. dispatcher ([socket])This is supposed to be the constructor of the asyncore.dispatcher class. To use this class, you need to subclass it, and override the method that you want to handle. This class is just a wrapper on top of a socket object. If the socket argument is omitted, you need to call the create_socket() method as shown in the following example: import asyncore import socket class Dispatcher(asyncore.dispatcher): def handle_write(self): self.send("data") self.close() class DataServer(asyncore.dispatcher): def __init__(self, port=8888): self.port = port self.create_socket(socket.AF_INET, socket.SOCK_STREAM) self.bind(("", port)) self.listen(5) def handle_accept(self): link, address = self.accept() Dispatcher(link) dataserverobj = DataServer(8888) asyncore.loop This example overrides two methods from the dispatcher class: handle_write() and handle_accept(). The first one is called when the socket receives an attempt to be written, and the other one is called when the listening socket receives a connection request. The other methods available in this class are as follows:
The dispatcher class also provides methods that have a implementation similar to those available in the socket module. Here is the list: create_socket (equivalent to socket), connect, bind, listen, send, recv, accept, and close. This module also reveals two functions:
You can also check out the Asynchronous Sockets Library, by Sam Rushing, which is used for building asynchronous socket clients and servers: http://www.nightmare.com/software.html This is a single program that can simultaneously communicate with many other clients and servers, using and implementing multiple protocols running within a single address space on a single thread. Included in the library are sample clients, servers, and demonstrations for several Internet protocols, including HTTP, finger, DNS, POP3, and FTP. The select ModuleThe select module is used to implement polling and to multiplex processing across multiple I/O streams without using threads or subprocesses. It provides access to the BSD select() function interface, available in most operating systems. On Windows, this function only works for sockets. On UNIX, it is used for pipes, sockets, files, or any other stream-compatible objects. Also note that the that asyncore module is built on top of the select module. The select function accepts socket lists as arguments. The following example implements a loop that will keep checking the sockets in order to identify the exact moment when they become readable, writable, or signal an error. (An error is assigned whenever a socket tries to open a connection, and the connection fails. A few other conditions will trigger one of the sockets, not just connect errors.) A socket becomes readable when it successfully gets a connection after calling the listener, or when it receives data. On the other hand, if a connection is set up after a non-blocking call to the connect method, the socket becomes writable. import select import socket App_Socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) App_Socket.bind("", 8888) App_Socket.listen(5) while 1: readable_sockets = [App_Socket] writable_sockets = [] r, w, err = select.select(readable_sockets, writable_sockets, [], 0) if r: client, address = service.accept() client.send("data") client.close()
|
© 2002, O'Reilly & Associates, Inc. |