See All Titles |
![]() ![]() HTTPHTTP (Hypertext Transfer Protocol) is a simple text-based protocol used for World Wide Web Applications. Both Web servers and Web browsers implement this protocol. The HTTP protocol works by having a client that opens a connection, and sends a request header to a Web server. This request is a simple text-based form that contains the request method (GET, POST, PUT, …), the name of the file that should be opened, and so forth. The server interprets the request and returns a response to the client. This response contains the HTTP protocol version number, as well as a lot of information—such as cookies, document type and size, and so on—about the returned document. For details about the HTTP specification, you'd better check: http://www.w3.org/Protocols Next, I list some Python projects that somehow use HTTP techniques. M2Crypto, by Ng Pheng Siong'sM2Crypto makes the following features available to the Python programmer: RSA, DH, DSA, HMACs, message digests, symmetric ciphers, SSL functionality to implement clients and servers, and S/MIME v2. http://mars.post1.com/home/ngps/m2/ Note
With Python-2.0, the socket module can be compiled with support for the OpenSSL library, so it can handle SSL without trouble. CTC (Cut The Crap), by Constantinos KotsokalisThis is a http proxy software written in Python, which cuts advertisement banners from your Web browser display. http://softlab.ntua.gr/~ckotso/CTC/ Alfajor, by Andrew CookeAlfajor is an HTTP cookie filter, written in Python with an optional GUI. It acts as an HTTP proxy (you must configure your browser to use it) and can either contact sites directly or work with a second proxy (for example, a cache). Note that Alfajor does not fully conform to any HTTP version. However, in practice, it works with the vast majority of sites. http://www.andrewcooke.free-online.co.uk/jara/alfajor/ Building Web ServersIn order to build Internet servers using Python, you can use the following modules:
The SocketServer ModuleThe SocketServer module exposes a framework that simplifies the task of writing network servers. Rather than having to implement servers using the low-level socket module, this module provides four basic server classes that implement interfaces to the protocols used most often: TCPServer, UDPServer, StreamRequestHandler, and DatagramRequestHandler. All these classes process requests synchronously. Each request must be completed before the next request can be started. This kind of behavior is not appropriate if each request takes a long time to complete because it requires a lot of computation and the client might be slow to process all data. In order to handle the requests as separate threads, you can use the following classes: ThreadingTCPServer, ThreadingUDPServer, ForkingUDPServer, and ForkingTCPServer. Both the StreamRequestHandler and DatagramRequestHandler classes provide two file attributes that can be used to read and write data from and to the client program. These attributes are self.rfile and self.wfile. The following code demonstrates the usage of the StreamRequestHandler class, which is exposed by the SocketServer module. import SocketServer port = 8000 class myRequestHandler(SocketServer.StreamRequestHandler): def handle(self): print "connection from ", self.client_address self.wfile.write("data") srvsocket = SocketServer.TCPServer(("", port), myRequestHandler) print "The socket is listening to port", port srvsocket.serve_forever() Next, you have the classes provided by this module:
In all four classes, the request_handler must be an instance of the BaseRequestHandler class, and usually, hostname is left blank. Each one of these classes has its own instances of class variables. request_queue_size stores the size of the request queue that is passed to the socket's listen() method. socket_type returns the socket type used by the server. The possible values are socket.SOCK_STREAM and socket.SOCK_DGRAM. The class instances implement the following methods and attributes:
The BaseHTTPServer ModuleThe BaseHTTPServer module defines two base classes for implementing basic HTTP servers (also known as Web servers). This module is built on top of the SocketServer module. Note that this module is rarely used directly. Instead, you should consider using the modules CGIHTTPServer and SimpleHTTPServer. The following code demonstrates the usage of the BaseHTTPRequestHandler class, which is exposed by the BaseHTTPServer module, to implement a simple HTTP Server. import BaseHTTPServer htmlpage = """ <html><head><title>Web Page</title></head> <body>Hello Python World</body> </html>""" notfound = "File not found" class WelcomeHandler(BaseHTTPServer.BaseHTTPRequestHandler): def do_GET(self): if self.path = "/": self.send_response(200) self.send_header("Content-type","text/html") self.end_headers() self.wfile.write(htmlpage) else: self.send_error(404, notfound) httpserver = BaseHTTPServer.HTTPServer(("",80), WelcomeHandler) httpserver.serve_forever() The HTTPServer((hostname, port), request_handler_class) base class is derived from the SocketServer.TCPServer, hence, it implements the same methods. This class creates a HTTPServer object that listens to the hostname+port, and uses the request_handler_class to handle requests. The second base class is called BaseHTTPRequestHandler(request, client_address, server). You need to create a subclass of this class in order to handle HTTP requests. If you need to handle GET requests, you must redefine the do_GET() method. On the other hand, if you need to handle POST requests, you must redefine the do_POST() method. This class also implements some class variables:
This string should contain the code for a complete Web page that must be sent to the client in case an error message must be displayed. Within the string, you can reference some error attributes because this string is dynamically linked to the contents of an error dictionary. """<head><title></title></head><body> Error code = %(code)d<br> Error message = %(message)s<br> Error explanation = %(explain)s<br></body>""" Each instance of the BaseHTTPRequestHandler class implements some methods and attributes:
The following object attributes are also exposed:
The SimpleHTTPServer ModuleThe SimpleHTTPServer module provides a simple HTTP server request-handler class. It has an interface compatible with the BaseHTTPServer module that enables it to serve files from a base directory. This module implements both standard GET and HEAD request handlers, as shown in this example: import SimpleHTTPServer import SocketServer ServerHandler = SimpleHTTPServer.SimpleHTTPRequestHandler httpserver = BaseHTTPServer.HTTPServer(("", 80), ServerHandler) httpserver.serve_forever() The current directory used to start up the server is used as the relative reference for all files requested by the client. This module implements the SimpleHTTPRequestHandler(request, (hostname, port), server) class. This class exposes the following two attributes: The CGIHTTPServer ModuleThe CGIHTTPServer module defines another simple HTTP server request-handler class. This module has an interface compatible with BaseHTTPServer, which enables it to server files from a base directory (the current directory and its subdirectories), and also allow clients to run CGI (Common Gateway Interface) scripts. Requests are handled using the do_GET and do_POST methods. You can override them in order to meet your needs. Note that the CGI scripts are executed as the user nobody. The next example demonstrates the implementation of a simple HTTP Server that accepts CGI requests. import CGIHTTPServer import BaseHTTPServer class ServerHandler(CGIHTTPServer.CGIHTTPRequestHandler): cgi_directories = ['/cgi-bin'] httpserver = BaseHTTPServer.HTTPServer(("", 80), Handler) httpserver.serve_forever() The CGIHTTPRequestHandler(request, (hostname, port), server) class is provided by this module. This handler class supports both GET and POST requests. It also implements the CGIHTTPRequestHandler.cgi_directories attribute, which contains a list of directories that can store CGI scripts. Setting Up the Client Side of the HTTP ProtocolThe httplib module implements the client side of the HTTP (Hypertext Transfer Protocol) protocol, and is illustrated as follows: import httplib url = "www.lessaworld.com" urlpath = "/default.html" host = httplib.HTTP(url) host.putrequest("GET", urlpath) host.putheader("Accept", "text/html") host.endheaders() errcode, errmsg, headers host.getreply() if errcode != 200: raise RuntimeError htmlfile = host.getfile() htmlpage = htmlfile.read() htmlfile.close() return htmlpage The previous example doesn't allow you to handle multiple requests in parallel because the getreply() method blocks the application while waiting for the server to respond. You should consider using the asyncore module for a more efficient and asynchronous solution. This module exposes the HTTP class. The HTTP([hostname [,port]]) class creates and returns a connection object. If no port is informed, port 80 is used; and if no arguments are informed at all, you need to use the connect() method to make the connection yourself. This class exposes the following methods:
Note
Note that the httplib module packed with Python 2.0 has been rewritten by Greg Stein, in order to provide new interfaces and support for HTTP/1.1 features, such as pipelining. Backward compatibility with the 1.5 version of httplib is provided, but you should consider taking a look at the documentation strings of the module for details. Also note that Python 2.0's version of the httplib module has support to " https:// " URLs over SSL.
|
© 2002, O'Reilly & Associates, Inc. |