< BACKMake Note | BookmarkCONTINUE >
152015024128143245168232148039199167010047123209178152124239215162147045098087171051091243

Python DB API

The quest to provide a standard way to interface to database systems drove a group of people to develop Python Database API. The Python DB API is maintained by the Database Special Interest Group (DB-SIG). For more information, check out their Web site at http://www.python.org/sigs/db-sig/.

The following list shows all the database modules that currently implement the Python DB API specification proposed by the DB-SIG. This means that after you understand the API, you will be able to handle, in a similar way, all the databases that are manipulated by the following modules:

This is the information available at the time this book was written. For an updated list of modules, check out http://www.python.org/topics/database/modules.html.

DB-API Specification v2.0

The following specification is available online at http://www.python.org/topics/database/DatabaseAPI-2.0.html.

Comments and questions about this specification can be directed to the SIG for Database Interfacing with Python at the email address db-sig@python.org.

For more information on database interfacing with Python and available packages, see the Database Topics Guide at http://www.python.org.

This document describes the Python Database API Specification 2.0. The previous version 1.0 is still available online at the Python Web site as a reference. Package writers are encouraged to use this version of the specification as the basis for new interfaces.

This API has been defined to encourage similarity between the Python modules that are used to access databases. By doing this, we hope to achieve a consistency leading to more easily understood modules, code that is generally more portable across databases, and a broader reach of database connectivity from Python.

The interface specification consists of several sections:

  • Module Interface

  • Connection Objects

  • Cursor Objects

  • Type Objects and Constructors

  • Implementation Hints

  • Major Changes from 1.0 to 2.0

Module Interface

Access to the database is made available through connection objects. The module must provide the following constructor for these:

connect(parameters…)—This is a constructor for creating a connection to the database. Returns a Connection Object. It takes a number of parameters that are database dependent.1

These module globals must be defined:

apilevel—This string constant states the supported DB API level. Currently only the strings '1.0' and '2.0' are allowed.

If not given, a Database API 1.0 level interface should be assumed.

threadsafety—This integer constant states the level of thread safety that the interface supports. Possible values are

0—Threads cannot share the module.

1—Threads can share the module, but not connections.

2—Threads can share the module and connections.

3—Threads can share the module, connections, and cursors.

Sharing in the previous context means that two threads can use a resource without wrapping it using a mutex semaphore to implement resource locking. Note that you cannot always make external resources thread safe by managing access using a mutex: The resource might rely on global variables or other external sources that are beyond your control.

paramstyle—This string constant states the type of parameter marker formatting expected by the interface. Possible values are as follows:2

									
'qmark'   = Question mark style, e.g. '…WHERE name=?'
'numeric' = Numeric, positional style, e.g. '…WHERE name=:1'
'named'   = Named style, e.g. '…WHERE name=:name'
'format'  = ANSI C printf format codes, e.g. '…WHERE name=%s'
'pyformat'= Python extended format codes, e.g. '…WHERE name=%(name)s'

								

The module should make all error information available through these exceptions or subclasses thereof:

Warning—  This exception is raised for important warnings such as data truncations while inserting, and so on. It must be a subclass of the Python StandardError (defined in the module exceptions).

Error—  This exception is the base class of all other error exceptions. You can use this to catch all errors with one single 'except'statement. Warnings are not considered errors and thus should not use this class as base. It must be a subclass of the Python StandardError (defined in the module exceptions).

InterfaceError—  This exception is raised for errors that are related to the database interface rather than the database itself. It must be a subclass of Error.

DatabaseError—  This exception is raised for errors that are related to the database. It must be a subclass of Error.

DataError—  This exception is raised for errors that are because of problems with the processed data such as division by zero, numeric value out of range, and so on. It must be a subclass of DatabaseError.

OperationalError—  This exception is raised for errors that are related to the database's operation and not necessarily under the control of the programmer; for example, an unexpected disconnect occurs, the data source name is not found, a transaction could not be processed, a memory allocation error occurred during processing, and so on. It must be a subclass of DatabaseError.

IntegrityError—  This exception is raised when the relational integrity of the database is affected; for example, a foreign key check fails. It must be a subclass of DatabaseError.

InternalError—  This exception is raised when the database encounters an internal error; for example, the cursor is not valid anymore, the transaction is out of sync, and so on. It must be a subclass of DatabaseError.

ProgrammingError—  This exception is raised for programming errors; for example, table not found or already exists, syntax error in the SQL statement, wrong number of parameters specified, and so on. It must be a subclass of DatabaseError.

NotSupportedError—  This exception is raised in case a method or database API was used that is not supported by the database; for example, requesting a .rollback() on a connection that does not support transaction or has transactions turned off. It must be a subclass of DatabaseError.

This is the exception inheritance layout:

							
StandardError
|__Warning
|__Error
   |__InterfaceError
   |__DatabaseError
      |__DataError
      |__OperationalError
      |__IntegrityError
      |__InternalError
      |__ProgrammingError
      |__NotSupportedError

						

Note

The values of these exceptions are not defined. They should give the user a good idea of what went wrong though.



Connection Objects

Connections Objects should respond to the following methods:

close()—   It closes the connection now (rather than whenever __del__ is called). The connection will be unusable from this point forward; an Error (or subclass) exception will be raised if any operation is attempted with the connection. The same applies to all cursor objects trying to use the connection.

commit()—   It commits any pending transaction to the database. If the database supports an autocommit feature, this must be initially off. An interface method might be provided to turn it back on.

Database modules that do not support transactions should implement this method with void functionality.

rollback()—   This method is optional because not all databases provide transaction support.3

In case a database does provide transactions, this method causes the database to roll back to the start of any pending transaction. Closing a connection without committing the changes first will cause an implicit rollback to be performed.

cursor()—   It returns a new Cursor Object using the connection. If the database does not provide a direct cursor concept, the module will have to emulate cursors using other means to the extent needed by this specification.4

Cursor Objects

These objects represent a database cursor, which is used to manage the context of a fetch operation. They should respond to the following methods and attributes:

description—   This read-only attribute is a set of seven-item sequences. Each of these sequences contains information describing one result column: (name, type_code, display_size, internal_size, precision, scale, null_ok). This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the executeXXX() method yet.

The type_code can be interpreted by comparing it to the Type Objects specified in the following section.

rowcount—   This read-only attribute specifies the number of rows that the last executeXXX() produced (for DQL statements such as select) or affected (for DML statements such as update or insert).

The attribute is -1 in case no executeXXX() has been performed on the cursor, or the rowcount of the last operation is not determinable by the interface.7

callproc(procname[,parameters])—   This method is optional because not all databases provide stored procedures.

It calls a stored database procedure with the given name. The sequence of parameters must contain one entry for each argument that the procedure expects. The result of the call is returned as modified copy of the input sequence. Input parameters are left untouched, and output and input/output parameters are replaced with possibly new values.

The procedure can also provide a resultset as output. This must then be made available through the standard fetchXXX() methods.

close()—   It closes the cursor now (rather than whenever __del__ is called). The cursor will be unusable from this point forward; an Error (or subclass) exception will be raised if any operation is attempted with the cursor.

execute(operation[,parameters])—   It prepares and executes a database operation (query or command). Parameters can be provided as sequence or mapping and will be bound to variables in the operation. Variables are specified in a database-specific notation (see the module's paramstyle attribute for details).5

A reference to the operation will be retained by the cursor. If the same operation object is passed in again, the cursor can optimize its behavior. This is most effective for algorithms in which the same operation is used, but different parameters are bound to it (many times).

For maximum efficiency when reusing an operation, it is best to use the setinputsizes() method to specify the parameter types and sizes ahead of time. It is legal for a parameter to not match the predefined information; the implementation should compensate, possibly with a loss of efficiency.

The parameters can also be specified as list of tuples to insert multiple rows in a single operation, but this kind of usage is depreciated: executemany() should be used instead.

Return values are not defined.

executemany(operation,seq_of_parameters

It prepares a database operation (query or command) and then executes it against all parameter sequences or mappings found in the sequence seq_of_parameters.

Modules are free to implement this method using multiple calls to the execute() method or by using array operations to have the database process the sequence as a whole in one call.

The same comments for execute() also apply accordingly to this method.

Return values are not defined.

fetchone()

It fetches the next row of a query resultset, returning a single sequence, or None when no more data is available.6

An Error (or subclass) exception is raised if the previous call to executeXXX() did not produce any resultset or no call was issued yet.

fetchmany([size=cursor.arraysize])

It fetches the next set of rows of a query result, returning a sequence of sequences (for example, a list of tuples). An empty sequence is returned when no more rows are available.

The number of rows to fetch per call is specified by the parameter. If it is not given, the cursor's arraysize determines the number of rows to be fetched. The method should try to fetch as many rows as indicated by the size parameter. If this is not possible because of the specified number of rows not being available, fewer rows can be returned.

An Error (or subclass) exception is raised if the previous call to executeXXX() did not produce any resultset or no call was issued yet.

Performance considerations are involved with the size parameter. For optimal performance, it is usually best to use the arraysize attribute. If the size parameter is used, it is best for it to retain the same value from one fetchmany() call to the next.

fetchall()

It fetches all (remaining) rows of a query result, returning them as a set of sequences (for example, a list of tuples). Note that the cursor's arraysize attribute can affect the performance of this operation.

An Error (or subclass) exception is raised if the previous call to executeXXX() did not produce any resultset or no call was issued yet.

nextset()

This method is optional because not all databases support multiple resultsets.3

This method will make the cursor skip to the next available set, discarding any remaining rows from the current set.

If there are no more sets, the method returns None. Otherwise, it returns a true value and subsequent calls to the fetch methods will return rows from the next resultset.

An Error (or subclass) exception is raised if the previous call to executeXXX() did not produce any resultset or no call was issued yet.

arraysize

This read/write attribute specifies the number of rows to fetch at a time with fetchmany(). It defaults to 1, which means to fetch a single row at a time.

Implementations must observe this value with respect to the fetchmany() method, but are free to interact with the database a single row at a time. It can also be used in the implementation of executemany().

setinputsizes(sizes)

This can be used before a call to executeXXX() to predefine memory areas for the operation's parameters.

sizes is specified as a sequence—one item for each input parameter. The item should be a Type Object that corresponds to the input that will be used, or it should be an integer specifying the maximum length of a string parameter. If the item is None, no predefined memory area will be reserved for that column. (This is useful to avoid predefined areas for large inputs.)

This method would be used before the executeXXX() method is invoked. Implementations are free to have this method do nothing, and users are free to not use it.

setoutputsize(size[,column])

It sets a column buffer size for fetches of large columns (for example, LONGs, BLOBs, and so on). The column is specified as an index into the result sequence. Not specifying the column will set the default size for all large columns in the cursor.

This method would be used before the executeXXX() method is invoked.

Implementations are free to have this method do nothing, and users are free to not use it.

Type Objects and Constructors

Many databases need to have the input in a particular format for binding to an operation's input parameters. For example, if an input is destined for a DATE column, it must be bound to the database in a particular string format. Similar problems exist for Row ID columns or large binary items (for example, BLOBs or RAW columns). This presents problems for Python because the parameters to the executeXXX() method are not typed. When the database module sees a Python string object, it doesn't know if it should be bound as a simple CHAR column, as a raw BINARY item, or as a DATE.

To overcome this problem, a module must provide the constructors defined later to create objects that can hold special values. When passed to the cursor methods, the module can then detect the proper type of the input parameter and bind it accordingly.

A Cursor Object's description attribute returns information about each of the result columns of a query. The type_code must be equal to one of Type Objects defined in the following. Type Objects can be equal to more than one type code. (For example, DATETIME could be equal to the type codes for date, time, and timestamp columns; see " Implementation Hints " for details.)

The module exports the following constructors and singletons:

Date(year, month, day)—  This function constructs an object holding a date value.

Time(hour, minute, second)—  This function constructs an object holding a time value.

Timestamp(year, month, day, hour, minute, second)—  This function constructs an object holding a timestamp value.

DateFromTicks(ticks)—  This function constructs an object holding a date value from the given ticks value (number of seconds since the epoch; see the documentation of the standard Python time module for details).

TimeFromTicks(ticks)—  This function constructs an object holding a time value from the given ticks value (number of seconds since the epoch; see the documentation of the standard Python time module for details).

TimestampFromTicks(ticks)—  This function constructs an object holding a time stamp value from the given ticks value (number of seconds since the epoch; see the documentation of the standard Python time module for details).

Binary(string)—  This function constructs an object capable of holding a binary (long) string value.

STRING—  This type object is used to describe columns in a database that are string based (for example, CHAR).

BINARY—  This type object is used to describe (long) binary columns in a database (for example, LONG, RAW, BLOBs).

NUMBER—  This type object is used to describe numeric columns in a database.

DATETIME—  This type object is used to describe date/time columns in a database.

ROWID—  This type object is used to describe the Row ID column in a database.

SQL NULL values are represented by the Python None singleton on input and output.

Note

Usage of UNIX ticks for database interfacing can cause troubles because of the limited date range they cover.



Implementation Hints

The next list provides some suggestions about using this API.

  • The preferred object types for the date/time objects are those defined in the mxDateTime package (http://starship.python.net/~lemburg/mxDateTime.html). It provides all necessary constructors and methods both at Python and C level.

  • The preferred object type for Binary objects are the buffer types available in standard Python starting with version 1.5.2. See the Python documentation for details. For information about the C interface, take a look at Include/bufferobject.h and Objects/bufferobject.c in the Python source distribution.

  • Here is a sample implementation of the UNIX ticks based constructors for date/time delegating work to the generic constructors:

    									
    import time
    def DateFromTicks(ticks):
        return apply(Date,time.localtime(ticks)[:3])
    def TimeFromTicks(ticks):
        return apply(Time,time.localtime(ticks)[3:6])
    def TimestampFromTicks(ticks):
        return apply(Timestamp,time.localtime(ticks)[:6])
    
    								
  • This Python class allows implementing the previous type objects even though the description type code field yields multiple values for one type object:

    									
    class DBAPITypeObject:
        def __init__(self,*values):
            self.values = values
        def __cmp__(self,other):
            if other in self.values:
                return 0
            if other < self.values:
                return 1
            else:
                return –1
    
    								

Note

The resulting type object compares equal to all values passed to the constructor.



  • Here is a snippet of Python code that implements the exception hierarchy defined previously:

    									
    import exceptions
    class Error(exceptions.StandardError):
        pass
    class Warning(exceptions.StandardError):
        pass
    class InterfaceError(Error):
        pass
    class DatabaseError(Error):
        pass
    class InternalError(DatabaseError):
        pass
    class OperationalError(DatabaseError):
        pass
    class ProgrammingError(DatabaseError):
        pass
    class IntegrityError(DatabaseError):
        pass
    class DataError(DatabaseError):
        pass
    class NotSupportedError(DatabaseError):
        pass
    							
    								

Note

In C you can use the PyErr_NewException(fullname, base, NULL) API to create the exception objects.



Major Changes from Version 1.0 to Version 2.0

The Python Database API 2.0 introduces a few major changes compared to the 1.0 version. Because some of these changes will cause existing DB API 1.0 based scripts to break, the major version number was adjusted to reflect this change.

These are the most important changes from 1.0 to 2.0:

  • The need for a separate dbi module was dropped and the functionality merged into the module interface itself.

  • New constructors and Type Objects were added for date/time values, the RAW Type Object was renamed to BINARY. The resulting set should cover all basic data types commonly found in modern SQL databases.

  • New constants (apilevel, threadlevel, paramstyle) and methods (executemany, nextset) were added to provide better database bindings.

  • The semantics of .callproc() needed to call stored procedures are now clearly defined.

  • The definition of the .execute() return value changed. Previously, the return value was based on the SQL statement type (which was difficult to implement correctly)—it is undefined now; use the more flexible .rowcount attribute instead. Modules are free to return the old style return values, but these are no longer mandated by the specification and should be considered database interface dependent.

  • Class-based exceptions were incorporated into the specification. Module implementers are free to extend the exception layout defined in this specification by subclassing the defined exception classes.

Open Issues

Although the version 2.0 specification clarifies a lot of questions that were left open in the 1.0 version, there are still some remaining issues:

  • Define a useful return value for .nextset() for the case in which a new resultset is available.

  • Create a fixed point numeric type for use as loss-less monetary and decimal interchange format.

Footnotes
  1. As a guideline, the connection constructor parameters should be implemented as keyword parameters for more intuitive use and follow this order of parameters:

    									
    dsn       = Data source name as string
    user      = User name as string          (optional)
    password  = Password as string           (optional)
    host      = Hostname                     (optional)
    database  = Database name                (optional)
    
    								

    For example, a connect could look like this:

    									
    connect(dsn='myhost:MYDB',user='guido',password='234$')
    
    								
  2. Module implementers should prefer numeric, named, or pyformat over the other formats because these offer more clarity and flexibility.

  3. If the database does not support the functionality required by the method, the interface should throw an exception in case the method is used.

    The preferred approach is to not implement the method and thus have Python generate an AttributeError in case the method is requested. This allows the programmer to check for database capabilities using the standard hasattr() function.

    For some dynamically configured interfaces, it might not be appropriate to require that the method be made available dynamically. These interfaces should then raise a NotSupportedError to indicate the inability to perform the rollback when the method is invoked.

  4. A database interface can choose to support named cursors by allowing a string argument to the method. This feature is not part of the specification because it complicates semantics of the .fetchXXX() methods.

  5. The module will use the __getitem__ method of the parameters object to map either positions (integers) or names (strings) to parameter values. This allows for both sequences and mappings to be used as input.

    The term bound refers to the process of binding an input value to a database execution buffer. In practical terms, this means that the input value is directly used as a value in the operation. The client should not be required to "escape" the value so that it can be used—the value should be equal to the actual database value.

  6. The interface can implement row fetching using arrays and other optimizations. It is not guaranteed that a call to this method will only move the associated cursor forward by one row.

  7. The rowcount attribute might be coded in a way that updates its value dynamically. This can be useful for databases that return useable rowcount values only after the first call to a .fetchXXX() method.


Last updated on 1/30/2002
Python Developer's Handbook, © 2002 Sams Publishing

< BACKMake Note | BookmarkCONTINUE >

Index terms contained in this section

0 variable
1 variable
2 variable
3 variable
accessing
     databases
            connection objects 2nd
apilevel variable
Application Program Interfaces (APIs)
      Python DB 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th
attributes
      description
connect(parametersÉ
      ) constructor
connection objects
      databases 2nd
constructors
     connect(parametersÉ
            )
      databases
cursor objects
      databases 2nd 3rd 4th
databases
      Python DB API 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th
description attribute
execute() method
executemany(operation,seq_of_parameters) method
executeXXX() method 2nd 3rd 4th
fetchmany([size=cursor.arraysize]) method
fetchone() method
interfaces
     application program (API)
            Python DB 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th
methods
      execute
      executemany(operation,seq_of_parameters)
      executeXXX 2nd 3rd 4th
      fetchmany([size=cursor.arraysize])
      fetchone()
      nextset()
      setinputsizes(sizes) 2nd
      setoutputsize(size[,column])
nextset() method
NULL values
      Structured Query Language (SQL)
objects
     connection
            databases 2nd
     cursor
            databases 2nd 3rd 4th
     type
            databases
operations
      references
parameters
      executeXXX() method
paramstyle variable
Python DB API 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th
references
      operations
setinputsizes(sizes) method 2nd
setoutputsize(size[,column]) method
threadsafety variable
type objects
      databases
values
     NULL
            Structured Query Language (SQL)
variables
      0
      1
      2
      3
      apilevel
      paramstyle
      threadsafety

© 2002, O'Reilly & Associates, Inc.