< BACKMake Note | BookmarkCONTINUE >
152015024128143245168232148039199167010047123209178152124239215162147044209056142149179093

Flat Databases

The simplest way to store any kind of information in Python is using flat files. You just need to use the open function that we already studied in Chapter 2, "Language Review." Two options are available: You can either store the information as simple text or as binary data.

Text Data

The next example is a straightforward case of using flat files to store and to retrieve information. First we try to read from the file. If the file doesn't exist, it is created, and the information provided by the user is saved on it.

						
filename = "myflatfile.txt"
try:
    file = open(filename, "r")
    data = file.read()
    file.close()
    print data
except IOError:
    data = raw_input("Enter data to save:")
    file = open(filename,"w")
    file.write(data)
    file.close()

					

Binary Data—The struct Module

The struct module is largely used to manipulate code of platform-independent binary files. It is a good choice for handling small files. For large files, you should consider using the array module.

Binary data files are much less likely to be platform independent. Also, it is easier to extend a text file format without breaking compatibility.

The struct module works by converting data between Python and binary data structures, which normally interact using functions written in C.

This module implements only three functions: pack, unpack, and calcsize.

  • pack—   Takes the list of values and returns a binary object based on the formatstring provided.

    								
    binobject = pack (formatstring, value1, value2, value3, …)
    
    							
  • unpack—   Returns a Python tuple containing the original values. It uses the formatstring to translate the string.

    								
    pythontuple = unpack (formatstring, string)
    
    							
  • calcsize—   Provides the size in bytes of the structure matching the format string.

    								
    no_of_bytes = calcsize(formatstring)
    
    							

The next example packs the values (1, 2, 3) into binary format based on the format string "ihb", and later converts them back to the original values.

						
>>> import struct
>>> buffer = struct.pack("ihb", 1,2,3)
>>> print repr(buffer)
'\001\000\000\000\002\000\003'
>>> print struct.unpack('ihb', buffer)
(1,2,3)

					

Note that the binary data is represented as a Python string.

The next example is based on a binary file that stores three different objects. The first one is the author's initial, the second one is the number of bytes used by an article written by the author, and the last object is the article itself.

						
>>> import struct
>>> data = open('mybinaryfile.dat').read()
>>> start, stop = 0, struct.calcsize('cl')
>>> author, num_bytes = struct.unpack('cl', data[start:stop])
>>> start, stop = stop, start + struct.calcsize('B'*num_bytes)
>>> bytes = struct.unpack('B'*num_bytes, data[start:stop])

					

The next table shows the list of formatting units that can be used by this module.

Table 8.1. Formatting Units Used by the struct Module
Format C Type Python Type
b signed char Integer
B unsigned char Integer
c char String of length 1
d double Float
f float Float
h short Integer
H unsigned short Integer
i int Integer
I unsigned int Integer
l long Integer
L unsigned long Integer
p char[] String
P void * Integer
s char[] String
x pad byte No value

Are you looking for more information about handling binary data? Check out the file npstruct-980726.zip at the following address:

http://www.nightmare.com/software.html

Sam Rushing has created an extension module useful for parsing and unparsing binary data structures. It is similar to the standard struct module, but with a few extra features (bit-fields, user-function-fields, byte order specification, and so on), and a different API that is more convenient for streamed and context-sensitive formats like network protocol packets, image, and sound files.


Last updated on 1/30/2002
Python Developer's Handbook, © 2002 Sams Publishing

< BACKMake Note | BookmarkCONTINUE >

Index terms contained in this section

array module
b format
B format
binary data 2nd
c format
d format
data
      binary 2nd
databases
      flat 2nd
f format
flat databases 2nd
formats
      data, struct module
h format
H format
i format
I format
l format
L format
modules
      array
      struct 2nd
p format
P format
Rushing, Sam
s format
struct module 2nd
x format

© 2002, O'Reilly & Associates, Inc.