See All Titles |
![]() ![]() Object Serialization and Persistent StorageThese other modules provide persistent storage of arbitrary Python objects. Whenever you need to save objects whose value is not a simple string (such as None, integer, long integer, float, complex, tuple, list, dictionary, code object, and so on), you need to serialize the object before sending it to a file. Both pickle and shelve modules save serializable objects to a file. By using these persistent storage modules, Python objects can be stored in relational database systems. These modules abstract and hide the underlying database interfaces, such as the Sybase module and the Python Database API. Included in the standard Python distribution, the pickle module can convert Python objects to and from a string representation. The cPickle module is a faster implementation of the pickle module. The copy_reg module extends the capabilities of the pickle and cpickle modules by registering support functions. The marshal module is an alternate method to implement Python object serialization. It allows you to read/write information in a platform independent binary format and convert data to/from character strings (the module only supports the simple built-in types). Basically, it is just another way to do byte stream conversions by using serialized Python objects. This module is used to serialize the compiled bytecode for Python modules. This module should be used for simple objects only. Use the pickle module to implement persistent objects in general. Persistent Storage of Python Objects in Relational Databases is a paper by Joel Shprentz presented at the Sixth Python Conference. For more information, check out http://www.python.org/workshops/1997-10/proceedings/shprentz.html. pickle ModuleThe pickle module serializes the contents of an object into a stream of bytes. Optionally, it can save the serialized object into a file object. It is slower than the marshal module. >>> import pickle >>> listobj = [1,2,3,4] >>> filehandle = open(filename, 'w') >>> pickle.dump(filehandle, listobj) >>> filehandle = open(filename, 'r') >>> listobj = pickle.load(filehandle) The next functions are the ones implemented by the pickle module. pickle.dump(object, filename [,bin]) This function serializes and saves an object into a file. The bin argument specifies that the information must be saved as binary data. This function is the same as the following: p = pickle.Pickler(filename) p.dump(object) If an unsupported object type is serialized, a PicklingException is raised. pickle.dumps(object [,bin]) This function has the same behavior of dump. The difference is that this one returns the serialized object. pickle.load(file) Restores a serialized object from a file. This function is the same as the following: object = pickle.Unpickler(file).load() The next example serializes the information and converts it back again. >>> import pickle >>> value = ("parrot", (1,2,3)) >>> data = pickle.dumps(value) >>> print pickle.loads(data) ("parrot", (1,2,3)) cPickle ModuleThis module implements the same functions that the pickle module does. The difference is that cPickle is much faster because it doesn't support subclassing of the Pickler and Unpickler objects. See the next example code. It uses the fastest pickle module available on the system. try: import cPickle pickle = cPickle except ImportError: import pickle copy_reg ModuleThis module registers new types to be used with the pickle module. It extends the capabilities of the pickle and cPickle modules by supporting the serialization of new object types defined in C extension modules. The next example corrects the fact that the standard pickle implementation cannot handle Python code objects. It registers a code object handler by using two functions:
import copy_reg, pickle, marshal, types def loaddata(data): return marshal.loads(data) def dumpdata(code): return loaddata, (marshal.dumps(code),) copy_reg.pickle(types.CodeType, dumpdata, loaddata) script = """ x = 1 while x < 10: print x x = x - 1 """ code = compile(script, "<string>", "exec") codeobj = pickle.dumps(code) exec pickle.loads(codeobj) Note
Note that starting at Python 2.0, the copy-reg module can't be used to register pickle support for classes anymore. It can only be used to register pickle support for extension types. You will get a TypeError exception from the pickle() function whenever you try to pass a class to the function. marshal ModuleThis module is only used to serialize simple data objects because class instances and recursive references in lists, tuples, and dictionaries are not supported. It works similar to pickle and shelve. This module implements the following functions: marshal.dump(value, filename) Writes the value in the opened filename. marshal.load(filename) Returns the next readable value from file. marshal.dumps(value) Only returns the string. marshal.loads(string) Returns the next readable value from string. Errors in the value manipulation will raise a ValueError exception. >>> import marshal >>> value = ("spam", [1,2,3,4]) >>> data = marshal.dumps(value) >>> print repr(data) '(\002\000\000\000s\004\000\000\000spam[\004\000\000\000i\001\000\000\000i\002\0 00\000\000i\003\000\000\000i\004\000\000\000' >>> print marshal.loads(data) ("spam", [1,2,3,4]) The next example handles code objects by storing precompiled Python code. import marshal script = """ x = 1 while x < 10: print x x = x - 1 """ code = compile(script, "<script>", "exec") codeobj = marshal.dumps(code) exec marshal.loads(codeobj) shelve ModuleThe shelve module is also part of the standard Python distribution. Built on top of the pickle and anydbm modules, it behaves similar to a persistent dictionary whose values can be arbitrary Python objects. The shelve module offers persistent object storage capability to Python by using dictionary objects. Both keys and values can use any data type, as long as the pickle module can handle it. import shelve key = raw_input("key: ") data = raw_input("value: ") dbhandle = shelve.open("DATABASE","w") while not(dbhandle.has_key(key)): dbhandle[key]=data key = raw_input("key: ") data = raw_input("value: ") dbhandle.close() The shelve module implements a shelf object which supports persistent objects that must be serializable using the pickle module. In other words, a shelf is a dbm (or gdbm) file that stores pickled Python objects. It stores dictionary structures (pickled objects) on disks. For that purpose, it uses dbm-like databases, such as dbm or gdbm. The file it produces is, consequently, a BINARY file. Therefore, the file's format is specific to the database manager used in the process. To open a shelve file, the following function is available: shelve.open(filename) The file is created when the filename does not exist. The following methods and operations are also supported: dbhandle[key] = value # Set the value of a given key entry value = dbhandle[key] # Get the value of a given key entry dbhandle.has_key(key) # Test whether a key exists dbhandle.keys() # Returns a list of the current keys available del dbhandle[key] # Delete a key dbhandle.close() # Close the file Next, I present a simple example of the shelve module using the following: >>> import shelve >>> dbhandle = shelve.open("datafile", "c") >>> dbhandle["animal"] = "parrot" >>> dbhandle["country"] = "Spain" >>> dbhandle["weekdays"] = 5 >>> dbhandle.close() >>> >>> dbhandle = shelve.open("datafile ", "r") >>> for key in dbhandle.keys(): print dbhandle[key] parrot Spain 5 >>> db.close() LockingAs a matter of fact, even though modules such as gdbm and bsddb perform locking, shelves don't implement locking facilities. This means that many users can read the files at the same time. However, only one user can update the file at a given moment. An easy way to handle the situation is by locking the file while writing to it. A routine like this must be implemented because it is not part of the standard distribution. More Sources of InformationPyVersantPyVersant is a simple Python wrapper for the Versant commercial OODBMS. By using PyVersant in the Python command prompt, you can interactively find objects, look at their values, change those values, and write the object back to the database, among other things. More information is provided at the following site: http://starship.python.net/crew/jmenzel/ Details about Versant OODBMS are shown at the following site: ZODBThe Zope Object Database is a persistent-object system that provides transparent transactional object persistence to Python applications. For more information, check out the following site: http://www.zope.org/Members/michel/HowTos/ZODB-How-To ZODB is a powerful object database system that can be used with or without Zope. As a database, it offers many features. Note that ZODB uses other database libraries for the actual storage. More information about Zope can be found in Chapter 11, "Web Development."
|
Index terms contained in this sectioncopy reg modulepickle support registering copy_reg module cPickle module 2nd databases object serialization 2nd Zope Object (ZODB) files shelve opening locking shelves marshal module 2nd modules copy reg pickle support copy_reg cPickle 2nd marshal 2nd pickle 2nd 3rd shelve 2nd 3rd object serialization databases 2nd objects serializable, saving 2nd 3rd opening shelve files persistent storage databases 2nd Persistent Storage of Python Objects in Relational Databases pickle module 2nd 3rd pickle support copy reg module saving serializable objects 2nd 3rd serializable objects saving 2nd 3rd serilization objects databases 2nd shelve files opening shelve module 2nd 3rd Shprentz, Joel storage persistent databases 2nd Zope Object Database (ZODB) |
© 2002, O'Reilly & Associates, Inc. |