< BACKMake Note | BookmarkCONTINUE >
152015024128143245168232148039196038240039088173205162105045222219255189226192197177127

Initialization, Finalization, and Threads

					
void Py_Initialize()

				

Initialize the Python interpreter. In an application embedding Python, this should be called before using any other Python/C API functions; with the exception of Py_SetProgramName(), PyEval_InitThreads(), PyEval_ReleaseLock(), and PyEval_AcquireLock(). This initializes the table of loaded modules (sys.modules), and creates the fundamental modules __builtin__, __main__, and sys. It also initializes the module search path (sys.path). It does not set sys.argv; it uses PySys_SetArgv() for that. This is a no-operation when called for a second time (without calling Py_Finalize() first). There is no return value; it is a fatal error if the initialization fails.

					
int Py_IsInitialized()

				

Returns true (nonzero) when the Python interpreter has been initialized, false (zero) if not. After Py_Finalize() is called, this returns false until Py_Initialize() is called again.

					
void Py_Finalize()

				

Undoes all initializations made by Py_Initialize() and subsequent uses of Python/C API functions, and destroys all sub-interpreters (see Py_NewInterpreter() in the following) that were created and not yet destroyed since the last call to Py_Initialize(). Ideally, this frees all memory allocated by the Python interpreter. This is a no-op when called for a second time (without calling Py_Initialize() again first). There is no return value; errors during finalization are ignored.

This function is provided for a number of reasons. An embedding application might want to restart Python without having to restart the application itself. An application that has loaded the Python interpreter from a dynamic link library (or DLL) might want to free all memory allocated by Python before unloading the DLL. During a hunt for memory leaks in an application, a developer might want to free all memory allocated by Python before exiting from the application.

Bugs and caveats include: The destruction of modules and objects in modules is done in random order; this can cause destructors (__del__() methods) to fail when they depend on other objects (even functions) or modules. Dynamically loaded extension modules loaded by Python are not unloaded. Small amounts of memory allocated by the Python interpreter might not be freed (if you find a leak, please report it to the development team). Memory tied up in circular references between objects is not freed. Some memory allocated by extension modules might not be freed. Some extension might not work properly if their initialization routine is called more than once; this can happen if an application calls Py_Initialize() and Py_Finalize() more than once.

					
PyThreadState* Py_NewInterpreter()

				

Creates a new sub-interpreter. This is an (almost) totally separate environment for the execution of Python code. In particular, the new interpreter has separate, independent versions of all imported modules, including the fundamental modules __builtin__, __main__, and sys. The table of loaded modules (sys.modules) and the module search path (sys.path) are also separate. The new environment has no sys.argv variable. It has new standard I/O stream file objects sys.stdin, sys.stdout, and sys.stderr (however, these refer to the same underlying FILE structures in the C library).

The return value points to the first thread state created in the new sub-interpreter. This thread state is made the current thread state. Note that no actual thread is created; see the discussion of thread states later. If the creation of the new interpreter is unsuccessful, NULL is returned; no exception is set because the exception state is stored in the current thread state and there might not be a current thread state. (Like all other Python/C API functions, the global interpreter lock must be held before calling this function and is still held when it returns; however, unlike most other Python/C API functions, there needn't be a current thread state on entry.)

Extension modules are shared between (sub-)interpreters as follows: the first time a particular extension is imported, it is initialized normally, and a (shallow) copy of its module's dictionary is squirreled away. When the same extension is imported by another (sub-)interpreter, a new module is initialized and filled with the contents of this copy; the extension's init function is not called. Note that this is different from what happens when an extension is imported after the interpreter has been completely re-initialized by calling Py_Finalize() and Py_Initialize(); in that case, the extension's initmodule function is called again.

Bugs and caveats include: Because sub-interpreters (and the main interpreter) are part of the same process, the insulation between them isn't perfect—for example, using low-level file operations like os.close(), they can (accidentally or maliciously) affect each other's open files. Because of the way extensions are shared between (sub-)interpreters, some extensions might not work properly; this is especially likely when the extension makes use of (static) global variables, or when the extension manipulates its module's dictionary after its initialization. It is possible to insert objects created in one sub-interpreter into a namespace of another sub-interpreter; this should be done with great care to avoid sharing user-defined functions, methods, instances or classes between sub-interpreters because import operations executed by such objects might affect the wrong (sub-)interpreter's dictionary of loaded modules.

Note

This is a hard-to-fix bug that will be addressed in a future release.



					
void Py_EndInterpreter(PyThreadState *tstate)

				

Destroys the (sub-)interpreter represented by the given thread state. The given thread state must be the current thread state. See the discussion of thread states later. When the call returns, the current thread state is NULL. All thread states associated with this interpreter are destroyed. (The global interpreter lock must be held before calling this function and is still held when it returns.) Py_Finalize() will destroy all sub- interpreters that haven't been explicitly destroyed at that point.

					
void Py_SetProgramName(char *name)

				

This function should be called before Py_Initialize() is called for the first time, if it is called at all. It tells the interpreter the value of the argv[0] argument to the main() function of the program. This is used by Py_GetPath() and some other following functions to find the Python runtime libraries relative to the interpreter executable. The default value is python. The argument should point to a zero-terminated character string in static storage whose contents will not change for the duration of the program's execution. No code in the Python interpreter will change the contents of this storage.

					
char* Py_GetProgramName()

				

Returns the program name set with Py_SetProgramName(), or the default. The returned string points into static storage; the caller should not modify its value.

					
char* Py_GetPrefix()

				

Returns the prefix for installed platform-independent files. This is derived through a number of complicated rules from the program name set with Py_SetProgramName() and some environment variables; for example, if the program name is "/usr/local/bin/python", the prefix is "/usr/local". The returned string points into static storage; the caller should not modify its value. This corresponds to the prefix variable in the top-level Makefile and the --prefix argument to the configure script at build time. The value is available to Python code as sys.prefix. It is only useful on UNIX. See also the next function.

					
char* Py_GetExecPrefix()

				

Returns the exec-prefix for installed platform-dependent files. This is derived through a number of complicated rules from the program name set with Py_SetProgramName() and some environment variables; for example, if the program name is "/usr/local/bin/python", the exec-prefix is "/usr/local". The returned string points into static storage; the caller should not modify its value. This corresponds to the exec_prefix variable in the top-level Makefile and the --exec_prefix argument to the configure script at build time. The value is available to Python code as sys.exec_prefix. It is only useful on UNIX.

The background is the exec-prefix differs from the prefix when platform dependent files (such as executables and shared libraries) are installed in a different directory tree. In a typical installation, platform dependent files can be installed in the "/usr/local/plat" subtree whereas platform independent files can be installed in "/usr/local".

Generally speaking, a platform is a combination of hardware and software families, for example, Sparc machines running the Solaris 2.x operating system are considered the same platform, but Intel machines running Solaris 2.x are another platform, and Intel machines running Linux are yet another platform. Different major revisions of the same operating system generally also form different platforms. Non-UNIX operating systems are a different story; the installation strategies on those systems are so different that the prefix and exec-prefix are meaningless, and set to the empty string. Note that compiled Python bytecode files are platform independent (but not independent from the Python version by which they were compiled).

System administrators will know how to configure the mount or automount programs to share "/usr/local" between platforms while having "/usr/local/plat" be a different filesystem for each platform.

					
char* Py_GetProgramFullPath()

				

Returns the full program name of the Python executable; this is computed as a side-effect of deriving the default module search path from the program name (set by Py_SetProgramName() earlier). The returned string points into static storage; the caller should not modify its value. The value is available to Python code as sys.executable.

					
char* Py_GetPath()

				

Returns the default module search path; this is computed from the program name (set by Py_SetProgramName() earlier) and some environment variables. The returned string consists of a series of directory names separated by a platform dependent delimiter character. The delimiter character is : on UNIX, ; on DOS/Windows, and \ n (the ASCII newline character) on Macintosh. The returned string points into static storage; the caller should not modify its value. The value is available to Python code as the list sys.path, which can be modified to change the future search path for loaded modules.

					
const char* Py_GetVersion()

				

Returns the version of this Python interpreter. This is a string that looks something like

					
"1.5 (#67, Dec 31 1997, 22:34:28) [GCC 2.7.2.2]"

				

The first word (up to the first space character) is the current Python version; the first three characters are the major and minor version separated by a period. The returned string points into static storage; the caller should not modify its value. The value is available to Python code as the list sys.version.

					
const char* Py_GetPlatform()

				

Returns the platform identifier for the current platform. On UNIX, this is formed from the official name of the operating system, converted to lowercase, followed by the major revision number; for example, for Solaris 2.x, which is also known as SunOS 5.x, the value is sunos5. On Macintosh, it is mac. On Windows, it is win. The returned string points into static storage; the caller should not modify its value. The value is available to Python code as sys.platform.

					
const char* Py_GetCopyright()

				

Returns the official copyright string for the current Python version; for example

					
"Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam"

				

The returned string points into static storage; the caller should not modify its value. The value is available to Python code as the list sys.copyright.

					
const char* Py_GetCompiler()

				

Returns an indication of the compiler used to build the current Python version, in square brackets; for example

					
"[GCC 2.7.2.2]"

				

The returned string points into static storage; the caller should not modify its value. The value is available to Python code as part of the variable sys.version.

					
const char* Py_GetBuildInfo()

				

Return information about the sequence number and build date and time of the current Python interpreter instance; for example

					
"#67, Aug 1 1997, 22:34:28"

				

The returned string points into static storage; the caller should not modify its value. The value is available to Python code as part of the variable sys.version.

					
int PySys_SetArgv(int argc, char **argv)

				

Sets sys.argv based on argc and argv. These parameters are similar to those passed to the program's main() function with the difference that the first entry should refer to the script file to be executed rather than the executable hosting the Python interpreter. If there isn't a script that will be run, the first entry in argv can be an empty string. If this function fails to initialize sys.argv, a fatal condition is signaled using Py_FatalError().

Thread State and the Global Interpreter Lock

The Python interpreter is not fully thread safe. In order to support multithreaded Python programs, a global lock must be held by the current thread before it can safely access Python objects. Without the lock, even the simplest operations could cause problems in a multithreaded program: for example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice.

Therefore, the rule exists that only the thread that has acquired the global interpreter lock can operate on Python objects or call Python/C API functions. In order to support multithreaded Python programs, the interpreter regularly releases and reacquires the lock—by default, every ten bytecode instructions (this can be changed with sys.setcheckinterval()). The lock is also released and reacquired around potentially blocking I/O operations such as reading or writing a file, so other threads can run while the thread that requests the I/O is waiting for the I/O operation to complete.

The Python interpreter needs to keep some bookkeeping information separate per thread—for this it uses a data structure called PyThreadState. This is new in Python 1.5; in earlier versions, such a state was stored in global variables, and switching threads could cause problems. In particular, exception handling is now thread safe when the application uses sys.exc_info() to access the exception last raised in the current thread.

There's one global variable left, however: the pointer to the current PyThreadState structure. Although most thread packages have a way to store per-thread global data, Python's internal platform independent thread abstraction doesn't support this yet. Therefore, the current thread state must be manipulated explicitly.

This is easy enough in most cases. Most code manipulating the global interpreter lock has the following simple structure:

						
Save the thread state in a local variable.
Release the interpreter lock.
...Do some blocking I/O operation...
Reacquire the interpreter lock.
Restore the thread state from the local variable.

					

This is so common that a pair of macros exists to simplify it:

						
Py_BEGIN_ALLOW_THREADS
...Do some blocking I/O operation...
Py_END_ALLOW_THREADS

					

The Py_BEGIN_ALLOW_THREADS macro opens a new block and declares a hidden local variable; the Py_END_ALLOW_THREADS macro closes the block. Another advantage of using these two macros is that when Python is compiled without thread support, they are defined empty, thus saving the thread state and lock manipulations.

When thread support is enabled, the previous block expands to the following code:

						
PyThreadState *_save;
_save = PyEval_SaveThread();
...Do some blocking I/O operation...
    PyEval_RestoreThread(_save);

					

Using even lower level primitives, we can get roughly the same effect as follows:

						
PyThreadState *_save;
_save = PyThreadState_Swap(NULL);
PyEval_ReleaseLock();
...Do some blocking I/O operation...
PyEval_AcquireLock();
PyThreadState_Swap(_save);

					

There are some subtle differences; in particular, PyEval_RestoreThread() saves and restores the value of the global variable errno because the lock manipulation does not guarantee that errno is left alone. Also, when thread support is disabled, PyEval_SaveThread() and PyEval_RestoreThread() don't manipulate the lock; in this case, PyEval_ReleaseLock() and PyEval_AcquireLock() are not available. This is done so that dynamically loaded extensions compiled with thread support enabled can be loaded by an interpreter that was compiled with disabled thread support.

The global interpreter lock is used to protect the pointer to the current thread state. When releasing the lock and saving the thread state, the current thread state pointer must be retrieved before the lock is released because another thread could immediately acquire the lock and store its own thread state in the global variable. Conversely, when acquiring the lock and restoring the thread state, the lock must be acquired before storing the thread state pointer.

Why so much detail about this? Because when threads are created from C, they don't have the global interpreter lock, nor is there a thread state data structure for them. Such threads must bootstrap themselves into existence, by first creating a thread state data structure, acquiring the lock, and finally storing their thread state pointer, before they can start using the Python/C API. When they are done, they should reset the thread state pointer, release the lock, and finally free their thread state data structure.

When creating a thread data structure, you need to provide an interpreter state data structure. The interpreter state data structure holds global data that is shared by all threads in an interpreter, for example the module administration (sys.modules). Depending on your needs, you can either create a new interpreter state data structure, or share the interpreter state data structure used by the Python main thread (to access the latter, you must obtain the thread state and access its interp member; this must be done by a thread that is created by Python or by the main thread after Python is initialized).

						
PyInterpreterState

					

This data structure represents the state shared by a number of cooperating threads. Threads belonging to the same interpreter share their module administration and a few other internal items. There are no public members in this structure.

Threads belonging to different interpreters initially share nothing, except process state like available memory, open file descriptors and such. The global interpreter lock is also shared by all threads, regardless of to which interpreter they belong.

						
PyThreadState

					

This data structure represents the state of a single thread. The only public data member is PyInterpreterState *interp, which points to this thread's interpreter state.

						
void PyEval_InitThreads()

					

Initialize and acquire the global interpreter lock. It should be called in the main thread before creating a second thread or engaging in any other thread operations such as PyEval_ReleaseLock() or PyEval_ReleaseThread(tstate). It is not needed before calling PyEval_SaveThread() or PyEval_RestoreThread().

This is a no-op when called for a second time. It is safe to call this function before calling Py_Initialize().

When only the main thread exists, no lock operations are needed. This is a common situation (most Python programs do not use threads), and the lock operations slow the interpreter down a bit. Therefore, the lock is not created initially. This situation is equivalent to having acquired the lock: When there is only a single thread, all object accesses are safe. Therefore, when this function initializes the lock, it also acquires it. Before the Python thread module creates a new thread, knowing that either it has the lock or the lock hasn't been created yet, it calls PyEval_InitThreads(). When this call returns, it is guaranteed that the lock has been created and that it has acquired it.

It is not safe to call this function when it is unknown which thread (if any) currently has the global interpreter lock.

This function is not available when thread support is disabled at compile time.

						
void PyEval_AcquireLock()

					

Acquires the global interpreter lock. The lock must have been created earlier. If this thread already has the lock, a deadlock ensues. This function is not available when thread support is disabled at compile time.

						
void PyEval_ReleaseLock()

					

Releases the global interpreter lock. The lock must have been created earlier. This function is not available when thread support is disabled at compile time.

						
void PyEval_AcquireThread(PyThreadState *tstate)

					

Acquires the global interpreter lock and then sets the current thread state to tstate, which should not be NULL. The lock must have been created earlier. If this thread already has the lock, deadlock ensues. This function is not available when thread support is disabled at compile time.

						
void PyEval_ReleaseThread(PyThreadState *tstate)

					

Resets the current thread state to NULL and releases the global interpreter lock. The lock must have been created earlier and must be held by the current thread. The tstate argument, which must not be NULL, is only used to check that it represents the current thread state—if it isn't, a fatal error is reported. This function is not available when thread support is disabled at compile time.

						
PyThreadState* PyEval_SaveThread()

					

Releases the interpreter lock (if it has been created and thread support is enabled) and resets the thread state to NULL, returning the previous thread state (which is not NULL). If the lock has been created, the current thread must have acquired it. (This function is available even when thread support is disabled at compile time.)

						
void PyEval_RestoreThread(PyThreadState *tstate)

					

Acquires the interpreter lock (if it has been created and thread support is enabled) and sets the thread state to tstate, which must not be NULL. If the lock has been created, the current thread must not have acquired it, otherwise deadlock ensues. (This function is available even when thread support is disabled at compile time.)

The following macros are normally used without a trailing semicolon; look for example usage in the Python source distribution.

						
Py_BEGIN_ALLOW_THREADS

					

This macro expands to "{ PyThreadState *_save; _save = PyEval_SaveThread();". Note that it contains an opening brace; it must be matched with the following Py_END_ALLOW_THREADS macro. It is a no-op when thread support is disabled at compile time.

						
Py_END_ALLOW_THREADS

					

This macro expands to "PyEval_RestoreThread(_save); } ". Note that it contains a closing brace; it must be matched with an earlier Py_BEGIN_ALLOW_THREADS macro. See earlier section for further discussion of this macro. It is a no-op when thread support is disabled at compile time.

						
Py_BEGIN_BLOCK_THREADS

					

This macro expands to "PyEval_RestoreThread(_save);" that is, it is equivalent to Py_END_ALLOW_THREADS without the closing brace. It is a no-op when thread support is disabled at compile time.

						
Py_BEGIN_UNBLOCK_THREADS

					

This macro expands to "_save = PyEval_SaveThread();" that is, it is equivalent to Py_BEGIN_ALLOW_THREADS without the opening brace and variable declaration. It is a no-op when thread support is disabled at compile time.

All the following functions are only available when thread support is enabled at compile time, and must be called only when the interpreter lock has been created.

						
PyInterpreterState* PyInterpreterState_New()

					

Creates a new interpreter state object. The interpreter lock need not be held, but can be held if it is necessary to serialize calls to this function.

void PyInterpreterState_Clear(PyInterpreterState *interp)—  Resets all information in an interpreter state object. The interpreter lock must be held.

void PyInterpreterState_Delete(PyInterpreterState *interp)—  Destroys an interpreter state object. The interpreter lock need not be held. The interpreter state must have been reset with a previous call to PyInterpreterState_Clear().

PyThreadState* PyThreadState_New(PyInterpreterState *interp)—  Creates a new thread state object belonging to the given interpreter object. The interpreter lock need not be held, but might be held if it is necessary to serialize calls to this function.

void PyThreadState_Clear(PyThreadState *tstate)—  Resets all information in a thread state object. The interpreter lock must be held.

void PyThreadState_Delete(PyThreadState *tstate)—  Destroys a thread state object. The interpreter lock need not be held. The thread state must have been reset with a previous call to PyThreadState_Clear().

PyThreadState* PyThreadState_Get()—  Returns the current thread state. The interpreter lock must be held. When the current thread state is NULL, this issues a fatal error (so that the caller needn't check for NULL).

PyThreadState* PyThreadState_Swap(PyThreadState *tstate)—  Swaps the current thread state with the thread state given by the argument tstate, which might be NULL. The interpreter lock must be held.


Last updated on 1/30/2002
Python Developer's Handbook, © 2002 Sams Publishing

< BACKMake Note | BookmarkCONTINUE >

Index terms contained in this section

Application Programmers Interface (API)
     Python/C
            initialization, finalization, and threads 2nd 3rd 4th
applications
     multithreaded
            support 2nd
char* Py_GetExecPrefix() function
const char* Py_GetVersion() function
creating
      thread data structures
finalization, Python/C Application Programmers Interface (API) 2nd 3rd 4th
functions
      char* Py_GetExecPrefix()
      const char* Py_GetVersion()
      initialization, finalization, and threads 2nd 3rd 4th
      PyThreadState* Py_NewInterpreter()
      sys.exec_info()
      void Py_Finalize()
global locks, thread state
initialization, Python/C Application Programmers Interface (API) 2nd 3rd 4th
interfaces
     Python/C Application Programmers (API)
            initialization, finalization, and threads 2nd 3rd 4th
interpreters
      global locks, thread state
locks
      global interpreter, thread state
multithreaded programs
      support 2nd
programs
     multithreaded
            support 2nd
Python/C Application Programmers Interface (API)
      initialization, finalization, and threads 2nd 3rd 4th
PyThreadState* Py_NewInterpreter() function
software
     multithreaded
            support 2nd
state
      threads, global locks
structures
      thread data, creating
support
      multithreaded programs 2nd
sys.exec_info() function
thread data structures, creating
threads
      Python/C Application Programmers Interface (API) 2nd 3rd 4th
void Py_Finalize() function
writing
      thread data structures

© 2002, O'Reilly & Associates, Inc.