< BACKMake Note | BookmarkCONTINUE >
152015024128143245168232148039199167010047123209178152124239215162148044238001160165056249

Threads

Let's start by quickly defining a thread. Many people still have some kind of confusion when it comes to clarifying the difference between threads and processes.

When you run any program in your computer, the CPU creates a process for that program. This process is defined as a group of elements that compound a single program. These elements are the memory area reserved for the program, a program counter, a list of files opened by the program, and a call stack where all the variables are stored. A program with a single call stack and program counter is a single threaded program.

Now, suppose you have different tasks inside your program that you need to execute several times simultaneously. What do you do? Maybe you are thinking about calling the whole program several times. Wrong answer! Think about all the resources that you are consuming without actually using them!

The solution to implement this multithreaded program is to create a function that implements the code which needs to be executed several times concurrently, and then, create a thread that uses only this function.

A thread is a program unit that processes multiple time-consuming actions as parallel tasks in the background of your main application process. Sometimes threads are difficult to debug because the circumstances in which they occur are hard to simulate.

Python Threads

Python threads can be implemented on every operational system that supports the POSIX threads library. But actually, the Python threading support doesn't always use POSIX threads. In the python-2.0 source tree, there are beos, cthread, lwp, nt, os2, pth, pthread, sgi, solaris, and wince thread implementations. In certain environments that support multithreading, Python allows the interpreter to run many threads at once.

Python has two threading interfaces: The thread module and the threading module. The use of these Python's native threading built-in modules enables the code to be portable across all platforms that support Python.

The thread module supports lightweight process threads. It offers a low-level interface for working with multiple threads.

On the other hand, the threading module provides high-level threading interfaces on top of the thread module.

Besides these two modules, Python also implements the Queue module. This is a synchronized queue class used in thread programming to move Python objects between multiple threads in a safe way.

Threads have limitations on some platforms. For instance, Linux thread switching is quite fast, sometimes faster than NT thread switching.

Programs—such as Tkinter, CORBA, and ILU—that rely on a main loop to dispatch events can complicate the design of threads. Definitively, they do not have a good relationship with threaded programs. Main loops are usually used by Graphical User Interfaces not to allow the main thread to exit.

MacPython is currently not built with thread support. That is because no posix-compatible thread implementation was available, making Python integration hard. However, this has changed with GUSI2 (a posix I/O emulation library), and the upcoming MacPython 1.6a1 is planned to have threads.

The Windows Operation System adds many additional features to Python's implementation of threads. The win32 package provides as additional features for Python's thread support:

  • The win32process module—An interface to the win32 Process and Thread API's.

  • The win32event module—A module that provides an interface to the win32 event/wait API.

The threading model provided by the COM technology allows objects not designed to work as threads to be used by other objects that are thread-aware.

Python's interpreter cannot handle more than one thread at the same time. The global interpreter lock is the internal mechanism which guarantees that the Python interpreter executes only one thread simultaneously. Although this is not a problem for single-threaded programs, or programs on single-processor machines, it can become trouble on performance-critical applications that run on multiprocessor computers. If your threads are doing IO work, other threads can execute during reads and writes.

Check out Appendix A, "Python/C API," for information about handling threads using the Python/C API. You can also see the latest documentation about it at

						
					http://www.python.org/doc/current/api/threads.html
				
					

You might also want to look at the thread and threading modules in the library reference, which are documented at

						
					http://www.python.org/doc/current/lib/module-thread.html
				
					

and

						
					http://www.python.org/doc/current/lib/module-threading.html
				
					

Anton Ertl has a Web page that exposes very interesting material about the differences between the various threading techniques:

						
					http://www.complang.tuwien.ac.at/forth/threaded-code.html
				
					

Python Thread Modules

Python includes two threading modules, assuming that your Python was configured for threads when it was built. One provides the primitives, and the other provides higher-level access. In general, Python relies on operating system threads unless you specifically compile it by activating the thread directive. This should offer adequate performance for all but the most demanding applications.

Thread Module

The following four functions are available in this module:

  • thread.allocate_lock()—   Creates and returns a lock object. This object has the following three methods:

    lckobj.acquire([flag])—   It is used to acquire a lock. If the flag is omitted, the function returns None when it acquires the lock. If flag is set to 0, the lock is only acquired when it can be immediately acquired. Anything different from 0 blocks the methods until the lock is released. This process cannot be interrupted. This function returns 1 if the lock is acquired, and 0 if not.

    lckobj.release()—   Releases the lock.

    lckobj.locked()—   Returns 1 if the object has a successful lock. Otherwise, it returns 0.

  • thread.exit()—   Raises a SystemExit exception that ends the thread. It is equivalent to sys.exit() function.

  • thread.get_ident()—   Gets the identifier of the current thread.

  • thread.start_new_thread(func, args [,kwargs])—   Starts a new thread. Internally, it uses the apply function to call func using the provided arguments. This method requires the second argument (args) to be a tuple.

As there isn't any main loop in the next program, the time.sleep function (line 30) doesn't allow the child threads be killed because it doesn't allow the main thread exit. If this function weren't there, the other threads would be killed immediately when the main thread exited. You can test this by commenting the last line.

							
 1: import thread, time
 2: class VCR:
 3:     def __init__(self):
 4:         self._channel = { }
 5:         self._channel['1'] = self.channel_KDSF
 6:         self._channel['2'] = self.channel_FOKS
 7:         self._channel['3'] = self.channel_CBA
 8:         self._channel['4'] = self.channel_ESTN
 9:     def channel(self, selection, seconds):
10:         self._channel[selection] (seconds)
11:     def channel_KDSF(self, seconds_arg):
12:         thread.start_new_thread(self.record, (seconds_arg,'1. KDSF'))
13:     def channel_FOKS(self, seconds_arg):
14:         thread.start_new_thread(self.record, (seconds_arg,'2. FOKS'))
15:     def channel_CBA(self, seconds_arg):
16:         thread.start_new_thread(self.record, (seconds_arg,'3. CBA'))
17:     def channel_ESTN(self, seconds_arg):
18:         thread.start_new_thread(self.record, (seconds_arg,'4. ESTN'))
19:     def record(self, seconds, channel):
20:         for i in range(seconds):
21:             time.sleep(0.0001)
22:         print "%s is recorded" % (channel)
23:
24: myVCR = VCR()
25:
26: myVCR.channel('1', 700)
27: myVCR.channel('2', 700)
28: myVCR.channel('3', 500)
29: myVCR.channel('4', 300)
30: time.sleep(5.0)

						

The time.sleep() function in line 21 is necessary to allow other threads to run. If you don't use this function, there will be no timing gap between commands to be used by the other threads.

Threading Module

Besides exposing all the functions from the thread module, this module also provides the following additional functions:

Threading.activeCount()—   This function returns the number of active thread objects.

Threading.currentThread()—   This function returns the thread object in current control.

Threading.enumerate()—   This function returns a list of all active thread objects.

Each Threading.Thread class object implements many methods, including

threadobj.start()—   This method invokes the run method.

threadobj.run()—   This method is called by the start method. You can redefine this one.

threadobj.join([timeout])—   This one waits for the threads to complete. The optional timeout argument must be provided in seconds.

threadobj.isAlive()—   Returns 1 if the run method of the thread object has concluded. If not, it returns 0.

In the next example, you want to subclass the Thread class, and define a new run method for the subclass. In order to activate the thread, you need to call the start() method, not the run() method. The start method creates the new thread that executes the run method.

							
import Threading
import time, random
class NewThread(Threading.Thread):
    def run(self):
        init = 0
        max = random.randint(1,10)
        while init < max:
            init = init + 1
            time.sleep(0.0001)
        print max

threads = []
for i in range(20):
    threadobj = NewThread()
    threadobj.start()
    threads.append(threadobj)

for thread in threads:
    thread.join()

print "---- THE END ----"

						

Just as a suggestion, try commenting the for loop near the end of the program. The reason for using it is to guarantee that all the threads are executed.

As final notes about this topic, I would like to highlight that

  • The processing time of a thread in a multithreaded program is equal to the CPU time of the program, divided by the number of threads that have been created. Well, that is an estimate because some threads might take a lot more CPU time than others.

  • Multithreaded programs have their data shared among all the threads, so it might cause race conditions (a state of inconsistent in a program). You have to be very careful when updating data used by multiple threads. Usually, the solution for this kind of problem is to lock the code before changing the data in order to keep all the threads synchronized.

For more information about threading, check out Python and Indirect Threading, by Vladimir Marangozov:

							
						http://starship.python.net/crew/vlad/archive/threaded_code/
					
						

Microthreads

If you are really thinking about diving into multitasking applications, another option that you should consider is called microthreads. It implements threading by tweaking the execution order of Python's virtual machine, rather than by interrupting the processor. The microthread approach is much newer and much less deeply tested, but it might be more straightforward for your application.

Simulations and high-volume mission critical applications typically prefer large numbers of lightweight threads. There is a Stackless Python implementation that implements lightweight microthreads (see http://www.stackless.com for more information).

With microthreads, all your simulation threads run within a single operating system thread. They are useful when you want to program many behaviors happening simultaneously. Simulations and games often want to model the simultaneous and independent behavior of many people, many businesses, many monsters, many physical objects, many spaceships, and so forth. With microthreads, you can code these behaviors as Python functions. Additionally, the microthread library includes a rich set of objects for interthread communication, synchronization, and execution control.

Tip

Keep in mind that you need to have the Stackless Python in order to use the microthread library.



Microthreads switch faster and use much less memory than OS threads. The restrictions on microthreads (not shared by OS threads) are that they will only provide context-switching within Python code, not within C or Fortran extensions, and they won't help you take advantage of multiple processors. Also, microthreads will not take advantage of multiple CPUs in a box.

You can run thousands of microthreads at the same time. However, microthreads can hang on some blocking I/O operations; they are so new that there isn't yet a lot of practical experience with which operations (input or output) are troublesome.

For details, check out Python Microthreads, by Christian Tismer and Will Ware:

						
					http://world.std.com/~wware/uthread.html
				
					


Last updated on 1/30/2002
Python Developer's Handbook, © 2002 Sams Publishing

< BACKMake Note | BookmarkCONTINUE >

Index terms contained in this section

applications
     MacPython
            thread support
classes
      queue
conditions
      race
functions
      time.sleep
handling
     threads
            interpreters
interpreters
      handling threads
MacPython
      thread support
microthreads
modules
      thread 2nd 3rd
programs
     MacPython
            thread support
queue class
race conditions
software
     MacPython
            thread support
Stackless Python
thread modules 2nd 3rd
threads 2nd 3rd 4th 5th 6th
time.sleep function
Windows
      thread support

© 2002, O'Reilly & Associates, Inc.