Python multiprocessing

May 2, 2010 by Oscar Andreasson · Leave a Comment
Filed under: Development, Linux

Sometimes I find a good time by writing small but easy to understand test programs to research specific behaviours in some language or another. Sometimes, the programs grows a bit wieldy and not so easy to understand, but all’s good that ends well. During the last year or so, I’ve grown more and more interested in python programming and finding it very enjoyable. There are some strange constructs that can be hard to wrap your head around, and sometimes I run into some very weird problems, but it’s not a big problem (imho).

My last forays has been into multiprocessing and how it behaves in Python. One of the features I had a hard time to wrap my head around since there are some syntactical weirdness that could use some addressing, or at least takes a bit of time to get your head around.

Multiprocessing requires all objects to be pickled and sent over to the running process by a pipe. This requires all objects to be picklable, including the instance methods etc.
Default implementation of pickling functions in python can’t handle instance methods, and hence some modifications needs to be done.
Correct parameters must be passed to callbacks and functions, via the apply_async. Failing to do so causes very strange errors to be reported.
Correct behaviour might be hard to predict since values are calculated at different times. This is especially true if your code has side effects.

The small but rather interesting testcode below explores and shows some of the interesting aspects mentioned above. Of special interest imho is the timing differences, it clearly shows what you get yourself into when doing multiprocessing.

#!/usr/bin/python
import multiprocessing

def _pickle_method(method):
    func_name = method.im_func.__name__
    obj = method.im_self
    cls = method.im_class
    return _unpickle_method, (func_name, obj, cls)

def _unpickle_method(func_name, obj, cls):
    for cls in cls.mro():
        try:
            func = cls.__dict__[func_name]
        except KeyError:
            pass
        else:
            break
    return func.__get__(obj, cls)

import copy_reg
import types
copy_reg.pickle(types.MethodType, 
    _pickle_method, 
    _unpickle_method)

class A:
    def __init__(self):
        print "A::__init__()"
        self.weird = "weird"

class B(object):
    def doAsync(self, lala):
        print "B::doAsync()"
        return lala**lala

    def callBack(self, result):
        print "B::callBack()"
        self.a.weird="wherio"
        print result

    def __init__(self, myA):
        print "B::__init__()"
        self.a = myA

def callback(result):
    print "callback result: " + str(result)

def func(x):
    print "func"
    return x**x

if __name__ == '__main__':

    pool = multiprocessing.Pool(2)
    a = A()
    b = B(a)
    print a.weird
    print "Starting"
    result1 = pool.apply_async(func, [4], callback=callback)
    result2 = pool.apply_async(b.doAsync, 
        [8], 
        callback=b.callBack)
    print a.weird
    print "result1: " + str(result1.get())
    print "result2: " + str(result2.get())
    print a.weird
    print "End"

The above code resulted in the following two runs, and if you look closely, the timing problems show up rather clearly. Things simply don’t happen in the order always expected when threading applications:

oan@laptop4:~$ ./multiprocessingtest.py
A::__init__()
B::__init__()
weird
Starting
weird
func
B::doAsync()
callback result: 256
B::callBack()
16777216result1: 256

result2: 16777216
wherio
End
oan@laptop4:~$ ./multiprocessingtest.py
A::__init__()
B::__init__()
weird
Starting
weird
func
B::doAsync()
callback result: 256
B::callBack()
16777216
result1: 256
result2: 16777216
wherio
End

One more warning is in order. A job that leaves via the multiprocessing.Pool, and then calls the callback function has a major effect that could take some getting used to. The callback is run in such a fashion that if a class was changed, the change has not taken place in the context of the callback.

Tags: Linux, multiprocessing, python

Meta

Categories
- 3D printing
- Android
- Communications
- Configuration Management
- Debian
- Development
- Economy
- Editors
- FPV
- Frozentux.net
- General
- Hardware
- Ipsysctl
- Iptables
- Linux
- Management
- Netfilter
- Personal
- Phone
- Plastic modelling
- Projects
- Robots
- Ubuntu
- Uncategorized
- Video
- Windows
Recent Posts
Tags

apt-get arduino c++ C/C++ DBus Debian Dpkg dpkg-buildroot eclipse embedded FFmpeg filesystem FPV github Guruplug hack HTML HTPC iMON iptables iptables tutorial JavaScript jenkins KC910 launchpad LG Linux lirc mirabox opencv php presentations pygtk python raspberry pi Remote Control server servo driver subversion sync trac Ubuntu USB XBMC yocto

Frozentux

Python multiprocessing

Share this:

Meta

Categories

Recent Posts

Tags