Archive for July, 2007

Juggernaut is Not a Good Idea

July 31, 2007

This article has been moved: http://www.orbited.org/blog/2007/08/juggernaut-is-a-bad-idea/

Pyevent + GIL Woes

July 31, 2007

For the first time I ran into my first problem with CPython’s GIL. I’ve heard many people complain about it, but I always figured “Whats the big deal?” After all, you can just create a multi-process application if you want to take advantage of multiple processors. And in fact, the GIL is all that saves my Core Duo Laptop when I play Galcon, as it utilizes 100% of whatever processor its given. When I play on my Pentium M the computer becomes unusable. On the Core Duo I’ve still got a spare processor lying around that python can’t get it.

But as for the problem at hand, I wrote a simple scgi daemon using pyevent. In case you haven’t heard of pyevent, its a python wrapper for the libevent newtork IO library. Libevent is fast and scales very well. Anyway, I wrote the daemon using pyevent and it worked great. But the app I was writing needed database access, so I figured I’d create a thread pool and dispatch jobs via a worker queue. I coded it up and gave it a try.

Instead of the great Async IO, threaded dispatch application that I expected, Only the Main thread actually ran. This seemed strange to me because the other threads all started up, but they just never got any cpu time. More perplexing was that after I shut down the main thread, the other threads suddenly started running, dispatching jobs from the work queue. The whole application shut down when they were done.

After a couple discouraging hours, I found that I could get the approximate behaviour I wanted by setting a timeout event in pyevent such that time.sleep would be called from the pyevent code.

def idle():
    time.sleep(0.01)

event.timeout(0.02, idle)

This works, but it isn’t ideal. Every 0.02 seconds the io thread will give way to the db threads for 0.01 seconds. This means that I’ve manually set a priority between the two. I actually have no idea which will end up using more cpu, so it will be very hard to choose a balance that works.

I plan on updating the code to use a second process for the database access. Its probably a better idea anyway because then I’ll be taking advantage of two processors.

Now I understand at least one way in which the GIL can be annoying. I started looking into updating the pyevent code to release the GIL, but I don’t know enough about the inner workings of libevent or python at this point to make the appropriate modifications. Perhaps someone else has some insight

Blog Started

July 18, 2007

I’ve decided that I encounter enough blog-worthy material each week that I should blog about it.