A Pool of Shotguns

A transparent connection pool for the Shotgun API in heavily threaded environments.

I have been working very heavily with Shotgun for the last several months, creating much deeper integrations between it and Western X's pipeline.

One of the things that bit me pretty early on is that the official Python API for Shotgun can not make parallel requests.

Under most conditions this isn't a big problem; the underlying connection would just serialize my threads' access to the Shotgun server, adding some latency, but it wasn't too bad. What was very irritating, however, was that a particular version of Python on OS X 10.6 would occasionally segfault during parallel requests. It took quite a few days of debugging Python in GDB (not a particularly easy prospect, especially since the problem was hard to reproduce) to isolate the problem to a bug in the ssl module's use of zlib to compress the request before sending it to the server.

(Aside, I highly recommend turning down the max_rpc_attempts from the default of 3, to 1. This silently dismissed many of the exceptions that eventually made debugging this problem much easier. In general, I like to fail as early as possible.)

My first attempt at fixing this problem was to fork the API and create primitive threading isolation via threading.local; each thread has its own connection (see the diff on GitHub). This immediately stopped the segfaults, and the overall throughput was approximately proportional to the number of concurrent requests I made.

Unfortunately, this was grafted onto an API with other design considerations, and pre-existing conventions no longer made a lot of sense (e.g. does the close method close the current connection, or all of them?). Ultimately, Shotgun Software decided that this was not the way they wanted to fix the problem.

Since it wasn't maintainable for us to keep using an unsupported fork of an official API, I implemented a ThreadLocalShotgun (see on Github) which mimicked the Shotgun interface, proxying attributes and methods to real Shotgun instances created on demand for each thread that used it.

Now I could use the new features of the API that have recently been rolled out, without having to reapply my patch with every update. There was still one more improvement to be made, however, since lots of little threads issuing only a single request would still incur a significant overhead as they were each opening their own collection.

Ergo, I just completed my ShotgunPool, which still creates Shotgun instances on demand, but recycles them leaving the connections open to the server for the next thread to use. Our most Shotgun-heavy UIs now run noticeably faster due to the lack of reopening the connection for every request. Usage is pretty simple:

>>> # Construct and wrap a Shotgun instance.
>>> shotgun = Shotgun(...)
>>> shotgun = ShotgunPool(shotgun)
>>> 
>>> # Use it like normal, except in parallel.
>>> shotgun.find('Task', ...)

It is also much simpler than other thread/connection pools that I have made previously, requiring no locks, and only a minor amount of effort to make sure it does the right thing; even if it fumbles a Shotgun instance and does not "release" it, it will get garbage collected and another one created on demand.

See the implementation on GitHub, read the docs, and please let me know if it helps you out at all.

Posted on February 20, 2013. Categories:

Python
/ Shotgun