One of the simple user errors that keeps on cropping up is accidentally having multiple greenthreads reading from the same socket at the same time. It’s a simple thing to accidentally do; just create a shared resource that contains a socket and spawn at least two greenthreads to use it:
import eventlet
httplib2 = eventlet.import_patched('httplib2')
shared_resource = httplib2.Http()
def get_url():
resp, content = shared_resource.request("http://eventlet.net")
return content
p = eventlet.GreenPile()
p.spawn(get_url)
p.spawn(get_url)
results = list(p)
assert results[0] == results[1]
Running this with Eventlet 0.9.7 results in an httplib.IncompleteRead exception being raised. It’s because both calls to get_url are divvying up the data from the socket between them, and neither is getting the full picture. The IncompleteRead error is pretty hard to debug — you’ll have no idea why it’s doing that, and you’ll be frustrated.
What’s new in the tip of Eventlet’s trunk is that Eventlet itself will warn you with a clear error message when you try to do this. If you run the above code with development Eventlet (see sidebar for instructions on how to get it) you now get this error instead:
RuntimeError: Second simultaneous read on fileno 3 detected. Unless you really know what you're doing, make sure that only one greenthread can read any particular socket. Consider using a pools.Pool. If you do know what you're doing and want to disable this error, call eventlet.debug.hub_multiple_reader_prevention(False)
Cool, huh? A little clearer about what exactly is going wrong here. And if you really want to do multiple readers or multiple writers on the same socket simultaneously, there’s a way to disable the protection.
Of course, the fix for this particular toy example is to have a single instance of Http() for every greenthread:
import eventlet
httplib2 = eventlet.import_patched('httplib2')
def get_url():
resp, content = httplib2.Http().request("http://eventlet.net")
return content
p = eventlet.GreenPile()
p.spawn(get_url)
p.spawn(get_url)
results = list(p)
assert results[0] == results[1]
But you probably created that shared_resource because you wanted to reuse Http() instances between requests. So you need some other way to sharing connections. This is what pools.Pool objects are for! Use them like this:
from __future__ import with_statement
import eventlet
from eventlet import pools
httplib2 = eventlet.import_patched('httplib2')
httppool = pools.Pool()
httppool.create = httplib2.Http
def get_url():
with httppool.item() as http:
resp, content = http.request("http://eventlet.net")
return content
p = eventlet.GreenPile()
p.spawn(get_url)
p.spawn(get_url)
results = list(p)
assert results[0] == results[1]
The Pool class will guarantee that the Http instances are reused if possible, and that only one greenthread can access each at a time. If you’re looking for somewhat more advanced usage of this design pattern, take a look at the source code to Heroshi, a concurrent web crawler written on top of Eventlet.
