• Martin Thoma
  • Home
  • Categories
  • Tags
  • Archives
  • Support me

Gunicorn

Contents

  • Performance
    • Constant Response
    • I/O Bound
    • CPU Bound
  • Alternatives
  • Notes
  • See also
  • Sources
sync Async Workers tornado AsyncIO Workers
worker gevent eventlet gaiohttp gthread
core characteristic * Per request a process
* Blocks other requests until it is finished
* based on the Greenlet library
* Program-level threads
* based on the Greenlet library
* Program-level threads
*provides async I/O non-blocking design model
* ideal for handling long requests
* Per request a thread
workload * No long I/O
* No heavy CPU
* No requests to external pages
URL http://www.gevent.org/ https://eventlet.net/ https://aiohttp.readthedocs.io/en/stable/

Performance

Performance can be measured in at least three ways:

  • throughput (requests per second),
  • latency (minimum),
  • latency (average)

I use locust with a hatch rate so that in 10s all users are hatched to get some numbers. I use the following locustfile.py:

from locust import HttpLocust, TaskSet, task


class WebsiteTasks(TaskSet):
    @task
    def index(self):
        self.client.get("/")


class WebsiteUser(HttpLocust):
    task_set = WebsiteTasks
    min_wait = 5000
    max_wait = 15000

I had to execute ulimit -S -n 10240 in order to test the 10,000 users. Otherwise I got

[2019-05-26 12:25:03,499] pc07/ERROR/stderr: OSError: [Errno 16] Device or resource busy
[2019-05-26 12:25:03,499] pc07/ERROR/stderr:
[2019-05-26 12:25:03,499] pc07/ERROR/stderr: 2019-05-26T10:25:03Z
[2019-05-26 12:25:03,499] pc07/ERROR/stderr:
[2019-05-26 12:25:03,499] pc07/ERROR/stderr: (<ThreadPool at 0x7fc2e275d4a8 0/10/10 hub=<Hub at 0x7fc2e7c84780 thread_ident=0x140475181328192>>, <built-in function getaddrinfo>) failed with OSError

Constant Response

The following app.py is probably the simplest one possible.

from flask import Flask

app = Flask(__name__)


@app.route("/")
def hello():
    return "hello world!"

Let's first have a look at the median response time by the number of users:

Users Development Server gevent (1 worker, 1000 connections) gevent (4 workers, 1000 connections) gevent (12 workers, 1000 connections) sync (4 workers) gthread (4 workers)
10 8ms - 9ms; 1 Req/ s 8ms - 9ms; 1 Req/ s 8ms - 11ms; 1 Req/ s 6ms - 10ms; 1 Req/ s 7ms - 9ms; 1 Req/s 6ms - 12ms; 1 Req/s
100 10ms; 10 Req/s 8ms - 9ms; 10 Req/s 8ms - 9ms; 10 Req/s 8ms - 9ms; 10 Req/s 7ms - 9ms; 10 Req/s 8ms - 12ms; 10 Req/s
1000 12ms - 30ms; 100 Req/s 8ms - 12ms; 100 Req/s 7ms - 10ms; 100 Req/s 5ms - 12ms; 100 Req/s 6ms - 8ms; 100 Req/s 8ms - 11ms; 100 Req/s
10 000 150ms - 34000ms; 170 - 270 Req/s 8100ms - 41000ms; 220 - 440 Req/s 340ms - 11000ms; 200 - 625 Req/s 150ms - 96000ms; 130 - 410 Req/s 81ms - 101000ms; 220 - 750 Req/s 350ms - 93000ms; 270 - 533 Req/s

You can clearly see that there is no real difference up to 100 users. At 1000 users, the development server gets slow. gevent does not care about it. At 10,000 users, gevent gets slow. But the real difference was the 95-percentile response time. This varies over time. A lot.

I/O Bound

Let's say our workload was I/O bound. Maybe we need to request another API before we can answer. To simulate this, I let the request sleep for 1s before it answers:

import time

from flask import Flask

app = Flask(__name__)


@app.route("/")
def hello():
    time.sleep(1)
    return "hello world!"
Development Server gevent (1 worker, 1000 connections) gevent (4 workers, 1000 connections) gevent (12 workers, 1000 connections) sync (4 workers) gthread (4 workers)
1000 1000ms; 1000-1200ms 1000ms; 1000-1100ms 1000ms; 1000ms 1000ms; 1s-1.1ms 32s-37s;35s-97s Fails
10,000 3.2s-8.4ms;5.2ms-76s 2.4s-8.5s; 28s - 132s 2.8s-9.7s; 16s-290s 1.8s - 11s; 6.7s - 150s 29s - 131s;45s-183s Fails

CPU Bound

Let's say our workload was CPU bound:

import time

from flask import Flask

app = Flask(__name__)


@app.route("/")
def hello():
    time.sleep(1)
    return "hello world!"

Alternatives

  • uWSGI (docs )is both a protocol and an application server; the application server can serve uWSGI, FastCGI, and HTTP protocols. Seems to be used together with gevent.
  • Tornado: a scalable, non-blocking web server and web application framework

While I searched, I also found:

  • TWISTED WEB
  • CHERRYPY
  • WAITRESS
  • CHAUSSETTE
  • meinheld: a high performance asynchronous WSGI Web Server (based on picoev)

Notes

  • gevent_pywsgi and gevent_wsgi are the same.1
  • gevent uses the gunicorn event parser, while gevent_wsgi uses the event parser within itself.1
  • If you need to handle a high volume of concurrent requests and your application performs a lot of waiting on I/O (database, streaming responses, upstream requests, etc) then gevent can be a good choice.1
  • I didn't include Sanic, because it is less tested then Flask.[^2]

See also

  • 2018-01-30: Brief introduction about the types of worker in gunicorn and respective suitable scenario
  • 2015-01-20: Gunicorn Worker Types
  • 2017-01-17: Reddit-comment by desmoulinmichel
  • 2012-09-12: Benchmark uWSGI vs gunicorn for async workers by Ivan Dyedov

Sources


  1. Randall Leeds: Github, 2019-03-04 ↩↩↩

Published

Mai 24, 2019
by Martin Thoma

Category

Code

Tags

  • Flask 6
  • gunicorn 2

Contact

  • Martin Thoma - A blog about Code, the Web and Cyberculture
  • E-mail subscription
  • RSS-Feed
  • Privacy/Datenschutzerklärung
  • Impressum
  • Powered by Pelican. Theme: Elegant by Talha Mansoor