Methodology
To try to make this as realistic as possible, I took an actual page from a
website and made it portable across both MySQL and Postgres. This basically
meant replacing all mysql_query() calls with pg_exec(). This page involves
a lot of selects and joins, as probably most pages on a typical website do.
website and made it portable across both MySQL and Postgres. This basically
meant replacing all mysql_query() calls with pg_exec(). This page involves
a lot of selects and joins, as probably most pages on a typical website do.
Once the test page was up and debugged, I then ran “ab”, the “Apache Benchmarking”
utility, from my workstation across my 100-mbit LAN to the quad-xeon machine. To
get an idea of scalability under load, I varied the “concurrent connections”
on ab from 10-120, while leaving the number of page views steady at 1000.
utility, from my workstation across my 100-mbit LAN to the quad-xeon machine. To
get an idea of scalability under load, I varied the “concurrent connections”
on ab from 10-120, while leaving the number of page views steady at 1000.
To more closely simulate real-world use, I set up a random-number generator in
the script that inserts a row into the database on 10% of the page views. My
own numbers on PHPBuilder show that about 10% of all pages in the discussion
forums are for posting new messages.
the script that inserts a row into the database on 10% of the page views. My
own numbers on PHPBuilder show that about 10% of all pages in the discussion
forums are for posting new messages.
Further, as mentioned above, I used real data from a production database. You
can’t get a whole lot more realistic than this scenario.
can’t get a whole lot more realistic than this scenario.
The Numbers
The most interesting thing about my test results was to see how much of a load
Postgres could withstand before giving any errors. In fact, Postgres seemed to
scale 3 times higher than MySQL before giving any errors at all. MySQL
begins collapsing at about 40-50 concurrent connections, whereas Postgres
handily scaled to 120 before balking. My guess is, that Postgres could have gone
far past 120 connections with enough memory and CPU.
Postgres could withstand before giving any errors. In fact, Postgres seemed to
scale 3 times higher than MySQL before giving any errors at all. MySQL
begins collapsing at about 40-50 concurrent connections, whereas Postgres
handily scaled to 120 before balking. My guess is, that Postgres could have gone
far past 120 connections with enough memory and CPU.
On the surface, this can appear to be a huge win for Postgres, but if you
look at the results in more detail, you’ll see that Postgres took up to 2-3
times longer to generate each page, so it needs to scale 2-3 times higher
just to break even with MySQL. So in terms of max numbers of pages generated
concurrently without giving errors, it’s pretty much a dead heat between
the two databases. In terms of generating one page at a time, MySQL does it
up to 2-3 times faster.
look at the results in more detail, you’ll see that Postgres took up to 2-3
times longer to generate each page, so it needs to scale 2-3 times higher
just to break even with MySQL. So in terms of max numbers of pages generated
concurrently without giving errors, it’s pretty much a dead heat between
the two databases. In terms of generating one page at a time, MySQL does it
up to 2-3 times faster.
Another interesting point was that MySQL crumbles faster in the “10%
insert” test described above. Research reveals that MySQL locks the entire
table when an insert occurs, while Postgres has a pretty nifty “better than
row-level locking” feature. This difference quickly causes MySQL to pile up concurrent
connections and thus collapse. The same is true if you are doing a large select
out of a database while another process is inserting into that table. Postgres
is completely unfazed, while MySQL piles up connections until it falls apart
like a house of cards.
insert” test described above. Research reveals that MySQL locks the entire
table when an insert occurs, while Postgres has a pretty nifty “better than
row-level locking” feature. This difference quickly causes MySQL to pile up concurrent
connections and thus collapse. The same is true if you are doing a large select
out of a database while another process is inserting into that table. Postgres
is completely unfazed, while MySQL piles up connections until it falls apart
like a house of cards.
For those of you wondering about persistent connections in PHP, they don’t
appear to benefit MySQL that much, whereas they are a clear boon for Postgres.
In fact, Postgres benchmarked as much as 30% faster just by using persistent
connections. That tells me that Postgres has a tremendous amount of overhead in
its connection-opening and authentication process. Some of this may be the
fault of Linux and its relatively lame process scheduler. Still, MySQL on the
same box beat it handily no matter how you look at it.
appear to benefit MySQL that much, whereas they are a clear boon for Postgres.
In fact, Postgres benchmarked as much as 30% faster just by using persistent
connections. That tells me that Postgres has a tremendous amount of overhead in
its connection-opening and authentication process. Some of this may be the
fault of Linux and its relatively lame process scheduler. Still, MySQL on the
same box beat it handily no matter how you look at it.