Fiam has posted a slightly controversial piece titled The Limits of Django. It is a great read, I totally enjoyed it. However, he has come up with some very shocking stats of Django ORM approach vs. C caching daemon (93 seconds vs. 0.8 second to query a database of 500,000 votes of 1,000 users on 5,000 stories. A few have suggested that his database/data model maybe to blame, I think there’s a great degree of truth to that. To try it out, I have created a sample database called django_test and populated it with a bunch of data like so:
-- users
create table users (id integer not null primary key);
-- stories
create table stories (id integer not null primary key);
-- votes
CREATE TABLE votes (
story_id integer not null
constraint story_id_fk references stories,
user_id integer not null
constraint user_id_fk references users,
score integer default 0);
create index user_story_idx on votes (story_id, user_id);
Populating with some random data:
#!/bin/bash
PSQL="/opt/local/bin/psql8 -q"
for ((i=1;i<=50000;i+=1))
do
�$PSQL -c "insert into stories (id) values ($i);" django_test
done
for ((i=9922;i<=10000;i+=1))
do
$PSQL -c "insert into stories (id) values ($i);" django_test
done
for ((i=1;i<=500000;i+=1))
do
score=$RANDOM
userid=$RANDOM
storyid=$RANDOM
let "score %= 100"
let "userid %= 1000"
let "storyid %= 10000"
$PSQL -c "insert into votes (user_id,story_id,score) values ($userid,$storyid,$score)" django_test
done
(Yes, this is very-very crude, I know)
Here are my results on PostgreSQL 8.1.3, running on a G4 PowerBook with 2GB of RAM under Mac OS X 10.5.2:
django_test=# select count(*) from users; select count (*) from stories; select count(*) from votes; count ------- 1000 (1 row) count ------- 10000 (1 row) count ------- 34625 (1 row)
I gave up waiting for population of 500K rows into votes table and run a few selects like the following one as it were (inserts were still running on background and I have not done VACUUM ANALYSE yet):
django_test=# timing
Timing is on.
django_test=# select distinct(story_id), score from votes where votes.user_id=123 order by votes.score desc;
story_id | score
----------+-------
3391 | 95
812 | 94
9215 | 93
9599 | 92
8510 | 85
220 | 84
8611 | 81
740 | 78
9193 | 74
2406 | 70
2201 | 61
2740 | 59
4500 | 52
5993 | 52
5692 | 51
2912 | 49
6566 | 49
2950 | 48
3149 | 47
36 | 45
1193 | 43
629 | 38
5720 | 36
9429 | 35
455 | 33
6628 | 32
2598 | 30
2623 | 28
2224 | 24
6794 | 24
2254 | 20
934 | 8
6267 | 8
3999 | 7
5879 | 3
(35 rows)
Time: 27.376 ms
That's less than 0.03 seconds. Not bad, eh?