is this for real? (import speed)

PostgreSQL is the database server used to store information. Do you have a question or are you having problem with PostgreSQL? If so, post them here.

Moderator: Moderators

is this for real? (import speed)

Postby almaplayera » Thu Nov 19, 2009 10:41 am

Image

92 hours and counting. this is on a server box running only PT3 & postgres with 2GB RAM and 15k rpm disk.

5 million hands is not a lot of hands! i want to import 50 million hands...like...regularly... i need to be able to deal with databases this size without so much waiting around.

i've seen the performance tuning stuff and i'll try to tweak some of the settings, but i imagine they have a minor impact. is anyone able to import millions of hands in a matter of hours rather than a matter of days/weeks? if so, you must be doing something fundamentally different than me, and i'd love to know what it is.
almaplayera
 
Posts: 13
Joined: Sat Nov 14, 2009 4:28 am

Re: is this for real? (import speed)

Postby kraada » Thu Nov 19, 2009 10:44 am

The last time I ran a manual import of 2M hands clustering took between 5 and 6 hours. I don't think my computer's specs are all that impressive, so there must be something else slowing things down for you. What sort of anti-virus/anti-spyware/firewall software are you using?
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Re: is this for real? (import speed)

Postby almaplayera » Thu Nov 19, 2009 10:46 am

no firewall, a/v, or any other software at all is installed. windows firewall is enabled... PT3 and postgresql are the only apps installed.
almaplayera
 
Posts: 13
Joined: Sat Nov 14, 2009 4:28 am

Re: is this for real? (import speed)

Postby kraada » Thu Nov 19, 2009 11:24 am

What version of PostgreSQL do you have installed? Also, do you still see files being created/edited/changed in the C:\Program Files\PostgreSQL\8.3\data\base directory?
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Re: is this for real? (import speed)

Postby almaplayera » Thu Nov 19, 2009 11:37 am

postgresql v8.4.1, and yes there is still plenty of write activity to that directory
almaplayera
 
Posts: 13
Joined: Sat Nov 14, 2009 4:28 am

Re: is this for real? (import speed)

Postby kraada » Thu Nov 19, 2009 12:01 pm

I'll look to see if I can find anything different about the 8.4.x cluster mechanism; my tests were using 8.3.x.

If there's still write activity, it's still really clustering so definitely let it finish. Oh, how much hard drive space do you have free?
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Re: is this for real? (import speed)

Postby almaplayera » Thu Nov 19, 2009 12:58 pm

the size of the db is about 40GB (btw, is that the expected size for a db w/ 7-8 million hands?). until a couple hours ago, there was only about 20gb of free disk space. i moved some files and now there is 44GB of free space.

i've read that not having enough space to make a complete copy of the db can slow clustering. but the import itself took 30+ hours for 4 million hands (+ 1 million duplicates). and even if the cluster was twice as fast with more free disk space, that means the whole process would be 60+ hours. and it still has to do vacumn, analyze, and cache update. supposedly the cache update can take as long as the cluster. i mean, this is ridiculous. i'm looking at 100+ hours to import 5 million hands? i need to be able to import 50 million in 5 hours.
almaplayera
 
Posts: 13
Joined: Sat Nov 14, 2009 4:28 am

Re: is this for real? (import speed)

Postby kraada » Thu Nov 19, 2009 1:39 pm

That wouldn't have been enough free space and certainly slowed things down for you.

Updating the cache should not take as long as cluster; it takes about the same amount of time on smaller databases, as databases get larger updating the cache is still the second longest function but it shouldn't take nearly 30 hours (but expect it to take a few hours).

We are still investigating what changes were made in the 8.4 revision of PostgreSQL to see if we can streamline anything on our end; last I heard there was nothing particularly useful there.

One thought for you, in terms of performance: If you make the move to SSDs, you won't need to run cluster at all. That would save you a ton of time, though SSDs are not cheap. To do back-of-the-envelope calculations, if your hands are full ring, 1M hands = 10G on disk. so 50M hands would be 500G on disk - that would be quite a lot of space in SSDs, though prices are getting cheaper all of the time.
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Re: is this for real? (import speed)

Postby almaplayera » Thu Nov 19, 2009 2:00 pm

thing is, the 15k disk i'm running on now performs way better than a 7200RPM disk. it's not as fast as SSD, but it's not that far away.

forget 50 million, just looking at 5 million: you said clustering 2M took 6 hours for you on an average PC. so i assume the import was about the same, and the cache update was maybe 2-3 hours? so a total of 15 hours for 2 million. so on an average PC, users should expect importing & housekeeping for 5 million hands to take upwards of 40 hours?
almaplayera
 
Posts: 13
Joined: Sat Nov 14, 2009 4:28 am

Re: is this for real? (import speed)

Postby kraada » Thu Nov 19, 2009 2:14 pm

No, I certainly wouldn't expect that.

But you already admitted that there was not enough free space on the disk, which distinctly slows things down by a large margin. Now that you've freed the space up things should go more quickly. Also, once the database has been clustered the benefit from the cluster will remain for future clusterings, so the next one wouldn't take nearly as long. The other normal reasons for slower results don't seem to apply (e.g. having an antivirus read every file that you open), and we certainly haven't changed anything; cluster is a PostgreSQL process, not a PT3 process.
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Next

Return to PostgreSQL [Read Only]

Who is online

Users browsing this forum: No registered users and 87 guests

cron
highfalutin