is this for real? (import speed)

PostgreSQL is the database server used to store information. Do you have a question or are you having problem with PostgreSQL? If so, post them here.

Moderator: Moderators

Re: is this for real? (import speed)

Postby almaplayera » Thu Nov 19, 2009 3:33 pm

i'm referring to the case of a user with a computer similar to yours, with plenty of disk space and no other problems. you said clustering 2M took you 6 hours. i presume the import itself took about the same. and the cache update / vacumn / analyze probably 2-3 hours. so for 2M hands the total import & housekeeping took about 15 hours, right? so for 5M hands on that same PC, you would expect about 2.5x that time = 40+ hours, no?
almaplayera
 
Posts: 13
Joined: Sat Nov 14, 2009 4:28 am

Re: is this for real? (import speed)

Postby kraada » Thu Nov 19, 2009 3:39 pm

Let me quote so we have the numbers actually correct:

For the large import, a sample of 1,322,364 hands was used split across 8,946 files. All hands were taken from .25/.50 no limit full ring tables played at Full Tilt Poker. PT3 took 4 hours and 21 minutes to complete the import, for an import time of 84.4 hands per second. The full housekeeping took an additional 1 hour and 31 minutes. The total import and housekeeping time for PT3 was 5 hours and 52 minutes, or 62.6 hands per second. At the end of import and housekeeping the PT3 database was 11G.


I didn't write it down but of that housekeeping I believe something like 50 minutes was cluster and 35 minutes was updating the cache.

Assuming things progressed linearly, 1.3M hands = ~6hrs, so 5M hands is roughly 24 hours. But I did not do enough testing to be certain that the time increase is strictly linear. Thus 40 hours at the outside would not be outside the realm of possibility.
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Re: is this for real? (import speed)

Postby tpb221 » Fri Nov 20, 2009 2:28 am

Don't know if this will help or not. I've been trying to get 8.4 to work on win7 x64-still having trouble with the config file,but that's a different story. Anyways, I did house cleaning on a small DB before I left for work and when I came home 10 hrs later it was still running. Way to long for the size of the DB. When looking around to see what was wrong I opened the resource monitor and saw one of the postgres exe's status was 'terminated'. After I restarted, everything worked fine-db was done under a hr. Might want to check there. Haven't had the problem since.
tpb221
 
Posts: 14
Joined: Thu May 15, 2008 9:46 pm

Re: is this for real? (import speed)

Postby Debith » Tue Nov 24, 2009 12:13 pm

Hi!

I've got similar problem here.

My setup is:
server based on OpenSolaris and PostgreSQL 8.2 installed in it. There are 2GB of memory in that machine and 6 hard drives serialized for space.
I've made few optimizations for server (increased memory usage).
Connection is made through 100Mb switch, which limits the speed a bit, but...

Computer 1:
I've had clustering running now for 75 hours and it is still going. The hand throughput over the network during the import connection was ~40h/s, which is pretty low. Processor time and network bandwidth are still barely used at all. Database size currently is 100GB.

Computer 2:
Not able to import hands to db while computer 1 is importing hands (not clustering).

My questions:
How long I have to wait?
Is this clustering even needed, when dedicated server keeps the database?
How can I control, how much processor time and network bandwidth Computer 1 should use for clustering and hand import. (Only few % of both are used and practically they could be on maximum, where network bandwidth would be the limiting factor).
What would be optimal settings (for maximum client performance) in dedicated Postgres server for both kernel and postgres service?
Debith
 
Posts: 24
Joined: Tue Mar 04, 2008 2:31 pm

Re: is this for real? (import speed)

Postby kraada » Tue Nov 24, 2009 1:20 pm

Clustering is a PostgreSQL database process. For more detail on clustering, see here. Clustering your database will improve seek time.

Is there a reason you went with PostgreSQL 8.2 rather than 8.3?
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Re: is this for real? (import speed)

Postby Debith » Tue Nov 24, 2009 2:51 pm

kraada wrote:Clustering is a PostgreSQL database process. For more detail on clustering, see here. Clustering your database will improve seek time.

Is there a reason you went with PostgreSQL 8.2 rather than 8.3?

Yes, accident. :)
Debith
 
Posts: 24
Joined: Tue Mar 04, 2008 2:31 pm

Re: is this for real? (import speed)

Postby kraada » Tue Nov 24, 2009 3:13 pm

8.3 is going to be faster than 8.2, which is why I ask. From what I recall when 8.3 was newish, the speed boost was roughly 15%. We haven't done any testing on 8.4 yet though so I can't say for certain whether 8.4 offers any improvements.
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Re: is this for real? (import speed)

Postby Debith » Tue Nov 24, 2009 4:51 pm

kraada wrote:8.3 is going to be faster than 8.2, which is why I ask. From what I recall when 8.3 was newish, the speed boost was roughly 15%. We haven't done any testing on 8.4 yet though so I can't say for certain whether 8.4 offers any improvements.

Maybe, but still 15% would be far from enough in terms of improvement.
By making those optimizations, to which are also links in your forums, I gained something like 10x speed improvement. I'm just wondering, would it be possible to get the speed over the web to something like 200+h/s. Certainly, if even one core would be fully used in computer 1, the speed would be good. But the CPU is barely utilized at the moment not to mention the throughput from computer 1 to server.
Debith
 
Posts: 24
Joined: Tue Mar 04, 2008 2:31 pm

Re: is this for real? (import speed)

Postby kraada » Tue Nov 24, 2009 5:05 pm

The real bottleneck is the access time from the hard drive; people with RAID arrays have been known to get speeds in excess of 300h/sec.
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Re: is this for real? (import speed)

Postby Debith » Tue Nov 24, 2009 5:17 pm

kraada wrote:The real bottleneck is the access time from the hard drive; people with RAID arrays have been known to get speeds in excess of 300h/sec.

You lost me. :)

Server machine consists array 5 hard-drives, which is equivalent to RAID (but much better). Is it the server who does the clustering work, or the client? In my case, computer 2 has issued the clustering job to server. Of course it makes sense, that server would do all the work, but postgres server is using mere 1% of cpu power on server computer (which is dual-core). If this is the case, that server should be my worker, how can I tell it to start using those resources it has? :)

Thanks!
Debith
 
Posts: 24
Joined: Tue Mar 04, 2008 2:31 pm

PreviousNext

Return to PostgreSQL [Read Only]

Who is online

Users browsing this forum: No registered users and 26 guests

cron