converting multiple databases

Experiencing technical difficulties? Think you've found a problem with PokerTracker 3? Report it here.

Moderator: Moderators

Re: converting multiple databases

Postby Hercules » Thu Feb 21, 2008 10:59 am

[img7hj]http://img220.imageshack.us/img220/8821/pt3b4wx4.jpg[/img7hj]

1 more database to go. total hands should be >28,5M when it's done.

Clustering took over 7 hours when the database was 125GB (23M hands).
The database is 150GB now, so next clustering will probably 9 hours or something like that.
I'll make a backup when it's all done and post my observations.
Admins can get 24/7 access to the database by that time as well.
Hercules
 
Posts: 78
Joined: Fri Jan 25, 2008 2:38 pm

Re: converting multiple databases

Postby Josh » Thu Feb 21, 2008 12:23 pm

That's averaged out to about 5.5 GB per million hands. That's not too bad. If you don't need the observed hand histories (for replaying), you can delete them and it would probably drop the usage down to 4.5 GB or less per million hands.

It will be interesting to see PT3's retrieval speed when operating over the internet.
Josh
Site Admin
 
Posts: 6222
Joined: Sat Dec 08, 2007 6:03 pm

Re: converting multiple databases

Postby Josh » Thu Feb 21, 2008 12:31 pm

Speaking of database size, one of our competitors mentioned in a thread on a different forum that their database is much smaller than ours (1/3 in fact). This competitor also mentioned that he used "completely novel ideas to make it smaller and more efficient", but that "made it more complex." So basically, it looks like 3rd party developers are going to have to jump through hoops to add support for that product. The PT3 database is designed for fast imports, faster retrieves, and ease of use. 3rd party developers will have no problems using our well documented, easy to use database structure. Unfortunately, that means the database will be a little bit larger. Though I think the number he was quoting from came from earlier thoughts that PT3 used 10 GB per million hands.

Hercules, are you using PostgreSQL 8.2.5 or 8.3?
Josh
Site Admin
 
Posts: 6222
Joined: Sat Dec 08, 2007 6:03 pm

Re: converting multiple databases

Postby Hercules » Thu Feb 21, 2008 2:04 pm

Last week, when I accessed PT3 over the internet, I had the impression that it was almost at the same speed as over the local LAN here. But it had only 6 or 7 million hands at that time.
Unfortunately, the guy where I tried it isn't around this weekend, so I already decided to go find an unprotected WLAN and try it in a few hours. :)

I just noticed something funny from the 1st database I converted... It converted 63,524 hands there.
According to PT2, there are only 62,185 hands in that database, and I do remember purging 1300-ish hands from that PT2-database a while ago.

BTW: I am using PostgreSQL 8.3.
Hercules
 
Posts: 78
Joined: Fri Jan 25, 2008 2:38 pm

Re: converting multiple databases

Postby Josh » Thu Feb 21, 2008 2:23 pm

PT2 might not have removed the hand_histories records when it purged the hands.

Since the database server does most of the work for most of the queries, I can see how the speed would be similar. For queries that return larger result sets (like say, viewing thousands upon thousands of hands in the "Final Hands" details report), the speed may be reduced significantly, depending on your connection speed.
Josh
Site Admin
 
Posts: 6222
Joined: Sat Dec 08, 2007 6:03 pm

Re: converting multiple databases

Postby Hercules » Fri Feb 22, 2008 4:02 pm

Wheeee. Converting is finished.

28,910,224 hands are in the database, with 163,373 players.
The PT3-database is 155.34 GB, the old PT2-databases were 102 GB.

Converting started at 175 hands/second on this machine. During the last database it still managed to do 100 hands/second. To compare: PT2 did 25 hands/second here in an empty database, and 15 hands/second in a 1 million hand-database. Overall, PT3 seems to have a 600% speed increase. :)


Sometimes I ran a query before and after converting a database. I noticed a few times that exactly 100 hands were missing in the new database.

The error I reported in this thread last week happened several times. According to the hand-counts in the database, the hands were normally imported, while the error said the hands were not imported.

There are a lot of nicks which appear a few time in the playerlist. Some of them even 10-12 times. As far as I can tell, this only happens to nicks with at least 1 non-alphabetical character in it.

Like I said yesterday, PT3 finds hands that were purged in PT2. This is a cool feature. :)

When you try to look at all the tabs of a player with a LOT of hands (like 300,000), the "vs Player"-tab is sorta worthless. It can take a lot of time before all the data is found in the database.


Is there a different approach to clustering now, compared to the 1st time I attempted to build a huge kickass database?
I tried it with PT3beta3 and PostgreSQL 8.2.x the 1st time, and clustering temporarily blew up the database to twice it's size. Now, with PT3beta4 and PostgreSQL 8.3, it appears that clustering is done in chunks. This means PostgreSQL doesn't need all the diskspace it needed before. I don't think my database was ever over 200 GB while clustering.

The database is big, but looking up players is still quite fast. For players with >200K hands things get slower, but it's still acceptable. Just don't try to get info from the "vs Player"-tab. That's the only tab which is very slow.

Last time I clustered the database, it needed 9 hours. This is a lot.
I had PT3 convert a database a few days ago before I went to bed, and I had "cluster on completion" checked. When I woke up, it was all finished. After that, I had it convert a smaller database which would be finished before I had to leave for work. Unfortunately I forgot to uncheck the cluster checkbox. Clustering in progress clustering can't/shouldn't be stopped, so this meant that the database was basically going to be idle untill I got home from work. Such a waste of time :) maybe something needs to be done with the "Cluster on completion" checkbox.

Today I tried creating a backup in PostgreSQL. This failed several times. Within 10 minutes the backup was aborted:
[code4vk]pg_dump: reading indexes for table "settings"
pg_dump: reading constraints
pg_dump: reading triggers
pg_dump: reading dependency data
pg_dump: saving encoding = SQL_ASCII
pg_dump: saving standard_conforming_strings = off
pg_dump: saving database definition
pg_dump: dumping contents of table holdem_hand_histories
pg_dump: Dumping the contents of table "holdem_hand_histories" failed: PQgetCopyData() failed.
pg_dump: Error message from server: could not receive data from server: No buffer space available (0x00002747/10055)
pg_dump: The command was: COPY public.holdem_hand_histories (id_hand, history) TO stdout;
pg_dump: *** aborted because of error[/code4vk]

So there's no way for me to backup the database... Right now, I don't care about the database, but it would be nice if I could backup such a monster in the future :)


I'll PM the server/port/ip/password info to both Josh and APerfect 10... I won't do a lot to the database for a few days, so go ahead and do evil things in it. :)
The database will be there untill I have to rebuild it for beta5. I'll add some more hands on a daily basis.

I didn't have a lot of time yet, so I couldn't try the database (after it got huge) over an internet connection.
Hercules
 
Posts: 78
Joined: Fri Jan 25, 2008 2:38 pm

Re: converting multiple databases

Postby Josh » Fri Feb 22, 2008 4:36 pm

Wow, nice job. That's an impressive database you have there. :)

I'm glad to see the final size is less than 5.5 GB per million hands, and that clustering is done more efficiently in 8.3. I'm not sure what's going on with the duplicate names, but you may be onto something with the names with non-alphanumeric characters in them.

The vs Player tab does a complex query which will hit nearly every record in the _statistics table, so it's no surprise that with 28 million rows (over 200 million records in the _statistics table) the query takes a while.

As far as clustering goes, we plan on adding an auto-cluster feature, which will only cluster after X hands have been imported. So you can set it to whatever you want.

I imagine backing up a 155 GB database may be a huge problem. You may want to google this or ask the PostgreSQL developers/support if they have any recommendations.

I just connected to your database. It took less than two and a half minutes to load the player list: 163,373 players. Not bad! :) I'll save your database the headache of performing my "Fish List" custom report. At least for now! That might take a while...
Josh
Site Admin
 
Posts: 6222
Joined: Sat Dec 08, 2007 6:03 pm

Previous

Return to Technical Support / Bug Reports [Read Only]

Who is online

Users browsing this forum: No registered users and 58 guests

cron
highfalutin