Page 1 of 3

Suggested performance optimization (57x improvement for me)

PostPosted: Mon Nov 19, 2012 11:20 pm
by _dave_
As some others have pointed out, report performance in PT4 goes "off the cliff" for an unknown reason, when the postgresql query planner makes a poor estimate. As documented in this thread: https://www.pokertracker.com/forums/vie ... t=nestloop this can be fixed by forcing the planer to not used nestloop, which is amazing - but unfortunately causes unwanted side effects, namely a significant slowdown in hud performance.

Over recent weeks I have had to help many players suffering this problem, some with quite small databases. While other players I know do not have the issue with significantly larger databases. I do not know if it affects cash players (I suspect it does not, otherwise I imagine there would be far more noise).

I recently wanted to play some poker, HU Hyper on PokerStars, so I did. all was well. I then imported roughly 3k tournaments, and I myself am now hit by this devastating performance issue. For any reasonable player, this is a very small number of games! But it at least lets me examine the situation a little in detail. Also useful is pokertracker4's logging.

What happens is this - when opening Pokertracker, or refreshing T -> View Stats -> Results.

Graph appears almost instantly. It then takes approximately forever (5 minutes) for the report section to load. at around the same time, the overview table to the left of the graph fills in. Looking in the log, I can see which are the extremely slow queries, so I pared down one of the reports configuration to make it smaller (in the hope one of the stats would cause a speedup and I'd have a solution for my players - no), ran again, copied it out to an editor, formatted it and got to work in pgAdmin trying to find the problem.

And great success!

here is the before:
Code: Select all
SELECT
  amt_buyin,
  amt_bounty_per,
  tourney_currency,
  amt_fee,
  amt_buyin_fee,
  sum(amt_fee_curr_conv),
  id_gametype,
  count(distinct cnt_tourneys),
  sum(cnt_hands),
  sum(amt_won_curr_conv),
  sum(amt_buyin_ttl_curr_conv)
FROM
 ( SELECT
    (tourney_summary.amt_buyin) as "amt_buyin",
    (tourney_summary.amt_bounty) as "amt_bounty_per",
    (tourney_summary.currency) as "tourney_currency",
    (tourney_summary.amt_fee) as "amt_fee",
    ((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)) as "amt_buyin_fee",
    (( (case when(tourney_summary.val_curr_conv!=0) then  tourney_summary.val_curr_conv * tourney_summary.amt_fee else  0.0 end) )) as "amt_fee_curr_conv",
    (tourney_hand_player_statistics.id_gametype) as "id_gametype",
    ((tourney_summary.id_tourney)) as "cnt_tourneys",
    (sum((case when(tourney_hand_player_statistics.id_hand > 0) then  1 else  0 end))) as "cnt_hands",
    (((case when(tourney_summary.val_curr_conv != 0) then  tourney_summary.val_curr_conv * (tourney_results.amt_won + tourney_results.cnt_bounty * tourney_summary.amt_bounty) else  0.0 end))) as "amt_won_curr_conv",
    (((case when(tourney_summary.val_curr_conv != 0) then  tourney_summary.val_curr_conv * (tourney_summary.amt_buyin + tourney_summary.amt_fee + tourney_summary.amt_rebuy * tourney_results.cnt_rebuy + tourney_summary.amt_addon * tourney_results.cnt_addon + tourney_summary.amt_bounty) else  0.0 end))) as "amt_buyin_ttl_curr_conv"
   FROM
     tourney_hand_summary,
     tourney_hand_player_statistics ,
     tourney_results,
     tourney_summary
   WHERE 
     tourney_results.id_tourney = tourney_hand_player_statistics.id_tourney
     AND tourney_results.id_player = tourney_hand_player_statistics.id_player 
     AND (tourney_summary.id_tourney = tourney_hand_player_statistics.id_tourney) 
     AND (tourney_results.id_tourney = tourney_summary.id_tourney)   
     AND (tourney_hand_player_statistics.id_player = (SELECT id_player FROM player WHERE player_name_search='dave_72o'  AND id_site='100'))   
     AND (tourney_hand_summary.id_hand = tourney_hand_player_statistics.id_hand)         
     AND ((tourney_hand_summary.id_gametype = 1))
   GROUP BY
     tourney_results.cnt_addon,
     tourney_results.cnt_rebuy,
     tourney_results.cnt_bounty,
     tourney_results.amt_won,
     tourney_summary.amt_addon,
     tourney_summary.amt_rebuy,
     tourney_summary.id_tourney,
     tourney_summary.val_curr_conv,
     tourney_summary.amt_fee,
     tourney_summary.currency,
     tourney_summary.amt_bounty,
     tourney_summary.amt_buyin,
     (tourney_summary.amt_buyin),
     (tourney_summary.amt_bounty),
     (tourney_summary.currency),
     (tourney_summary.amt_fee),
     ((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)),
     (tourney_hand_player_statistics.id_gametype)
  ) AS tnySumGroup
GROUP BY
  amt_buyin,
  amt_bounty_per,
  tourney_currency,
  amt_fee,
  amt_buyin_fee,
  id_gametype



execution time: 75547ms, or 1 minute 15 seconds, on second+ run so cached if possible. As I run my postgresql externally, I observed the problem here - CPU usage is maxed out on postgres. Imo this explains the weird randomness between my players, some have more tourneys but a fast CPU so it's less noticeable. some have few tournies but slow CPU and a PT4 usage nightmare.

In EXPLAIN it shows it will use a lot of nested loop.

Here is the fixed version:
Code: Select all
SELECT
  amt_buyin,
  amt_bounty_per,
  tourney_currency,
  amt_fee,
  amt_buyin_fee,
  sum(amt_fee_curr_conv),
  id_gametype,
  count(distinct cnt_tourneys),
  sum(cnt_hands),
  sum(amt_won_curr_conv),
  sum(amt_buyin_ttl_curr_conv)
FROM
 ( SELECT
    (tourney_summary.amt_buyin) as "amt_buyin",
    (tourney_summary.amt_bounty) as "amt_bounty_per",
    (tourney_summary.currency) as "tourney_currency",
    (tourney_summary.amt_fee) as "amt_fee",
    ((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)) as "amt_buyin_fee",
    (( (case when(tourney_summary.val_curr_conv!=0) then  tourney_summary.val_curr_conv * tourney_summary.amt_fee else  0.0 end) )) as "amt_fee_curr_conv",
    (tourney_hand_player_statistics.id_gametype) as "id_gametype",
    ((tourney_summary.id_tourney)) as "cnt_tourneys",
    (sum((case when(tourney_hand_player_statistics.id_hand > 0) then  1 else  0 end))) as "cnt_hands",
    (((case when(tourney_summary.val_curr_conv != 0) then  tourney_summary.val_curr_conv * (tourney_results.amt_won + tourney_results.cnt_bounty * tourney_summary.amt_bounty) else  0.0 end))) as "amt_won_curr_conv",
    (((case when(tourney_summary.val_curr_conv != 0) then  tourney_summary.val_curr_conv * (tourney_summary.amt_buyin + tourney_summary.amt_fee + tourney_summary.amt_rebuy * tourney_results.cnt_rebuy + tourney_summary.amt_addon * tourney_results.cnt_addon + tourney_summary.amt_bounty) else  0.0 end))) as "amt_buyin_ttl_curr_conv"
   FROM
     tourney_hand_summary,
     tourney_hand_player_statistics ,
     tourney_results,
     tourney_summary
   WHERE 
     tourney_results.id_tourney = tourney_hand_player_statistics.id_tourney
     AND tourney_results.id_player = tourney_hand_player_statistics.id_player 
     AND (tourney_summary.id_tourney = tourney_hand_player_statistics.id_tourney) 
     AND (tourney_results.id_tourney = tourney_summary.id_tourney)   
     AND (tourney_hand_player_statistics.id_player = 2)   
     AND (tourney_hand_summary.id_hand = tourney_hand_player_statistics.id_hand)         
     AND ((tourney_hand_summary.id_gametype = 1))
   GROUP BY
     tourney_results.cnt_addon,
     tourney_results.cnt_rebuy,
     tourney_results.cnt_bounty,
     tourney_results.amt_won,
     tourney_summary.amt_addon,
     tourney_summary.amt_rebuy,
     tourney_summary.id_tourney,
     tourney_summary.val_curr_conv,
     tourney_summary.amt_fee,
     tourney_summary.currency,
     tourney_summary.amt_bounty,
     tourney_summary.amt_buyin,
     (tourney_summary.amt_buyin),
     (tourney_summary.amt_bounty),
     (tourney_summary.currency),
     (tourney_summary.amt_fee),
     ((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)),
     (tourney_hand_player_statistics.id_gametype)
  ) AS tnySumGroup
GROUP BY
  amt_buyin,
  amt_bounty_per,
  tourney_currency,
  amt_fee,
  amt_buyin_fee,
  id_gametype

This completes in an amazing 1312ms, or 1.3 seconds. a 57x speedup! Explain for this one shows only hash joins, much better!

In case it is not obvious what has changed in these queries - I removed the subquery:
Code: Select all
tourney_hand_player_statistics.id_player = (SELECT id_player FROM player WHERE player_name_search='dave_72o'  AND id_site='100')

and replaced with my known id_player:
Code: Select all
tourney_hand_player_statistics.id_player = (2)


So, I would like to suggest you please implement this technique in PokerTracker 4 :) It should not be too problematic to always query first the id_player, then use the now-known ID in place of the subquery when generating these SQL statements, I guess. I cannot promise this will actually work, when combined with filters etc (my tests are run with zero filters enabled), but it may make a world of difference to people having this issue. which cannot be a small number, I think it just gets extreme with a slow CPU. I imagine more likely a lot of people it takes 10 seconds for these queries to run, which may come down to nothing :)

Hope this is useful.

Re: Suggested performance optimization (57x improvement for

PostPosted: Tue Nov 20, 2012 4:23 am
by WhiteRider
Thanks, _dave_. I will, of course, make sure that the development team see this.

Re: Suggested performance optimization (57x improvement for

PostPosted: Tue Nov 20, 2012 9:24 am
by APerfect10
_dave_,

It sounds like your join_collapse_limit is set too low. Modify your postgresql.conf file and uncomment the "from_collapse_limit" and "join_collapse_limit" and set them both to 12 so that it looks like:

Code: Select all
from_collapse_limit = 12
join_collapse_limit = 12      # 1 disables collapsing of explicit

Then restart PostgreSQL and re-run the original query. If that doesn't fix the query plan then please send me your EXPLAIN ANALYZE results from the query, via the link below, so that I can look at it.

Create Support Ticket

Best regards,

Derek

Re: Suggested performance optimization (57x improvement for

PostPosted: Tue Nov 20, 2012 11:31 am
by Ypsi
dave has solved you problem and you can not see it,
You can help postgres, you query is not very nice, because you says to find player_id for every hand:

Code: Select all
SELECT
  amt_buyin,
  amt_bounty_per,
  tourney_currency,
  amt_fee,
  amt_buyin_fee,
  sum(amt_fee_curr_conv),
  id_gametype,
  count(distinct cnt_tourneys),
  sum(cnt_hands),
  sum(amt_won_curr_conv),
  sum(amt_buyin_ttl_curr_conv)
FROM
( SELECT
    tourney_hand_player_statistics.id_player as "id_player",
    (tourney_summary.amt_buyin) as "amt_buyin",
    (tourney_summary.amt_bounty) as "amt_bounty_per",
    (tourney_summary.currency) as "tourney_currency",
    (tourney_summary.amt_fee) as "amt_fee",
    ((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)) as "amt_buyin_fee",
    (( (case when(tourney_summary.val_curr_conv!=0) then  tourney_summary.val_curr_conv * tourney_summary.amt_fee else  0.0 end) )) as "amt_fee_curr_conv",
    (tourney_hand_player_statistics.id_gametype) as "id_gametype",
    ((tourney_summary.id_tourney)) as "cnt_tourneys",
    (sum((case when(tourney_hand_player_statistics.id_hand > 0) then  1 else  0 end))) as "cnt_hands",
    (((case when(tourney_summary.val_curr_conv != 0) then  tourney_summary.val_curr_conv * (tourney_results.amt_won + tourney_results.cnt_bounty * tourney_summary.amt_bounty) else  0.0 end))) as "amt_won_curr_conv",
    (((case when(tourney_summary.val_curr_conv != 0) then  tourney_summary.val_curr_conv * (tourney_summary.amt_buyin + tourney_summary.amt_fee + tourney_summary.amt_rebuy * tourney_results.cnt_rebuy + tourney_summary.amt_addon * tourney_results.cnt_addon + tourney_summary.amt_bounty) else  0.0 end))) as "amt_buyin_ttl_curr_conv"
   FROM
     tourney_hand_summary,
     tourney_hand_player_statistics ,
     tourney_results,
     tourney_summary
   WHERE 
     tourney_results.id_tourney = tourney_hand_player_statistics.id_tourney
     AND tourney_results.id_player =tourney_hand_player_statistics.id_player 
     AND (tourney_summary.id_tourney = tourney_hand_player_statistics.id_tourney) 
     AND (tourney_results.id_tourney = tourney_summary.id_tourney)   
     AND (tourney_hand_summary.id_hand = tourney_hand_player_statistics.id_hand)       
     AND ((tourney_hand_summary.id_gametype = 1))
   GROUP BY
     tourney_hand_player_statistics.id_player,
     tourney_results.cnt_addon,
     tourney_results.cnt_rebuy,
     tourney_results.cnt_bounty,
     tourney_results.amt_won,
     tourney_summary.amt_addon,
     tourney_summary.amt_rebuy,
     tourney_summary.id_tourney,
     tourney_summary.val_curr_conv,
     tourney_summary.amt_fee,
     tourney_summary.currency,
     tourney_summary.amt_bounty,
     tourney_summary.amt_buyin,
     (tourney_summary.amt_buyin),
     (tourney_summary.amt_bounty),
     (tourney_summary.currency),
     (tourney_summary.amt_fee),
     ((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)),
     (tourney_hand_player_statistics.id_gametype)
  ) AS tnySumGroup

  where tnySumGroup.id_player IN (SELECT id_player FROM player WHERE player_name_search='xxxx'  AND id_site='100')
GROUP BY
  amt_buyin,
  amt_bounty_per,
  tourney_currency,
  amt_fee,
  amt_buyin_fee,
  id_gametype



next time i will write invice to you for 0.5 h hour

Re: Suggested performance optimization (57x improvement for

PostPosted: Tue Nov 20, 2012 11:55 am
by APerfect10
Ypsi wrote:dave has solved you problem and you can not see it,
You can help postgres, you query is not very nice, because you says to find player_id for every hand:

next time i will write invice to you for 0.5 h hour


The amount of time it takes to obtain the player id is milliseconds and is not the problem.

-Derek

Re: Suggested performance optimization (57x improvement for

PostPosted: Tue Nov 20, 2012 11:59 am
by Ypsi
APerfect10 wrote:
Ypsi wrote:dave has solved you problem and you can not see it,
You can help postgres, you query is not very nice, because you says to find player_id for every hand:

next time i will write invice to you for 0.5 h hour


The amount of time it takes to obtain the player id is milliseconds and is not the problem.

-Derek



i have posted my queries that executed only 1 sec, you query need 5 min. If you cannot see it, i cant help you

Re: Suggested performance optimization (57x improvement for

PostPosted: Tue Nov 20, 2012 12:07 pm
by APerfect10
Ypsi wrote:i have posted my queries that executed only 1 sec, you query need 5 min. If you cannot see it, i cant help you


If only the world was so black and white...

We are only going to fix issues -- not provide hacks that break or degrade other functionality. We want to fix the issue correctly.

-Derek

Re: Suggested performance optimization (57x improvement for

PostPosted: Tue Nov 20, 2012 12:10 pm
by Ypsi
APerfect10 wrote:
Ypsi wrote:i have posted my queries that executed only 1 sec, you query need 5 min. If you cannot see it, i cant help you


If only the world was so black and white...

We are only going to fix issues -- not provide hacks that break or degrade other functionality. We want to fix the issue correctly.

-Derek


sorry, but how my query will break you functionality ? i dont speak about nested loops

Re: Suggested performance optimization (57x improvement for

PostPosted: Tue Nov 20, 2012 12:24 pm
by Ypsi
APerfect10 wrote:
The amount of time it takes to obtain the player id is milliseconds and is not the problem.

-Derek


if its right
0.001 sec for calculating player_id for each(!) hand, i have 100.000 hands, its 100sec

APerfect10 wrote:
is the problem.

-Derek

fyp

Re: Suggested performance optimization (57x improvement for

PostPosted: Tue Nov 20, 2012 12:31 pm
by APerfect10
Ypsi wrote:sorry, but how my query will break you functionality ? i dont speak about nested loops


The only suggestion that you have made (that I can remember or find) is to disable nested loops. Nested loops are not bad and in fact are faster in many scenarios.

No database engineer would ever consider disabling nested loops. It should only be disabled for testing to find the root cause of the problem which is what we doing.

Ypsi wrote:if its right
0.001 sec for calculating player_id for each(!) hand, i have 100.000 hands, its 100sec


Completely incorrect. It's milliseconds per query -- not per hand. The amount of overhead needed to join it to 1 hand versus 1 billion hands is not linear.

-Derek