Suggested performance optimization (57x improvement for me)
Posted: Mon Nov 19, 2012 11:20 pm
As some others have pointed out, report performance in PT4 goes "off the cliff" for an unknown reason, when the postgresql query planner makes a poor estimate. As documented in this thread: https://www.pokertracker.com/forums/vie ... t=nestloop this can be fixed by forcing the planer to not used nestloop, which is amazing - but unfortunately causes unwanted side effects, namely a significant slowdown in hud performance.
Over recent weeks I have had to help many players suffering this problem, some with quite small databases. While other players I know do not have the issue with significantly larger databases. I do not know if it affects cash players (I suspect it does not, otherwise I imagine there would be far more noise).
I recently wanted to play some poker, HU Hyper on PokerStars, so I did. all was well. I then imported roughly 3k tournaments, and I myself am now hit by this devastating performance issue. For any reasonable player, this is a very small number of games! But it at least lets me examine the situation a little in detail. Also useful is pokertracker4's logging.
What happens is this - when opening Pokertracker, or refreshing T -> View Stats -> Results.
Graph appears almost instantly. It then takes approximately forever (5 minutes) for the report section to load. at around the same time, the overview table to the left of the graph fills in. Looking in the log, I can see which are the extremely slow queries, so I pared down one of the reports configuration to make it smaller (in the hope one of the stats would cause a speedup and I'd have a solution for my players - no), ran again, copied it out to an editor, formatted it and got to work in pgAdmin trying to find the problem.
And great success!
here is the before:
execution time: 75547ms, or 1 minute 15 seconds, on second+ run so cached if possible. As I run my postgresql externally, I observed the problem here - CPU usage is maxed out on postgres. Imo this explains the weird randomness between my players, some have more tourneys but a fast CPU so it's less noticeable. some have few tournies but slow CPU and a PT4 usage nightmare.
In EXPLAIN it shows it will use a lot of nested loop.
Here is the fixed version:
This completes in an amazing 1312ms, or 1.3 seconds. a 57x speedup! Explain for this one shows only hash joins, much better!
In case it is not obvious what has changed in these queries - I removed the subquery:
and replaced with my known id_player:
So, I would like to suggest you please implement this technique in PokerTracker 4 It should not be too problematic to always query first the id_player, then use the now-known ID in place of the subquery when generating these SQL statements, I guess. I cannot promise this will actually work, when combined with filters etc (my tests are run with zero filters enabled), but it may make a world of difference to people having this issue. which cannot be a small number, I think it just gets extreme with a slow CPU. I imagine more likely a lot of people it takes 10 seconds for these queries to run, which may come down to nothing
Hope this is useful.
Over recent weeks I have had to help many players suffering this problem, some with quite small databases. While other players I know do not have the issue with significantly larger databases. I do not know if it affects cash players (I suspect it does not, otherwise I imagine there would be far more noise).
I recently wanted to play some poker, HU Hyper on PokerStars, so I did. all was well. I then imported roughly 3k tournaments, and I myself am now hit by this devastating performance issue. For any reasonable player, this is a very small number of games! But it at least lets me examine the situation a little in detail. Also useful is pokertracker4's logging.
What happens is this - when opening Pokertracker, or refreshing T -> View Stats -> Results.
Graph appears almost instantly. It then takes approximately forever (5 minutes) for the report section to load. at around the same time, the overview table to the left of the graph fills in. Looking in the log, I can see which are the extremely slow queries, so I pared down one of the reports configuration to make it smaller (in the hope one of the stats would cause a speedup and I'd have a solution for my players - no), ran again, copied it out to an editor, formatted it and got to work in pgAdmin trying to find the problem.
And great success!
here is the before:
- Code: Select all
SELECT
amt_buyin,
amt_bounty_per,
tourney_currency,
amt_fee,
amt_buyin_fee,
sum(amt_fee_curr_conv),
id_gametype,
count(distinct cnt_tourneys),
sum(cnt_hands),
sum(amt_won_curr_conv),
sum(amt_buyin_ttl_curr_conv)
FROM
( SELECT
(tourney_summary.amt_buyin) as "amt_buyin",
(tourney_summary.amt_bounty) as "amt_bounty_per",
(tourney_summary.currency) as "tourney_currency",
(tourney_summary.amt_fee) as "amt_fee",
((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)) as "amt_buyin_fee",
(( (case when(tourney_summary.val_curr_conv!=0) then tourney_summary.val_curr_conv * tourney_summary.amt_fee else 0.0 end) )) as "amt_fee_curr_conv",
(tourney_hand_player_statistics.id_gametype) as "id_gametype",
((tourney_summary.id_tourney)) as "cnt_tourneys",
(sum((case when(tourney_hand_player_statistics.id_hand > 0) then 1 else 0 end))) as "cnt_hands",
(((case when(tourney_summary.val_curr_conv != 0) then tourney_summary.val_curr_conv * (tourney_results.amt_won + tourney_results.cnt_bounty * tourney_summary.amt_bounty) else 0.0 end))) as "amt_won_curr_conv",
(((case when(tourney_summary.val_curr_conv != 0) then tourney_summary.val_curr_conv * (tourney_summary.amt_buyin + tourney_summary.amt_fee + tourney_summary.amt_rebuy * tourney_results.cnt_rebuy + tourney_summary.amt_addon * tourney_results.cnt_addon + tourney_summary.amt_bounty) else 0.0 end))) as "amt_buyin_ttl_curr_conv"
FROM
tourney_hand_summary,
tourney_hand_player_statistics ,
tourney_results,
tourney_summary
WHERE
tourney_results.id_tourney = tourney_hand_player_statistics.id_tourney
AND tourney_results.id_player = tourney_hand_player_statistics.id_player
AND (tourney_summary.id_tourney = tourney_hand_player_statistics.id_tourney)
AND (tourney_results.id_tourney = tourney_summary.id_tourney)
AND (tourney_hand_player_statistics.id_player = (SELECT id_player FROM player WHERE player_name_search='dave_72o' AND id_site='100'))
AND (tourney_hand_summary.id_hand = tourney_hand_player_statistics.id_hand)
AND ((tourney_hand_summary.id_gametype = 1))
GROUP BY
tourney_results.cnt_addon,
tourney_results.cnt_rebuy,
tourney_results.cnt_bounty,
tourney_results.amt_won,
tourney_summary.amt_addon,
tourney_summary.amt_rebuy,
tourney_summary.id_tourney,
tourney_summary.val_curr_conv,
tourney_summary.amt_fee,
tourney_summary.currency,
tourney_summary.amt_bounty,
tourney_summary.amt_buyin,
(tourney_summary.amt_buyin),
(tourney_summary.amt_bounty),
(tourney_summary.currency),
(tourney_summary.amt_fee),
((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)),
(tourney_hand_player_statistics.id_gametype)
) AS tnySumGroup
GROUP BY
amt_buyin,
amt_bounty_per,
tourney_currency,
amt_fee,
amt_buyin_fee,
id_gametype
execution time: 75547ms, or 1 minute 15 seconds, on second+ run so cached if possible. As I run my postgresql externally, I observed the problem here - CPU usage is maxed out on postgres. Imo this explains the weird randomness between my players, some have more tourneys but a fast CPU so it's less noticeable. some have few tournies but slow CPU and a PT4 usage nightmare.
In EXPLAIN it shows it will use a lot of nested loop.
Here is the fixed version:
- Code: Select all
SELECT
amt_buyin,
amt_bounty_per,
tourney_currency,
amt_fee,
amt_buyin_fee,
sum(amt_fee_curr_conv),
id_gametype,
count(distinct cnt_tourneys),
sum(cnt_hands),
sum(amt_won_curr_conv),
sum(amt_buyin_ttl_curr_conv)
FROM
( SELECT
(tourney_summary.amt_buyin) as "amt_buyin",
(tourney_summary.amt_bounty) as "amt_bounty_per",
(tourney_summary.currency) as "tourney_currency",
(tourney_summary.amt_fee) as "amt_fee",
((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)) as "amt_buyin_fee",
(( (case when(tourney_summary.val_curr_conv!=0) then tourney_summary.val_curr_conv * tourney_summary.amt_fee else 0.0 end) )) as "amt_fee_curr_conv",
(tourney_hand_player_statistics.id_gametype) as "id_gametype",
((tourney_summary.id_tourney)) as "cnt_tourneys",
(sum((case when(tourney_hand_player_statistics.id_hand > 0) then 1 else 0 end))) as "cnt_hands",
(((case when(tourney_summary.val_curr_conv != 0) then tourney_summary.val_curr_conv * (tourney_results.amt_won + tourney_results.cnt_bounty * tourney_summary.amt_bounty) else 0.0 end))) as "amt_won_curr_conv",
(((case when(tourney_summary.val_curr_conv != 0) then tourney_summary.val_curr_conv * (tourney_summary.amt_buyin + tourney_summary.amt_fee + tourney_summary.amt_rebuy * tourney_results.cnt_rebuy + tourney_summary.amt_addon * tourney_results.cnt_addon + tourney_summary.amt_bounty) else 0.0 end))) as "amt_buyin_ttl_curr_conv"
FROM
tourney_hand_summary,
tourney_hand_player_statistics ,
tourney_results,
tourney_summary
WHERE
tourney_results.id_tourney = tourney_hand_player_statistics.id_tourney
AND tourney_results.id_player = tourney_hand_player_statistics.id_player
AND (tourney_summary.id_tourney = tourney_hand_player_statistics.id_tourney)
AND (tourney_results.id_tourney = tourney_summary.id_tourney)
AND (tourney_hand_player_statistics.id_player = 2)
AND (tourney_hand_summary.id_hand = tourney_hand_player_statistics.id_hand)
AND ((tourney_hand_summary.id_gametype = 1))
GROUP BY
tourney_results.cnt_addon,
tourney_results.cnt_rebuy,
tourney_results.cnt_bounty,
tourney_results.amt_won,
tourney_summary.amt_addon,
tourney_summary.amt_rebuy,
tourney_summary.id_tourney,
tourney_summary.val_curr_conv,
tourney_summary.amt_fee,
tourney_summary.currency,
tourney_summary.amt_bounty,
tourney_summary.amt_buyin,
(tourney_summary.amt_buyin),
(tourney_summary.amt_bounty),
(tourney_summary.currency),
(tourney_summary.amt_fee),
((tourney_summary.amt_buyin) + (tourney_summary.amt_fee)),
(tourney_hand_player_statistics.id_gametype)
) AS tnySumGroup
GROUP BY
amt_buyin,
amt_bounty_per,
tourney_currency,
amt_fee,
amt_buyin_fee,
id_gametype
This completes in an amazing 1312ms, or 1.3 seconds. a 57x speedup! Explain for this one shows only hash joins, much better!
In case it is not obvious what has changed in these queries - I removed the subquery:
- Code: Select all
tourney_hand_player_statistics.id_player = (SELECT id_player FROM player WHERE player_name_search='dave_72o' AND id_site='100')
and replaced with my known id_player:
- Code: Select all
tourney_hand_player_statistics.id_player = (2)
So, I would like to suggest you please implement this technique in PokerTracker 4 It should not be too problematic to always query first the id_player, then use the now-known ID in place of the subquery when generating these SQL statements, I guess. I cannot promise this will actually work, when combined with filters etc (my tests are run with zero filters enabled), but it may make a world of difference to people having this issue. which cannot be a small number, I think it just gets extreme with a slow CPU. I imagine more likely a lot of people it takes 10 seconds for these queries to run, which may come down to nothing
Hope this is useful.