performance tuning - splitting data and logs

PostgreSQL is the database server used to store information. Do you have a question or are you having problem with PostgreSQL? If so, post them here.

Moderator: Moderators

performance tuning - splitting data and logs

Postby MiamiVIce80 » Fri Jan 02, 2009 9:56 pm

I checked your advices on using 2 HDDs for more performance. Actually i used the default install routine and had everything on a single hdd.

In "performance tuning 2 HDDs" u suggested to use "tablespace" to split postgres/logs like(pg_xog or pg_xlog) and the actual data.
I really like the idea of splitting, but my system is stable and running and i got a lot pf data that i dont want to move (using alter table or what ever). So i was thinking of an vice versa approach.

Why dont move the log files to a different drive just using symbolic links ? I run Vista 64bit and with NTFS u can build directory junctions what seems to be the same like symbolic links to directories in Unix.

I moved /8.3/data/pg_log and /8.3/data/pg_xlog to a different spindle and it already increased my performance.

I think of a another optimization, because i have a 3 hdd installed and i would like to try to split data, index and logs for maximum performance.

Unfortunately i´m not really used to postgres, maybe u can give me some advise. My postgres installation looks like this:

/Postgres/8.3/data/

subdirectories from /data/

- base -contains the actual data HDD1
- global -no idea what it is for - any hint? HDD1
- pc_clog - looks like empty and never used HDD1
- pc_log - log of databaseaction HDD2
- pg_multixact - looks like empty and never used HDD1
- pg_subtrans - looks like empty and never used HDD1
- pg_ tblspc - looks like empty and never used HDD1
- pc_twophase - looks like empty and never used HDD1
- pg_xlog - log of databaseaction HDD2

Any suggestions how to split these directories between 3 HDDs ?

Thx
MiamiVIce80
 
Posts: 24
Joined: Thu May 08, 2008 8:08 am

Re: performance tuning - splitting data and logs

Postby APerfect10 » Sat Jan 03, 2009 10:49 am

That is a very good observation and that should work.

I just found this which confirms your suggestion.

From http://momjian.us/main/writings/pgsql/h ... index.html
Multiple Disk Spindles

The disk head moves around quite a bit during database activity. If too many read/write requests are made, the drive can become saturated, causing poor performance. (Vmstat and sar can provide information on the amount of activity on each disk drive.)

One solution to disk saturation is to move some of the POSTGRESQL data files to other disks. Remember, moving the files to other filesystems on the same disk drive does not help. All filesystems on a drive use the same disk heads.

Database access can be spread across disk drives in several ways:

Moving Databases
initlocation allows you to create databases on different drives.
Moving Tables
Symbolic links allow you to move tables and indexes to different drives. Movement should only be done while POSTGRESQL is shut down. Also, POSTGRESQL doesn't know about the symbolic links, so if you delete the table and recreate it, it will be created in the default location for that database. In 7.1, pg_database.oid and pg_class.relfilenode map database, table, and index names to their numeric file names.
Moving Indexes
Symbolic links allow moving indexes to different drives from their heap tables. This allows an index scan to be performed on one disk while a second disk performs heap lookups.
Moving Joins
Symbolic links allow the movement of joined tables to separate disks. If tables A and B are joined, lookups of table A can be performed on one drive while lookups of table B can be done on a second drive.
Moving Write-Ahead Log
Symbolic links can be used to move the pg_xlog directory to a different disk drive. (Pg_xlog exists in POSTGRESQL releases 7.1 and later.) Unlike other writes, POSTGRESQL log writes must be flushed to disk before completing a transaction. The cache cannot be used to delay these writes. Having a separate disk for log writes allows the disk head to stay on the current log cylinder so writes can be performed without head movement delay. (You can use the postgres -F parameter to prevent log writes from being flushed to disk, but an operating system crash may require a restore from backup.)

Other options include the use of RAID disks to spread a single filesystem across several drives. Mirroring can slow down database writes, but will speed database reads because data can be retrieved from either drive. Many sites use RAID 1+0 or RAID 0+1 because it offers the advantage of striping for performance, and mirroring for reliability. RAID 5 is faster for six or more drives. Ideally, RAID 5 will have battery-backed cache so writes can be flushed to disk in an efficient manner rather than slowing down the application while writes hit the platters.


MiamiVIce80 wrote:subdirectories from /data/

- base -contains the actual data HDD1
- global -no idea what it is for - any hint? HDD1
- pc_clog - looks like empty and never used HDD1
- pc_log - log of databaseaction HDD2
- pg_multixact - looks like empty and never used HDD1
- pg_subtrans - looks like empty and never used HDD1
- pg_ tblspc - looks like empty and never used HDD1
- pc_twophase - looks like empty and never used HDD1
- pg_xlog - log of databaseaction HDD2


  • data/base = local database files
  • data/global = global database files containing cluster-wide tables such as pg_database
  • data/pg_clog = Transaction commit status data
  • data/pg_log = PostgreSQL server log files
  • data/pg_multixact = Multi-transaction status data used for row locks
  • data/pg_subtrans = sub-transaction status data
  • data/pg_tblspc = symbolic links to tablespaces
  • data/pg_twophase = state files for prepared transactions
  • data/pg_xlog = Write Ahead Log files
The tables that you state appear "empty and never used" are only used when the PostgreSQL service is running; when not running these directories will be empty.

Best regards,

Derek
APerfect10
Site Admin
 
Posts: 4463
Joined: Sat Dec 08, 2007 6:03 pm

Re: performance tuning - splitting data and logs

Postby LVgamb00ler » Mon Jan 05, 2009 11:39 am

I like MiamiVIce80's idea of changing the location of the pg_xlog. I looked in the postgreSQL documentation but could not find how to change the location in any of the configuration files. So, I researched on how to create the symbolic links.

I run WinXP Pro and I am pretty comfortable using the command line to make changes, so that is the route I took.

Here's what I did:
- found a simple tool to create the symbolic link (called a junction). Here's the URL to download if from Msoft.
http://technet.microsoft.com/en-us/sysinternals/bb896768.aspx
- downloaded the junction.exe, put it into the directory c:\program files\utilities
- shut down postgreSQL
- created a pg_xlog directory on the desired drive, in my case f:\pg_xlog (you can call it anything you want)
- copied the files and sub-directory from c:\program files\postgreSQL\data\pg_xlog to f:\pg_xlog then deleted them from their original location
- opened a command window (Start->Run enter cmd and hit return)
- typed the command "c:\program files\utilities\junction" "c:\program files\postgreSQL\data\pg_xlog" f:\pg_xlog (include the")
- typed exit to close the command window
- restarted postgreSQL

If this doesn't work
- if it is running, shutdown postgreSQL
- start another command window
- type cd "\program files\postgreSQL\data" (include ") and hit enter
- type "c:\program files\utilities\junction" pg_xlog
- if it says "No reparse points found." then for some reason the junction was not created, so recheck your work
- if it says "c:\postgreSQL\pg_xlog: JUNCTION", that means the junction was correctly created but something else is wrong. So now type "c:\program files\utilities\junction" -d pg_xlog. This will destroy the junction. Now, restore the files and sub-directories to the original pg_xlog location. You should be able to restart postgreSQL now.

If the above recovery instructions do not get you back up and running, don't panic, your data is still safe. You can remove and re-install postgreSQL. But, remember to skip the step that initializes the data storage (I think the installation calls it Initialzing the Cluster ?).


You may need to modify the above commands to reflect the EXACT locations of the junction.exe, the postgreSQL installation and new location for pg_xlog data.

G'luck all,
LVgamb00ler
LVgamb00ler
 
Posts: 32
Joined: Thu Apr 03, 2008 7:57 pm
Location: Las Vegas

Re: performance tuning - splitting data and logs

Postby kraada » Mon Jan 05, 2009 12:17 pm

Thanks for the detailed walkthrough; I'm going to sticky this thread so that it doesn't get lost.
kraada
Moderator
 
Posts: 54431
Joined: Wed Mar 05, 2008 2:32 am
Location: NY

Re: performance tuning - splitting data and logs

Postby MiamiVIce80 » Thu Jan 08, 2009 5:35 pm

I did what LVgamb00ler said.

Running Vista Buisness i used mklink. Not sure if it is available in every Vista Version.

http://en.wikipedia.org/wiki/NTFS_symbolic_link
MiamiVIce80
 
Posts: 24
Joined: Thu May 08, 2008 8:08 am

Re: performance tuning - splitting data and logs

Postby sonicman » Tue Jan 20, 2009 1:40 pm

how about a walkthrough using mlink? :D
sonicman
 
Posts: 70
Joined: Wed May 07, 2008 12:38 pm

Re: performance tuning - splitting data and logs

Postby MiamiVIce80 » Wed Jan 21, 2009 9:06 pm

sure, i can give it a try - but always be careful

1. stop postgres completly, check task-manager (postgres.exe and pg_ctl.exe should not run)
2. copy the complete folder X:\...\PostgreSQL\8.3\data\pg_xlog to the new spindle Z:\pg_xlog
3. rename X:\...\PostgreSQL\8.3\data\pg_xlog to X:\...\PostgreSQL\8.3\data\pg_xlog2
4. start cmd and go to X:\...\PostgreSQL\8.3\data\
5. run mklink: X:\...\PostgreSQL\8.3\data\mklink.exe /J X:\...\PostgreSQL\8.3\data\pg_xlog Z:\pg_xlog
6. typ 'dir' and check if a directory is created pg_xlog[Z:\pg_xlog]
7. restart Postgres and check if everything is running

If anything goes wrong:
1. stop postgres completly, check task-manager (postgres.exe and pg_ctl.exe should not run)
2. delete everything you created in "Z:\" and remove the junctions with "rd". Rename X:\...\PostgreSQL\8.3\data\pg_xlog2 to X:\...\PostgreSQL\8.3\data\pg_xlog and everything should be like before.
MiamiVIce80
 
Posts: 24
Joined: Thu May 08, 2008 8:08 am

Re: performance tuning - splitting data and logs

Postby Gendo » Wed Mar 25, 2009 8:38 pm

Nice thread.
The quote of APerfect10 is confusing me.
I just set up 3 HDDs. 1 with my system (fastest HDD) and 2 new.
I want to split everything as good as possible.
Any suggestions what should be moved to HDD2 and HDD3? What should stay on HDD1?

I have these files/folders:

base = local database files
global = global database files containing cluster-wide tables such as pg_database
pg_clog = Transaction commit status data
pg_multixact = Multi-transaction status data used for row locks
pg_subtrans = sub-transaction status data
pg_tblspc = symbolic links to tablespaces
pg_twophase = state files for prepared transactions
PG_VERSION
pg_xlog = Write Ahead Log files
postmaster.opts
postmaster.pid
root.crt
server.crt
server.key

base is of course the biggest part. :)
Gendo
 
Posts: 11
Joined: Mon Jan 28, 2008 8:41 am

Re: performance tuning - splitting data and logs

Postby WhiteRider » Thu Mar 26, 2009 5:36 am

You can't just move the files/folders, you need to follow the above directions.
The most important thing will be to put just your postgres data on a single drive on its own.
I'm sure someone who knows more about this stuff will give you more details..
WhiteRider
Moderator
 
Posts: 54017
Joined: Sat Jan 19, 2008 7:06 pm
Location: UK

Re: performance tuning - splitting data and logs

Postby Gendo » Thu Mar 26, 2009 5:41 am

Yes sorry if I was unclear. I will move the files and create symlinks as decribed above. Just want to know what's the most effective distrubution. :)
I already moved the "base" to a separate disk. pg_xlog seems to be the next biggest file/folder with 130MB for me. All other are just a few kb big (postgres running but not working).
Gendo
 
Posts: 11
Joined: Mon Jan 28, 2008 8:41 am

Next

Return to PostgreSQL [Read Only]

Who is online

Users browsing this forum: No registered users and 11 guests

cron