IPP Software Navigation Tools IPP Links Communication Pan-STARRS Links
wiki:DatabaseBackups

Version 20 (modified by Serge CHASTEL, 15 years ago) ( diff )

--

Notes on the IPP Database Backup scheme

(Up to IPP for PS1)

(Last update: 2011-04-07)

The database on the replication slave is backed up twice a day at 0:30 and 12:30 (see the local crontab). Dumping the whole database takes about 3 hours.

Every day at 03:05 and 15:05 the last nebulous dump is copied on ipp001 by the neb_copy.sh script running on ipp001 (documentation is in the script). neb_copy.sh verifies that the copy is fine otherwise sends an e-mail (while in dev mode, only to me, i.e. serge).

  • gpc1 on ippdb01 is replicated on ippc02.

The database on the replication slave is backed up every 4 hours (at 0:05, 4:05, 8:05, 12:05, 16:05, and 20:05). Dumping the database takes about one hour (TODO: maybe shorter. When I looked at that ippc02 was doing another backup that I didn't want to stop). The script responsible for the dump (gpc1_dump) is also responsible for the copy onto ipp001 (it is then different from the previous one).

Every 4 hours (at 1:30, 5:30, 9:30, 13:30, 17:30, and 21:30), gpc1 is verified, alternatively "installed" on gpc1_0 or gpc1_1 (once a day between 0:00 and 4:00), and distributed with its md5 checksum on the rsync server.

  • Backups of ippRequestServer, isp, ippadmin is /home/panstarrs/ipp/mysql-dump/ops_dump.csh (Manoa cluster).
    • They are backed up every four hours at 0:00, 4:00, 8:00, 12:00, 16:00, and 20:00).
    • Dumps can be found in /export/ipp001.0/ipp/mysql-dumps.
    • Command for backup is started from ipp@ipp001 crontab.
  • Thanks to efforts by Cindy & Gavin, regular dumps of the mysql databases being used by the processing system on Maui are available via the rsync server on ipp0012.ifa.hawaii.edu. The three databases distributed are the gpc1 processing database, the isp processing database, and the ippadmin database describing the database schema. These are available from the rsync location: rsync://ipp0012.ifa.hawaii.edu/ippdb

if you want all three, give a command like: rsync -auv rsync://ipp0012.ifa.hawaii.edu/ippdb/ ippdb/

The databases have names of the form ippdb01-DBNAME.dump.bz, eg, the GPC1 processing information is in the file called ippdb01-gpc1.dump.bz. They are dumped every 4 hours, and the new one replaces the old name. If people are unable to retrieve these in less than the 4 hour period, then we can adjust the naming to keep more than one old version around.

If you want just one of the above databases, use a command like:

rsync rsync://ipp0012.ifa.hawaii.edu/ippdb/ippdb01-gpc1.dump.bz .

Note: The validity of the ippdb01-gpc1.dump.bz file can be asserted via its MD5 checksum (stored in ippdb01-gpc1.md5 and distributed in the same place rsync://ipp0012.ifa.hawaii.edu/ippdb/ippdb01-gpc1.md5)

Other

MySQL Servers Naming Convention

It is required (for replication) to have distinct server-id's.

The default 1 is replaced by YYYYMMDD<RUNNING_ID_FOR_DAY_ON_TWO_CHARACTERS>. For instance: 2011050602 means the MySQL server was the second MySQL Server (which means that there is a 2011050601 MySQL server somewhere) configured on May 6th 2011. Note that it is expected that different people configuring MySQL servers exchange information.

For reference

All of the various ipp mysql databases are now being dumped to /data/ipp001.0/ipp/mysql-dumps, and these above are linked into the 'distribution' subdirectory, which is made visible via the rsync server. We are moving to a system of keeping a sampling of the old databases with the following timescales:

  • all backups for the past 5 days (~20 copies)
  • one backup per day for the previous ~10 days (~10 copies)
  • one backup every 10 days for the previous 100 days (~10 copies)
  • one backup every 100 days for the lifetime of the project (~15 copies)

(currently, we are keeping all of the old copies with a linear spacing...)

Backup Monitoring

Log files (the best is to check in the relevant crontab though):

  • gcp1:
    ippc02:/export/ippc02.0/mysql-dumps/gpc1_dump.log
    ipp001:/home/panstarrs/ipp/mysql-dump/gpc1_install.log
    
  • nebulous:
    ippdb02:/export/ippdb02.0/mysql-dumps/neb_dump.log
    ipp001:/home/panstarrs/ipp/mysql-dump/neb_copy.log
    

pbzip2

Parallel bzip2 is used for compressing the dumps.

Its web site is here: http://www.compression.ca/pbzip2/

It can be found (on the production cluster) in ~ipp/softwares and it is installed in ~ipp/local.

Note: See TracWiki for help on using the wiki.