IPP Software Navigation Tools IPP Links Communication Pan-STARRS Links
wiki:Moped_accident

Purpose

Just in case there is any problem and I can't do things myself in case of a moped accident. Moreover searching for "moped" should be a good way to find this wiki page in a couple of years when we will have to do this again...

Related wikipages

Back to the related czar log wikipage (2012-05-30)

Back to the related czar log wikipage (2012-06-04)

Ingestion statistics

Preparation

  • DONE change apache configuration files on ippc01-ippc10 I simply replaced the value of nebulous.ipp.ifa.hawaii.edu in /etc/apache2/modules.d/apache2-mod_perl-startup.pl by ippdb02.ipp.ifa.hawaii.edu. Note that it could have been possible to change the value in the dns (but Gene didn't want it). A copy of the original files is named /etc/apache2/modules.d/apache2-mod_perl-startup.pl.20120529_ippdb00
  • DONE (scripts are finished): check if jt scripts are still running on ippc19: make them asleep
  • DONE make sure ippdb02 is synchronized with ippdb00, stop slave on ippb02
  • DONE apache is connected to ippdb02 (check process table) restart apache servers on ippc01-ippc10
  • N/A wake up jt scripts: if crashed, tell jt
  • DONE kill nebdiskd on ippdb00
  • DONE start nebulous mysqldump on ippdb00 (use pbzip2)

2012-05-29 19:00 The dump is running in screen session named nebulous_dump (i.e. reconnect on ippdb00 with screen -r nebulous_dump). The dump file is /export/ippdb00.0/nebulous_dumps/nebulous-20120529_fulldump.sql.bz2 and should have a size of 26G (according to the last successful dump of ippdb02 started this morning)

2012-05-29 19:50 The dump should be finished around 0:30 (6 GB in 54 minutes / 27 GB for the complete dump).

2012-05-29 21:50 The dumping rate has dropped down to about 10 GB in 3 hours... Should be finished around 3am?

2012-05-30 06:20 Dump finished for info:

# stat nebulous-20120529_fulldump.sql.bz2
  File: `nebulous-20120529_fulldump.sql.bz2'
  Size: 27041967453     Blocks: 52816600   IO Block: 4096   regular file
Device: 804h/2052d      Inode: 1257        Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2012-05-29 18:54:50.019218105 -1000
Modify: 2012-05-30 04:27:42.479312043 -1000
Change: 2012-05-30 04:27:42.479312043 -1000

Once neb dump is finished

  • DONE copy reference dump to ippdb02 (and from ippdb02 to ipp001)
   /export/ippdb02.0/mysql-dumps/backups/20120529/nebulous-20120529_fulldump.sql.bz2
     Took 08:02 at 53.5MB/s (scp output)
   /export/ipp001.0/ipp/mysql-dumps/Archives/2012/nebulous-20120529_fulldump.sql.bz2
     Took 12:05 at 35.7MB/s (scp output)
  • DONE (2012-05-30 06:34) stop the mysql server
  • DONE delete nebulous database on ippdb00

For info in /var/lib/mysql/nebulous:

 # ls -la
total 1143656632
drwx------ 2 mysql mysql         4096 Apr 25 13:42 .
drwxr-x--- 6 mysql root         28672 May 29 08:15 ..
-rw-rw---- 1 mysql mysql         8632 Sep 16  2011 cabinet.frm
-rw-rw---- 1 mysql mysql       131072 Nov 16  2011 cabinet.ibd
-rw-rw---- 1 mysql mysql           65 Nov 23  2009 db.opt
-rw-rw---- 1 mysql mysql         8632 Sep 16  2011 deleted.frm
-rw-rw---- 1 mysql mysql    348127232 May 24 14:43 deleted.ibd
-rw-rw---- 1 mysql mysql         8640 Sep 16  2011 directory.frm
-rw-rw---- 1 mysql mysql  22993174528 May 29 17:19 directory.ibd
-rw-rw---- 1 mysql mysql         8722 Mar  2  2011 instance.frm
-rw-rw---- 1 mysql mysql 558735818752 May 29 18:00 instance.ibd
-rw-rw---- 1 mysql mysql         8637 Sep 16  2011 lock_record.frm
-rw-rw---- 1 mysql mysql       114688 Sep 16  2011 lock_record.ibd
-rw-rw---- 1 mysql mysql         8704 Sep 16  2011 log.frm
-rw-rw---- 1 mysql mysql        98304 Sep 16  2011 log.ibd
-rw-rw---- 1 mysql mysql         8748 Apr 25 13:42 lost_instances.frm
-rw-rw---- 1 mysql mysql       688128 Apr 25 20:33 lost_instances.ibd
-rw-rw---- 1 mysql mysql         8898 Sep 16  2011 mountedvol.frm
-rw-rw---- 1 mysql mysql       229376 May 29 18:43 mountedvol.ibd
-rw-rw---- 1 mysql mysql         8723 Mar  3  2011 storage_object.frm
-rw-rw---- 1 mysql mysql 493929627648 May 29 18:00 storage_object.ibd
-rw-rw---- 1 mysql mysql         8716 Sep 16  2011 storage_object_attr.frm
-rw-rw---- 1 mysql mysql  50675580928 May 29 18:00 storage_object_attr.ibd
-rw-rw---- 1 mysql mysql         8624 Sep 16  2011 storage_object_xattr.frm
-rw-rw---- 1 mysql mysql  44417679360 May 29 17:30 storage_object_xattr.ibd
-rw-rw---- 1 mysql mysql         8870 Sep 16  2011 volume.frm
-rw-rw---- 1 mysql mysql       212992 May 24 14:42 volume.ibd
# du
1143656584      .
  • DONE clean mysql on ippdb00 (remove old binlogs)
  • DONE start the mysql server
  • RUNNING ingest neb dump on ippdb00
    Running in screen session {{{nebulous_dump}}}
    Started: Wed May 30 06:45:36 HST 2012
    Finished: Crashed

Since ippdb00 is not finished

  • DONE We will use ippdb02 as a backup. I flushed the logs on ippdb02. The binlogs are now: mysqld-bin.000680 (and mysqld-relay-bin.003160)
  • Ingestion crashed see czar log
  • DONE Ingestion restarted with slow log queries disabled
        Running in screen session 'ingest' (root account)
        Started: May 31 15:14:39
        Finished: Jun 05 00:02:42
    
  • Data for ingestion statistics:
    • Directory contents
      ippdb00 ~ # ls -l /var/lib/mysql/nebulous/
      total 932014828
      -rw-rw---- 1 mysql mysql         8632 May 31 15:14 cabinet.frm
      -rw-rw---- 1 mysql mysql       131072 May 31 15:14 cabinet.ibd
      -rw-rw---- 1 mysql mysql         8632 May 31 15:14 deleted.frm
      -rw-rw---- 1 mysql mysql    289406976 May 31 15:15 deleted.ibd
      -rw-rw---- 1 mysql mysql         8640 May 31 15:14 directory.frm
      -rw-rw---- 1 mysql mysql  21583888384 May 31 15:32 directory.ibd
      -rw-rw---- 1 mysql mysql         8722 May 31 15:30 instance.frm
      -rw-rw---- 1 mysql mysql 452494098432 Jun  5 03:10 instance.ibd
      -rw-rw---- 1 mysql mysql         8637 Jun  4 05:26 lock_record.frm
      -rw-rw---- 1 mysql mysql       114688 Jun  4 05:32 lock_record.ibd
      -rw-rw---- 1 mysql mysql         8704 Jun  4 05:26 log.frm
      -rw-rw---- 1 mysql mysql        98304 Jun  4 05:32 log.ibd
      -rw-rw---- 1 mysql mysql         8748 Jun  4 05:26 lost_instances.frm
      -rw-rw---- 1 mysql mysql       688128 Jun  4 05:32 lost_instances.ibd
      -rw-rw---- 1 mysql mysql         8898 Jun  4 05:26 mountedvol.frm
      -rw-rw---- 1 mysql mysql       229376 Jun  4 05:32 mountedvol.ibd
      -rw-rw---- 1 mysql mysql         8723 Jun  4 05:26 storage_object.frm
      -rw-rw---- 1 mysql mysql 412044230656 Jun  5 03:08 storage_object.ibd
      -rw-rw---- 1 mysql mysql         8716 Jun  4 19:18 storage_object_attr.frm
      -rw-rw---- 1 mysql mysql  37694210048 Jun  4 22:33 storage_object_attr.ibd
      -rw-rw---- 1 mysql mysql         8624 Jun  4 22:32 storage_object_xattr.frm
      -rw-rw---- 1 mysql mysql  30274486272 Jun  5 00:02 storage_object_xattr.ibd
      -rw-rw---- 1 mysql mysql         8870 Jun  4 23:59 volume.frm
      -rw-rw---- 1 mysql mysql       212992 Jun  5 00:02 volume.ibd
      
    • Tables contents:
      directory: 16895977 entries in 911 seconds (= "2012-05-31 15:14:56" - "2012-05-31 15:30:07")
      instance: 1216215431 entries in 309374 seconds (= "2012-05-31 15:30:07" - "2012-06-04 05:26:21")
      storage_object: 944739798 entries in 49940 seconds (= "2012-06-04 05:26:22" - "2012-06-04 19:18:42")
      storage_object_attr: 944739798 entries in 11623 seconds (= "2012-06-04 19:18:42" - "2012-06-04 22:32:25")
      storage_object_xattr: 274638705 entries in 5250 seconds (= "2012-06-04 22:32:25" - "2012-06-04 23:59:55")
      
    • The full ingestion created 450 binlogs (see /export/ippdb00.0/mysql_recovery/ippdb00_after_recovery_20120605 for the history).
    • /var/lib/mysql contents:
      ippdb00 mysql # du
      2210620	./log
      1688	./test
      784	./mysql
      932014832	./nebulous
      1416110816	.
      

Once ippdb00 ingestion is complete

  • DONE stop DONE pantasks, DONE pstamp apache server, DONE apache servers
  • DONE flush the logs on ippdb02 after making sure noone is connected to ippdb02 mysql server
  • DONE copy binlogs from ippdb02 to ippdb00 (copied to /export/ippdb00.0/mysql_recovery/ippdb02_binlogs_20120605)
  • DONE enable mysql slow logs
  • DONE restart the mysql server on ippdb00.
  • DONE play binlogs taken from ippdb02 on ippdb00 (Started at 2012-06-06 10:43; Complete at 2012-06-07 10:00)
    • In order not to block processing I asked Gavin to leave the dns configuration so that nebulous.ipp.ifa.hawaii.edu points to ippdb02
    • For the first binlog I executed: 'mysqlbinlog <binlog> | mysql -u root -p nebulous' which took 3 hours: slowed down by COMMIT statements happening every other line or so.
    • For the second binlog I executed: 'mysqlbinlog <binlog> | grep -v 'COMMIT/*!*/' | mysql -u root -p nebulous'. That took 75 minutes: slowed down by BEGIN statements.
    • Further ingestions executed with: 'mysqlbinlog <binlog> | grep -v 'COMMIT/*!*/' | grep -v BEGIN | mysql -u root -p nebulous'.
    • Note: There are between 40 and 45 millions lines in each binlog when shown as text. There are roughly: SET statements: 7-8M; INSERT: 1.5-2.5M; DELETE: 0.2-0.5M; UPDATE: 1M; REPLACE: 0.1-0.3M). The rest is either COMMIT, BEGIN or comments.
    • Note: There were offending statements in 695: "CREATE USER" (I restarted nebdiskd and had to use that statement) as well as "use nebulous" (I don't see why it's offending but it crashed the replay). The best seems to filter those two statements too (as well as commit and begin).
  • DONE compress all possible logs on ippdb00: I compressed the mysql slowlog. I didn't find the nebdiskd log though...
  • DONE stop pantasks, apache on ippc17, apache servers
  • DONE play the last binlog
  • DONE down new master coordinates
    mysql> SHOW MASTER STATUS;
    +-------------------+-----------+--------------+------------------+
    | File              | Position  | Binlog_Do_DB | Binlog_Ignore_DB |
    +-------------------+-----------+--------------+------------------+
    | mysqld-bin.000471 | 248051284 |              |                  | 
    +-------------------+-----------+--------------+------------------+
    1 row in set (0.00 sec)
    
    
  • DONE ask gavin to change dns so that nebulous.ipp.ifa.hawaii.edu points to ippdb00 again
  • DONE restart nebdiskd
  • DONE restore apache configuration files on ippc01-ippc10 (backup files are called /etc/apache2/modules.d/apache2-mod_perl-startup.pl.20120529_ippdb00)
  • DONE restart apache servers
  • DONE restart ipp pantasks (Heather takes care of ippdvo stuff)
  • DONE delete neb dump and binlogs from ippdb00 and ippdb02

Then on ippb02

  • DONE set master coordinates
  • DONEstart slave
  • DONE activate neb dump in root crontab
Last modified 14 years ago Last modified on Jun 7, 2012, 12:04:43 PM
Note: See TracWiki for help on using the wiki.