| Version 14 (modified by , 15 years ago) ( diff ) |
|---|
PS1 IPP Czar Logs for the week 2010.12.13 - 2010.12.19
(Up to PS1 IPP Czar Logs)
Monday : 2010.12.13
Lots of update processing triggered by the postage stamp server has been going on for the past several days. Except for some requests for chips with reduction STDSCIENCE_V0 things have been progressing smoothly.
- 12:40 (bills) added label MD02.2010.rerun to stdscience 562 exposures. About 1/2 hour later pantasks died. Restarted it at 13:31
- 17:08 (heather) under my own stdscience, processed darktest%.20101213 to investigate edge nans. Queued darktest.201012013, consisting of 100 i band images, for the static masking (and the autogeneration of the static masks).
Tuesday : 2020.12.14
Update storm seems to have passed during the night.
Many faults from old chipRuns with reduction STDSCIENCE_V0 (chip_id ~= 20000). Many (if not all) of the config dump files for this reduction refer to files that no longer exist. I set the state to goto_purged but I have not yet pulled the trigger and set label to goto_cleaned.
- 08:00 Queued MD02 nightly stacks. Label MD02.2010.rerun. Data group is MD02.V2.$date to distinguish from the MD02 stacks run previously with data group MD02.$date
- 10:40 (Serge) Many diffs have faults. Started revert for diff (through czartool). I don't see anything else.
- 11:00 (Serge) I ran ~heather/sshToNodes.py ipp /usr/local/sbin/nfscheck as ipp and all nodes tell they are OK.
- 11:20 (Serge) Repeated entry in logs of failed jobs:
I/O error code: 102 -> pmFPAfileWrite (pmFPAfileIO.c:340): Known programming error Error: file->mode != PM_FPA_MODE_INTERNAL is not true. -> pmFPAfileIOChecks (pmFPAfileIO.c:90): I/O error failed WRITE in FPA_AFTER block for PSPHOT.BACKMDL.STDEV -> main (ppSub.c:89): I/O error Unable to close files. Unable to perform ppSub: 2 at /home/panstarrs/ipp/psconfig//ipp-20101206.lin64/bin/diff_skycell.pl line 400.
Gene speaks: "[...] hitting the same failure which [...] i mentioned earlier this morning. let's let them fail [and fix later]"
- 13:14 stopping processing in order to update the build. Need to wait a few minutes to let the running stacks finish.
- 13:41 rebuild complete processing restarted. set label STS.20101202 to inactive temporarily until fix is confirmed. Reverted the diff faults.
- 14:45 set label STS.20101202 to active. Added MD02.2010.rerun to survey.dist and added label to distribution pantasks.
- 15:48 (Serge) Added ippMonitor as mysql user on ippdb:
CREATE USER 'ippMonitor'@'ipp004.ifa.hawaii.edu' IDENTIFIED BY 'ippMonitor'; GRANT REPLICATION CLIENT ON *.* TO 'ippMonitor'@'ipp004.ifa.hawaii.edu'; FLUSH PRIVILEGES;
Modified ippMonitor/raw/site.php so that czar tool can check replication status on ippdb02 (SVN 30034).
