IPP Software Navigation Tools IPP Links Communication Pan-STARRS Links
wiki:Processing_Log

Version 34 (modified by bills, 16 years ago) ( diff )

--

This page contains a running log of the "special activities" of the IPP Processing Czar

2010-06-09 Bill

Bad weather last night never opened. Took the opportunity to evaluate the existing faults and unfinished processing.

  • chip processing was generating lots of errors due to missing detrends. It turns out that some engineering exposures got made with obs_mode = '3PI'. Set these chipRuns to drop.
  • camRun 85769 repeatedly failed. exp_name o5354g0523o. jpeg images for exposures around this time have lots of missing chips. Observing log says 'g0490o - g0523o Most of these exposures were taken in twilight and need to be redone' Set camRun to drop with note to this effect
  • 4 warpSkyfiles were repeatedly failing. This was because one of the input images chip_id 100112 class_id XY17 was corrupted. Re-ran chip_imfile.pl by hand, reverted warp, and it completed. The command line was
     chip_imfile.pl --no-update --verbose --exp_id 176615 --chip_id 100112 --chip_imfile_id 5832194 --class_id XY17 --uri neb://ipp026.0/gpc1/20100603/o5350g0442o/o5350g0442o.ota17.fits --camera GPC1 --run-state new --deburned 0 --outroot neb://ipp026.0/gpc1/ThreePi.nt/2010/06/03/o5350g0442o.176615/o5350g0442o.176615.ch.100112 --redirect-output --dbname gpc1
    
    
  • What!? camRun 85769 got set changed from drop to new! Looks like camtool -revertprocessedexp is obsolete. Paul investigated and fixed in trunk.
  • 4 of the twilight exposures from MJD 55354 failed at warp stage. Set the corresponding runs' state to 'drop'.
  • Eric Morganson reports that some postage stamp requests he submitted last night (HST) didn't complete. Turns out each one referenced some warpRuns in unexpected state: state = 'full' magicked = -1 Fixed by setting the corresponding magicDSRuns to state new. This shouldn't have happened. Magicked gets set to -1 when cleaned and at update time, since the chips are magicked the warps are magicked. These were the only runs in this state (at any stage) so I let this go for now.
  • Investigated incomplete runs
    • warp - 38 runs were descendents of camRuns with poor quality. This is a bug in camtool. It should not advance run to fake if the quality is poor. Set to drop. Checked fix to camtool into the trunk.
    • stackRun 116184 gets 'no fake sources suitable for PSF fitting'
    • distRun 177188 component X15 from chipRun 94549 is corrupt (streaksremove output)
    • 117 magicDSRuns were in state new but the corresponding warpRuns had been purged. Set to goto_cleaned.
    • magicDSRun 52543 for camera stage cannot proceed because the chipRun has been cleaned. Set to goto_cleaned.
    • magicDSRun 162057 for chipRun 99733 cannot proceed because the camera mask file for chip XY17 is corrupt (camRun 83891)
    • magicDSRun 152644 for diffRun 56728 repeatedly fails on skycell.2266.020 with fault 4 The program segvs due to a bug in the function censorSources. It looks like the input cmf file may be corrupt. Fix to streaksremove committed to the trunk.
    • 4 diff stage distRuns were in state new but the corresponding diffRuns had been cleaned. Set to goto_cleaned
    • 49 diffRuns in state new but the corresponding warpRuns have been cleaned.
    • several magicRuns were in state new with diffRuns that have been cleaned. Set to drop. Changed magictool to allow state = 'drop'
    • 23 magicRuns were in state new whose corresponding diffRuns were in state 'new' This is quite bizarre. The data_group was ThreePi.20100XXX where XXX in 221, 222, 223, 227, 228, 301. Set magicRun.state to drop.
    • diffRun 58718 and stackRun 116037 are blocked because one of their inputs warpSkyfile 70847 skycell.075 is corrupt. Reprocessed with the command
      warp_skycell.pl --no-update --verbose --warp_id 70847 --warp_skyfile_id 6667056 --skycell_id skycell.075 --tess_dir MD06 --outroot neb://ipp036.0/gpc1/MD06.nt/2010/06/06/o5353g0124o.178114/o5353g0124o.178114.wrp.70847.skycell.075 --run-state new --camera GPC1
      

2010-06-10 Bill

High winds on the summit again. No science exposures

  • 3 postage stamp requests got stuck. There were a couple of problems with destreaking that needed some manual intervention.
    • a couple of runs failed at the end due to a database deadlock updating the magic status of a chipProcessedImfile. After reverting these completed.
    • some others failed due to an inconsistency in the database. The diffRuns that were the input to the magic analysis were in state 'cleaned' but the data_state of the skyfiles was 'full'. Because of this magic_destreak.pl assumed that the skyfiles were available to be used for the 'diffed pixel' calculations but the files didn't exist. About 4000 diffRuns were in this state. I updated the database by hand and ran revert on destreak. Upon re-running the script found the data_state as 'cleaned' so it created temporary skeleton skycells.


2010-06-14 15:05 Chris

  • shutdown cleanup pantasks as requested via phone.

2010-06-15 Paul

  • Camera run 85769 failing because only 3 chips have good 'quality' flag: mysql> update camRun set state = 'drop' where cam_id = 85769;
  • Burntooling old data for MD04 reference stack:
    pantasks: ns.add.date 2009-12-21
    pantasks: ns.add.date 2009-11-28
    pantasks: ns.add.date 2009-11-26
    pantasks: ns.add.date 2009-12-02
    

2010-06-25 Paul

  • Manually re-running corrupted files:
    chip_imfile.pl --exp_id 182679 --chip_id 105326 --chip_imfile_id 6145034 --class_id XY17 --uri neb://ipp026.0/gpc1/20100617/o5364g0161o/o5364g0161o.ota17.fits --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/17/o5364g0161o.182679/o5364g0161o.182679.ch.105326 --run-state new --no-update --verbose --redirect-output --dbname gpc1
    chip_imfile.pl --exp_id 184898 --chip_id 106408 --chip_imfile_id 6209945 --class_id XY05 --uri neb://ipp023.0/gpc1/20100620/o5367g0467o/o5367g0467o.ota05.fits --camera GPC1 --outroot neb://any/gpc1/STS.nt/2010/06/20/o5367g0467o.184898/o5367g0467o.184898.ch.106408 --run-state new --no-update --verbose --redirect-output --dbname gpc1
    camera_exp.pl --exp_tag o5371g0244o.186524 --cam_id 91658 --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/24/o5371g0244o.186524/o5371g0244o.186524.cm.91658 --run-state new --no-update --verbose --redirect-output --dbname gpc1
    camera_exp.pl --exp_tag o5369g0208o.185868 --cam_id 91339 --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/22/o5369g0208o.185868/o5369g0208o.185868.cm.91339 --run-state new --no-update --verbose --redirect-output --dbname gpc1
    camera_exp.pl --exp_tag o5366g0349o.184150 --cam_id 90117 --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/19/o5366g0349o.184150/o5366g0349o.184150.cm.90117 --run-state new --no-update --verbose --redirect-output --dbname gpc1
    camera_exp.pl --exp_tag o5363g0305o.182212 --cam_id 88947 --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/16/o5363g0305o.182212/o5363g0305o.182212.cm.88947 --run-state new --no-update --verbose --redirect-output --dbname gpc1
    warp_skycell.pl --warp_id 76965 --warp_skyfile_id 7257921 --skycell_id skycell.2332.089 --tess_dir RINGS.V0 --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/22/o5369g0092o.185751/o5369g0092o.185751.wrp.76965.skycell.2332.089 --run-state new --no-update --threads 8 --redirect-output --dbname gpc1
    warp_skycell.pl --warp_id 77486 --warp_skyfile_id 7309832 --skycell_id skycell.1404.127 --tess_dir RINGS.V0 --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/24/o5371g0408o.186686/o5371g0408o.186686.wrp.77486.skycell.1404.127 --redirect-output --dbname gpc1 --verbose --threads 8 --run-state new --no-update
    diff_skycell.pl --diff_id 61705 --skycell_id skycell.1404.127 --outroot neb://any/gpc1/ThreePi.nightlyscience/2010/06/24/RINGS.V0/skycell.1404.127/RINGS.V0.skycell.1404.127.dif.61705 --run-state new --diff_skyfile_id 3316579 --no-update --redirect-output --verbose --threads 8 --dbname gpc1
    camera_exp.pl --exp_tag o5348g0312o.175971 --cam_id 83891 --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/01/o5348g0312o.175971/o5348g0312o.175971.cm.83891 --dbname gpc1 --redirect-output --verbose --no-update
    
  • Dropped runs that cannot complete because the inputs are not available:
    stacktool -updaterun -set_state drop -stack_id 116184 -dbname gpc1
    magictool -updaterun -state drop -magic_id 20031 -dbname gpc1
    magictool -updaterun -state drop -magic_id 20032 -dbname gpc1
    
    mysql> update diffRun join diffInputSkyfile using(diff_id) join warpRun on warp1 = warp_id set diffRun.state = 'drop' where diffRun.label = 'ThreePi.nightlyscience' and diffRun.state = 'new' and warpRun.state = 'cleaned';
    Query OK, 48 rows affected (0.06 sec)
    Rows matched: 48  Changed: 48  Warnings: 0
    

2010-06-30 Paul

  • Manually re-running corrupted files:
    warp_skycell.pl --warp_id 79174 --warp_skyfile_id 7478005 --skycell_id skycell.2008.028 --tess_dir RINGS.V0 --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/28/o5375g0534o.188432/o5375g0534o.188432.wrp.79174.skycell.2008.028 --run-state new --no-update --threads 4 --dbname gpc1 --redirect-output --verbose
    warp_skycell.pl --warp_id 79400 --warp_skyfile_id 7500333 --skycell_id skycell.1477.006 --tess_dir RINGS.V0 --camera GPC1 --outroot neb://any/gpc1/ThreePi.nt/2010/06/29/o5376g0268o.188733/o5376g0268o.188733.wrp.79400.skycell.1477.006 --run-state new --no-update --threads 4 --dbname gpc1 --redirect-output --verbose
    

2010-07-01 Bill

  • we're out of space. Gene and Chris determined that part of the problem is that the non-destreaked files are being kept around. This effectively doubles the space used. Gene shut off processing. Bill issued some mysql queries to queue existing data for cleanup.
  • 10:00 shut off destreak/distribution and increased the number of nodes doing cleanup
  • MOPS's postage stamp requests were failing. Inverse images were missing. Turned out to be a problem diff updates. Queued diffRun's for cleanup, killed off the jobs and asked Jan to resubmit.
  • shut everything down to allow /data/ipp005.0 to be mounted from ipp037. Restart went smoothly. Since we have some space now turned on summit copy and registration. Distribution/desreak and stdscience are still off for the time being.
  • Jan re-submitted 2 requests with 100 jobs each. They required some updates to process. These might be blocked waiting for the previous cleanup to finish.
  • 15:14 turned off destreak cleanup to give diff.cleanup a chance to finish the 498 diff runs pending cleanup. There are 14556 destreak runs left to go.

---

2010-07-28 Bill

  • They reportedly had a good night on the summit. 754 exposures were taken. Some of those were tests.
  • As of 7:45 burntool is running. The MD08.redo warps (V2 tessellation are running as well) Ganglia is glowing red and nebulous space used is up to 89% Need to check on cleanup
Note: See TracWiki for help on using the wiki.