IPP Software Navigation Tools IPP Links Communication Pan-STARRS Links
wiki:diff_fixits

Version 2 (modified by bills, 16 years ago) ( diff )

--

We have 2 diff faults today September 7, 2010. Let's fix them. The techniques described here can
be used for other stages.

***************************************

diff_id 76390 skycell.1194.140 fails because it can't read input warp file on ipp045 
(This was a node that had some troublesome days last month)

Confirm that file is corrupt with

    funpack -S /data/ipp045.0/nebulous/2d/e8/391435564.gpc1:SweetSpot.nt:2010:08:08:o5416g0096o.204249:SR_o5416g0096o.204249.wrp.90720.skycell.1194.140.wt.fits > broken.fits

    FITSIO status = 108: error reading from FITS file
    Error reading data buffer from file:
    /data/ipp045.0/nebulous/2d/e8/391435564.gpc1:SweetSpot.nt:2010:08:08:o5416g0096o
    .204249:SR_o5416g0096o.204249.wrp.90720.skycell.1194.140.wt.fits
    Error reading elements 1 thru 2147418112 from column 1 (ffgclb).
    error reading compressed byte stream from binary table

Yep it's garbage.

To regenerate the file execute the new script that I am going to finish and add to the tools directory.
(Or perhaps I will build the feature into warp_skycell.pl)

    perl runwarpskycell.pl --warp_id 90720 --skycell_id skycell.1194.140 --no-update

We use --no-update because we don't want to update the database (The warpRun is full. The warpSkyfile is in the database.
We just want to fix the bits.)

Unfortunately runwarpskycell.pl fails because input chips not found.

This is SweetSpot data from early August being diffed against the stacks.

To find out why the data can't be found use

    chiptool -processedimfile -chip_id 121251 -class_id xy52

to list the state of one of the required inputs.

We find chipRun has been cleaned (The data was previously processed as warp warp diffs. We've been
keeping the warps around to run against the stacks which were made at the end of the month)

We can get the chip images back with

 chiptool -setimfiletoupdate -chip_id 121251 -set_label update

For extra credit we could use warptool -scmap -warp_id 90720 -skycell_id skycell.1194.140 to find
the subset of chips to process, but I didn't. We'd queue them for updates 1 at a time by
adding -class_id xy?? to 
     chiptool -setimfiletoupdate -chip_id 121251 -set_label update -class_id xy52
     chiptool -setimfiletoupdate -chip_id 121251 -set_label update -class_id xy53
     etc.

Since we'll clean the data soon why bother.

Wait for updates to complete. 5 minutes or so. Check with 
       chiptool -listrun -chip_id 121251
to see that the state is full.

Then run the runwarpupdate.pl script. Success

Now revert the diff failure
    difftool -revertdiffskyfile -diff_id 76390

A few minutes later the diff is complete

Now we can go clean up the chips
   chiptool -updaterun -set_state goto_cleaned -set_label goto_cleaned -chip_id 121251

*****************************************
Case 2.

?????????????????????????????????????????????Assertion failed in function psThreadLauncher at psThread.c:244. Error stack:
Unable to perform ppSub: 4 at /home/panstarrs/ipp/psconfig//ipp-20100823.lin64/bin/diff_skycell.pl line 400.
Running [/home/panstarrs/ipp/psconfig/ipp-20100823.lin64/bin/difftool -diff_id 76373 -skycell_id skycell.1460.009 -fault 4 -adddiffskyfile -dtime_script 62.9999801516533 -hostname ipp053 -path_base neb://ipp053.0/gpc1/SweetSpot.nightlyscience/2010/09/07/RINGS.V0/skycell.1460.009/RINGS.V0.skycell.1460.009.dif.76373 -dbname gpc1]...

This is a recurring bug at the diff stage. Ticket #1422 covers it.

I have had success fixing these by running the command by hand without threads.

They also sometimes succeed after reverting. Let's try that

    difftool -revertdiffskyfile -diff_id 76373

Wait few minutes

 difftool -diffskyfile -diff_id 76373 -skycell_id skycell.1460.009 | grep state
 data_state       STR       full            
 state            STR       full

It worked. It would be nice to fix this bug.
Note: See TracWiki for help on using the wiki.