IPP Software Navigation Tools IPP Links Communication Pan-STARRS Links

Changes between Version 41 and Version 42 of Processing


Ignore:
Timestamp:
Apr 13, 2010, 4:30:12 PM (16 years ago)
Author:
rhenders
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Processing

    v41 v42  
    277277= Finding and dealing with errors =
    278278
    279 As shown above, the {{{pantasks: status}}} command shows failures for a particular processing stage. A handy script also exists for monitoring all stages for a particular label (basically a shortcut for using the 'Night summary' page of the ippmonitor (see above)). It is found in {{{/tools}}} under the build directory. Example usage is shown below.
     279As shown above, the {{{pantasks: status}}} command will display failures for a particular processing stage. A handy script also exists for monitoring all stages for a particular label (basically a shortcut for using the 'Night summary' page of the ippmonitor (see above)). It is found in {{{/tools}}} under the build directory. Example usage is shown below.
    280280
    281281{{{
     
    316316}}}
    317317
    318 The output highlights a problem at the diff stage, and in the {{{faults and log files}}} section, the relevant log file is listed, which can be views using {{{neb-tail}}} as below.
     318The output highlights a problem at the diff stage, and in the {{{faults and log files}}} section, the relevant log file is listed, which can be viewed using {{{neb-tail}}} as below.
    319319
    320320{{{
     
    322322}}}
    323323
    324 == Common issues ==
    325 
    326 This section attempts to outline common issues encountered during processing and how to work through them.
    327 
    328 === stdscience ===
    329 
    330 '''Chip failures''', for example
    331 
    332 {{{
    333   AV Name                     Nrun   Njobs   Ngood Nfail Ntime Command               
    334   ++ chip.imfile.run             0   23536   17755  5781     0 chip_imfile.pl       
    335 }}}
    336 
    337 To investigate the failures, go to
    338 
    339 [http://ipp004.ifa.hawaii.edu/ippMonitor/Login.php ippmonitor]->Science steps->Chip Failed Imfiles
    340 
    341 where you can view the logs by clicking within the 'State' column.
    342 
    343 '''Warp failures'''
    344 
    345 To investigate the failures, go to
    346 
    347 [http://ipp004.ifa.hawaii.edu/ippMonitor/Login.php ippmonitor]->Science steps->Warp Failed Skyfiles
    348 
    349 Filter results by using 'new' in the state column. For the results, check that the values in the 'Fault' column are 2, which denotes an NFS error, in which case we can 'revert' using
    350 
    351 {{{
    352 pantasks: warp.revert.on
    353 }}}
    354 
    355 Remember to switch off again afterwards with
    356 
    357 {{{
    358 pantasks: warp.revert.off
    359 }}}
    360 
    361 
    362 = Rebuilding the IPP code =
    363 
    364 The IPP in use presently is located at
    365 
    366 {{{
    367 ~ipp/ipp-20100211
    368 }}}
    369 
    370 If the code needs an update and rebuild, then:
    371 
    372  * stop pantasks (as above)
    373  * {{{cd ~ipp/ipp-20100211}}}
    374  * {{{svn update}}}
    375  * {{{psbuild -dev -optimize}}}
    376  * restart pantasks (as above)
    377 
    378 = Who to contact =
    379 
    380 Any problems or concerns should be reported to the ipp development mailing list:
    381 
    382 {{{
    383 ps-ipp-dev@ifa.hawaii.edu
    384 }}}
    385 
    386 Different members of the IPP team are responsible for different parts of the code, and the relevant person will hopefully address the issue.
    387 
    388 = Fault states =
    389 
     324Another script in the {{{/tools}}} directory can be used to probe errors more thoroughly. Chip, camera, warp, stack, diff, magic and destreak are currently supported, as well as the use of multiple labels (all use LIKE so wildcards work). You can also search by a specific fault code, or limit the query. The script categorises errors for entries at the specified stage and with the provided label. The output is the appropriate id and component, the machine it was run on, and the particular problem (e.g., a file that failed to be found, otherwise the resolved name of the processing log), all grouped into categories.  With such a list, it's easy to identify patterns, e.g., a few warps are failing because of a single corrupt camera mask file; or machine X can't read files on machine Y. For example, using the same label as above:
     325
     326{{{
     327./errors.pl --dbhost ippdb01 --dbuser ipp --dbpass ipp --dbname gpc1 --label 'SweetSpot.20100409' --stage diff
     328}}}
     329
     330Will produce
     331
     332{{{
     333Total: 1
     334
     335Assertion failures: 1
     33650316.skycell.1500.091(ippc08): /data/ipp034.0/nebulous/61/ef/244153882.gpc1:SweetSpot.20100409:2010:04:12:RINGS.V0:skycell.1500.091:RINGS.V0.skycell.1500.091.dif.50316.log
     337}}}
     338
     339The 'fault codes' mentioned above are as follows.
     340
     341|| '''Code''' || '''Description''' ||
    390342|| 1 || Error of unknown nature ||
    391343|| 2 || Error with a system call (often an NFS error) ||
     
    395347|| 6 || Error due to timeout ||
    396348
    397 
    398 
     349== Common issues ==
     350
     351This section attempts to outline common issues encountered during processing and how to work through them.
     352
     353=== stdscience ===
     354
     355'''Chip failures''', for example
     356
     357{{{
     358  AV Name                     Nrun   Njobs   Ngood Nfail Ntime Command               
     359  ++ chip.imfile.run             0   23536   17755  5781     0 chip_imfile.pl       
     360}}}
     361
     362To investigate the failures, go to
     363
     364[http://ipp004.ifa.hawaii.edu/ippMonitor/Login.php ippmonitor]->Science steps->Chip Failed Imfiles
     365
     366where you can view the logs by clicking within the 'State' column.
     367
     368'''Warp failures'''
     369
     370To investigate the failures, go to
     371
     372[http://ipp004.ifa.hawaii.edu/ippMonitor/Login.php ippmonitor]->Science steps->Warp Failed Skyfiles
     373
     374Filter results by using 'new' in the state column. For the results, check that the values in the 'Fault' column are 2, which denotes an NFS error, in which case we can 'revert' using
     375
     376{{{
     377pantasks: warp.revert.on
     378}}}
     379
     380Remember to switch off again afterwards with
     381
     382{{{
     383pantasks: warp.revert.off
     384}}}
     385
     386
     387= Rebuilding the IPP code =
     388
     389The IPP in use presently is located at
     390
     391{{{
     392~ipp/ipp-20100211
     393}}}
     394
     395If the code needs an update and rebuild, then:
     396
     397 * stop pantasks (as above)
     398 * {{{cd ~ipp/ipp-20100211}}}
     399 * {{{svn update}}}
     400 * {{{psbuild -dev -optimize}}}
     401 * restart pantasks (as above)
     402
     403= Who to contact =
     404
     405Any problems or concerns should be reported to the ipp development mailing list:
     406
     407{{{
     408ps-ipp-dev@ifa.hawaii.edu
     409}}}
     410
     411Different members of the IPP team are responsible for different parts of the code, and the relevant person will hopefully address the issue.
    399412
    400413{{{magic}}} has fault states greater than 5.