| 324 | | == Common issues == |
| 325 | | |
| 326 | | This section attempts to outline common issues encountered during processing and how to work through them. |
| 327 | | |
| 328 | | === stdscience === |
| 329 | | |
| 330 | | '''Chip failures''', for example |
| 331 | | |
| 332 | | {{{ |
| 333 | | AV Name Nrun Njobs Ngood Nfail Ntime Command |
| 334 | | ++ chip.imfile.run 0 23536 17755 5781 0 chip_imfile.pl |
| 335 | | }}} |
| 336 | | |
| 337 | | To investigate the failures, go to |
| 338 | | |
| 339 | | [http://ipp004.ifa.hawaii.edu/ippMonitor/Login.php ippmonitor]->Science steps->Chip Failed Imfiles |
| 340 | | |
| 341 | | where you can view the logs by clicking within the 'State' column. |
| 342 | | |
| 343 | | '''Warp failures''' |
| 344 | | |
| 345 | | To investigate the failures, go to |
| 346 | | |
| 347 | | [http://ipp004.ifa.hawaii.edu/ippMonitor/Login.php ippmonitor]->Science steps->Warp Failed Skyfiles |
| 348 | | |
| 349 | | Filter results by using 'new' in the state column. For the results, check that the values in the 'Fault' column are 2, which denotes an NFS error, in which case we can 'revert' using |
| 350 | | |
| 351 | | {{{ |
| 352 | | pantasks: warp.revert.on |
| 353 | | }}} |
| 354 | | |
| 355 | | Remember to switch off again afterwards with |
| 356 | | |
| 357 | | {{{ |
| 358 | | pantasks: warp.revert.off |
| 359 | | }}} |
| 360 | | |
| 361 | | |
| 362 | | = Rebuilding the IPP code = |
| 363 | | |
| 364 | | The IPP in use presently is located at |
| 365 | | |
| 366 | | {{{ |
| 367 | | ~ipp/ipp-20100211 |
| 368 | | }}} |
| 369 | | |
| 370 | | If the code needs an update and rebuild, then: |
| 371 | | |
| 372 | | * stop pantasks (as above) |
| 373 | | * {{{cd ~ipp/ipp-20100211}}} |
| 374 | | * {{{svn update}}} |
| 375 | | * {{{psbuild -dev -optimize}}} |
| 376 | | * restart pantasks (as above) |
| 377 | | |
| 378 | | = Who to contact = |
| 379 | | |
| 380 | | Any problems or concerns should be reported to the ipp development mailing list: |
| 381 | | |
| 382 | | {{{ |
| 383 | | ps-ipp-dev@ifa.hawaii.edu |
| 384 | | }}} |
| 385 | | |
| 386 | | Different members of the IPP team are responsible for different parts of the code, and the relevant person will hopefully address the issue. |
| 387 | | |
| 388 | | = Fault states = |
| 389 | | |
| | 324 | Another script in the {{{/tools}}} directory can be used to probe errors more thoroughly. Chip, camera, warp, stack, diff, magic and destreak are currently supported, as well as the use of multiple labels (all use LIKE so wildcards work). You can also search by a specific fault code, or limit the query. The script categorises errors for entries at the specified stage and with the provided label. The output is the appropriate id and component, the machine it was run on, and the particular problem (e.g., a file that failed to be found, otherwise the resolved name of the processing log), all grouped into categories. With such a list, it's easy to identify patterns, e.g., a few warps are failing because of a single corrupt camera mask file; or machine X can't read files on machine Y. For example, using the same label as above: |
| | 325 | |
| | 326 | {{{ |
| | 327 | ./errors.pl --dbhost ippdb01 --dbuser ipp --dbpass ipp --dbname gpc1 --label 'SweetSpot.20100409' --stage diff |
| | 328 | }}} |
| | 329 | |
| | 330 | Will produce |
| | 331 | |
| | 332 | {{{ |
| | 333 | Total: 1 |
| | 334 | |
| | 335 | Assertion failures: 1 |
| | 336 | 50316.skycell.1500.091(ippc08): /data/ipp034.0/nebulous/61/ef/244153882.gpc1:SweetSpot.20100409:2010:04:12:RINGS.V0:skycell.1500.091:RINGS.V0.skycell.1500.091.dif.50316.log |
| | 337 | }}} |
| | 338 | |
| | 339 | The 'fault codes' mentioned above are as follows. |
| | 340 | |
| | 341 | || '''Code''' || '''Description''' || |
| 397 | | |
| 398 | | |
| | 349 | == Common issues == |
| | 350 | |
| | 351 | This section attempts to outline common issues encountered during processing and how to work through them. |
| | 352 | |
| | 353 | === stdscience === |
| | 354 | |
| | 355 | '''Chip failures''', for example |
| | 356 | |
| | 357 | {{{ |
| | 358 | AV Name Nrun Njobs Ngood Nfail Ntime Command |
| | 359 | ++ chip.imfile.run 0 23536 17755 5781 0 chip_imfile.pl |
| | 360 | }}} |
| | 361 | |
| | 362 | To investigate the failures, go to |
| | 363 | |
| | 364 | [http://ipp004.ifa.hawaii.edu/ippMonitor/Login.php ippmonitor]->Science steps->Chip Failed Imfiles |
| | 365 | |
| | 366 | where you can view the logs by clicking within the 'State' column. |
| | 367 | |
| | 368 | '''Warp failures''' |
| | 369 | |
| | 370 | To investigate the failures, go to |
| | 371 | |
| | 372 | [http://ipp004.ifa.hawaii.edu/ippMonitor/Login.php ippmonitor]->Science steps->Warp Failed Skyfiles |
| | 373 | |
| | 374 | Filter results by using 'new' in the state column. For the results, check that the values in the 'Fault' column are 2, which denotes an NFS error, in which case we can 'revert' using |
| | 375 | |
| | 376 | {{{ |
| | 377 | pantasks: warp.revert.on |
| | 378 | }}} |
| | 379 | |
| | 380 | Remember to switch off again afterwards with |
| | 381 | |
| | 382 | {{{ |
| | 383 | pantasks: warp.revert.off |
| | 384 | }}} |
| | 385 | |
| | 386 | |
| | 387 | = Rebuilding the IPP code = |
| | 388 | |
| | 389 | The IPP in use presently is located at |
| | 390 | |
| | 391 | {{{ |
| | 392 | ~ipp/ipp-20100211 |
| | 393 | }}} |
| | 394 | |
| | 395 | If the code needs an update and rebuild, then: |
| | 396 | |
| | 397 | * stop pantasks (as above) |
| | 398 | * {{{cd ~ipp/ipp-20100211}}} |
| | 399 | * {{{svn update}}} |
| | 400 | * {{{psbuild -dev -optimize}}} |
| | 401 | * restart pantasks (as above) |
| | 402 | |
| | 403 | = Who to contact = |
| | 404 | |
| | 405 | Any problems or concerns should be reported to the ipp development mailing list: |
| | 406 | |
| | 407 | {{{ |
| | 408 | ps-ipp-dev@ifa.hawaii.edu |
| | 409 | }}} |
| | 410 | |
| | 411 | Different members of the IPP team are responsible for different parts of the code, and the relevant person will hopefully address the issue. |