IPP Software Navigation Tools IPP Links Communication Pan-STARRS Links
wiki:shepherding_the_pstamp_server

Version 1 (modified by bills, 16 years ago) ( diff )

--

Shepherding the postage stamp server and the update process.

Overview


Postage stamp request files are fits tables that contain 1 or more request specifications. Request files are submitted to the system three ways.

  • By posting to a fileset on a data store that the server is monitoring
  • by uploading a request file to the upload page http://pstamp.ipp.ifa.hawaii.edu/upload.php
  • by entering a "valid" set of paramters on the prototype web interface and hitting submit

Each file is represented in the database by a row in the table pstampRequest. The request starts life in state 'new'.

Request files are parsed into jobs by the task pstamp.job.run

Each row in the request table contains a request specification. Each row causes one or more pstampJobs to be entered into the database. Jobs can be entered with state = 'stop' and fault > 10. These are due to requests wanting images that haven't been destreaked or have been purged, bycoord requests that have no images or errors in the request.

Once the file has completed parsing successfully it's state is set to 'run'. Once the request is in state run its jobs are eligible to run and will be output by pstamptool -pendingjob

If a pstampJob's input images have been cleaned a pstampDependent object is entered into the database with state 'new'. The pstampJob will not be run until it's dependent is set to 'full'.

The task pstamp.dependent.run runs the script pstamp_checkdependent.pl. This script queries the gpc1 database for the state of the dependent component. This script's job is to queue and monitor update processing for the dependent component. CAREFUL THERE BE SPIDERS

Once all jobs have completed (changed from run to stop state) the request is queued for "finishing". The task pstamp.finish.run builds the results fits table and packages up all of the data into a fileset on the request's outgoing data store.

Monitoring


Since the postage stamp / data store database is on ippdb02 and not ippdb01 some care must be taken when running commands.

I define the environment variable PSDBSERVER=ippdb02 and the alias

   alias pst 'pstamptool -dbname ippRequestServer -dbserver $PSDBSERVER'

I always type pst instead of pstamptool. Since the postage stamp tables on gpc1 database have been dropped you get errors if you try and run pstamptool without -dbserver

    (ipp032:~) bills% pstamptool -pendingreq
     p_psDBRunQuery (psDB.c:812) : Failed to execute SQL query.  Error: Table 'gpc1.pstampRequest' doesn't exist
         pendingreqMode (pstamptool.c:320) : database error


     -> p_psDBRunQuery (psDB.c:812): Database error generated by the server
         Failed to execute SQL query.  Error: Table 'gpc1.pstampRequest' doesn't exist
     -> pendingreqMode (pstamptool.c:320): unknown psLib error
         database error

This is a feature. (I plan on removing the postage stamp tables from the set created by pxadmin -create and adding a new mode to create them but I haven't gotten around to it yet).

Common Failures


Requests can fault for two reasons either at the parse stage or when finishing the request. Request faults rarely happen and are usually due to software bug. pst -revertreq doesn't work for requests that haven't finished parsing. (The script sets the state to run when it shouldn't. I don't want to fix this problem until I'm sure that it won't break anything).

Jobs can fault due to typical nfs errors. There is a revert task for jobs. It only resets jobs with fault > 0 and < 10. Faults >= 10 are faults for the users. Jobs which are faulted in this way have shoul have had their state set to stop thus they are finished.

The vast majority of the problems these days are related to the update processing. There are a large number of error conditions that can gum up the works. When a dependent run faults (that is when a update of a chipProcessedImfile, warpSkyfile, etc faults) the dependent is faulted as well.

Here is an example

(ipp032:~) bills% pst -pendingdependent -limit 1 pstampDependent MULTI

pstampDependent METADATA

dep_id S64 55261 state STR new stage STR chip stage_id S64 77563 component STR XY31 imagedb STR gpc1 outdir STR /data/ippdb02.0/pstamp/work/20100527/5977 rlabel STR ps_ud_WEB.UP need_magic BOOL T fault S16 0 priority S64 8

END

pstamp_checkdependent.pl sets chipRun 77563, class_id XY31 to update state and watches for it to go to 'full' state. Once it' goes to full it sets the magicDSRun to 'new' state. Once the component is destreaked chipProcessedImfile.magicked > 0 it's done and the pstampDependent is set to 'full'

The update processing is managed by the 'update' pantasks

Note: See TracWiki for help on using the wiki.