IPP Software Navigation Tools IPP Links Communication Pan-STARRS Links

Changes between Version 3 and Version 4 of Processing


Ignore:
Timestamp:
Apr 9, 2010, 9:23:05 AM (16 years ago)
Author:
rhenders
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Processing

    v3 v4  
    33This page outlines the procedures and responsibilities for the person currently acting as 'IPP Processing Czar'. In a nutshell, these include:
    44
    5  * monitoring the various pantasks servers running on the production cluster
     5 * monitoring the various pantasks servers running on the production cluster using {{{pantasks_client}}}
    66 * alerting the IPP group to any notable errors or failures
    77 * keeping an eye on production cluster load using 'ganglia'
     
    1111= Setup =
    1212
    13 You will need to have ipp user access on the production cluster. For convenience, have someone with access (anyone on the IPP team) add your ssh public key to ~ipp/.ssh/authorized_keys.
     13You will need to have ipp user access on the production cluster. For convenience, have someone who already has access (anyone on the IPP team) to add your ssh public key to {{{~ipp/.ssh/authorized_keys}}}.
    1414
    1515= Resources =
     
    1919 * [http://ganglia.pan-starrs.ifa.hawaii.edu/?r=hour&s=descending&c=IPP%2520Production Ganglia] - For monitoring load on production cluster machines
    2020 * [http://ipp004.ifa.hawaii.edu/ippMonitor/Login.php ippmonitor] - a window onto the gpc1 database, particularly the [http://ipp004.ifa.hawaii.edu/ippMonitor/nightSummary.php NightSummary] page
     21
     22= Getting started and checking processing status =
     23
     24Log in as {{{ipp}}} user on any production cluster machine and run
     25
     26[[code]]
     27./check_system.sh
     28[[/code]]
     29
     30This lists the various {{{pantasks}}} servers currently running on the cluster, eg
     31
     32{{{
     33pantasks server addstar is running (host: ipp004)
     34pantasks server cleanup is running (host: ippc07)
     35pantasks server detrend is NOT running (host: ippc06)
     36pantasks server distribution is running (host: ippc15)
     37pantasks server pstamp is running (host: ippdb02)
     38pantasks server publishing is running (host: ippc08)
     39pantasks server registration is running (host: ippc02)
     40pantasks server replication is running (host: ippdb00)
     41pantasks server stdscience is running (host: ippc16)
     42pantasks server summitcopy is running (host: ippc01)
     43}}}
     44
     45Move to the directory corresponding to the server of interest in the list above, eg {{{stdscience/}}}, then run
     46
     47{{{
     48pantasks_client
     49}}}
     50
     51Within {{{pantasks}}}, to check processing status, do
     52
     53{{{
     54pantasks: status
     55}}}
     56
     57To check the current labels being processed:
     58
     59{{{
     60pantasks: show.labels
     61}}}
     62
     63= Stopping and staring the servers =
     64
     65== Stopping ==
     66
     67To shut down all {{pantasks_server}} instances, use
     68
     69{{{
     70check_system.sh stop
     71check_system.sh shutdown
     72}}}
     73
     74== Starting ==
     75
     76Each {{{pantasks_server}}} uses the {{{input}}} file located in the directory where is in instantiated. It also uses the local {{{ptolemy.rc}}} file (this file details the machine where the server is to run).
     77
     78To restart all the {{{pantasks_server}}} instances, you need to {{{ssh}}} to each relevant machine, which are found using {{{check_system.sh}}}. For each server do the following:
     79
     80{{{
     81ssh ipp@ippXXX
     82cd <serverName>
     83pantasks_server &
     84pantasks_client
     85pantasks: server input input
     86pantasks: setup
     87}}}
     88
     89So, for example for {{{stdscience}}}
     90
     91{{{
     92ssh ippc16
     93cd ~stdscience
     94pantasks_server &
     95pantasks_client
     96pantasks: server input input
     97pantasks: setup
     98}}}
     99
     100Each server then needs to be handled differently for setup.
     101
     102== stdscience ==
     103
     104Add surveys
     105
     106{{{
     107pantasks: add.surveys
     108}}}
     109
     110This adds the surveys defined in the 'input' file. Now show labels with
     111
     112{{{
     113pantasks: show.labels
     114}}}
     115
     116Working from this list, add and remove labels with {{{del.label}}} and {{{add.label}}}, eg
     117
     118{{{
     119pantasks: del.label M31.nightlyscience
     120pantasks: add.label ThreePi.DM.20100401
     121}}}
     122
     123No add some hosts
     124
     125{{{
     126pantasks: hosts add wave1
     127pantasks: hosts add wave2
     128pantasks: hosts add wave2
     129pantasks: hosts add wave3
     130pantasks: hosts add wave3
     131pantasks: hosts add compute
     132}}}
     133
     134Now we are ready to run the server
     135
     136{{{
     137pantasks: run
     138}}}
     139
     140== summitcopy, registration, replication ==
     141
     142The easy ones, just
     143
     144{{{
     145pantasks: run
     146}}}
     147
     148++ publishing
     149
     150This is for MOPS
     151
     152add labels? TODO
     153
     154== pstamp ==
     155
     156{{{
     157pantasks: add.hosts
     158pantasks: run
     159}}}
     160
     161== distribution ==
     162
     163{{{
     164pantasks: add.labels
     165}}}
     166
     167same labels as stdscience? TODO
     168
     169Add hosts
     170
     171{{{
     172pantasks: hosts add wave1
     173pantasks: hosts add wave2
     174pantasks: hosts add wave3
     175pantasks: hosts add compute
     176}}}
     177
     178Check processing is running smoothly in {{{stdscience}}} using {{{pantasks: status}}}. If all is okay, then
     179
     180{{{
     181pantasks: run
     182}}}
     183
     184== cleanup ==
     185
     186{{{
     187pantasks: add.labels
     188pantasks: hosts add wave2
     189pantasks: hosts add wave3
     190pantasks: hosts add compute
     191pantasks: run
     192}}}
     193
     194== detrend, addstar ==
     195
     196TODO
    21197
    22198= Rebuilding the IPP code =
     
    37213 * restart pantasks (as above)
    38214
     215= Who to contact =
     216
     217Any problems should be reported to the ipp development mailing list: ps-ipp-dev@ifa.hawaii.edu