| 4 | | In this note we outline our plans for distributing PS1 IPP data |
| 5 | | to remote sites and describe some of the key features of the software |
| 6 | | that we are building to facilitate this process. |
| 7 | | |
| 8 | | The system is based on the existence of one or more mirror IPP sites. |
| 9 | | |
| 10 | | A mirror site will have a copy of the magic de-streaked raw images and |
| 11 | | the database tables for the various processing stages. It is possible for |
| 12 | | a site to be populated with subsets of the data. At this point only the |
| 13 | | MPG cluster in Garching is known to be ready to accept the |
| 14 | | full raw data volume. |
| 15 | | |
| 16 | | As the data is processed on Maui, distribution bundles will be |
| 17 | | created and posted on the MHPCC IPP Data Store. Each bundle will contain the |
| 18 | | results for a 'run' for a particular stage and a file containing |
| 19 | | the IPP database information for the run. |
| 20 | | |
| 21 | | The remote site will have an IPP installation that includes software |
| 22 | | that uses the 'Data Store protocol to manage the transfer of data bundles |
| 23 | | to the remote sites and tools to manage the database mirror. |
| 24 | | |
| 25 | | Due to the vast size of the PS1 data, not all of the images will |
| 26 | | be transferred to full-scale mirror sites. Instead the IPP's |
| 27 | | 'clean and update' system will be used to allow processed images to be |
| 28 | | remade by re-running portions of processing steps at the mirror site. |
| 29 | | |
| 30 | | Distribution bundles are built in either 'full' or 'clean' state. In the full state |
| 31 | | all of the associated data products are included. In the clean state |
| 32 | | the larger images are omitted. Once the dependent products are available |
| 33 | | at the mirror site, a run can be set to update state which causes the images to be |
| 34 | | re-created. |
| 35 | | |
| 36 | | To insure that all of the database information remains consistent, the mirror database |
| 37 | | shall not be used for queuing new ipp processing runs (for example, new chipRuns |
| 38 | | shall not be queued for an exposure). If such processing is desired the dependent |
| 39 | | data must be inserted into a different ipp database and processed from there. |
| 40 | | |
| 41 | | The data transfer software is being designed in such a way that a mirror |
| 42 | | site can serve as a source for other mirror sites. |
| 43 | | |
| 44 | | This 'bucket brigade' feature allows the load on the UH IfA network, computers, and |
| 45 | | the intercontinental network to be reduced from the level that would be required to service several |
| 46 | | clients. |
| 47 | | |
| 48 | | The 'default' set of bundles that is to be packaged has not been determined at this time. |
| 49 | | |
| 50 | | We expect that bundles in 'full' state will be produced for a subset of the data. |
| 51 | | |
| 52 | | The IPP Postage Stamp Server will be integrated with this system and will enable sites to request |
| 53 | | specially produced bundles using the postage stamp request system. |
| 54 | | |
| 55 | | The remote site will notify the IPP that a bundle has been received successfully by posting data |
| 56 | | on a Data Store at the site. The ipp will query this site and once all receivers give the all |
| 57 | | clear the distribution bundles will be purged from the data store when space is needed. |
| | 3 | * [wiki:GPC1_DataDistribution_Overview Overview Document by Bill Sweeney] |