| 5 | | |
| 6 | | |
| 7 | | |
| | 5 | Cron jobs are set up on all nebulous storage nodes to scan the nebulous disks and construct statistics about their contents. These jobs are striped across the cluster throughout the week, in an attempt to prevent all hosts running the disk scan simultaneously (which would presumably have an effect on processing throughput). The usage statistics are placed in nebulous in files named `neb://ipp_diskspace/YYYY-MM-DD/ippXXX.Z.neb_usage.dat`, with the date string matching the date of the Sunday of the week the scan was performed. The script parses each filename in the nebulous directory, and attempts to classify the stage it comes from as well as the type of product from that stage. The total count of these files, and the total usage (in bytes) of these files. These totals are what is stored in the statistics files. An excerpt of one of these files is listed below, showing the CHIP stage file statistics from one host. |
| | 6 | {{{ |
| | 7 | CHIP B1FITS 774142 320650755840 |
| | 8 | CHIP B1JPG 71 18789330 |
| | 9 | CHIP B2FITS 775168 34038596160 |
| | 10 | CHIP B2JPG 70 902953 |
| | 11 | CHIP BUNDLE 6649 190039346914 |
| | 12 | CHIP CATALOG 462015 443818370880 |
| | 13 | CHIP FITS 98121 1030044401280 |
| | 14 | CHIP KERNEL 14 6658560 |
| | 15 | CHIP LOG 764083 25942316933 |
| | 16 | CHIP MASK 44888 93167104320 |
| | 17 | CHIP MDC 774856 82952703287 |
| | 18 | CHIP MDL 832113 34001887680 |
| | 19 | CHIP PNG 9838 160628007 |
| | 20 | CHIP PSF 774958 77240390400 |
| | 21 | CHIP PTN 50884 60478741440 |
| | 22 | CHIP SKYCELL 30 86400 |
| | 23 | CHIP STATS 637305 2358654287 |
| | 24 | CHIP TRACE 422839 29600227 |
| | 25 | CHIP WEIGHT 44740 608909785920 |
| | 26 | }}} |
| | 27 | |
| | 28 | At the end of the week, the gpc1 database is polled to identify the count of states for each stage, and the results stored in `neb://ipp_diskspace/YYYY-MM-DD/run_im_counts.dat`. This gives the stage, the data_state, and then the count of the number of components with that data_state: |
| | 29 | {{{ |
| | 30 | CHIP cleaned 16945531 |
| | 31 | CHIP error_cleaned 25155 |
| | 32 | CHIP error_scrubbed 5851 |
| | 33 | CHIP full 631356 |
| | 34 | CHIP purged 286850 |
| | 35 | CHIP scrubbed 5993 |
| | 36 | CHIP update 3304 |
| | 37 | }}} |
| | 38 | At the end of the week, a summary file is created (`neb://ipp_diskspace/YYYY-MM-DD/summary.dat`) that gives the usage in TB of each stage/product combination: |
| | 39 | {{{ |
| | 40 | CHIP_B1FITS 4876.40501260757 |
| | 41 | CHIP_B1JPG 5.90300299786031 |
| | 42 | CHIP_B2FITS 506.811673343182 |
| | 43 | CHIP_B2JPG 0.397686927579343 |
| | 44 | CHIP_BTTABLE 0 |
| | 45 | CHIP_BUNDLE 2889.25785760116 |
| | 46 | CHIP_CATALOG 7536.26382619143 |
| | 47 | CHIP_FITS 14993.0452584894 |
| | 48 | CHIP_KERNEL 3.89136224985123 |
| | 49 | CHIP_LOG 557.324770034291 |
| | 50 | CHIP_MASK 1228.97341176867 |
| | 51 | CHIP_MDC 1149.67314596102 |
| | 52 | CHIP_MDL 501.610046625137 |
| | 53 | CHIP_PNG 10.054713034071 |
| | 54 | CHIP_PSF 1185.04951536655 |
| | 55 | CHIP_PTN 1154.56684827805 |
| | 56 | CHIP_SKYCELL 0.0493767857551575 |
| | 57 | CHIP_STATS 68.9840127192438 |
| | 58 | CHIP_TRACE 5.00015713181347 |
| | 59 | CHIP_UNKNOWN 0.0171183617785573 |
| | 60 | CHIP_WEIGHT 9193.91913002729 |
| | 61 | }}} |
| | 62 | |
| | 63 | A largely untested final summary should be generated that takes the usage.dat files, the run_im_counts.dat file, and the mapping file (hardcoded as `neb://ipp_diskspace/mappings_im.dat`) to match the sizes in each stage/product and the counts in each stage/data_state, and construct an understanding of how much disk space is being used by permanent products (raw imfiles), transient products (chip stage images), and final output products (stacks). This has been delayed due to other concerns, and an issue with the perl installation on ippbXX. This final summary would then be able to be used to recalculate the table in `trunk/tools/diskspace/sizes_from_counts.pl`. This script takes a list of counts (any of the run_im_counts.dat files), and uses the average sizes for each stage/data_state, and calculates a description of which stages are using the most disk space. Running this script on the count file from 2011-10-09 yields (in part): |
| | 64 | {{{ |
| | 65 | CHIP cleaned PRODUCT 31179.7770 0.001840 16945531 |
| | 66 | CHIP error_cleaned PRODUCT 2652.1420 0.105432 25155 |
| | 67 | CHIP error_scrubbed PRODUCT 232.8581 0.039798 5851 |
| | 68 | CHIP full TRANSIENT 75617.5081 0.119770 631356 |
| | 69 | CHIP purged PRODUCT 527.8040 0.001840 286850 |
| | 70 | CHIP scrubbed PRODUCT 11.0271 0.001840 5993 |
| | 71 | CHIP update TRANSIENT 348.3473 0.105432 3304 |
| | 72 | }}} |
| | 73 | This shows that of the space used by chip (on the disks scanned), 30% is consumed by the permanent output products of previous calculations, etc. There is also a summary printed: |
| | 74 | {{{ |
| | 75 | PRODUCT 272237.058713 |
| | 76 | TRANSIENT 160852.076341 |
| | 77 | PERMANENT 805663.824597 |
| | 78 | SUM 1238752.959651 |
| | 79 | }}} |
| | 80 | which displays the calculated sizes used by each of the three classes of products, along with the sum. This sum seems to only be accurate to about 10%, and would certainly be improved by fixing the ippbXX perl issue and fully recalculating the size/count table used. |