Changeset 5397
- Timestamp:
- Oct 20, 2005, 8:21:35 AM (21 years ago)
- Location:
- trunk/Ohana/src/addstar/doc
- Files:
-
- 2 edited
-
Changes.txt (modified) (1 diff)
-
notes.txt (modified) (2 diffs)
Legend:
- Unmodified
- Added
- Removed
-
trunk/Ohana/src/addstar/doc/Changes.txt
r5386 r5397 1 1 2 2005.10.193 2 4 I have added the client / server mode, and tested it out to a 5 fair degree. It handles all of the available addstar modes, 6 including -ref and -cat. A handful of things still need to be 7 done, including pushing some of the logic to enforce sorting 8 into the dvo load_catalog functions. Also, I need to correct 9 the idiosyncratic problem with sky vs Myyyy for the skyprobe 10 data. There are a handful of other improvements that are 11 needed before addstar / DVO is ready for IPP release. 12 However, at this point, it is ready for internal use by the 13 grad students, but not yet ready for CFHT use with Elixir. 3 2005.10.20 : addstar-1.3 14 4 15 2005.10.07 5 This release incorportates many substantial improvements 6 needed to handle the panstarrs data problems: 7 - internal data representation now high precision 8 - multiple data storage formats (LONEOS, ELIXIR, PANSTARRS) 9 - multiple data storage modes (RAW, MEF, SPLIT) 10 - alternative matching scheme (-closest) 11 - sorted output tables now optional 12 - incremental updates possible for inserts 13 - client / server set now defined 16 14 17 I have finished the basic implementation of the update mode. 18 I have been able to demonstrate substantial improvements in 19 speed when the number of existing measurements dominates the 20 total number of measurements and the number of averages is 21 typically small compared to the number of measurements (ie, 22 most objects are real, detected in most images, and each new 23 image supplies many new measurements of objects which exist 24 and not many of objects which don't exist already). The speed 25 gain is significant in this context because the average table 26 is small compared to the measure table; since both update and 27 full-load methods require the complete average table, there is 28 no difference in the load time for the average table. 15 2005.08.15 : addstar-1.2 29 16 30 I was having some memory collision problems, and attempting to 31 use the ohana_allocate functions reminded me that the libFITS 32 functions were not supported under ohana_allocate. This was 33 unhelpful. I bit the bullet and split libohana into libohana 34 (base functions only, including ohana_allocate) and libdvo 35 (functions based on the libautocode structures). Doing this 36 allowed me to make libFITS depend on libohana (including 37 ohana_allocate). BUT, this forced me to change all LDFLAGS 38 entries in ohana to swap -lohana -lFITS for -lFITS -lohana, 39 and to add include <fitsio.h> in some cases. 17 This is a snapshot release before I begin serious work on the 18 code to handle alternate formats and so forth needed for the 19 panstarrs support. Minor updates since v1.1, mostly to fix 20 2MASS issues and to stay in sync with the libs. 40 21 41 2005.10.06 42 split / nosort / update 43 I have added a few new concepts to addstar recently: split 44 catalog files, nosort for the measurement table, and 45 update-only. 46 47 split mode 48 49 The split mode is quite straightforeward. In this mode, each 50 catalog is represented by a set of four files: *.cpt, *.cpm, 51 *.cpn, *.cps. Each file contains only one FITS table of the 52 data, along with basic header and empty matrix. Having 53 individual tables for each component of the database lets me 54 add entries without re-writing the entire table. This should 55 save on I/O operations in the long run. 56 57 The first file contains the table of averages, and is the file 58 normally identified in the table lookup functions. The header 59 of this file contains the names of the other table files 60 (paths relative to the directory containing the cpt file). 61 The names and extensions are specified in 'mkcatalog.c'; all 62 other functions use the defined filename references, rather 63 than expecting a naming convention. 64 65 The additional files contain the measures (cpm), missings 66 (cpn), and secfilt (cps) elements of the catalog tables. 67 68 To facilitate the handling of the additional filenames, file 69 pointers, and headers, Catalog was extended to include 70 pointers to the measure, missing, and secfilt files as 71 additional catalogs. When the data are loaded into memory, 72 these catalogs are locked (as usual), and file information is 73 stored in the individual Catalog entries; the data segements 74 are all loaded into the main catalog pointers (eg, measures 75 are loaded into catalog[0].measure, rather than 76 catalog[0].measure_catalog[0].measure). 77 78 The function 'load_catalog' auto-recognizes the SPLIT format 79 by looking for the header keyword MEASURE, identifying the 80 file containing the measures. The identification of the RAW 81 format and the SPLIT format are not cross-checked: if the 82 NAXIS keyword is set to 2, the file is assumed to be RAW, even 83 if the MEASURE keyword is present. Careful with this (though 84 there is no reason the main matrix should be used in a basic 85 database table). 86 87 nosort 88 89 the nosort option by itself provides a minor processing 90 speed-up by deferring the re-sorting of the measurement table 91 until after multiple addstar processes are run. addstar 92 should not require the measurements to be sorted, so this step 93 can be safetly deferred if only addstars are being performed. 94 the other DVO operations require the sorted table, so the sort 95 must be performed before they are run (either as part of the 96 catalog load, not implemented yet, or with a call to addstar 97 without the -nosort option set. the real goal of the nosort 98 option is to enable the -update concept in addstar, in which 99 only the new rows are written out; this will only work if 100 addstar can handle unsorted measures. 101 102 the nosort option required the addition of a 'sorted' element 103 in the Catalog structure to track if the data are sorted or 104 not. On load, this flag is set based on the value of the 105 header keyword SORTED; if the data is sorted during addstar, 106 the flag is appropriately set, otherwise it is set FALSE be 107 default. 108 109 The nosort option requires a function which can generate the 110 'next_meas' link sequence based on the measure table. there 111 is now a function called 'build_meas_link' which generates a 112 correct link list; there is also the pair of functions 113 'init_meas_links' and'init_miss_links' to generate the links 114 in the event that the table is sorted (should be must 115 quicker). 116 117 The 'missing' table is problematic: the LONEOS and ELIXIR 118 formats do not carry an averef entry, thus they do not have 119 enough information to define the links based only on the 120 missing table. This means we are forced to write out a sorted 121 missing table; the nosort option is invalid for the missing 122 table. One future upgrade path is to add the averef entry to 123 the PANSTARRS format and then only require the missing table 124 to be sorted if the format is old and does not support 125 -nosort. (Note also that, for the moment, the missing table 126 has only a single valid format). 127 128 In the process of defining the nosort option, I also cleaned 129 up a bit the find_matches functions to use clearer functions 130 for the links. 131 132 update 133 134 The 'update' process in principle allows addstar to 135 substantially reduce the amount of I/O it needs to perform by 136 only requiring addstar to write out new measures and new 137 average/secfilt entries. 138 139 The 'missing' table is problematic: since the format does not 140 support the 'nosort' option, it is not possible to use update 141 with the missing table. This means we are forced to write out 142 a complete, sorted missing table. This is currently 143 implemented in update_catalog_split by simply writing out the 144 complete missing table. In fact, this choice is still flawed 145 because the average table, since it is not written out in full 146 each time, is inconsistent with the missing table: the Nn 147 entries for each average, which identifies the number of 148 missing entries, are not updated. In practice, this means 149 that the -update option forces the use of the -missed option, 150 though at the moment, this is not forced or checked in any 151 way. 152 153 Note that the 'missed' table contains duplicate information 154 and can, in principle, be completely regenerated at any time. 155 This should be an addstar option: to re-construct the missing 156 table, potentially with constraints on the images which are 157 searched for matches. 158 159 2005.10.04 160 - moved measure/missing list manipulation to separate functions 161 - added concept of sorted / unsorted measure catalog 162 - defined build_meas_links and reorder_measure,missing 163 - some cleanup of both find_matches.c and find_matches_closest.c 164 165 2005.10.03: 166 - dropping GSCRegion *region entry from find_matches (unused!) 167 - adding function find_matches_closest (alternate matches) 168 169 2005.08.19: 170 changed load_photcode to handle CATMODE and CATFORMAT variations 171 - addstar.h: added CATMODE and CATFORMAT globals 172 - ConfigInit: read CATMODE and CATFORMAT from config 173 - gcatalog: set catalog.catmode from CATMODE 174 - mkcatalog: set CATFORMAT and CATMODE for new catalog 175 - wcatalog: set CATFORMAT for new catalog 176 177 using full photometry conversions in find_matches 178 added SetZeroPoint to gstars to enable phot conversions 179 180 2005.08.15: 181 cleanup of the minor Wall,Werror messages 182 183 2005.07.06 : current release is addstar-1.1 22 2005.07.06 : addstar-1.1 184 23 185 24 I have made a variety of fairly substantial changes since -
trunk/Ohana/src/addstar/doc/notes.txt
r5347 r5397 16 16 with sorted measure tables; add this as a feature 17 17 of the load_catalog API? 18 19 2005.10.19 20 21 I have added the client / server mode, and tested it out to a 22 fair degree. It handles all of the available addstar modes, 23 including -ref and -cat. A handful of things still need to be 24 done, including pushing some of the logic to enforce sorting 25 into the dvo load_catalog functions. Also, I need to correct 26 the idiosyncratic problem with sky vs Myyyy for the skyprobe 27 data. There are a handful of other improvements that are 28 needed before addstar / DVO is ready for IPP release. 29 However, at this point, it is ready for internal use by the 30 grad students, but not yet ready for CFHT use with Elixir. 18 31 19 32 2005.10.14 … … 122 135 from libohana. That is probably not a bad plan in any case... 123 136 137 2005.10.07 138 139 I have finished the basic implementation of the update mode. 140 I have been able to demonstrate substantial improvements in 141 speed when the number of existing measurements dominates the 142 total number of measurements and the number of averages is 143 typically small compared to the number of measurements (ie, 144 most objects are real, detected in most images, and each new 145 image supplies many new measurements of objects which exist 146 and not many of objects which don't exist already). The speed 147 gain is significant in this context because the average table 148 is small compared to the measure table; since both update and 149 full-load methods require the complete average table, there is 150 no difference in the load time for the average table. 151 152 I was having some memory collision problems, and attempting to 153 use the ohana_allocate functions reminded me that the libFITS 154 functions were not supported under ohana_allocate. This was 155 unhelpful. I bit the bullet and split libohana into libohana 156 (base functions only, including ohana_allocate) and libdvo 157 (functions based on the libautocode structures). Doing this 158 allowed me to make libFITS depend on libohana (including 159 ohana_allocate). BUT, this forced me to change all LDFLAGS 160 entries in ohana to swap -lohana -lFITS for -lFITS -lohana, 161 and to add include <fitsio.h> in some cases. 162 163 2005.10.06 164 split / nosort / update 165 I have added a few new concepts to addstar recently: split 166 catalog files, nosort for the measurement table, and 167 update-only. 168 169 split mode 170 171 The split mode is quite straightforeward. In this mode, each 172 catalog is represented by a set of four files: *.cpt, *.cpm, 173 *.cpn, *.cps. Each file contains only one FITS table of the 174 data, along with basic header and empty matrix. Having 175 individual tables for each component of the database lets me 176 add entries without re-writing the entire table. This should 177 save on I/O operations in the long run. 178 179 The first file contains the table of averages, and is the file 180 normally identified in the table lookup functions. The header 181 of this file contains the names of the other table files 182 (paths relative to the directory containing the cpt file). 183 The names and extensions are specified in 'mkcatalog.c'; all 184 other functions use the defined filename references, rather 185 than expecting a naming convention. 186 187 The additional files contain the measures (cpm), missings 188 (cpn), and secfilt (cps) elements of the catalog tables. 189 190 To facilitate the handling of the additional filenames, file 191 pointers, and headers, Catalog was extended to include 192 pointers to the measure, missing, and secfilt files as 193 additional catalogs. When the data are loaded into memory, 194 these catalogs are locked (as usual), and file information is 195 stored in the individual Catalog entries; the data segements 196 are all loaded into the main catalog pointers (eg, measures 197 are loaded into catalog[0].measure, rather than 198 catalog[0].measure_catalog[0].measure). 199 200 The function 'load_catalog' auto-recognizes the SPLIT format 201 by looking for the header keyword MEASURE, identifying the 202 file containing the measures. The identification of the RAW 203 format and the SPLIT format are not cross-checked: if the 204 NAXIS keyword is set to 2, the file is assumed to be RAW, even 205 if the MEASURE keyword is present. Careful with this (though 206 there is no reason the main matrix should be used in a basic 207 database table). 208 209 nosort 210 211 the nosort option by itself provides a minor processing 212 speed-up by deferring the re-sorting of the measurement table 213 until after multiple addstar processes are run. addstar 214 should not require the measurements to be sorted, so this step 215 can be safetly deferred if only addstars are being performed. 216 the other DVO operations require the sorted table, so the sort 217 must be performed before they are run (either as part of the 218 catalog load, not implemented yet, or with a call to addstar 219 without the -nosort option set. the real goal of the nosort 220 option is to enable the -update concept in addstar, in which 221 only the new rows are written out; this will only work if 222 addstar can handle unsorted measures. 223 224 the nosort option required the addition of a 'sorted' element 225 in the Catalog structure to track if the data are sorted or 226 not. On load, this flag is set based on the value of the 227 header keyword SORTED; if the data is sorted during addstar, 228 the flag is appropriately set, otherwise it is set FALSE be 229 default. 230 231 The nosort option requires a function which can generate the 232 'next_meas' link sequence based on the measure table. there 233 is now a function called 'build_meas_link' which generates a 234 correct link list; there is also the pair of functions 235 'init_meas_links' and'init_miss_links' to generate the links 236 in the event that the table is sorted (should be must 237 quicker). 238 239 The 'missing' table is problematic: the LONEOS and ELIXIR 240 formats do not carry an averef entry, thus they do not have 241 enough information to define the links based only on the 242 missing table. This means we are forced to write out a sorted 243 missing table; the nosort option is invalid for the missing 244 table. One future upgrade path is to add the averef entry to 245 the PANSTARRS format and then only require the missing table 246 to be sorted if the format is old and does not support 247 -nosort. (Note also that, for the moment, the missing table 248 has only a single valid format). 249 250 In the process of defining the nosort option, I also cleaned 251 up a bit the find_matches functions to use clearer functions 252 for the links. 253 254 update 255 256 The 'update' process in principle allows addstar to 257 substantially reduce the amount of I/O it needs to perform by 258 only requiring addstar to write out new measures and new 259 average/secfilt entries. 260 261 The 'missing' table is problematic: since the format does not 262 support the 'nosort' option, it is not possible to use update 263 with the missing table. This means we are forced to write out 264 a complete, sorted missing table. This is currently 265 implemented in update_catalog_split by simply writing out the 266 complete missing table. In fact, this choice is still flawed 267 because the average table, since it is not written out in full 268 each time, is inconsistent with the missing table: the Nn 269 entries for each average, which identifies the number of 270 missing entries, are not updated. In practice, this means 271 that the -update option forces the use of the -missed option, 272 though at the moment, this is not forced or checked in any 273 way. 274 275 Note that the 'missed' table contains duplicate information 276 and can, in principle, be completely regenerated at any time. 277 This should be an addstar option: to re-construct the missing 278 table, potentially with constraints on the images which are 279 searched for matches. 280 281 2005.10.04 282 - moved measure/missing list manipulation to separate functions 283 - added concept of sorted / unsorted measure catalog 284 - defined build_meas_links and reorder_measure,missing 285 - some cleanup of both find_matches.c and find_matches_closest.c 286 287 2005.10.03: 288 - dropping GSCRegion *region entry from find_matches (unused!) 289 - adding function find_matches_closest (alternate matches) 290 291 2005.08.19: 292 changed load_photcode to handle CATMODE and CATFORMAT variations 293 - addstar.h: added CATMODE and CATFORMAT globals 294 - ConfigInit: read CATMODE and CATFORMAT from config 295 - gcatalog: set catalog.catmode from CATMODE 296 - mkcatalog: set CATFORMAT and CATMODE for new catalog 297 - wcatalog: set CATFORMAT for new catalog 298 299 using full photometry conversions in find_matches 300 added SetZeroPoint to gstars to enable phot conversions 301 302 2005.08.15: 303 cleanup of the minor Wall,Werror messages 304 124 305 2005.03.07 : notes related to new version of addstar 125 306
Note:
See TracChangeset
for help on using the changeset viewer.
