IPP Software Navigation Tools IPP Links Communication Pan-STARRS Links

Opened 21 years ago

Closed 21 years ago

Last modified 21 years ago

#239 closed defect (fixed)

psVectorHistogram()

Reported by: gusciora@… Owned by: Paul Price
Priority: high Milestone:
Component: PSLib SDRS Version: unspecified
Severity: normal Keywords:
Cc:

Description

How should the errors psVector be used?

Change History (8)

comment:1 by Paul Price, 21 years ago

Resolution: fixed
Status: newclosed

On p4 of the ADD-07, there is a paragraph about histograms when using errors.
It is in the "robust statistics" section, but applies equally well to histograms
in general. Added a short section to the ADD:

"When calculating histograms in the presence of known errors in the
input values, the approach described above for the robust statistics
is used (i.e., the histograms become probability density functions)."

comment:2 by Paul Price, 21 years ago

Keywords: VERIFIED added

Closing subsequent to release of SDRS-08, ADD-07.

comment:3 by Paul Price, 21 years ago

Keywords: VERIFIED removed

comment:4 by gusciora@…, 21 years ago

Resolution: fixed
Status: closedreopened

There is no section about histograms in the current ADD (version 07).

The section on robust statistics mentions histograms, but it is
unclear to me. As I interpret it, each input value will contribute to multiple
bins in the histogram. The contributions to each bin will be equal. What will
that contribution be?

The width is defined as 2.35 times the error. I assume "width" refers to the
boxcar "width". Is the width defined in number of histogram bins? If not, how
do we determine which histogram bins must be modified? What do we do about
fraction bins? For example, if we are to update 1.5 bins surrounding the data
point, what should be done?

What does the phrase "full width at half maximum" mean?

comment:5 by Paul Price, 21 years ago

Resolution: fixed
Status: reopenedclosed

The section on histograms is in ADD-08, which hasn't been formally released yet
(bug must have gotten closed in the big closure). It reads:

If the errors in the input values are known, then the same approach is
used, except that the histograms become probability density functions
(PDFs). In this case, the input values are spread out, so that they
do not simply contribute a single unit to the histogram, but rather
contribute a fraction of a value, equivalent to the weight. In the
interests of speed, a boxcar PDF may be used to represent each input
value (as opposed to a Gaussian), where the boxcar width is equal to
$2 \sqrt{2 \ln 2}$ times the error and each input value contributes
constant area. Then the mean, median, mode, standard deviation and
quartiles are estimated in the same manner as above.

\paragraph{Histograms}

When calculating histograms in the presence of known errors in the
input values, the approach described above for the robust statistics
is used (i.e., the histograms become probability density functions).

=====================================================================

So, yes --- each point will contribute to multiple bins. You might think of it

as a probability density function, integrated over the bins. As specified, the

contribution width for each value is approx 2.35 (= 2sqrt(2ln2) ) times the
error in the value.

An example may help. Say we have our histogram bounds being 0, 1, 2, 3, 4, 5;
and our value is 2.5 +/- 0.5.
Then, the width of the contribution is 0.5*2.35... ~ 1.175. Half the width is
0.5875, so we will treat this value as a boxcar from 2.5 - 0.5875 to 2.5 + 0.5875.
Consequently, the bins 0 to 1 and 4 to 5 get no value, because none of the
boxcar overlaps. The bin 1 to 2 gets 0.0875, because that's the fraction of the
boxcar that overlaps with it; same thing with the bin 3 to 4. The bin 2 to 3
gets 0.825 because that's the fraction of the boxcar that overlaps with it.
So the single value 2.5 +/- 0.5 makes the following histogram:
0-1 0
1-2 0.0875
2-3 0.8250
3-4 0.0875
4-5 0

Note that the total adds to one --- the number of values added.

In the interests of clarity, I'm adding this example to the ADD as well. I hope
that's cleared up any remaining confusion. Please let me know if not.

comment:6 by gusciora@…, 21 years ago

Much better! Thanks.

This can be closed.

comment:7 by gusciora@…, 21 years ago

Wait. Don't close it yet. The bins for the histograms are defined as unsigned
32-bit integers. It won't be so easy to add fractional values.

comment:8 by Paul Price, 21 years ago

OK, updated the SDRS:

the vector \code{bounds} specifies the boundaries
of the histogram bins, and must of type \code{psF32}, while
\code{nums} specifies the number of entries in the bin, and must be of
type \code{psF32} in order to accomodate errors.

Note: See TracTickets for help on using tickets.