[Cuis-dev] testing Float results

Tue Jun 4 13:12:55 PDT 2024

I think checking for a relative error on the order of 10^-4 when double 
precision floating point numbers can represent relative errors on the 
order of 10^-16 is not good.  This means that the result can be off by a 
factor of roughly 10^12 and still be "correct".

Moreover, the implementation is incorrect because if the numbers being 
compared are denormals, multiplying by 0.0001 will lead to things like

	nonZero < 0.0

in the code, which always fails.

For example,

good := 10 raisedTo: 16.
bad := good + (10 raisedTo: 12).
[
	TestCase new assert: good asFloat isCloseTo: bad asFloat.
	#allGood
] on: TestFailure do: [:ex | ex return: #allWrong]

returns #allGood (but the relative error is huge!).  Meanwhile,

good := 10 raisedTo: 16.
bad := good + (10 raisedTo: 13).
[
	TestCase new assert: good asFloat isCloseTo: bad asFloat.
	#allGood
] on: TestFailure do: [:ex | ex return: #allWrong]

returns #allWrong because apparently the results are unacceptable after 
losing more than ~75% of the precision available.  Finally,

good := Float fminDenormalized * 10.
bad := good.
[
	TestCase new assert: good asFloat isCloseTo: bad asFloat.
	#allGood
] on: TestFailure do: [:ex | ex return: #allWrong]

returns #allWrong even though good = bad.

This kind of relative error checking makes sense, of course, but only 
when you do substantial numerical analysis homework to know what the 
tolerable threshold actually is for the application at hand.  In 
practice this is not done, and arbitrary statements like "0.0001 
relative error is good enough for everyone" effectively suppress 
curiosity into how things actually work.  On top of that, this threshold 
is so lax that it hides basically all but the most egregious errors 
under the rug.

Intel got into a lot of trouble over its Pentium FDIV bug for errors 
much smaller than this.  The x87 FPU had trigonometric transcendental 
instructions that were grossly imprecise in certain cases (e.g. only 
about 15 bits of the 53 bit mantissa made sense and the rest were 
rubbish) because of poor argument reduction.  The modern x86 CPU vector 
units do not have hardware transcendental support because you can use 
SSE to implement great precision trigonometry with far less effort than 
you can do that in hardware, and in addition the process is very fast. 
These are the types of problems you ignore when you say things like 
"it's within 10^-4", and obviously this is contrary to understanding, 
personal mastery, and so on.

So instead of that, I would ensure that the actual result is a certain, 
well understood number of ulps of the expected value.  The ulp tolerance 
depends on the numerical analysis for the calculations being done.  If 
you do not do this, then you constantly run the risk of living in the 
world of the Indiana pi bill.

https://en.wikipedia.org/wiki/Indiana_pi_bill

There is no royal road to floating point arithmetic.

On 6/4/24 12:06 PM, Mark Volkmann via Cuis-dev wrote:
> How do you feel about using "approximately equal" tests for things like 
> unit tests for the calculation of areas of 2D shapes such as circles?
> 
> On Mon, Jun 3, 2024 at 10:47 PM Martin McClure via Cuis-dev 
> <cuis-dev at lists.cuis.st <mailto:cuis-dev at lists.cuis.st>> wrote:
> 
>     Hi Mark,
> 
>     Welcome to the list -- good to see you diving into Cuis!
> 
>     As Andres says, there are messages to do this kind of test on Floats.
> 
>     I find myself compelled, however, to warn about using such
>     "approximately equal" tests when inappropriate. I recently discovered
>     this kind of usage in an ancient test framework, and it was allowing
>     tests to pass that should not have. Those tests are being replaced with
>     equality tests.
> 
>     As I commented to a co-worker just this morning, accepting as correct a
>     Float result that is one ULP different from the correct Float is really
>     no different than accepting 5 as a correct answer to 2 + 2.
> 
>     There are, of course, times when figuring out the exact rounding
>     expected in a sequence of floating-point operations is impractical, and
>     accepting a certain amount of cumulative error is OK. Floats are often
>     used in applications where accuracy is only required to some specific
>     precision, but it's also good to keep in mind that each Float precisely
>     represents one value, and each operation on a Float has only one
>     correct
>     answer.
> 
>     Regards,
>     -Martin
> 
>     On 6/3/24 19:49, Andres Valloud via Cuis-dev wrote:
>      > Look at Float>>isWithin:floatsFrom:, and see also Float>>ulp.
>      >
>      > On 6/3/24 4:36 PM, Mark Volkmann via Cuis-dev wrote:
>      >> Is there a function that tests whether two Float values are "close"
>      >> (within some delta)?
>      >> I can write it, but I thought that might be provided.
>      >> I looked at all the methods in the Float class, but didn't find one
>      >> like that.
>      >>
>      >> --
>      >> R. Mark Volkmann
>      >> Object Computing, Inc.
>      >>
> 
>     -- 
>     Cuis-dev mailing list
>     Cuis-dev at lists.cuis.st <mailto:Cuis-dev at lists.cuis.st>
>     https://lists.cuis.st/mailman/listinfo/cuis-dev
>     <https://lists.cuis.st/mailman/listinfo/cuis-dev>
> 
> 
> 
> -- 
> R. Mark Volkmann
> Object Computing, Inc.
>