[Cuis-dev] Problems in class Number

Andres Valloud ten at smallinteger.com
Mon Oct 7 22:20:03 PDT 2019


Agustin, nice to see someone looking into these kinds of things :).

>   * In Integer, the *sumTo: anInteger *method,
>     computes "self+(self+1)+...+anInteger". The problem here is that if
>     self > anInteger this operation does not have sense. For this case,
>     the implementation returns 0 instead of rasing an error. This kind
>     of makes sense, because not adding anything could be thought as
>     0, nevertheless this is cuestionable.

Well, it really depends on the perspective.  For instance, if one adopts 
a point of view where definitions must be very strict, these seemingly 
"unlawful" operations appear as a problem.  In practice, however, it's 
very useful to let empty sums add to zero.  Note that zero is not just 
any which number: it is the additive identity in the integers.

Here are a couple of situations that come up in Smalltalk code where 
pragmatics give context and meaning to seemingly ill defined operations.

	-1 timesRepeat: [self doSomethingUseful]

Obviously -1 timesRepeat: doesn't make any sense if seen head on.  What 
is a negative number of times anyhow?  However, it is useful that such 
an expression does nothing because that -1 could be a variable or the 
result of some other message send that uses -1 (or -5, or some other 
negative number) to indicate that there are no useful cases to consider, 
and later on the user of this response can then rely on the fact that 
negative times repeat does nothing.

Similarly,

	1 to: 0 do: [:x | x halt]

does nothing, as it should.

Beyond that, since sumTo: is closely related to finite calculus, I'd 
point out that in that context the usual definition of sums is such that 
the terms can be defined in terms of sets and conditionals, not just 
index ranges.  When one can show the set of potential terms is empty, 
then the sum is defined to be zero.  See for example Concrete 
Mathematics, pg. 24 (second edition).

By the way, the entire Concrete Mathematics book is awesome, you should 
take a look into it if you haven't already.  Plus, if you find a way to 
improve any book by Knuth, and you send that bug report to Knuth and he 
accepts the improvement, then you get a check (ok, these days you get an 
account in the fictional Bank of San Serriffe rather than an actual 
paper check, but still!).  See here:

https://www-cs-faculty.stanford.edu/~knuth/news08.html

I got a paper check at one point by paying attention to his books and 
working with quadratic residues in Smalltalk.  I don't see why you 
shouldn't be able to get one that way, too.

If you find something to send him, take your time to methodically make 
sure the report is concise and correct.  For neat ways to report bugs in 
particular, and ask questions in general, you can see some suggestions 
by going through Eric Raymond's article here:

http://catb.org/~esr/faqs/smart-questions.html

>   * Analogously, the *productTo: anInteger *method, computes
>     self*(self+1)*...*anInteger. Similarly, in this case when self >
>     anInteger, it returns 1 instead of an error. In this case, I don't
>     think there is a direct correlation to think that not multiplying
>     any number should be equal to 1.

Similarly to the above, the empty product (as in $\prod_{k=1}^0 n_k$) is 
defined to be the multiplicative identity for multiplication, i.e.: 1. 
See Concrete Mathematics, pg. 106 (second edition).

>   * A detail, if now we make this change in the method *productTo:
>     anInteger*, there will be a small problem in the method *factorial*.
>     This is because what factorial does is /"^1 productTo: self"/ (after
>     checking if self is not negative). So, with this new change, if we
>     make /0 factorial/ it will raise an error, because we would be
>     computing /"^1 productTo: 0"/, and this would be not
>     defined. Anyway, this can be easily fixed by checking this border case.

Actually, 0! = 1 by definition, see Concrete Mathematics, pg. 47 and on 
(second edition).  Note that n! is the product of n integers, either 
starting at 1 and ending at n, or starting at n and ending at 1.  So 0! 
is an empty product, and by the previous case it's 1.

>   * The *raisedToInteger: exp modulo: m *method**in Integer has a very
>     big problem. If we compute, for example, /"5 raisedTo: 0 modulo:
>     0"/, this returns 1. This means, that according to Smalltalk, the
>     rest of the division by 0 of 1(=5^0) is equal to 1 (Yes, division by
>     zero!!). I think you can see the problem. This is due the first line
>     of the method, that says /"(exp = 0) ifTrue: [^ 1].", /does
>     not check anything else. This problem can be easily fixed by
>     checking if m=0 just before.

I agree, the current code appears to be wrong.  The initials on the code 
belong to Juan Vuletich and Nicolas Cellier.  Guys, is there reason why 
e.g. 5 raisedTo: 0 modulo: 0 should answer 1 rather than fail?  I don't 
see any, but...

Assuming the code is broken and needs to be fixed, alternatively one 
could also write the initial guard clause like this:

	n = 0 ifTrue: [^1 \\ m].

because the case m = 0 will fail.

>   * Another detail, in Number there are defined this methods: *mod:
>     divisor*, *rem: divisor* y *\\ divisor*. All of this are thought
>     like some kind of rest in the division. Particularly, the method
>     *mod: divisor *maps to the mathematical operation of modulo of
>     modular arithmetic. Smalltalk defines this operation in general for
>     every number as modulo, but mathematically this operation should be
>     only be defined for the natual numbers, thus it could be added
>     some check for this. "For a positive integer n, two numbers a and
>     b are said to be congruent modulo n ...[]"
>     <https://en.wikipedia.org/wiki/Modular_arithmetic>

I see no contradiction here.  I agree that for positive integers it's 
possible to define the mod congruence in terms of Euclidean division. 
That, however, does not forbid extending the definition to all integers, 
consistently using the same Euclidean division algorithm.  The key here 
is that the remainder r of dividing by d always satisfies 0 <= r < |d|.

In fact, when I proposed using the mod: selector for this purpose in the 
first place, I went through every sender of \\ and rem: to verify they 
were being used as intended.  Surprise: there was a sender in Color 
that, because \\ and rem: do not guarantee non-negative remainders, 
ended up calculating negative hues which would subsequently cause 
errors.  That was an ideal place to use mod:, because in that particular 
instance what was required was the Euclidean remainder.

>   * Last detail, if we compute /"0 raisedToInteger: 0"/ this returns 1.
>     I am not a math expert, but I don't think this operation should be
>     defined if there is not even a consensual answer for this.

What's the use case?  Here, since the receiver is an integer, one might 
think the use case is algebra or combinatorics, so one might think the 
usual argument of k^0 = 1 applies, like this:

https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero

This seems especially relevant because, in the example you mention, the 
exponent is also an integer, so the vast majority of the article above 
applies.  That material includes so many applications (discrete 
exponents, algebra, combinatorics, set theory, polynomials, power 
series, limits), the statement that the consensus is that 0^0 = 1, plus 
Knuth's own endorsement, that it seems 0^0 = 1 is the approach that is 
most likely to be useful by a very large margin of safety.

>   * The *isPrime *method in Integer makes some optimization in order to
>     run the algorithm in O(sqrt(self)) instead of the naive way in
>     O(self). This is very intelligent, but the constant factor of this
>     method can be still improved significantly. I share with you my
>     implementation of *isPrimeFast *with a small explanation. This
>     implementation runs in general more than 3 times faster than the
>     actual one. I leave you a test that checks the correctness of it as
>     well, and some other tests that check this complexity I mentioned.

I see what you did there, but I do not know how to reproduce the time 
tests you mention.  I built a sample of integers between 1 and 2^32 (I 
didn't go up to 2^64 because that would require O(2^32) operations each, 
and I want that to finish in reasonable time), and I get something like 
a 2x performance improvement rather than 3x.  This seems to make sense 
because the approach you propose halves the \\ operations (8 remain out 
of the 16 the current code is doing, for every batch of 30 potential 
divisors).

	slicer := 1024.
	thickness := 255.
	maxK := 1 bitShift: 32.
	integers := 1 to: maxK by: maxK // slicer
		:: inject: OrderedCollection new
		into: [:t :x |
			t add: x.
			thickness timesRepeat: [t add: t last + 1].
			t yourself]
		:: asArray.
	Time millisecondsToRun:
		[1 to: integers size do:
			[:x | (integers at: x) isPrime]]

Using the above code (which I could not format more nicely in this 
email), I get about 4.8s for isPrime, and about 2.4s for isPrimeFast.

Generally, isPrime shouldn't send isProbablyPrime because isPrime is 
meant to be deterministic, and one shouldn't assume that the 
probabilistic algorithm today will happen to provide the correct 
deterministic answer tomorrow.

Why is the (each+31) case first in the trial division loop?

Andres.


More information about the Cuis-dev mailing list