[Cuis-dev] Problems in class Number
Andres Valloud
ten at smallinteger.com
Mon Oct 7 22:20:03 PDT 2019
Agustin, nice to see someone looking into these kinds of things :).
> * In Integer, the *sumTo: anInteger *method,
> computes "self+(self+1)+...+anInteger". The problem here is that if
> self > anInteger this operation does not have sense. For this case,
> the implementation returns 0 instead of rasing an error. This kind
> of makes sense, because not adding anything could be thought as
> 0, nevertheless this is cuestionable.
Well, it really depends on the perspective. For instance, if one adopts
a point of view where definitions must be very strict, these seemingly
"unlawful" operations appear as a problem. In practice, however, it's
very useful to let empty sums add to zero. Note that zero is not just
any which number: it is the additive identity in the integers.
Here are a couple of situations that come up in Smalltalk code where
pragmatics give context and meaning to seemingly ill defined operations.
-1 timesRepeat: [self doSomethingUseful]
Obviously -1 timesRepeat: doesn't make any sense if seen head on. What
is a negative number of times anyhow? However, it is useful that such
an expression does nothing because that -1 could be a variable or the
result of some other message send that uses -1 (or -5, or some other
negative number) to indicate that there are no useful cases to consider,
and later on the user of this response can then rely on the fact that
negative times repeat does nothing.
Similarly,
1 to: 0 do: [:x | x halt]
does nothing, as it should.
Beyond that, since sumTo: is closely related to finite calculus, I'd
point out that in that context the usual definition of sums is such that
the terms can be defined in terms of sets and conditionals, not just
index ranges. When one can show the set of potential terms is empty,
then the sum is defined to be zero. See for example Concrete
Mathematics, pg. 24 (second edition).
By the way, the entire Concrete Mathematics book is awesome, you should
take a look into it if you haven't already. Plus, if you find a way to
improve any book by Knuth, and you send that bug report to Knuth and he
accepts the improvement, then you get a check (ok, these days you get an
account in the fictional Bank of San Serriffe rather than an actual
paper check, but still!). See here:
https://www-cs-faculty.stanford.edu/~knuth/news08.html
I got a paper check at one point by paying attention to his books and
working with quadratic residues in Smalltalk. I don't see why you
shouldn't be able to get one that way, too.
If you find something to send him, take your time to methodically make
sure the report is concise and correct. For neat ways to report bugs in
particular, and ask questions in general, you can see some suggestions
by going through Eric Raymond's article here:
http://catb.org/~esr/faqs/smart-questions.html
> * Analogously, the *productTo: anInteger *method, computes
> self*(self+1)*...*anInteger. Similarly, in this case when self >
> anInteger, it returns 1 instead of an error. In this case, I don't
> think there is a direct correlation to think that not multiplying
> any number should be equal to 1.
Similarly to the above, the empty product (as in $\prod_{k=1}^0 n_k$) is
defined to be the multiplicative identity for multiplication, i.e.: 1.
See Concrete Mathematics, pg. 106 (second edition).
> * A detail, if now we make this change in the method *productTo:
> anInteger*, there will be a small problem in the method *factorial*.
> This is because what factorial does is /"^1 productTo: self"/ (after
> checking if self is not negative). So, with this new change, if we
> make /0 factorial/ it will raise an error, because we would be
> computing /"^1 productTo: 0"/, and this would be not
> defined. Anyway, this can be easily fixed by checking this border case.
Actually, 0! = 1 by definition, see Concrete Mathematics, pg. 47 and on
(second edition). Note that n! is the product of n integers, either
starting at 1 and ending at n, or starting at n and ending at 1. So 0!
is an empty product, and by the previous case it's 1.
> * The *raisedToInteger: exp modulo: m *method**in Integer has a very
> big problem. If we compute, for example, /"5 raisedTo: 0 modulo:
> 0"/, this returns 1. This means, that according to Smalltalk, the
> rest of the division by 0 of 1(=5^0) is equal to 1 (Yes, division by
> zero!!). I think you can see the problem. This is due the first line
> of the method, that says /"(exp = 0) ifTrue: [^ 1].", /does
> not check anything else. This problem can be easily fixed by
> checking if m=0 just before.
I agree, the current code appears to be wrong. The initials on the code
belong to Juan Vuletich and Nicolas Cellier. Guys, is there reason why
e.g. 5 raisedTo: 0 modulo: 0 should answer 1 rather than fail? I don't
see any, but...
Assuming the code is broken and needs to be fixed, alternatively one
could also write the initial guard clause like this:
n = 0 ifTrue: [^1 \\ m].
because the case m = 0 will fail.
> * Another detail, in Number there are defined this methods: *mod:
> divisor*, *rem: divisor* y *\\ divisor*. All of this are thought
> like some kind of rest in the division. Particularly, the method
> *mod: divisor *maps to the mathematical operation of modulo of
> modular arithmetic. Smalltalk defines this operation in general for
> every number as modulo, but mathematically this operation should be
> only be defined for the natual numbers, thus it could be added
> some check for this. "For a positive integer n, two numbers a and
> b are said to be congruent modulo n ...[]"
> <https://en.wikipedia.org/wiki/Modular_arithmetic>
I see no contradiction here. I agree that for positive integers it's
possible to define the mod congruence in terms of Euclidean division.
That, however, does not forbid extending the definition to all integers,
consistently using the same Euclidean division algorithm. The key here
is that the remainder r of dividing by d always satisfies 0 <= r < |d|.
In fact, when I proposed using the mod: selector for this purpose in the
first place, I went through every sender of \\ and rem: to verify they
were being used as intended. Surprise: there was a sender in Color
that, because \\ and rem: do not guarantee non-negative remainders,
ended up calculating negative hues which would subsequently cause
errors. That was an ideal place to use mod:, because in that particular
instance what was required was the Euclidean remainder.
> * Last detail, if we compute /"0 raisedToInteger: 0"/ this returns 1.
> I am not a math expert, but I don't think this operation should be
> defined if there is not even a consensual answer for this.
What's the use case? Here, since the receiver is an integer, one might
think the use case is algebra or combinatorics, so one might think the
usual argument of k^0 = 1 applies, like this:
https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero
This seems especially relevant because, in the example you mention, the
exponent is also an integer, so the vast majority of the article above
applies. That material includes so many applications (discrete
exponents, algebra, combinatorics, set theory, polynomials, power
series, limits), the statement that the consensus is that 0^0 = 1, plus
Knuth's own endorsement, that it seems 0^0 = 1 is the approach that is
most likely to be useful by a very large margin of safety.
> * The *isPrime *method in Integer makes some optimization in order to
> run the algorithm in O(sqrt(self)) instead of the naive way in
> O(self). This is very intelligent, but the constant factor of this
> method can be still improved significantly. I share with you my
> implementation of *isPrimeFast *with a small explanation. This
> implementation runs in general more than 3 times faster than the
> actual one. I leave you a test that checks the correctness of it as
> well, and some other tests that check this complexity I mentioned.
I see what you did there, but I do not know how to reproduce the time
tests you mention. I built a sample of integers between 1 and 2^32 (I
didn't go up to 2^64 because that would require O(2^32) operations each,
and I want that to finish in reasonable time), and I get something like
a 2x performance improvement rather than 3x. This seems to make sense
because the approach you propose halves the \\ operations (8 remain out
of the 16 the current code is doing, for every batch of 30 potential
divisors).
slicer := 1024.
thickness := 255.
maxK := 1 bitShift: 32.
integers := 1 to: maxK by: maxK // slicer
:: inject: OrderedCollection new
into: [:t :x |
t add: x.
thickness timesRepeat: [t add: t last + 1].
t yourself]
:: asArray.
Time millisecondsToRun:
[1 to: integers size do:
[:x | (integers at: x) isPrime]]
Using the above code (which I could not format more nicely in this
email), I get about 4.8s for isPrime, and about 2.4s for isPrimeFast.
Generally, isPrime shouldn't send isProbablyPrime because isPrime is
meant to be deterministic, and one shouldn't assume that the
probabilistic algorithm today will happen to provide the correct
deterministic answer tomorrow.
Why is the (each+31) case first in the trial division loop?
Andres.
More information about the Cuis-dev
mailing list