<div dir="ltr">I added a comment with a warning about reproducibility to Collection>>#atRandom:, Set>>#atRandom: and Bag>>#atRandom:. I also moved the empty check to the top of the methods in Collection>>#atRandom: and Bag>>#atRandom:, the idea behind this is that '0 atRandom: aGenerator' produces a different error, and doing 'self emptyCheck' first is better because if at some point we change Collection>>#errorEmptyCollection to signal a new exception like say EmptyCollectionException we won't need to change these methods. Finally the last line in those two methods should never be executed, unless the collection is broken (i.e. size returns n, but do: iterates over less than n elements), so I just do "self error: 'collection invariants broken'"<div><br></div><div>Let me know if you have any more suggestions, I really want the kernel classes to be absolutely perfect.</div><div><br></div><div>Thanks!!</div><div>Luciano</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, May 13, 2022 at 11:01 PM Juan Vuletich <<a href="mailto:JuanVuletich@zoho.com">JuanVuletich@zoho.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div bgcolor="#ffffff">
What I think is the more important issue is that of reproducibility
of results. Again, something that someone using hashed collections
should be aware of, but a comment wouldn't hurt.<br>
<br>
On 5/13/2022 7:48 PM, Luciano Notarfrancesco via Cuis-dev wrote:
<blockquote type="cite">
<div>Hm, yes, it could fail to be reproducible if the Collection
is iterated in different order in two successive calls to do:.
Good point. This might happen if the collection is rehashed
between calls, as you say. Same problem could happen in Set with
the implementation that we have been using for some years now. I
use this extensively in tests in my math project and I didn’t
run into problems, but I think it’s a good idea to add a
comment.</div>
<div><br>
</div>
<div>However I don’t think it could affect the uniformity of the
distribution. If you take random samples from a collection and
reorder it between each sample, it should still be uniform,
assuming the generator is really random. Since our generators
are not real random, and knowing the internal state of the
generator and the algorithm, it is possible to do a trick and
reorder it each time you take a sample in such a way that would
always return the same element… but I don’t think it’s a real
problem, I think if the collection is rehashed it would still
look uniform. I’ll think more about it tomorrow when I’m more
awake, tho.</div>
<div><br>
</div>
<div>The implementation of Collection>>#atRandom: is mostly
for completeness, in practice we reimplement it in subclasses
more efficiently, exploiting the structure of each type of
collection.</div>
<div><br>
</div>
<div><br>
<div>
<div>On Sat, 14 May 2022 at 5:26 AM Juan Vuletich <<a href="mailto:JuanVuletich@zoho.com" target="_blank">JuanVuletich@zoho.com</a>>
wrote:<br>
</div>
<blockquote>
<div> Well, given that there is no actual guarantee of the
iteration order, there is no guarantee of the distribution
of the picks.<br>
<br>
I know it is not likely, but we can't prove it is
impossible that this strategy answers always the same
element. All we need is extremely bad luck so the
collections are rehashed following the RNG! In practice,
what is possible, is that the distribution is not exactly
uniform.<br>
<br>
Another problem is with reproducibility of the results. In
tests, or any other situation where you need to guarantee
the same results, it is usually enough to use the same
seeds for the RNGs. In these cases, results would not be
reproducible, because there is no guarantee of sequencing
between successive runs. Especially if the image is
restarted, or the data is recreated, in a different image.<br>
<br>
What I'd assume is that the user knows what they are
doing. Maybe a warning in a comment in those methods is in
order.<br>
<br>
Thanks,</div>
<div><br>
<br>
On 5/13/2022 7:11 PM, Luciano Notarfrancesco via Cuis-dev
wrote:
<blockquote type="cite">
<div>What do you mean by good random properties? It
should be uniformly distributed. Do you see any
problem in Collection>>#atRandom: or
Bag>>#atRandom:?</div>
<div><br>
</div>
<div>Thanks!</div>
<div>Luciano</div>
<div><br>
<div>
<div>On Sat, 14 May 2022 at 4:57 AM Juan Vuletich
<<a href="mailto:JuanVuletich@zoho.com" target="_blank">JuanVuletich@zoho.com</a>>
wrote:<br>
</div>
<blockquote>
<div> Anyone using #atRandom: on a
non-sequenceable collection should be aware that
no good random properties can be guaranteed,
right?<br>
<br>
Anyway, just pushed to GitHub.<br>
<br>
Thanks,</div>
<div><br>
<br>
On 5/9/2022 7:56 AM, Luciano Notarfrancesco via
Cuis-dev wrote:
<blockquote type="cite">
<div>Juan, please don't forget to look at the
changeset in my previous mail when you have
time.
<div><br>
</div>
<div>Here are some additional tweaks. I
reimplemented
Collection>>#identityIncludes: using
#allSatisfy: instead of #do:, in this way
it is fast for Bags too, and it mirrors
the implementation of
Collection>>#includes:.</div>
<div><br>
</div>
<div>I also implemented a fast
Bag>>#atRandom:, and implemented a
general Collection>>#atRandom:.
Originally #atRandom: was not implemented
in Collection, so I implemented a generic
version that only assumes the collection
understands #size and #do:.</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Luciano</div>
</div>
<br>
<div>
<div>On Tue, May 3, 2022 at 7:54 AM Luciano
Notarfrancesco <<a href="mailto:luchiano@gmail.com" target="_blank">luchiano@gmail.com</a>>
wrote:<br>
</div>
<blockquote>
<div>Here are some more methods that take
advantage of the structure of a Bag
(#allSatisfy:, #anySatisfy:, #max:,
#min:, #sum:, etc).
<div><br>
</div>
<div>Also made some tweaks to some
methods in Collection to call existing
methods instead of reimplementing, in
order to simplify the changes in Bag
(otherwise. for example, I'd have to
implement #sum, #sum: and
#sum:ifEmpty: in Bag instead of only
implementing #sum:ifEmpty). And I
changed Collection>>#product to
produce an error when the collection
is empty instead of returning 1 (to be
consistent with
Collection>>#sum:).</div>
<div><br>
</div>
<div>All base image tests pass, but
please review.
<div><br>
</div>
<div>Also, while running tests I got a
walkback on BitBltCanvasEngine, see
the attached log.</div>
</div>
</div>
</blockquote>
</div>
</blockquote>
<br>
<br>
<pre>--
Juan Vuletich
<a href="http://www.cuis-smalltalk.org" target="_blank">www.cuis-smalltalk.org</a>
<a href="https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev" target="_blank">https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev</a>
<a href="https://github.com/jvuletich" target="_blank">https://github.com/jvuletich</a>
<a href="https://www.linkedin.com/in/juan-vuletich-75611b3" target="_blank">https://www.linkedin.com/in/juan-vuletich-75611b3</a>
<a href="https://independent.academia.edu/JuanVuletich" target="_blank">https://independent.academia.edu/JuanVuletich</a>
<a href="https://www.researchgate.net/profile/Juan-Vuletich" target="_blank">https://www.researchgate.net/profile/Juan-Vuletich</a>
<a href="https://patents.justia.com/inventor/juan-manuel-vuletich" target="_blank">https://patents.justia.com/inventor/juan-manuel-vuletich</a>
<a href="https://twitter.com/JuanVuletich" target="_blank">https://twitter.com/JuanVuletich</a></pre>
</div>
</blockquote>
</div>
</div>
</blockquote>
<br>
<br>
<pre>--
Juan Vuletich
<a href="http://www.cuis-smalltalk.org" target="_blank">www.cuis-smalltalk.org</a>
<a href="https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev" target="_blank">https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev</a>
<a href="https://github.com/jvuletich" target="_blank">https://github.com/jvuletich</a>
<a href="https://www.linkedin.com/in/juan-vuletich-75611b3" target="_blank">https://www.linkedin.com/in/juan-vuletich-75611b3</a>
<a href="https://independent.academia.edu/JuanVuletich" target="_blank">https://independent.academia.edu/JuanVuletich</a>
<a href="https://www.researchgate.net/profile/Juan-Vuletich" target="_blank">https://www.researchgate.net/profile/Juan-Vuletich</a>
<a href="https://patents.justia.com/inventor/juan-manuel-vuletich" target="_blank">https://patents.justia.com/inventor/juan-manuel-vuletich</a>
<a href="https://twitter.com/JuanVuletich" target="_blank">https://twitter.com/JuanVuletich</a></pre>
</div>
</blockquote>
</div>
</div>
</blockquote>
<br>
<br>
<pre cols="72">--
Juan Vuletich
<a href="http://www.cuis-smalltalk.org" target="_blank">www.cuis-smalltalk.org</a>
<a href="https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev" target="_blank">https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev</a>
<a href="https://github.com/jvuletich" target="_blank">https://github.com/jvuletich</a>
<a href="https://www.linkedin.com/in/juan-vuletich-75611b3" target="_blank">https://www.linkedin.com/in/juan-vuletich-75611b3</a>
<a href="https://independent.academia.edu/JuanVuletich" target="_blank">https://independent.academia.edu/JuanVuletich</a>
<a href="https://www.researchgate.net/profile/Juan-Vuletich" target="_blank">https://www.researchgate.net/profile/Juan-Vuletich</a>
<a href="https://patents.justia.com/inventor/juan-manuel-vuletich" target="_blank">https://patents.justia.com/inventor/juan-manuel-vuletich</a>
<a href="https://twitter.com/JuanVuletich" target="_blank">https://twitter.com/JuanVuletich</a></pre>
</div>
</blockquote></div>