[Cuis-dev] Some more Bag tweaks

Juan Vuletich JuanVuletich at zoho.com
Fri May 13 16:01:27 PDT 2022


What I think is the more important issue is that of reproducibility of 
results. Again, something that someone using hashed collections should 
be aware of, but a comment wouldn't hurt.

On 5/13/2022 7:48 PM, Luciano Notarfrancesco via Cuis-dev wrote:
> Hm, yes, it could fail to be reproducible if the Collection is 
> iterated in different order in two successive calls to do:. Good 
> point. This might happen if the collection is rehashed between calls, 
> as you say. Same problem could happen in Set with the implementation 
> that we have been using for some years now. I use this extensively in 
> tests in my math project and I didn’t run into problems, but I think 
> it’s a good idea to add a comment.
>
> However I don’t think it could affect the uniformity of the 
> distribution. If you take random samples from a collection and reorder 
> it between each sample, it should still be uniform, assuming the 
> generator is really random. Since our generators are not real random, 
> and knowing the internal state of the generator and the algorithm, it 
> is possible to do a trick and reorder it each time you take a sample 
> in such a way that would always return the same element… but I don’t 
> think it’s a real problem, I think if the collection is rehashed it 
> would still look uniform. I’ll think more about it tomorrow when I’m 
> more awake, tho.
>
> The implementation of Collection>>#atRandom: is mostly for 
> completeness, in practice we reimplement it in subclasses more 
> efficiently, exploiting the structure of each type of collection.
>
>
> On Sat, 14 May 2022 at 5:26 AM Juan Vuletich <JuanVuletich at zoho.com 
> <mailto:JuanVuletich at zoho.com>> wrote:
>
>     Well, given that there is no actual guarantee of the iteration
>     order, there is no guarantee of the distribution of the picks.
>
>     I know it is not likely, but we can't prove it is impossible that
>     this strategy answers always the same element. All we need is
>     extremely bad luck so the collections are rehashed following the
>     RNG! In practice, what is possible, is that the distribution is
>     not exactly uniform.
>
>     Another problem is with reproducibility of the results. In tests,
>     or any other situation where you need to guarantee the same
>     results, it is usually enough to use the same seeds for the RNGs.
>     In these cases, results would not be reproducible, because there
>     is no guarantee of sequencing between successive runs. Especially
>     if the image is restarted, or the data is recreated, in a
>     different image.
>
>     What I'd assume is that the user knows what they are doing. Maybe
>     a warning in a comment in those methods is in order.
>
>     Thanks,
>
>
>     On 5/13/2022 7:11 PM, Luciano Notarfrancesco via Cuis-dev wrote:
>>     What do you mean by good random properties? It should be
>>     uniformly distributed. Do you see any problem in
>>     Collection>>#atRandom: or Bag>>#atRandom:?
>>
>>     Thanks!
>>     Luciano
>>
>>     On Sat, 14 May 2022 at 4:57 AM Juan Vuletich
>>     <JuanVuletich at zoho.com <mailto:JuanVuletich at zoho.com>> wrote:
>>
>>         Anyone using #atRandom: on a non-sequenceable collection
>>         should be aware that no good random properties can be
>>         guaranteed, right?
>>
>>         Anyway, just pushed to GitHub.
>>
>>         Thanks,
>>
>>
>>         On 5/9/2022 7:56 AM, Luciano Notarfrancesco via Cuis-dev wrote:
>>>         Juan, please don't forget to look at the changeset in my
>>>         previous mail when you have time.
>>>
>>>         Here are some additional tweaks. I reimplemented
>>>         Collection>>#identityIncludes: using #allSatisfy: instead of
>>>         #do:, in this way it is fast for Bags too, and it mirrors
>>>         the implementation of Collection>>#includes:.
>>>
>>>         I also implemented a fast Bag>>#atRandom:, and implemented a
>>>         general Collection>>#atRandom:. Originally #atRandom: was
>>>         not implemented in Collection, so I implemented a generic
>>>         version that only assumes the collection understands #size
>>>         and #do:.
>>>
>>>         Thanks,
>>>         Luciano
>>>
>>>         On Tue, May 3, 2022 at 7:54 AM Luciano Notarfrancesco
>>>         <luchiano at gmail.com <mailto:luchiano at gmail.com>> wrote:
>>>
>>>             Here are some more methods that take advantage of the
>>>             structure of a Bag (#allSatisfy:, #anySatisfy:, #max:,
>>>             #min:, #sum:, etc).
>>>
>>>             Also made some tweaks to some methods in Collection to
>>>             call existing methods instead of reimplementing, in
>>>             order to simplify the changes in Bag (otherwise. for
>>>             example, I'd have to implement #sum, #sum: and
>>>             #sum:ifEmpty: in Bag instead of only implementing
>>>             #sum:ifEmpty). And I changed Collection>>#product to
>>>             produce an error when the collection is empty instead of
>>>             returning 1 (to be consistent with Collection>>#sum:).
>>>
>>>             All base image tests pass, but please review.
>>>
>>>             Also, while running tests I got a walkback on
>>>             BitBltCanvasEngine, see the attached log.
>>>
>>
>>
>>         -- 
>>         Juan Vuletich
>>         www.cuis-smalltalk.org  <http://www.cuis-smalltalk.org>
>>         https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
>>         https://github.com/jvuletich
>>         https://www.linkedin.com/in/juan-vuletich-75611b3
>>         https://independent.academia.edu/JuanVuletich
>>         https://www.researchgate.net/profile/Juan-Vuletich
>>         https://patents.justia.com/inventor/juan-manuel-vuletich
>>         https://twitter.com/JuanVuletich
>>
>
>
>     -- 
>     Juan Vuletich
>     www.cuis-smalltalk.org  <http://www.cuis-smalltalk.org>
>     https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
>     https://github.com/jvuletich
>     https://www.linkedin.com/in/juan-vuletich-75611b3
>     https://independent.academia.edu/JuanVuletich
>     https://www.researchgate.net/profile/Juan-Vuletich
>     https://patents.justia.com/inventor/juan-manuel-vuletich
>     https://twitter.com/JuanVuletich
>


-- 
Juan Vuletich
www.cuis-smalltalk.org
https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
https://github.com/jvuletich
https://www.linkedin.com/in/juan-vuletich-75611b3
https://independent.academia.edu/JuanVuletich
https://www.researchgate.net/profile/Juan-Vuletich
https://patents.justia.com/inventor/juan-manuel-vuletich
https://twitter.com/JuanVuletich

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20220513/406a7fc8/attachment.htm>


More information about the Cuis-dev mailing list