[Cuis-dev] Some more Bag tweaks
Juan Vuletich
JuanVuletich at zoho.com
Fri May 13 16:01:27 PDT 2022
What I think is the more important issue is that of reproducibility of
results. Again, something that someone using hashed collections should
be aware of, but a comment wouldn't hurt.
On 5/13/2022 7:48 PM, Luciano Notarfrancesco via Cuis-dev wrote:
> Hm, yes, it could fail to be reproducible if the Collection is
> iterated in different order in two successive calls to do:. Good
> point. This might happen if the collection is rehashed between calls,
> as you say. Same problem could happen in Set with the implementation
> that we have been using for some years now. I use this extensively in
> tests in my math project and I didn’t run into problems, but I think
> it’s a good idea to add a comment.
>
> However I don’t think it could affect the uniformity of the
> distribution. If you take random samples from a collection and reorder
> it between each sample, it should still be uniform, assuming the
> generator is really random. Since our generators are not real random,
> and knowing the internal state of the generator and the algorithm, it
> is possible to do a trick and reorder it each time you take a sample
> in such a way that would always return the same element… but I don’t
> think it’s a real problem, I think if the collection is rehashed it
> would still look uniform. I’ll think more about it tomorrow when I’m
> more awake, tho.
>
> The implementation of Collection>>#atRandom: is mostly for
> completeness, in practice we reimplement it in subclasses more
> efficiently, exploiting the structure of each type of collection.
>
>
> On Sat, 14 May 2022 at 5:26 AM Juan Vuletich <JuanVuletich at zoho.com
> <mailto:JuanVuletich at zoho.com>> wrote:
>
> Well, given that there is no actual guarantee of the iteration
> order, there is no guarantee of the distribution of the picks.
>
> I know it is not likely, but we can't prove it is impossible that
> this strategy answers always the same element. All we need is
> extremely bad luck so the collections are rehashed following the
> RNG! In practice, what is possible, is that the distribution is
> not exactly uniform.
>
> Another problem is with reproducibility of the results. In tests,
> or any other situation where you need to guarantee the same
> results, it is usually enough to use the same seeds for the RNGs.
> In these cases, results would not be reproducible, because there
> is no guarantee of sequencing between successive runs. Especially
> if the image is restarted, or the data is recreated, in a
> different image.
>
> What I'd assume is that the user knows what they are doing. Maybe
> a warning in a comment in those methods is in order.
>
> Thanks,
>
>
> On 5/13/2022 7:11 PM, Luciano Notarfrancesco via Cuis-dev wrote:
>> What do you mean by good random properties? It should be
>> uniformly distributed. Do you see any problem in
>> Collection>>#atRandom: or Bag>>#atRandom:?
>>
>> Thanks!
>> Luciano
>>
>> On Sat, 14 May 2022 at 4:57 AM Juan Vuletich
>> <JuanVuletich at zoho.com <mailto:JuanVuletich at zoho.com>> wrote:
>>
>> Anyone using #atRandom: on a non-sequenceable collection
>> should be aware that no good random properties can be
>> guaranteed, right?
>>
>> Anyway, just pushed to GitHub.
>>
>> Thanks,
>>
>>
>> On 5/9/2022 7:56 AM, Luciano Notarfrancesco via Cuis-dev wrote:
>>> Juan, please don't forget to look at the changeset in my
>>> previous mail when you have time.
>>>
>>> Here are some additional tweaks. I reimplemented
>>> Collection>>#identityIncludes: using #allSatisfy: instead of
>>> #do:, in this way it is fast for Bags too, and it mirrors
>>> the implementation of Collection>>#includes:.
>>>
>>> I also implemented a fast Bag>>#atRandom:, and implemented a
>>> general Collection>>#atRandom:. Originally #atRandom: was
>>> not implemented in Collection, so I implemented a generic
>>> version that only assumes the collection understands #size
>>> and #do:.
>>>
>>> Thanks,
>>> Luciano
>>>
>>> On Tue, May 3, 2022 at 7:54 AM Luciano Notarfrancesco
>>> <luchiano at gmail.com <mailto:luchiano at gmail.com>> wrote:
>>>
>>> Here are some more methods that take advantage of the
>>> structure of a Bag (#allSatisfy:, #anySatisfy:, #max:,
>>> #min:, #sum:, etc).
>>>
>>> Also made some tweaks to some methods in Collection to
>>> call existing methods instead of reimplementing, in
>>> order to simplify the changes in Bag (otherwise. for
>>> example, I'd have to implement #sum, #sum: and
>>> #sum:ifEmpty: in Bag instead of only implementing
>>> #sum:ifEmpty). And I changed Collection>>#product to
>>> produce an error when the collection is empty instead of
>>> returning 1 (to be consistent with Collection>>#sum:).
>>>
>>> All base image tests pass, but please review.
>>>
>>> Also, while running tests I got a walkback on
>>> BitBltCanvasEngine, see the attached log.
>>>
>>
>>
>> --
>> Juan Vuletich
>> www.cuis-smalltalk.org <http://www.cuis-smalltalk.org>
>> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
>> https://github.com/jvuletich
>> https://www.linkedin.com/in/juan-vuletich-75611b3
>> https://independent.academia.edu/JuanVuletich
>> https://www.researchgate.net/profile/Juan-Vuletich
>> https://patents.justia.com/inventor/juan-manuel-vuletich
>> https://twitter.com/JuanVuletich
>>
>
>
> --
> Juan Vuletich
> www.cuis-smalltalk.org <http://www.cuis-smalltalk.org>
> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
> https://github.com/jvuletich
> https://www.linkedin.com/in/juan-vuletich-75611b3
> https://independent.academia.edu/JuanVuletich
> https://www.researchgate.net/profile/Juan-Vuletich
> https://patents.justia.com/inventor/juan-manuel-vuletich
> https://twitter.com/JuanVuletich
>
--
Juan Vuletich
www.cuis-smalltalk.org
https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
https://github.com/jvuletich
https://www.linkedin.com/in/juan-vuletich-75611b3
https://independent.academia.edu/JuanVuletich
https://www.researchgate.net/profile/Juan-Vuletich
https://patents.justia.com/inventor/juan-manuel-vuletich
https://twitter.com/JuanVuletich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20220513/406a7fc8/attachment.htm>
More information about the Cuis-dev
mailing list