[Cuis-dev] Inconsistent #= and #hash
Juan Vuletich
juan at jvuletich.org
Wed Jun 12 04:09:08 PDT 2019
On 6/12/2019 1:19 AM, Luciano Notarfrancesco wrote:
> Awesome, thanks! (That's a google-suggested automatic reply, first
> time I try it, but really thanks).
:)
BTW, be sure to update to #3798. With this and previous updates, this
runs in 20 minutes in my PC (without halting):
instances := Object allSubInstances.
{ 'Total: '. instances size} print.
Time now print.
1 to: instances size do: [:i|
i \\ 1000 = 0 ifTrue: [ i print ].
a := instances at: i.
aHash := a hash.
i+1 to: instances size do: [:j|
b := instances at: j.
(a = b and: [(aHash = b hash and: [b = a]) not]) ifTrue: [self
halt]]].
Time now print.
BTW, I found the StackSizeWatcher in ProcessBrowser (by Hernán
Wilkinson) extremely useful for dealing with the bugs this snippet made
visible.
>
> So to make it clear, the current implementation of Collection>>hash
> implies that if two Collections are equal, then they must at least 1)
> be the same species, and 2) have equal elements. New Collections can
> have more requirements in order to be equal, but those two are
> necessary otherwise the hashes won't match. This is the essence of a
> Collection's identity.
Agreed.
> BTW, do you think 'aCollection includes: anObject' should work without
> errors for arbitrary anObject instance of any class?
Yes. Any errors is a bug or at least a smell, and we'd discuss it...
Well, no. Adding this as I come to the end of your message. See comments
there.
> I had a similar problem with algebraic structures (what I call
> Domains), and I ended up implementing #includes: to work for arbitrary
> objects, and a new message #contains: (not the best name, might think
> a better one in the future) that is faster and simpler and assumes the
> objects are of a certain type.
I guess I'd call it something like #containsDomain:, including in the
selector the assumption about the type. A matter of style, of course.
> I don't think we need two messages like that for Collections, but I
> still don't know what's the correct way of thinking about
> Collection>>includes:, and if we think that it should work for
> arbitrary objects we can try a similar script to test 'aCollection
> includes: Object new' in all instances of Collections... without
> trying that, I know that Interval fails for anything that is not a
> Number, not sure if should be fixed or not. What do you think?
Many collections can hold any kind of object. Those should not fail when
evaluating `aCollection includes: anObject`, for any object. Others are
specific for some kind of content, like String, ByteArray, FloatArray,
etc. It is ok for those to raise an error when evaluating `aCollection
includes: anObject`, if trying to add that anObject would also fail. I
see no problem there. Still, I wouldn't object if people prefer to just
make #includes: answer false in those cases.
> On Tue, Jun 11, 2019 at 5:25 PM Juan Vuletich <juan at jvuletich.org
> <mailto:juan at jvuletich.org>> wrote:
>
> On 6/11/2019 8:49 AM, Luciano Notarfrancesco via Cuis-dev wrote:
> > I did this to fish for inconsistencies:
> >
> > instances _ OrderedCollection new.
> > Object withAllSubclassesDo: [:each| instances addAll: each
> allInstances].
> > instances _ instances asArray.
> >
> > 1 to: instances size do: [:i|
> > a _ instances at: i.
> > aHash _ a hash.
> > i+1 to: instances size do: [:j|
> > b _ instances at: j.
> > (a = b and: [(aHash = b hash and: [b = a]) not]) ifTrue:
> [self
> > halt]]]
> >
> > (I typed it manually here, I hope I copied it right, but you get
> the
> > idea.)
> >
> > Found out this:
> > - FeatureRequest forgets to implement hash;
> > - #() = Semaphore new, but not the other way around, and hashes
> also
> > differ;
> > - similarly, LinkedList new = Semaphore new;
> > - RunArray new = Object new fails because it assumes the
> argument is a
> > Collection and sends isSequenceable;
> > - RunArray new = Text new, but hashes differ;
> > - Set new = Dictionary new but hashes differ;
> >
> > And there are more, I stopped before finishing.
> >
> > I don't know how to fix some of those, and the implications of
> > changing the behavior of #= or #hash are not obvious in some cases.
> > But I think we should change Collection>>hash to set the initial
> value
> > to 0 instead of 'self species hash', and that would fix two of the
> > issues above. What do you think? Other ideas?
>
> I just pushed fixes to most of them. For Semaphore and RunArray it
> is a
> matter of implementing and honoring #species. For Set, it was making
> aDictionary is: #Set to answer false. For the others it was adding
> a new
> #hash or #= methods. I don't expect much breakeage.
>
> I'll run now a tweaked (hopefully faster) version of your script.
>
> Cheers,
>
> --
> Juan Vuletich
> www.cuis-smalltalk.org <http://www.cuis-smalltalk.org>
> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
> https://github.com/jvuletich
> https://www.linkedin.com/in/juan-vuletich-75611b3
> @JuanVuletich
>
Cheers,
--
Juan Vuletich
www.cuis-smalltalk.org
https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
https://github.com/jvuletich
https://www.linkedin.com/in/juan-vuletich-75611b3
@JuanVuletich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20190612/ccad0876/attachment-0001.htm>
More information about the Cuis-dev
mailing list