[Cuis-dev] performance of OrderedCollection #new vs. #new:
Juan Vuletich
juan at cuis.st
Mon Mar 4 09:53:23 PST 2024
Hi Folks,
Interesting questions, and even better answer!
It looks like any allocation in old space will trigger a GC. Is this
right? Is it needed?
It also looks like a two level design, with leaves of size 2^16-1 or
such is in order...
Cheers,
On 3/3/2024 8:31 PM, Nicolas Cellier via Cuis-dev wrote:
> Hi Christian,
> concerning the thresholds, the Opensmalltalk VM stops allocating in
> eden for 2^16 slots and above.
> It allocates in oldSpace instead
>
> numSlots > self maxSlotsForNewSpaceAlloc
> ifTrue:
> [numSlots > self maxSlotsForAlloc ifTrue:
> [coInterpreter primitiveFailFor: PrimErrUnsupported.
> ^nil].
> newObj := self allocateSlotsInOldSpace: numSlots format:
> instSpec classIndex: classIndex]
> ifFalse:
> [newObj := self allocateSlots: numSlots format: instSpec
> classIndex: classIndex].
>
> Hence the 66,000 threshold you observe with new: (it's 65536).
>
> For 41000, we start we size 10, and double size at each growth, then,
> at 11th growth we get a size 2^11 * 10, 4096*10 = 40,960
> At next growth (adding the 41961), the array size will get over the
> 65535 threshold.
> Hence it's about the same threshold that we observe.
>
> For 82000, this gets more interesting. This time, the size allocated
> is 163,840 slots.
> With 8 bytes per slot (assuming 64 bits VM), we're getting just over a
> MiByte.
> No time to dig more in VM source code, but it might be related to
> object memory growth...
>
> Now let's observe some interesting figures in Squeak:
>
> [Array new: 65535] bench.
> '1,650 per second. 605 microseconds per run. 60.35841 % GC time.'
>
> [Array new: 65536] bench.
> '226 per second. 4.42 milliseconds per run. 91.64405 % GC time.'
>
> Notice the high percentage spent at GC once we're in the old space :
> the cost is likely to be dominated by GC.
>
> And in Cuis:
>
> [Array new: 65535] bench.
> '2.02 k runs per second' .
> [Array new: 65536] bench.
> '1.31 k runs per second' .
>
> Ah ah ! Cuis image is much smaller, hence full GC much cheaper !
>
> I guess that Pharo images are much larger, hence the cost...
>
> So,
> - the cost is dominated by GC in OpenSmalltalk images
> - starting at 2^16 slot and above, oldSpace is allocated, and that
> ends up with full GC
> - the larger the image, the less efficient the allocation
>
> VW memory policy is more optimized than Opensmalltalk, at least for
> this benchmark.
> Probably a virtue of large space (segments reserved for large objects).
>
> Nicolas
>
--
Juan Vuletich
cuis.st
github.com/jvuletich
researchgate.net/profile/Juan-Vuletich
independent.academia.edu/JuanVuletich
patents.justia.com/inventor/juan-manuel-vuletich
linkedin.com/in/juan-vuletich-75611b3
twitter.com/JuanVuletich
More information about the Cuis-dev
mailing list