<div dir="ltr"><div>Hi Christian,</div><div>concerning the thresholds, the Opensmalltalk VM stops allocating in eden for 2^16 slots and above.<br></div><div>It allocates in oldSpace instead<br><br> numSlots > self maxSlotsForNewSpaceAlloc<br> ifTrue:<br> [numSlots > self maxSlotsForAlloc ifTrue:<br> [coInterpreter primitiveFailFor: PrimErrUnsupported.<br> ^nil].<br> newObj := self allocateSlotsInOldSpace: numSlots format: instSpec classIndex: classIndex]<br> ifFalse:<br> [newObj := self allocateSlots: numSlots format: instSpec classIndex: classIndex].<br></div><br><div>Hence the 66,000 threshold you observe with new: (it's 65536).</div><div><br></div><div>For 41000, we start we size 10, and double size at each growth, then, at 11th growth we get a size 2^11 * 10, 4096*10 = 40,960<br></div><div>At next growth (adding the 41961), the array size will get over the 65535 threshold.<br></div><div>Hence it's about the same threshold that we observe.<br><br></div><div>For 82000, this gets more interesting. This time, the size allocated is 163,840 slots.<br></div><div>With 8 bytes per slot (assuming 64 bits VM), we're getting just over a MiByte.<br></div><div>No time to dig more in VM source code, but it might be related to object memory growth...<br></div><div><br></div><div>Now let's observe some interesting figures in Squeak:<br><br></div><div><div>[Array new: 65535] bench.<br> '1,650 per second. 605 microseconds per run. 60.35841 % GC time.' <br></div><div><br>[Array new: 65536] bench.<br> '226 per second. 4.42 milliseconds per run. 91.64405 % GC time.' </div><div><br></div><div>Notice the high percentage spent at GC once we're in the old space : the cost is likely to be dominated by GC.</div><div><br></div><div>And in Cuis:</div><div><br>[Array new: 65535] bench.<br> '2.02 k runs per second' .<br>[Array new: 65536] bench.<br> '1.31 k runs per second' .<br><br></div><div>Ah ah ! Cuis image is much smaller, hence full GC much cheaper !<br><br><div>I guess that Pharo images are much larger, hence the cost...</div><div><br></div><div>So,<br></div><div>- the cost is dominated by GC in OpenSmalltalk images<br></div><div>- starting at 2^16 slot and above, oldSpace is allocated, and that ends up with full GC<br></div><div>- the larger the image, the less efficient the allocation<br><br></div><div>VW memory policy is more optimized than Opensmalltalk, at least for this benchmark.<br></div><div>Probably a virtue of large space (segments reserved for large objects).<br><br></div><div>Nicolas<br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le dim. 3 mars 2024 à 19:36, Christian Haider via Cuis-dev <<a href="mailto:cuis-dev@lists.cuis.st">cuis-dev@lists.cuis.st</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg-8657701997952365470"><div lang="DE" style="overflow-wrap: break-word;"><div class="m_-8657701997952365470WordSection1"><p class="MsoNormal">Hi,<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><span lang="EN-US">I was going to complain that #new: on OrderedCollection was removed.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">I thought that it is common wisdom that #new: is a very important optimization when allocating OrderedCollections, especially big ones.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">Therefore, I measured the differences with the following script:<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New"">| time time1 time2 |<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New"">(0 to: 100000 by: 1000) collect: [:size |<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> time := time1 := time2 := 0.<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> 1000 timesRepeat: [<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> time := time + (Time microsecondsToRun: [<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> | list |<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> list := OrderedCollection new.<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> 1 to: size do: [:i | list add: i]])].<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> time1 := time / 1000.<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> time := 0.<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> 1000 timesRepeat: [<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> time := time + (Time microsecondsToRun: [<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> | list |<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> list := OrderedCollection new: size.<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> 1 to: size do: [:i | list add: i]])].<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> time2 := time / 1000.<u></u><u></u></span></p><p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US" style="font-family:"Courier New""> Array with: size with: time1 with: time2]</span><span lang="EN-US"><u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">I ran the script with the current Cuis 6.3, Pharo 10, Squeak 6.0 and VW 9.3.1.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">The results are in the attached file from which I created the charts below with Excel.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">The size of the created collections are on the x-axis, while the y-axis shows the microseconds.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">The results are interesting!<u></u><u></u></span></p><ol style="margin-top:0cm" start="1" type="1"><li class="m_-8657701997952365470MsoListParagraph" style="margin-left:0cm"><span lang="EN-US">As expected, VW new: is the fastest and grows linear with the collection size.<u></u><u></u></span></li><li class="m_-8657701997952365470MsoListParagraph" style="margin-left:0cm"><span lang="EN-US">As expected, #new is slower than #new: (except for Cuis – more below)<u></u><u></u></span></li><li class="m_-8657701997952365470MsoListParagraph" style="margin-left:0cm"><span lang="EN-US">There are certain threshold sizes from which on the creation is consistently slower (look at 40,000 and 82,000 for #new:)<u></u><u></u></span></li><li class="m_-8657701997952365470MsoListParagraph" style="margin-left:0cm"><span lang="EN-US">Surprising is the slowness of Pharo(?)<u></u><u></u></span></li></ol><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><img width="1264" height="737" style="width: 13.1666in; height: 7.677in;" id="m_-8657701997952365470Diagramm_x0020_1" src="cid:ii_18e065fd7c55b16b21"><u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><span lang="EN-US">Extremely surprising is Cuis. The next chart shows just Cuis and Squeak to spread the scale a bit.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">In Cuis, both methods have about the same performance and are growing linear with the size!!!<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">How is this possible? What is the magic? Or is the measurement wrong? <u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">I am quite impressed.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">And I am not going to complain about the missing #new: </span><span lang="EN-US" style="font-family:Wingdings">J</span><span lang="EN-US">.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">Happy hacking,<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">Christian<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><img width="1264" height="737" style="width: 13.1666in; height: 7.677in;" id="m_-8657701997952365470Diagramm_x0020_2" src="cid:ii_18e065fd7c99374b62"><u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><u></u> <u></u></p></div></div>-- <br>
Cuis-dev mailing list<br>
<a href="mailto:Cuis-dev@lists.cuis.st" target="_blank">Cuis-dev@lists.cuis.st</a><br>
<a href="https://lists.cuis.st/mailman/listinfo/cuis-dev" rel="noreferrer" target="_blank">https://lists.cuis.st/mailman/listinfo/cuis-dev</a><br>
</div></blockquote></div>