[Cuis-dev] usefulness of a faster #timesRepeat?
Andres Valloud
ten at smallinteger.com
Thu Nov 7 14:54:03 PST 2019
No, I had not looked at #bench. I did some experiments here, and it
looks like it could be improved too, for really fast blocks like [2+3]
the measured speed difference is on the order of 30%.
Moreover, I think it would be nicer if bench did a GC before starting
the measurement. In some other worlds, I also saw that such measurement
methods would go to the trouble of ensuring as far as possible that the
code under consideration was already jitted before the measurement
begins. I suppose that helps in very large blocks of code or that
perhaps do not run many iteration. See attached and let me know what
you think.
The benchmarks I was running were trying to evaluate variations on code
that uses smallintegers a lot. So either I would have to increase the
repetitions of the code under consideration inside the timesRepeat: to
minimize the overhead (that makes workspaces rather verbose and
repetitive), or I would have to consider the overhead explicitly and
start the subtraction game. Part of what motivated the improvement in
timesRepeat: was to lower the relative overhead so repeating code in
workspaces or manually accounting for the timesRepeat: overhead would
become less necessary for the sake of measurement precision.
I was a bit surprised I could get it to run 50% faster, I wasn't
expecting that. For contrast, in other worlds, timesRepeat: is also
optimized by the Smalltalk compiler. This introduces some kinks.
First, especially in 32 bit systems, it's very important never to send
timesRepeat: to a large integer --- this is why the large integer method
splits the process in rounds of timesRepeat: sent to small integers.
Second, timesRepeat: is only optimized when it's followed by an
explicit, literal block, e.g.:
10 timesRepeat: [self blah]
rather than
10 timesRepeat: aBlock
So, in those contexts, sometimes it's necessary to write code like this
instead to enable the compiler optimization:
10 timesRepeat: [aBlock value]
If the overhead of timesRepeat: is minimized, perhaps the Smalltalk
compiler optimization is unnecessary.
Andres.
On 11/7/19 14:08, Juan Vuletich via Cuis-dev wrote:
> Hi, Andrés, Folks,
>
> I was thinking about the need you see for a faster implementation of
> #timesRepeat:. I guess you are using it for benchmarking, right? I also
> guess you are benchmarking some code that takes very little time to run,
> to make the overhead of #timesRepeat: noticeable.
>
> If that is the case, have you looked at #bench? It was brought from
> Squeak, and it is the 'standard' way of benchmarking small pieces of code.
>
> Cheers,
>
-------------- next part --------------
'From Cuis 5.0 [latest update: #3944] on 7 November 2019 at 2:53:38 pm'!
!BlockClosure methodsFor: 'evaluating' stamp: 'sqr 11/7/2019 14:07:05'!
bench
"See how many times I can value in 5 seconds. I'll answer a meaningful description.
[ Float pi printString ] bench print.
[ 80000 factorial printString ] bench print.
"
| startTime endTime count run |
count _ 0.
run _ true.
"Same initial conditions"
Smalltalk garbageCollect.
"So the code doing the measurement starts jitted"
Time localMillisecondClock.
self value.
[ (Delay forSeconds: 5) wait. run _ false ] forkAt: Processor timingPriority - 1.
startTime _ Time localMillisecondClock.
[ run ] whileTrue: [
self value. self value. self value. self value. self value. self value. self value. self value.
self value. self value. self value. self value. self value. self value. self value. self value.
self value. self value. self value. self value. self value. self value. self value. self value.
self value. self value. self value. self value. self value. self value. self value. self value.
count _ count + 32
].
endTime _ Time localMillisecondClock.
count = 1
ifTrue: [
(endTime - startTime) / 1000 withDecimalUnitPrefixAndValue: [ :value :unitPrefixSymbol :unitPrefixName |
^String streamContents: [ :strm |
value printOn: strm fractionDigits: 2.
strm
space;
nextPutAll: unitPrefixSymbol;
nextPutAll: ' seconds per run']]
]
ifFalse: [
(count * 1000) / (endTime - startTime) withDecimalUnitPrefixAndValue: [ :value :unitPrefixSymbol :unitPrefixName |
^String streamContents: [ :strm |
value printOn: strm fractionDigits: 2.
strm
space;
nextPutAll: unitPrefixSymbol;
nextPutAll: ' runs per second' ]]
]! !
More information about the Cuis-dev
mailing list