[Cuis-dev] usefulness of a faster #timesRepeat?

Andres Valloud ten at smallinteger.com
Thu Nov 7 14:54:03 PST 2019


No, I had not looked at #bench.  I did some experiments here, and it 
looks like it could be improved too, for really fast blocks like [2+3] 
the measured speed difference is on the order of 30%.

Moreover, I think it would be nicer if bench did a GC before starting 
the measurement.  In some other worlds, I also saw that such measurement 
methods would go to the trouble of ensuring as far as possible that the 
code under consideration was already jitted before the measurement 
begins.  I suppose that helps in very large blocks of code or that 
perhaps do not run many iteration.  See attached and let me know what 
you think.

The benchmarks I was running were trying to evaluate variations on code 
that uses smallintegers a lot.  So either I would have to increase the 
repetitions of the code under consideration inside the timesRepeat: to 
minimize the overhead (that makes workspaces rather verbose and 
repetitive), or I would have to consider the overhead explicitly and 
start the subtraction game.  Part of what motivated the improvement in 
timesRepeat: was to lower the relative overhead so repeating code in 
workspaces or manually accounting for the timesRepeat: overhead would 
become less necessary for the sake of measurement precision.

I was a bit surprised I could get it to run 50% faster, I wasn't 
expecting that.  For contrast, in other worlds, timesRepeat: is also 
optimized by the Smalltalk compiler.  This introduces some kinks.

First, especially in 32 bit systems, it's very important never to send 
timesRepeat: to a large integer --- this is why the large integer method 
splits the process in rounds of timesRepeat: sent to small integers. 
Second, timesRepeat: is only optimized when it's followed by an 
explicit, literal block, e.g.:

	10 timesRepeat: [self blah]

rather than

	10 timesRepeat: aBlock

So, in those contexts, sometimes it's necessary to write code like this 
instead to enable the compiler optimization:

	10 timesRepeat: [aBlock value]

If the overhead of timesRepeat: is minimized, perhaps the Smalltalk 
compiler optimization is unnecessary.

Andres.

On 11/7/19 14:08, Juan Vuletich via Cuis-dev wrote:
> Hi, Andrés, Folks,
> 
> I was thinking about the need you see for a faster implementation of 
> #timesRepeat:. I guess you are using it for benchmarking, right? I also 
> guess you are benchmarking some code that takes very little time to run, 
> to make the overhead of #timesRepeat: noticeable.
> 
> If that is the case, have you looked at #bench? It was brought from 
> Squeak, and it is the 'standard' way of benchmarking small pieces of code.
> 
> Cheers,
> 
-------------- next part --------------
'From Cuis 5.0 [latest update: #3944] on 7 November 2019 at 2:53:38 pm'!

!BlockClosure methodsFor: 'evaluating' stamp: 'sqr 11/7/2019 14:07:05'!
bench
	"See how many times I can value in 5 seconds.  I'll answer a meaningful description.
	[ Float pi printString ] bench print.
	[ 80000 factorial printString ] bench print.
	"

	| startTime endTime count run |
	count _ 0.
	run _ true.
	"Same initial conditions"
	Smalltalk garbageCollect.
	"So the code doing the measurement starts jitted"
	Time localMillisecondClock.
	self value.
	[ (Delay forSeconds: 5) wait. run _ false ] forkAt: Processor timingPriority - 1.
	startTime _ Time localMillisecondClock.
	[ run ] whileTrue: [
		self value.  self value.  self value.  self value.  self value.  self value.  self value.  self value.
		self value.  self value.  self value.  self value.  self value.  self value.  self value.  self value.
		self value.  self value.  self value.  self value.  self value.  self value.  self value.  self value.
		self value.  self value.  self value.  self value.  self value.  self value.  self value.  self value.
		count _ count + 32
	].
	endTime _ Time localMillisecondClock.
	count = 1
		ifTrue: [
			(endTime - startTime) / 1000 withDecimalUnitPrefixAndValue: [ :value  :unitPrefixSymbol :unitPrefixName |
				^String streamContents: [ :strm |
					value printOn: strm fractionDigits: 2.
					strm
						space;
						nextPutAll: unitPrefixSymbol;
						nextPutAll: ' seconds per run']]
			]
		ifFalse: [
			(count * 1000) / (endTime - startTime) withDecimalUnitPrefixAndValue: [ :value  :unitPrefixSymbol :unitPrefixName |
				^String streamContents: [ :strm |
					value printOn: strm fractionDigits: 2.
					strm
						space;
						nextPutAll: unitPrefixSymbol;
						nextPutAll: ' runs per second' ]]
			]! !


More information about the Cuis-dev mailing list