[Cuis-dev] Unwind mechanism during termination is broken and inconsistent

Juan Vuletich juan at jvuletich.org
Fri May 21 12:08:33 PDT 2021


Hi Jaromir,

On 5/16/2021 5:26 PM, Jaromir Matas via Cuis-dev wrote:
>
> Hi Juan,
>
> I'm enclosing an updated changeset (please disregard the previous 
> one); I made a stupid assumption about #runUntilErrorOrReturnFrom:. 
> The new changeset leaves it untouched and adds its replica called 
> #runUnwindUntilErrorOrReturnFrom: with the required functionality.
>

I think you are right in the previous email, and we'd integrate your 
latest change.I have already checked that none of the tests we recently 
added breaks because of it. If you could turn some of the arguments you 
provide into tests, it would be great.

> I'd like to point out a little issue with the proposed semantics of 
> nested errors during termination:
>
> Below’s a simple example. If you abandon the first debugger it means 
> you want to terminate the execution and unwind the terminated process 
> but an error happens during the unwind. You get another debugger 
> reporting the error and asking you what to do next - to debug the 
> error or proceed. But proceeding a termination really means to proceed 
> with the **unwind business only** and not to proceed with the original 
> computation. Here's my point: Hitting Proceed will resume the original 
> computation beyond the unwind blocks but Abandon will only finish the 
> unwind procedure which is exactly what we want. In Squeak I simply 
> disabled the Proceed button but in Cuis I couldn't see such a simple 
> solution so I left it unresolved.
>
> [self error: 'x1'] ensure: [
>
>     [self error: 'x2'] ensure: [
>
>         Transcript show: '2'].
>
>     Transcript show: '1'].
>
> Transcript show: '0'
>
> ---> you want to see '21' but proceeding the debugger will give you 
> '210' where '0' is printed outside of unwind blocks.
>
> So just that you know there's still a little inconsistency in the 
> termination semantics.
>

I see. The issue is how to disable a button so it doesn't act? The 
simplest way is by saying `aButton lock: true`. We could add #enable and 
#disable to PluggableButtonMorph (calling self lock: ). Or perhaps call 
it #isEnabled: like in MenuItemMorph, although it is an ugly name. Can 
you prepare a change set like this? That would be great. I mean, I don't 
know where those calls should be made!

> One more question: do you have an idea why BlockCannotReturn error is 
> considered resumable? I'd say it's unresumable by definition…? Thanks :)
>

Well, that method is pretty old, and in the mean time, exception 
handling has gotten a lot better, thanks to several batches of 
improvements, latter ones by you. So, please provide a patch to change 
that, so it properly carries your initials.

> best regards,
>
> Jaromir
>

Thanks again Jaromir for this amazing work.

Cheers,

> *From: *Jaromir Matas via Cuis-dev <mailto:cuis-dev at lists.cuis.st>
> *Sent: *Wednesday, May 12, 2021 17:02
> *To: *Discussion of Cuis Smalltalk <mailto:cuis-dev at lists.cuis.st>
> *Cc: *Jaromir Matas <mailto:mail at jaromir.net>; Juan Vuletich 
> <mailto:juan at jvuletich.org>; Hernan Wilkinson 
> <mailto:hernan.wilkinson at 10pines.com>
> *Subject: *Re: [Cuis-dev] Unwind mechanism during termination is 
> broken and inconsistent
>
> Hi Juan,
>
> I apologize for responding so late to your great comments. I felt I 
> had to educate myself in error handling first :)
>
> > m1
>
> >    | a |
>
> >    a := Array new: 3.
>
> >    self m2: a.
>
> >    a at: 3 put: true.
>
> >    a print.
>
> >
>
> >  m2: a   "A"
>
> >    [1 + 2] ensure: [
>
> >      [ 3 + 4 ] ensure: [    "*1"
>
> >        true ifTrue: [
>
> >           ^a at: 1 put: true ]]. "*2"
>
> >      a at: 2 put: true ]
>
> >
>
> >  In this example, the *1ensure is there only to guarantee that *2is 
> ran, even if [3+4] happens to fail. If [3+4] it runs without problems, 
> the result should be exactly the same as:
>
> >
>
> >  m2: a   "B"
>
> >    [1 + 2] ensure: [
>
> >      3 + 4.
>
> >      true ifTrue: [
>
> >        ^a at: 1 put: true ].
>
> >      a at: 2 put: true ]
>
> >
>
> >  Applying the same argument, the result should be the same as:
>
> >
>
> >  m2: a   "C"
>
> >    1 + 2.
>
> >    3 + 4.
>
> >    true ifTrue: [
>
> >      ^a at: 1 put: true ].
>
> >    a at: 2 put: true
>
> >
>
> >  In implementation C it is clear that a second isNil. So, the same 
> should be the case for B and A.
>
> This is a beautiful example! Thanks a lot, very enlightening. I fully 
> agree, my suggestion was incorrect. The logic you are showing says to 
> leave the block where the non-local return executes without unwinding 
> but to continue unwinding after that block. Very interesting!
>
> > x := nil.
>
> > [self error: 'x1'] ensure: [
>
> >   [self error: 'x2'] ensure: [
>
> > ​    [self error: 'x3'] ensure: [
>
> > ​      x:=3].
>
> > ​    x:=2].
>
> >   x:=1].
>
> > x
>
> >
>
> > I think this case is very similar to the one above.. For example, if 
> we proceed the first debugger (the x1 error), the x2 debugger opens. 
> If we abandon it, we are abandoning the execution of the first ensured 
> block (that includes the x := 1 assignment at the end). So, no 
> assignment is done. I think that the behavior of our fix is correct in 
> this case. No need to simulate "resuming".
>
> In this case I'd disagree. Abandoning is currently implemented and 
> interpreted as terminating so I chose to follow this logic and made 
> abandoning equivalent to sending #terminate, meaning the unwind will 
> continue and execute the second error, open a second debugger etc. 
> Please try out the enclosed changeset; it also follows the non-local 
> logic you suggested above. We've also discussed the issue with 
> Christoph Thiede here: 
> http://forum.world.st/Solving-multiple-termination-bugs-summary-proposal-tp5128285p5129113.html
>
> The main idea behind the extended version of #terminate is using the 
> return value from #runUntilErrorOrReturnFrom: to continue the unwind 
> based on the exception found.
>
> I've also fixed a few minor bug in the current implementation:
>
> 1) using suspendedContext variable during unwind is an incorrect 
> approach that may lead to leaving unterminated context chains behind 
> which the garbage collector wouldn't recycle; I'm using a local temp 
> 'top' instead.
>
> 2) I removed the part looking for an 'inner' context halfway through 
> execution - it's unnecessary and theoretically even incorrect - and 
> use outerMost as the unwind context that needs to be completed.
>
> 3) the condition in the middle (`ifNil: [^self]`) became unnecessary 
> after your fix.
>
> 4) there was an issue with infinite recursion in case of #cannotReturn 
> and #doesNotUnderstand - this has been fixed by explicitly excluding 
> the two exceptions from returning.
>
> 5) #cannotReturn generally causes a VM crash if it attempts to return 
> - this is fixed by looping #cannotReturn back to itself and the way 
> out leads via Abandoning the debugger window (even accidental pressing 
> Proceed crashes your image so I guess it's worth fixing in every case)
>
> 6) #runUntilErrorOrReturnFrom: required a slight modification - I 
> suggest either modify it or we can create a copy with this modified 
> behavior; I chose to modify it because no current method would be 
> affected by the change.
>
> As another example illustrating the changes you can try:
>
>     a := Array new: 4 withAll: false.
>
>     p := [
>
>              [
>
>             [] ensure: [
>
>                            [ ^2 ] ensure: [
>
>                                                       a at: 1 put: true].
>
>                            a at: 2 put: true]
>
>                            ] ensure: [a at: 3 put: true].
>
>                            a at: 4 put: true
>
>         ] newProcess.
>
>     p resume.
>
> It crashes the image without amending #cannotReturn but works well 
> with the enclosed changeset and returns
>
> correctly a equal to #(true false true false).
>
> Also this example is catastrophic but dealt with safely with the new 
> changeset:
>
>    [^2] newProcess resume
>
> I haven't built any testcases yet but I definitely plan to if you find 
> this interesting.
>
> Many thanks for discussing this issue; any comments will be greatly 
> appreciated :)
>
> best,
>
> Jaromir
>
> *From: *Juan Vuletich via Cuis-dev <mailto:cuis-dev at lists.cuis.st>
> *Sent: *Friday, April 30, 2021 19:32
> *To: *Discussion of Cuis Smalltalk <mailto:cuis-dev at lists.cuis.st>
> *Cc: *Juan Vuletich <mailto:juan at jvuletich.org>; Jaromir Matas 
> <mailto:mail at jaromir.net>; Hernan Wilkinson 
> <mailto:hernan.wilkinson at 10pines.com>
> *Subject: *Re: [Cuis-dev] Unwind mechanism during termination is 
> broken and inconsistent
>
> Hi Jaromir,
>
> (inline)
> On 4/30/2021 7:01 AM, Jaromir Matas via Cuis-dev wrote:
>
>     Hi Juan,
>
>     Awesome! I remember the idea of using the result of
>     #runUntilErrorOrReturnFrom crossed my mind for a fleeting moment…
>     and then I lost it ;) With your permission I’d like to use your
>     fix in my Squeak version as well.
>
>
> I'm glad you like it. Yes, of course. In these low level details, it 
> is a good thing to have as much common code as possible!
>
> Besides, in general, all the code we publish for Cuis is MIT license, 
> and free for any use.
>
>     I’ve tried to rewrite your #test1ATerminate without the method
>     calls – and indeed it passes… and that’s why I missed that – it
>     was too simple; when using sends is where your fix comes to the
>     rescue – THANKS!
>
>         | p a |
>
>         a := Array new: 4 withAll: false.
>
>         p := [
>
>                  [
>
>                                   [ ] ensure: [
>
>                                                  [Processor
>     activeProcess suspend] ensure: [
>
>                                                                ^a at:
>     1 put: true].    "line L1"
>
>                                                  a at: 2 put:
>     true]                     "line L2"
>
>                                   ] ensure: [a at: 3 put: true].
>
>                                   a at: 4 put: true
>
>             ] newProcess.
>
>         p resume.
>
>         Processor yield.
>
>         "make sure p is suspended and none of the unwind blocks has
>     finished yet"
>
>         self assert: p isSuspended.
>
>         a noneSatisfy: [ :b | b ].
>
>         "now terminate the process and make sure all unwind blocks
>     have finished"
>
>         p terminate.
>
>         self assert: p isTerminated.
>
>         self assert: a first & a third.
>
>         self assert: (a second | a fourth) not.
>
>
> Yes it does. I just thought that a more real-life like test of non 
> local returns should also include actual method calls!
>
>     I’d like to raise a question here: I feel the second item, on line
>     L2 should ideally execute too because it’s inside an unwind block
>     halfway through it’s termination. The problem is though the
>     non-local return at line L1 invokes it’s own unwind algorithm in
>     #resume:through: which ignores halfway through unwind blocks – the
>     reason for that is #resume:through: operates on the active
>     process’s stack which makes it extremely difficult to unwind
>     halfway through blocks. I tried to apply a similar tactics like in
>     termination (control the unwind from another stack) and it works
>     well but it’s very intrusive… I may open a separate discussion on
>     that later to share the results. Do you think it may be worth
>     exploring or it’s just not worth the bother?
>
>
> Well. This is not just in the case of process #terminate, right? To 
> play with this without involving process handling, but including 
> actual method calls I just tried this:
>
> m1
>     | a |
>     a := Array new: 3.
>     self m2: a.
>     a at: 3 put: true.
>     a print.
>
> m2: a      "A"
>     [1 + 2] ensure: [
>         [ 3 + 4 ] ensure: [       "*1"
>             true ifTrue: [
>                 ^a at: 1 put: true ]]. "*2"
>         a at: 2 put: true ]
>
> In this example, the *1ensure is there only to guarantee that *2is 
> ran, even if [3+4] happens to fail. If [3+4] it runs without problems, 
> the result should be exactly the same as:
>
> m2: a     "B"
>     [1 + 2] ensure: [
>         3 + 4.
>         true ifTrue: [
>             ^a at: 1 put: true ].
>         a at: 2 put: true ]
>
> Applying the same argument, the result should be the same as:
>
> m2: a     "C"
>     1 + 2.
>     3 + 4.
>     true ifTrue: [
>         ^a at: 1 put: true ].
>     a at: 2 put: true
>
> In implementation C it is clear that a second isNil. So, the same 
> should be the case for B and A.
>
> I think that an ensured block should be guaranteed to run without 
> external interference. But if it decides on its own to exit before 
> running all its statements, it is it's own decision.
>
>     Another issue: I considered using the error part of the result of
>     #runUntilErrorOrReturnFrom: to deal with situations like this
>     (careful – crashes the Cuis image without the terminate fix; with
>     the fix it works “ok”):
>
>     x := nil.
>
>     [self error: 'x1'] ensure: [
>
>         [self error: 'x2'] ensure: [
>
>             [self error: 'x3'] ensure: [
>
>                 x:=3].
>
>             x:=2].
>
>         x:=1].
>
>     x
>
>     Here you have nested errors and the question is: If we abandon the
>     Debugger window, what do we want to see as a result? Without the
>     fix the image crashes badly with unwind errors, with the fix
>     however the Debugger closes without unwinding – it’s a consequence
>     of # runUntilErrorOrReturnFrom: behavior – it returns errors
>     rather than opens a debugger (and leaves the decision with the
>     user). So what do we want to see as a result – keep opening
>     debugger windows and abandoning them manually or ignoring the
>     errors and executing the assignments? That sounds like “resuming”
>     rather than abandoning to me so at the moment I don’t know and
>     will have to think about it. I just didn’t want to complicate the
>     #terminate prematurely :)
>
>
> I think this case is very similar to the one above.. For example, if 
> we proceed the first debugger (the x1 error), the x2 debugger opens. 
> If we abandon it, we are abandoning the execution of the first ensured 
> block (that includes the x := 1 assignment at the end). So, no 
> assignment is done. I think that the behavior of our fix is correct in 
> this case. No need to simulate "resuming".
>
>     Juan, many thanks again, I’ll study your tests and learn from them.
>
>
> I'm happy to be of help. There's not nothing in those tests that could 
> be new to you. All I did was to add nested method calls, and add the 
> #resume cases, that already did work with the fix. Let me thank you. 
> You did a great analysis of the issues at hand, and your fix is a 
> great contribution.
>
> I'll integrate it right now. If any further analysis provides 
> additional changes, we'll integrate them too.
>
>     Best regards,
>
>     jaromir
>
>
> Cheers,
>
> -- 
> Juan Vuletich
> www.cuis-smalltalk.org  <http://www.cuis-smalltalk.org>
> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
> https://github.com/jvuletich
> https://www.linkedin.com/in/juan-vuletich-75611b3
> @JuanVuletich
>


-- 
Juan Vuletich
www.cuis-smalltalk.org
https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
https://github.com/jvuletich
https://www.linkedin.com/in/juan-vuletich-75611b3
@JuanVuletich

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20210521/1706f346/attachment-0001.htm>


More information about the Cuis-dev mailing list