<div dir="ltr"><div dir="ltr">Andres,</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Sep 17, 2019 at 3:28 AM Andres Valloud via Cuis-dev <<a href="mailto:cuis-dev@lists.cuis.st">cuis-dev@lists.cuis.st</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hey Phil, this is neat :). Let's play the VM development game a bit, I <br>
think it's helpful to at least give an idea of what it's like. The same <br>
principles can be used to any program, and IME the results are good.<br>
<br></blockquote><div><br></div><div>Works for me!</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
On 9/16/19 23:10, Phil B wrote:<br>> Taking a half step back, I ask the question: what's it supposed to be <br>
> doing? According to <br>
> <a href="https://github.com/Geal/Squeak-VM/blob/master/platforms/Mac%20OS/vm/Documentation/3.2.2%20Release%20Notes.rtf" rel="noreferrer" target="_blank">https://github.com/Geal/Squeak-VM/blob/master/platforms/Mac%20OS/vm/Documentation/3.2.2%20Release%20Notes.rtf</a> <br>
> the file flush primitive was added in (classic) VM 3.0.5 and 'now <br>
> actually flushes the file via an OS call' as of 3.0.6.<br>
<br>
Interesting: how do the dates of the primFlush: method (circa 2001) and <br>
the 3.0.6 VM correlate? Is this a case of the Smalltalk image hacks <br>
going stale while the VM changes away? Interesting: both primFlush: and <br>
that VM are essentially contemporaneous, because the VM is from about 2002.<br></blockquote><div><br></div><div>Yes, but also keep in mind that the VM/image changes haven't necessarily been in sync since at least whenever the separate VM team started up (not sure when that occurred. But for example, there has been discussion on vm-dev about Sista VM development but primary Squeak development isn't doing anything with it yet etc. etc... image support often lags a bit since the VM changes need to exist before they can support them though it's entirely likely that on some changes it has led) </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Why would primFlush: retain the comment about xyzOS not doing flush when <br>
the VM release notes insist that flushing now flushes?</blockquote><div><br></div><div>This is one of the downsides to having separate development teams: left hand vs. right hand and all that.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Note also the <br>
reference to CodeWarrior 5.3, but according to this:<br>
<br>
<a href="https://en.wikipedia.org/wiki/CodeWarrior" rel="noreferrer" target="_blank">https://en.wikipedia.org/wiki/CodeWarrior</a><br>
<br>
that only ran on Windows and Mac. Also, surely CodeWarrior usage is <br>
super obsolete by now. What's going on in here?...<br></blockquote><div><br></div><div>While CodeWarrior is obsolete in this century for new development, it was pretty much *the* go-to development platform for Macs as of PPC support (it was the first, and for a long time only, mainstream compiler supporting PPC) until OS X. And as is typical, many devs clung to it until the bitter end (when supporting OS 9 and prior was no longer viable.)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Side comment: you know, back then there was a POSIX / Single Unix <br>
Specification, so all that was necessary was to write the VM to POSIX <br>
(mostly, on Windows you have to do a bit of work for that). However, <br>
maybe it can be excused because POSIX /SUS was rather new at the time.<br>
<br>
<a href="https://en.wikipedia.org/wiki/POSIX" rel="noreferrer" target="_blank">https://en.wikipedia.org/wiki/POSIX</a><br>
<br>
Maybe at the time there wasn't a decent SDK on Mac... no idea. In any <br>
case, that's not the case today.<br></blockquote><div><br></div><div>Just because there was a specification doesn't mean that it was fully or even correctly supported throughout most of the 90's even by those who were supposedly POSIX compliant. A big part of open source development work was in dealing with some rather significant deviations between platforms... and that was just on Unix. Also, GNU (and therefore Linux) was notably incomplete as well as incompatible by design on one or two things IIRC. That said, Mac and Windows were not POSIX compliant in any meaningful way back then. (Windows had a POSIX compliance subsystem for NT but I don't recall anyone I worked with ever using it... I think it was more a marketing bullet point than anything that saw serious use back then. I don't think Mac had anything to say on the POSIX front until OS X)</div><div><br></div><div>It's similar to today with HTML standards: everyone is compliant... kinda, sorta... to varying degrees... with inadvertent and deliberate deviations.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">> [1] "fflush() can fail for the same reasons write() can so errors<br>
> must be checked but sqFileFlush() must support being called on<br>
> readonly files for historical reasons so EBADF is ignored"... so<br>
> there's one example of how it could fail but for this particular<br>
> failure case it is ignored in the C code<br>
That's interesting, I'd verify whether a 5 line C program shows fflush() <br>
fails with EBADF when given a file open for reading only. POSIX says <br>
EBADF is returned when the file handle isn't valid, but what if you do <br>
pass in a valid file handle? Shouldn't fflush() be a no-op then?<br></blockquote><div><br></div><div>I would expect that it depends on the underlying object the handle references: local file (what kind of filesystem), network file (what kind) etc. and that's where the 'other errors' would come into play.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> platforms/win32/plugins/FilePlugin/sqWin32FilePrims.c (which <br>
> calls FlushFileBuffers(FILE_HANDLE(f)) and ignores the return code)<br>
<br>
Ignoring return codes is not good at all. In particular, you can get <br>
into all sorts of problems by doing that in MSDN land.<br></blockquote><div><br></div><div>That could either be due to the errors reported being different enough that meaningfully supporting/reporting would have been a chore or simple laziness. And of course everyone knows fflush never fails in the real world (see your famous last words comment ;-)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> It looks like it will never fail on Windows (regardless of the fact that <br>
> the call might have)<br>
<br>
In MSDN, any time you see "call GetLastError() to see what happened", <br>
effectively that means any of these circumstances can happen:<br>
<br>
<a href="https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes" rel="noreferrer" target="_blank">https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes</a><br>
<br>
Note the numbers go up to 15999. Ok fine they are not all used, but <br>
still. And look at this text:<br>
<br>
"System Error Codes are very broad: each one can occur in one of many <br>
hundreds of locations in the system. Consequently, the descriptions of <br>
these codes cannot be very specific. Use of these codes requires some <br>
amount of investigation and analysis. You need to note both the <br>
programmatic and the runtime context in which these errors occur. <br>
Because these codes are defined in WinError.h for anyone to use, <br>
sometimes the codes are returned by non-system software. And sometimes <br>
the code is returned by a function deep in the stack and far removed <br>
from your code that is handling the error."<br>
<br>
In practice, this means "no MSDN function documentation page will list <br>
what errors can occur when calling it", which means "anything goes". <br>
This is very much unlike POSIX, i.e. very much unhelpful.<br></blockquote><div><br></div><div>See my comment about POSIX compliance above ;-)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
For instance, why is it that I need to care that ReadFile() and <br>
WriteFile() may fail with ERROR_NO_SYSTEM_RESOURCES when attempting an <br>
I/O operation at least 64mb - 32kb + 16 bytes in size (this figure is <br>
undocumented), but only when that I/O occurs on mapped drives, and even <br>
if the mapped drive is local to the machine?<br>
<br>
Because e.g.: loading or saving the image fails.<br>
<br>
Ah, right. So now that *ONE* error condition needs special handling. <br>
Great. Only 15998 possible values to go.<br></blockquote><div><br></div><div>One thing that can be said in defense of Windows VM development: prior to OS X, Mac VM support was probably just about as distinct. Mac moved to Unix and Windows stayed put so now it's the odd OS out on this front. </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> but can fail everywhere else depending on the rules <br>
> of the particular OS. On Linux, over a dozen possible error codes are <br>
> given (one of them is the ignored EBADF case) as well as a note that a <br>
> variety of additional errors can occur depending on the particular <br>
> object the file descriptor represents. So reasons: many.<br>
<br>
Yes, however, there are O(10) possible errors, not O(10^4). It's a huge <br>
difference with the MSDN world. I'd rather write against the Unix <br>
subsystem / standard C library on Windows.<br>
<br>
> I believe the #forceChangesToDisk hack had a different objective. The <br>
> other hack(s) are dealing with flush failure, #forceChangesToDisk <br>
> appears to predate flush support and/or to deal with the reality at the <br>
> time that flush alone often wasn't a complete solution.<br>
<br>
Ok, so if that's true, then we're dealing with bit rot.<br>
<br>
This shows that it is incredibly important to be completely thorough, <br>
because it is at that time that a good understanding of the entire <br>
problem is in anyone's head. If you are not thorough today, someone <br>
else will have to recreate your state of mind tomorrow. Overall, <br>
everybody goes slower.<br></blockquote><div><br></div><div>Yep, that's why I'm always complaining about documentation. What seems obvious today often won't be six months or more from now. The frustrating thing is that even though we're being thorough, we could still easily be wrong since we're trying to piece together intent with incomplete information. It's better than guessing but worse than if someone had just written...a ...few...more... words of documentation ;-)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> I'm as baffled by that cryptic comment as you.<br>
<br>
Might as well delete it, then. It serves no good purpose if it can't be <br>
tied to anything concrete.<br></blockquote><div><br></div><div>I'm OK with that.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> I did a little more general search trying to find something, anything <br>
> that might point in a direction that leads to clarity... nothing so far.<br>
<br>
The reference in POSIX says fflush() flushes, and the MSDN reference <br>
says FlushFileBuffers() flushes. If that covers all platforms, the <br>
comment needs to go because misbehavior means it's not your problem and <br>
you can file a bug against the spec. Provided, of course, that you are <br>
certain as you can be that the relevant API is being used correctly, and <br>
that you can recreate the problem in a small, standalone C program, <br>
which you will attach to the bug report :).<br>
<br>
Andres.<br>
-- <br>
Cuis-dev mailing list<br>
<a href="mailto:Cuis-dev@lists.cuis.st" target="_blank">Cuis-dev@lists.cuis.st</a><br>
<a href="https://lists.cuis.st/mailman/listinfo/cuis-dev" rel="noreferrer" target="_blank">https://lists.cuis.st/mailman/listinfo/cuis-dev</a><br>
</blockquote></div></div>