[Cuis-dev] fileout. proposed 2 new methods for strict file chunks reading

Juan Vuletich juan at jvuletich.org
Sun Oct 24 09:14:05 PDT 2021


Hi Nicola,

On 10/23/2021 6:56 PM, Nicola Mingotti via Cuis-dev wrote:
>
> Hi Juan,
>
> At the best of my current undestanding I can provide this:
>
> 1. Fileout for tests in BaseImageTests

Much better!

> 2. A few fileout for new methods and method names

I still think the focus should be on terminator vs. separator, 
especially on method and argument names. See 
https://en.wikipedia.org/wiki/Newline :

"Interpretation
Two ways to view newlines, both of which are self-consistent, are that 
newlines either separate lines or that they terminate lines. If a 
newline is considered a separator, there will be no newline after the 
last line of a file. Some programs have problems processing the last 
line of a file if it is not terminated by a newline. On the other hand, 
programs that expect newline to be used as a separator will interpret a 
final newline as starting a new (empty) line. Conversely, if a newline 
is considered a terminator, all text lines including the last are 
expected to be terminated by a newline. If the final character sequence 
in a text file is not a newline, the final line of the file may be 
considered to be an improper or incomplete text line, or the file may be 
considered to be improperly truncated. "

> 3. I could not replicate the bug you say, I did not understand well 
> maybe. if you could send me
> a fail example it would be helpful.
>

Sure. The following test fails even after fixing the obvious bug:

testUpToStrict3
     | path fs read |
     path := 'test-{1}.txt' format: {(Float pi * 10e10) floor. } .
     path asFileEntry fileContents: ((1 to: 100) inject: '' into: [ 
:prev :each | prev, 'A lot of stuff, needs over 2000 chars! ']).
     fs := path asFileEntry readStream.
     read := fs upTo: $X strict: true.
     self assert: (read =  nil).
     fs close.

>
> bye
> Nicola
>

Cheers,

>
> On 10/23/21 02:26, Nicola Mingotti wrote:
>>
>> Hi Juan, let me a bit of time to read your references, I thought what 
>> I sent were test methods,
>> clearly i miss part of the story.
>>
>> There shouldn't be any concatenation of nil and for God sake NO 
>> partial records.
>> This is what I wanted to avoid, apologies.
>>
>> Tomorrow i will probably be out for the Linux day, i will update when 
>> possible.
>>
>>
>> bye
>> Nicola
>>
>>
>>
>>
>> On 10/23/21 01:20, Juan Vuletich wrote:
>>> Hi Folks,
>>>
>>> The main point here is not "strict vs. legacy", "logically correct 
>>> vs incorrect" or anything like that at all.
>>>
>>> The point is "separator vs. terminator", and how using a terminator 
>>> instead of a separator allows processing files while they are still 
>>> being written to. (And this has really no relation with running on a 
>>> server or any other kind of machine.)
>>>
>>> Besides, Nicola, your code has a bug when recurring on terminator: 
>>> it will answer the previous partial last record concatenated with nil.
>>>
>>> Finally, please take a look at TestCase,SUnit and 
>>> BaseImageTests.pck.st to see what we mean by a "test".
>>>
>>> Thanks,
>>>
>>> On 10/22/2021 12:18 PM, Nicola Mingotti via Cuis-dev wrote:
>>>>
>>>> Hi Hernan,
>>>>
>>>> We will have opportunity to work together on larger problems, this 
>>>> is too small.
>>>> It would take more time to talk than to do things ;)
>>>>
>>>> I have a proposed version. I rewrote the methods. wrote the test. I 
>>>> kept a good part
>>>> of the original code which may have evolved for efficiency over time.
>>>>
>>>> upToLegacy method can of course be eliminated. it is there only for 
>>>> reference.
>>>>
>>>> upTo: XXX --- now calls --->  upTo: XXX strict: false
>>>>
>>>> upTo: XXX strict: XXX ------ is recursive, it needs an extra helper 
>>>> method to remember a parameter (Scheme recursion style)  ----> 
>>>> upTo: XXX strict: XXX posMemo: xxxx
>>>>
>>>> See attached fileout
>>>>
>>>>
>>>> bye
>>>> Nicola
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 10/21/21 19:49, Hernan Wilkinson wrote:
>>>>> ok, let me know. I wish we could do it together but my agenda (and 
>>>>> I guess yours) is almost always full...
>>>>>
>>>>> On Thu, Oct 21, 2021 at 2:32 PM Nicola Mingotti 
>>>>> <nmingotti at gmail.com> wrote:
>>>>>
>>>>>
>>>>>     Hi Hernan,
>>>>>
>>>>>     ok, let me try, it is too many days i am talking about it.
>>>>>
>>>>>     I will let you know soon
>>>>>
>>>>>     bye
>>>>>     Nicola
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     On 10/21/21 19:02, Hernan Wilkinson wrote:
>>>>>>     Hi Nicolas,
>>>>>>      if you could refactor upTo: to use the same code as
>>>>>>     strictUpTo: and write the tests to check that everything
>>>>>>     works as expected, that would be great!
>>>>>>      I would not use the names of the Linux stdlib for those
>>>>>>     messages nor the C functions, it is not necessary...
>>>>>>      If you do not have the time to do it, I can give it a try if
>>>>>>     you wish.
>>>>>>
>>>>>>     Cheers!
>>>>>>     Hernan.
>>>>>>
>>>>>>     On Thu, Oct 21, 2021 at 12:47 PM Nicola Mingotti
>>>>>>     <nmingotti at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>>         Hi Hernan,
>>>>>>
>>>>>>         . forget the code and test. I can rewrite it from scratch
>>>>>>         with test. I actually changed
>>>>>>         existing code for "politeness" ;)
>>>>>>
>>>>>>         . for me it is very important to have this matter fixed,
>>>>>>         well and for the future.
>>>>>>         It is not good to have standard lib functionality
>>>>>>         disseminated in my application packages.
>>>>>>
>>>>>>         . since I found Linux stdlib has a function to do well
>>>>>>         what i want i will use that name(s)
>>>>>>         to avoid confusion and recycle already existing function
>>>>>>         names. "getline" and "getdelim".
>>>>>>
>>>>>>         . if you really dislike this functions I can put them in
>>>>>>         OSProcess and maybe
>>>>>>         just link the C version only for Linux/BSD. So much I
>>>>>>         think they are valuable in the server environment.
>>>>>>
>>>>>>         . to fix this i need maybe 1-2 days. If i need to link
>>>>>>         the C functions I don't know, since I never tried.
>>>>>>
>>>>>>         So, let me know, if you are not against these functions I
>>>>>>         am open to implement them well.
>>>>>>
>>>>>>
>>>>>>         ===== Extra considerations whose reading is secondary
>>>>>>         ==================
>>>>>>
>>>>>>         . your fix was one step in the right direction but not
>>>>>>         enough, you also need to
>>>>>>         bring back the stream pointer to the last existant $A.
>>>>>>         This is to say: too complex.
>>>>>>         A good method must do all its chore, not leave us back
>>>>>>         the dirty business and special conditions.
>>>>>>
>>>>>>         . I understand the concision, small core etc. On the
>>>>>>         other side, i
>>>>>>         run Cuis on the servers.  the most important thing there
>>>>>>         is on servers are files and
>>>>>>         sockets. You must read from there all of the time. It
>>>>>>         must be easy and idiot proof,
>>>>>>         rock solid and resistant to concurrent processing as far
>>>>>>         as possible.
>>>>>>
>>>>>>         . I see that Python and Ruby standard library do it
>>>>>>         wrong, at bit better than Cuis 'upTo' does.
>>>>>>         but still bad. They leave you the '\n' at the end, but,
>>>>>>         if any process goes on writing
>>>>>>         'f1.txt' Ruby and Python lost the half backed record !
>>>>>>         -------- Linux
>>>>>>         $> printf 'line-1\nline-2\nline-TRAP' > f1.txt
>>>>>>         # python
>>>>>>         $> python3.9 -c "f=open('f1.txt','r'); print(f.readlines())"
>>>>>>         => ['line-1\n', 'line-2\n', 'line-TRAP']
>>>>>>         # ruby
>>>>>>         $> ruby -e "f=open('f1.txt','r'); puts
>>>>>>         f.readlines().to_s;  "
>>>>>>         => ["line-1\n", "line-2\n", "line-TRAP"]
>>>>>>         # both Python and Ruby ate the half backed record ! bad !
>>>>>>         ---------------------------------------------------------
>>>>>>
>>>>>>         . C and CommonLisp standard libraries have a way to do it
>>>>>>         right:
>>>>>>         -) CL read-line.
>>>>>>         http://www.lispworks.com/documentation/HyperSpec/Body/f_rd_lin.htm#read-line
>>>>>>
>>>>>>         -) C getline.
>>>>>>         https://man7.org/linux/man-pages/man3/getline.3.html
>>>>>>
>>>>>>         . I understand I am probably the only one running Cuis in
>>>>>>         the server so I am the first
>>>>>>         to step into a few particular problems.
>>>>>>
>>>>>>         . In my opinion Cuis in the Server can be a good match,
>>>>>>         up to now i have 2 small
>>>>>>         company services working and a big one project in
>>>>>>         continuous development.
>>>>>>         Time will tell. Sturdiness, undertandability and ease of
>>>>>>         modification were my top priority.
>>>>>>         Up to now things are at least working.
>>>>>>
>>>>>>         ======================================================
>>>>>>
>>>>>>         bye
>>>>>>         Nicola
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>         On 10/21/21 14:53, Hernan Wilkinson wrote:
>>>>>>>         Hi Nicola,
>>>>>>>          I see your point regarding the functionality of upTo:,
>>>>>>>         but you can easily overcome that using #peekBack. Using
>>>>>>>         you example:
>>>>>>>         -----
>>>>>>>         s _ 'hello-1Ahello-2Ahel'.
>>>>>>>         '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>>>
>>>>>>>         st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>>>
>>>>>>>         st1 upTo: $A. " 'hello-1' "
>>>>>>>         st1 upTo: $A. " 'hello-2' "
>>>>>>>         st1 upTo: $A. " 'hel' "
>>>>>>>         (st1 atEnd and: [ st1 peekBack ~= $A ]) ifTrue: [ self
>>>>>>>         error: 'End of file without delimiter ].
>>>>>>>         ------
>>>>>>>          Regarding my concern of adding this functionality to
>>>>>>>         Cuis, we are trying to have a compact set of classes and
>>>>>>>         methods to reduce complexity (or at least not increase
>>>>>>>         it) and help newcomers to understand it and oldies to
>>>>>>>         remember it :-) . We are also trying to add more and
>>>>>>>         more tests because it is the only way to keep a system
>>>>>>>         from becoming a legacy one and to reduce the fear it
>>>>>>>         produces to change something.
>>>>>>>          The strictUpTo:startPos: you are sending is almost a
>>>>>>>         copy of the upTo: method, with a few lines changed.
>>>>>>>         Even though the functionality makes sense (although
>>>>>>>         right now you are the only one needing it and as I said,
>>>>>>>         you can use peekBack to overcome it), adding that method
>>>>>>>         adds repeated code which in the long term makes it more
>>>>>>>         difficult to understand and maintain, even more because
>>>>>>>         it does not have tests.
>>>>>>>          So I hope you understand that as maintainers of Cuis,
>>>>>>>         we want to be loyal to the goals I mentioned before and
>>>>>>>         keep Cuis as clean and simple as possible. If you can
>>>>>>>         refactor what you sent to avoid having repeated code
>>>>>>>         with #upTo: and add tests that verify the functionality
>>>>>>>         of both methods (strictUpTo: and upTo:), that will make
>>>>>>>         our task easier and meet the goals we have. If you think
>>>>>>>         this does not make sense to you, or you do not have the
>>>>>>>         time to do it, it is completely understandable and in
>>>>>>>         that case I suggest for you to have it as an extension
>>>>>>>         of the StandardFileStream class or just use the peekBack
>>>>>>>         message as I showed.
>>>>>>>          I hope you understand my concern and agree with me. If
>>>>>>>         not, please let me know.
>>>>>>>
>>>>>>>         Cheers!
>>>>>>>         Hernan.
>>>>>>>
>>>>>>>
>>>>>>>         On Tue, Oct 19, 2021 at 10:32 AM Nicola Mingotti
>>>>>>>         <nmingotti at gmail.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>             Hi Hernan,
>>>>>>>
>>>>>>>             In all frankness, in I would wipe out the old 'upTo'
>>>>>>>             because its behavior is a bit "wild".
>>>>>>>
>>>>>>>             On the other side, I understand it may create
>>>>>>>             problems in retro-compatibility, that is why for
>>>>>>>             the moment i propose to add a new method which
>>>>>>>             behaves a bit better.
>>>>>>>
>>>>>>>             I hope this example explains the problem:
>>>>>>>             -------------------------------------------------------
>>>>>>>             s _ 'hello-1Ahello-2Ahel'.
>>>>>>>             '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>>>
>>>>>>>             st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>>>
>>>>>>>             st1 upTo: $A. " 'hello-1' "
>>>>>>>             st1 upTo: $A. " 'hello-2' "
>>>>>>>             st1 upTo: $A. " 'hel' "         "(*)"
>>>>>>>             ------------------------------------------------------
>>>>>>>             (*) You can't establish in any way if you actually
>>>>>>>             found an "A" terminated block or just hit the end of
>>>>>>>             file
>>>>>>>             (*) If you hit the end of file you eat an incomplete
>>>>>>>             record, this is another problem, maybe another process
>>>>>>>             was going to end writing that record but you will
>>>>>>>             never know.
>>>>>>>
>>>>>>>             Maybe there is another method around that performs
>>>>>>>             similarly to 'strictUpTp', if there is I did not
>>>>>>>             find it, sorry.
>>>>>>>
>>>>>>>             IMHO, In a scale of importance from 0 to 10, this
>>>>>>>             method, for a programmer, >= 8.
>>>>>>>             I would definitely not put it into an external
>>>>>>>             package, too much fundamental.
>>>>>>>
>>>>>>>             bye
>>>>>>>             Nicola
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>             On 10/19/21 14:44, Hernan Wilkinson wrote:
>>>>>>>>             Hi Nicola!
>>>>>>>>              I was wondering, why are you suggesting adding
>>>>>>>>             them to the base? Is it not enough to implement
>>>>>>>>             them as an extension in your package?
>>>>>>>>              Also, I think that any new functionality should
>>>>>>>>             come with its corresponding tests to help the
>>>>>>>>             maintenance and understanding of the functionality.
>>>>>>>>
>>>>>>>>             Cheers!
>>>>>>>>             Hernan.
>>>>>>>>
>>>>>>>>
>>>>>>>>             On Tue, Oct 19, 2021 at 7:04 AM Nicola Mingotti via
>>>>>>>>             Cuis-dev <cuis-dev at lists.cuis.st> wrote:
>>>>>>>>
>>>>>>>>                 Hi Juan, guys,
>>>>>>>>
>>>>>>>>                 I would like to add to Cuis the 2 methods i
>>>>>>>>                 attach here. One is a helper method.
>>>>>>>>
>>>>>>>>                 -----------
>>>>>>>>                 StandardFileStream strictUpTo: delim.
>>>>>>>>                 -----------
>>>>>>>>
>>>>>>>>                 Differently from 'upTo: delim' this method:
>>>>>>>>                 1. Does not return stuff if it does not find
>>>>>>>>                 'delim'.
>>>>>>>>                 2. Does not upgrade the position on the stream
>>>>>>>>                 if does not find 'delim'.
>>>>>>>>                 3. If it finds 'delim' returns a chunk that
>>>>>>>>                 includes it.
>>>>>>>>
>>>>>>>>                 I am parsing log files at the moment, this is
>>>>>>>>                 very much useful.
>>>>>>>>
>>>>>>>>                 NOTE. Up to now I tested only on small files.
>>>>>>>>
>>>>>>>>                 bye
>>>>>>>>                 Nicola
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 -- 
>>>>>>>>                 Cuis-dev mailing list
>>>>>>>>                 Cuis-dev at lists.cuis.st
>>>>>>>>                 https://lists.cuis.st/mailman/listinfo/cuis-dev
>>>>>>>>
>>>>>>>>
>>>
>>> -- 
>>> Juan Vuletich
>>> www.cuis-smalltalk.org
>>> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
>>> https://github.com/jvuletich
>>> https://www.linkedin.com/in/juan-vuletich-75611b3
>>> @JuanVuletich
>>
>


-- 
Juan Vuletich
www.cuis-smalltalk.org
https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
https://github.com/jvuletich
https://www.linkedin.com/in/juan-vuletich-75611b3
@JuanVuletich

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20211024/0005011d/attachment-0001.htm>


More information about the Cuis-dev mailing list