[Cuis-dev] fileout. proposed 2 new methods for strict file chunks reading

Nicola Mingotti nmingotti at gmail.com
Mon Oct 25 09:09:38 PDT 2021


Hi guys,

Juan, I saw your changes, i think i understood.

My only perplexity is about the parameter slot: 'delimiterIsTerminator', 
i guess
if it has to be a question it should be 'isDelimiterTerminator' or 
'isDelimiterATerminator' .

I am not native English so my opinion on this weights as a feather.

bye
Nicola




On 10/25/21 16:39, Juan Vuletich wrote:
> Hi Nicola, Hernán,
>
> This is my take. I tried to be explicit and clear with 'terminator' 
> vs. 'separator', and also added the other two implementors of #upTo: 
> in the Stream hierarchy. Slightly tweaked your tests, and used them 
> almost verbatim for the other two implementors.
>
> Please review.
>
> Thanks,
>
> On 10/25/2021 7:35 AM, Nicola Mingotti via Cuis-dev wrote:
>>
>> Hi Juan,
>>
>> 1. I corrected the bug you found, added other test cases and made 
>> them symmetric
>> between 'upTo' and 'upToStrict'. There are 2 files attached, one for 
>> tests one to collect changes to System-Files.
>>
>> 2. about names, 'terminator', 'separator', i see your point. I am 
>> open to any
>> naming scheme. The motivation that pushes me to ask this enhancement
>> of 'upTo' is totally based on log parsing. So, It wouldn't be 
>> inappropriate also to name the
>> boolean parameter something like "logReaderMode". It would be long, 
>> but easy to detect
>> for people involved in this kind of business. I don't dislike also 
>> "strict" to be honest.
>>
>>
>> bye
>> Nicola
>>
>>
>>
>> On 10/24/21 18:14, Juan Vuletich wrote:
>>> Hi Nicola,
>>>
>>> On 10/23/2021 6:56 PM, Nicola Mingotti via Cuis-dev wrote:
>>>>
>>>> Hi Juan,
>>>>
>>>> At the best of my current undestanding I can provide this:
>>>>
>>>> 1. Fileout for tests in BaseImageTests
>>>
>>> Much better!
>>>
>>>> 2. A few fileout for new methods and method names
>>>
>>> I still think the focus should be on terminator vs. separator, 
>>> especially on method and argument names. See 
>>> https://en.wikipedia.org/wiki/Newline :
>>>
>>> "Interpretation
>>> Two ways to view newlines, both of which are self-consistent, are 
>>> that newlines either separate lines or that they terminate lines. If 
>>> a newline is considered a separator, there will be no newline after 
>>> the last line of a file. Some programs have problems processing the 
>>> last line of a file if it is not terminated by a newline. On the 
>>> other hand, programs that expect newline to be used as a separator 
>>> will interpret a final newline as starting a new (empty) line. 
>>> Conversely, if a newline is considered a terminator, all text lines 
>>> including the last are expected to be terminated by a newline. If 
>>> the final character sequence in a text file is not a newline, the 
>>> final line of the file may be considered to be an improper or 
>>> incomplete text line, or the file may be considered to be improperly 
>>> truncated. "
>>>
>>>> 3. I could not replicate the bug you say, I did not understand well 
>>>> maybe. if you could send me
>>>> a fail example it would be helpful.
>>>>
>>>
>>> Sure. The following test fails even after fixing the obvious bug:
>>>
>>> testUpToStrict3
>>>     | path fs read |
>>>     path := 'test-{1}.txt' format: {(Float pi * 10e10) floor. } .
>>>     path asFileEntry fileContents: ((1 to: 100) inject: '' into: [ 
>>> :prev :each | prev, 'A lot of stuff, needs over 2000 chars! ']).
>>>     fs := path asFileEntry readStream.
>>>     read := fs upTo: $X strict: true.
>>>     self assert: (read =  nil).
>>>     fs close.
>>>
>>>>
>>>> bye
>>>> Nicola
>>>>
>>>
>>> Cheers,
>>>
>>>>
>>>> On 10/23/21 02:26, Nicola Mingotti wrote:
>>>>>
>>>>> Hi Juan, let me a bit of time to read your references, I thought 
>>>>> what I sent were test methods,
>>>>> clearly i miss part of the story.
>>>>>
>>>>> There shouldn't be any concatenation of nil and for God sake NO 
>>>>> partial records.
>>>>> This is what I wanted to avoid, apologies.
>>>>>
>>>>> Tomorrow i will probably be out for the Linux day, i will update 
>>>>> when possible.
>>>>>
>>>>>
>>>>> bye
>>>>> Nicola
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 10/23/21 01:20, Juan Vuletich wrote:
>>>>>> Hi Folks,
>>>>>>
>>>>>> The main point here is not "strict vs. legacy", "logically 
>>>>>> correct vs incorrect" or anything like that at all.
>>>>>>
>>>>>> The point is "separator vs. terminator", and how using a 
>>>>>> terminator instead of a separator allows processing files while 
>>>>>> they are still being written to. (And this has really no relation 
>>>>>> with running on a server or any other kind of machine.)
>>>>>>
>>>>>> Besides, Nicola, your code has a bug when recurring on 
>>>>>> terminator: it will answer the previous partial last record 
>>>>>> concatenated with nil.
>>>>>>
>>>>>> Finally, please take a look at TestCase,SUnit and 
>>>>>> BaseImageTests.pck.st to see what we mean by a "test".
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> On 10/22/2021 12:18 PM, Nicola Mingotti via Cuis-dev wrote:
>>>>>>>
>>>>>>> Hi Hernan,
>>>>>>>
>>>>>>> We will have opportunity to work together on larger problems, 
>>>>>>> this is too small.
>>>>>>> It would take more time to talk than to do things ;)
>>>>>>>
>>>>>>> I have a proposed version. I rewrote the methods. wrote the 
>>>>>>> test. I kept a good part
>>>>>>> of the original code which may have evolved for efficiency over 
>>>>>>> time.
>>>>>>>
>>>>>>> upToLegacy method can of course be eliminated. it is there only 
>>>>>>> for reference.
>>>>>>>
>>>>>>> upTo: XXX --- now calls --->  upTo: XXX strict: false
>>>>>>>
>>>>>>> upTo: XXX strict: XXX ------ is recursive, it needs an extra 
>>>>>>> helper method to remember a parameter (Scheme recursion style)  
>>>>>>> ----> upTo: XXX strict: XXX posMemo: xxxx
>>>>>>>
>>>>>>> See attached fileout
>>>>>>>
>>>>>>>
>>>>>>> bye
>>>>>>> Nicola
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/21/21 19:49, Hernan Wilkinson wrote:
>>>>>>>> ok, let me know. I wish we could do it together but my agenda 
>>>>>>>> (and I guess yours) is almost always full...
>>>>>>>>
>>>>>>>> On Thu, Oct 21, 2021 at 2:32 PM Nicola Mingotti 
>>>>>>>> <nmingotti at gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>     Hi Hernan,
>>>>>>>>
>>>>>>>>     ok, let me try, it is too many days i am talking about it.
>>>>>>>>
>>>>>>>>     I will let you know soon
>>>>>>>>
>>>>>>>>     bye
>>>>>>>>     Nicola
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     On 10/21/21 19:02, Hernan Wilkinson wrote:
>>>>>>>>>     Hi Nicolas,
>>>>>>>>>      if you could refactor upTo: to use the same code as
>>>>>>>>>     strictUpTo: and write the tests to check that everything
>>>>>>>>>     works as expected, that would be great!
>>>>>>>>>      I would not use the names of the Linux stdlib for those
>>>>>>>>>     messages nor the C functions, it is not necessary...
>>>>>>>>>      If you do not have the time to do it, I can give it a try
>>>>>>>>>     if you wish.
>>>>>>>>>
>>>>>>>>>     Cheers!
>>>>>>>>>     Hernan.
>>>>>>>>>
>>>>>>>>>     On Thu, Oct 21, 2021 at 12:47 PM Nicola Mingotti
>>>>>>>>>     <nmingotti at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>         Hi Hernan,
>>>>>>>>>
>>>>>>>>>         . forget the code and test. I can rewrite it from
>>>>>>>>>         scratch with test. I actually changed
>>>>>>>>>         existing code for "politeness" ;)
>>>>>>>>>
>>>>>>>>>         . for me it is very important to have this matter
>>>>>>>>>         fixed, well and for the future.
>>>>>>>>>         It is not good to have standard lib functionality
>>>>>>>>>         disseminated in my application packages.
>>>>>>>>>
>>>>>>>>>         . since I found Linux stdlib has a function to do well
>>>>>>>>>         what i want i will use that name(s)
>>>>>>>>>         to avoid confusion and recycle already existing
>>>>>>>>>         function names. "getline" and "getdelim".
>>>>>>>>>
>>>>>>>>>         . if you really dislike this functions I can put them
>>>>>>>>>         in OSProcess and maybe
>>>>>>>>>         just link the C version only for Linux/BSD. So much I
>>>>>>>>>         think they are valuable in the server environment.
>>>>>>>>>
>>>>>>>>>         . to fix this i need maybe 1-2 days. If i need to link
>>>>>>>>>         the C functions I don't know, since I never tried.
>>>>>>>>>
>>>>>>>>>         So, let me know, if you are not against these
>>>>>>>>>         functions I am open to implement them well.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>         ===== Extra considerations whose reading is secondary
>>>>>>>>>         ==================
>>>>>>>>>
>>>>>>>>>         . your fix was one step in the right direction but not
>>>>>>>>>         enough, you also need to
>>>>>>>>>         bring back the stream pointer to the last existant $A.
>>>>>>>>>         This is to say: too complex.
>>>>>>>>>         A good method must do all its chore, not leave us back
>>>>>>>>>         the dirty business and special conditions.
>>>>>>>>>
>>>>>>>>>         . I understand the concision, small core etc. On the
>>>>>>>>>         other side, i
>>>>>>>>>         run Cuis on the servers.  the most important thing
>>>>>>>>>         there is on servers are files and
>>>>>>>>>         sockets. You must read from there all of the time. It
>>>>>>>>>         must be easy and idiot proof,
>>>>>>>>>         rock solid and resistant to concurrent processing as
>>>>>>>>>         far as possible.
>>>>>>>>>
>>>>>>>>>         . I see that Python and Ruby standard library do it
>>>>>>>>>         wrong, at bit better than Cuis 'upTo' does.
>>>>>>>>>         but still bad. They leave you the '\n' at the end,
>>>>>>>>>         but, if any process goes on writing
>>>>>>>>>         'f1.txt' Ruby and Python lost the half backed record !
>>>>>>>>>         -------- Linux
>>>>>>>>>         $> printf 'line-1\nline-2\nline-TRAP' > f1.txt
>>>>>>>>>         # python
>>>>>>>>>         $> python3.9 -c "f=open('f1.txt','r');
>>>>>>>>>         print(f.readlines())"
>>>>>>>>>         => ['line-1\n', 'line-2\n', 'line-TRAP']
>>>>>>>>>         # ruby
>>>>>>>>>         $> ruby -e "f=open('f1.txt','r'); puts
>>>>>>>>>         f.readlines().to_s;  "
>>>>>>>>>         => ["line-1\n", "line-2\n", "line-TRAP"]
>>>>>>>>>         # both Python and Ruby ate the half backed record ! bad !
>>>>>>>>>         ---------------------------------------------------------
>>>>>>>>>
>>>>>>>>>         . C and CommonLisp standard libraries have a way to do
>>>>>>>>>         it right:
>>>>>>>>>         -) CL read-line.
>>>>>>>>>         http://www.lispworks.com/documentation/HyperSpec/Body/f_rd_lin.htm#read-line
>>>>>>>>>
>>>>>>>>>         -) C getline.
>>>>>>>>>         https://man7.org/linux/man-pages/man3/getline.3.html
>>>>>>>>>
>>>>>>>>>         . I understand I am probably the only one running Cuis
>>>>>>>>>         in the server so I am the first
>>>>>>>>>         to step into a few particular problems.
>>>>>>>>>
>>>>>>>>>         . In my opinion Cuis in the Server can be a good
>>>>>>>>>         match, up to now i have 2 small
>>>>>>>>>         company services working and a big one project in
>>>>>>>>>         continuous development.
>>>>>>>>>         Time will tell. Sturdiness, undertandability and ease
>>>>>>>>>         of modification were my top priority.
>>>>>>>>>         Up to now things are at least working.
>>>>>>>>>
>>>>>>>>>         ======================================================
>>>>>>>>>
>>>>>>>>>         bye
>>>>>>>>>         Nicola
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>         On 10/21/21 14:53, Hernan Wilkinson wrote:
>>>>>>>>>>         Hi Nicola,
>>>>>>>>>>          I see your point regarding the functionality of
>>>>>>>>>>         upTo:, but you can easily overcome that using
>>>>>>>>>>         #peekBack. Using you example:
>>>>>>>>>>         -----
>>>>>>>>>>         s _ 'hello-1Ahello-2Ahel'.
>>>>>>>>>>         '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>>>>>>
>>>>>>>>>>         st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>>>>>>
>>>>>>>>>>         st1 upTo: $A. " 'hello-1' "
>>>>>>>>>>         st1 upTo: $A. " 'hello-2' "
>>>>>>>>>>         st1 upTo: $A. " 'hel' "
>>>>>>>>>>         (st1 atEnd and: [ st1 peekBack ~= $A ]) ifTrue: [
>>>>>>>>>>         self error: 'End of file without delimiter ].
>>>>>>>>>>         ------
>>>>>>>>>>          Regarding my concern of adding this functionality to
>>>>>>>>>>         Cuis, we are trying to have a compact set of classes
>>>>>>>>>>         and methods to reduce complexity (or at least not
>>>>>>>>>>         increase it) and help newcomers to understand it and
>>>>>>>>>>         oldies to remember it :-) . We are also trying to add
>>>>>>>>>>         more and more tests because it is the only way to
>>>>>>>>>>         keep a system from becoming a legacy one and to
>>>>>>>>>>         reduce the fear it produces to change something.
>>>>>>>>>>          The strictUpTo:startPos: you are sending is almost a
>>>>>>>>>>         copy of the upTo: method, with a few lines changed.
>>>>>>>>>>         Even though the functionality makes sense (although
>>>>>>>>>>         right now you are the only one needing it and as I
>>>>>>>>>>         said, you can use peekBack to overcome it),
>>>>>>>>>>         adding that method adds repeated code which in the
>>>>>>>>>>         long term makes it more difficult to understand and
>>>>>>>>>>         maintain, even more because it does not have tests.
>>>>>>>>>>          So I hope you understand that as maintainers of
>>>>>>>>>>         Cuis, we want to be loyal to the goals I mentioned
>>>>>>>>>>         before and keep Cuis as clean and simple as possible.
>>>>>>>>>>         If you can refactor what you sent to avoid having
>>>>>>>>>>         repeated code with #upTo: and add tests that verify
>>>>>>>>>>         the functionality of both methods (strictUpTo: and
>>>>>>>>>>         upTo:), that will make our task easier and meet the
>>>>>>>>>>         goals we have. If you think this does not make sense
>>>>>>>>>>         to you, or you do not have the time to do it, it is
>>>>>>>>>>         completely understandable and in that case I suggest
>>>>>>>>>>         for you to have it as an extension of the
>>>>>>>>>>         StandardFileStream class or just use the peekBack
>>>>>>>>>>         message as I showed.
>>>>>>>>>>          I hope you understand my concern and agree with me.
>>>>>>>>>>         If not, please let me know.
>>>>>>>>>>
>>>>>>>>>>         Cheers!
>>>>>>>>>>         Hernan.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>         On Tue, Oct 19, 2021 at 10:32 AM Nicola Mingotti
>>>>>>>>>>         <nmingotti at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>             Hi Hernan,
>>>>>>>>>>
>>>>>>>>>>             In all frankness, in I would wipe out the old
>>>>>>>>>>             'upTo' because its behavior is a bit "wild".
>>>>>>>>>>
>>>>>>>>>>             On the other side, I understand it may create
>>>>>>>>>>             problems in retro-compatibility, that is why for
>>>>>>>>>>             the moment i propose to add a new method which
>>>>>>>>>>             behaves a bit better.
>>>>>>>>>>
>>>>>>>>>>             I hope this example explains the problem:
>>>>>>>>>>             -------------------------------------------------------
>>>>>>>>>>             s _ 'hello-1Ahello-2Ahel'.
>>>>>>>>>>             '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>>>>>>
>>>>>>>>>>             st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>>>>>>
>>>>>>>>>>             st1 upTo: $A. " 'hello-1' "
>>>>>>>>>>             st1 upTo: $A. " 'hello-2' "
>>>>>>>>>>             st1 upTo: $A. " 'hel' "         "(*)"
>>>>>>>>>>             ------------------------------------------------------
>>>>>>>>>>             (*) You can't establish in any way if you
>>>>>>>>>>             actually found an "A" terminated block or just
>>>>>>>>>>             hit the end of file
>>>>>>>>>>             (*) If you hit the end of file you eat an
>>>>>>>>>>             incomplete record, this is another problem, maybe
>>>>>>>>>>             another process
>>>>>>>>>>             was going to end writing that record but you will
>>>>>>>>>>             never know.
>>>>>>>>>>
>>>>>>>>>>             Maybe there is another method around that
>>>>>>>>>>             performs similarly to 'strictUpTp', if there is I
>>>>>>>>>>             did not find it, sorry.
>>>>>>>>>>
>>>>>>>>>>             IMHO, In a scale of importance from 0 to 10, this
>>>>>>>>>>             method, for a programmer, >= 8.
>>>>>>>>>>             I would definitely not put it into an external
>>>>>>>>>>             package, too much fundamental.
>>>>>>>>>>
>>>>>>>>>>             bye
>>>>>>>>>>             Nicola
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>             On 10/19/21 14:44, Hernan Wilkinson wrote:
>>>>>>>>>>>             Hi Nicola!
>>>>>>>>>>>              I was wondering, why are you suggesting adding
>>>>>>>>>>>             them to the base? Is it not enough to implement
>>>>>>>>>>>             them as an extension in your package?
>>>>>>>>>>>              Also, I think that any new functionality should
>>>>>>>>>>>             come with its corresponding tests to help the
>>>>>>>>>>>             maintenance and understanding of the functionality.
>>>>>>>>>>>
>>>>>>>>>>>             Cheers!
>>>>>>>>>>>             Hernan.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>             On Tue, Oct 19, 2021 at 7:04 AM Nicola Mingotti
>>>>>>>>>>>             via Cuis-dev <cuis-dev at lists.cuis.st> wrote:
>>>>>>>>>>>
>>>>>>>>>>>                 Hi Juan, guys,
>>>>>>>>>>>
>>>>>>>>>>>                 I would like to add to Cuis the 2 methods i
>>>>>>>>>>>                 attach here. One is a helper method.
>>>>>>>>>>>
>>>>>>>>>>>                 -----------
>>>>>>>>>>>                 StandardFileStream strictUpTo: delim.
>>>>>>>>>>>                 -----------
>>>>>>>>>>>
>>>>>>>>>>>                 Differently from 'upTo: delim' this method:
>>>>>>>>>>>                 1. Does not return stuff if it does not find
>>>>>>>>>>>                 'delim'.
>>>>>>>>>>>                 2. Does not upgrade the position on the
>>>>>>>>>>>                 stream if does not find 'delim'.
>>>>>>>>>>>                 3. If it finds 'delim' returns a chunk that
>>>>>>>>>>>                 includes it.
>>>>>>>>>>>
>>>>>>>>>>>                 I am parsing log files at the moment, this
>>>>>>>>>>>                 is very much useful.
>>>>>>>>>>>
>>>>>>>>>>>                 NOTE. Up to now I tested only on small files.
>>>>>>>>>>>
>>>>>>>>>>>                 bye
>>>>>>>>>>>                 Nicola
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                 -- 
>>>>>>>>>>>                 Cuis-dev mailing list
>>>>>>>>>>>                 Cuis-dev at lists.cuis.st
>>>>>>>>>>>                 https://lists.cuis.st/mailman/listinfo/cuis-dev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Juan Vuletich
>>>>>> www.cuis-smalltalk.org
>>>>>> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
>>>>>> https://github.com/jvuletich
>>>>>> https://www.linkedin.com/in/juan-vuletich-75611b3
>>>>>> @JuanVuletich
>>>>>
>>>>
>>>
>>>
>>> -- 
>>> Juan Vuletich
>>> www.cuis-smalltalk.org
>>> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
>>> https://github.com/jvuletich
>>> https://www.linkedin.com/in/juan-vuletich-75611b3
>>> @JuanVuletich
>>
>
>
> -- 
> Juan Vuletich
> www.cuis-smalltalk.org
> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
> https://github.com/jvuletich
> https://www.linkedin.com/in/juan-vuletich-75611b3
> @JuanVuletich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20211025/3c565f98/attachment-0001.htm>


More information about the Cuis-dev mailing list