[Cuis-dev] fileout. proposed 2 new methods for strict file chunks reading
Nicola Mingotti
nmingotti at gmail.com
Mon Oct 25 03:35:35 PDT 2021
Hi Juan,
1. I corrected the bug you found, added other test cases and made them
symmetric
between 'upTo' and 'upToStrict'. There are 2 files attached, one for
tests one to collect changes to System-Files.
2. about names, 'terminator', 'separator', i see your point. I am open
to any
naming scheme. The motivation that pushes me to ask this enhancement
of 'upTo' is totally based on log parsing. So, It wouldn't be
inappropriate also to name the
boolean parameter something like "logReaderMode". It would be long, but
easy to detect
for people involved in this kind of business. I don't dislike also
"strict" to be honest.
bye
Nicola
On 10/24/21 18:14, Juan Vuletich wrote:
> Hi Nicola,
>
> On 10/23/2021 6:56 PM, Nicola Mingotti via Cuis-dev wrote:
>>
>> Hi Juan,
>>
>> At the best of my current undestanding I can provide this:
>>
>> 1. Fileout for tests in BaseImageTests
>
> Much better!
>
>> 2. A few fileout for new methods and method names
>
> I still think the focus should be on terminator vs. separator,
> especially on method and argument names. See
> https://en.wikipedia.org/wiki/Newline :
>
> "Interpretation
> Two ways to view newlines, both of which are self-consistent, are that
> newlines either separate lines or that they terminate lines. If a
> newline is considered a separator, there will be no newline after the
> last line of a file. Some programs have problems processing the last
> line of a file if it is not terminated by a newline. On the other
> hand, programs that expect newline to be used as a separator will
> interpret a final newline as starting a new (empty) line. Conversely,
> if a newline is considered a terminator, all text lines including the
> last are expected to be terminated by a newline. If the final
> character sequence in a text file is not a newline, the final line of
> the file may be considered to be an improper or incomplete text line,
> or the file may be considered to be improperly truncated. "
>
>> 3. I could not replicate the bug you say, I did not understand well
>> maybe. if you could send me
>> a fail example it would be helpful.
>>
>
> Sure. The following test fails even after fixing the obvious bug:
>
> testUpToStrict3
> | path fs read |
> path := 'test-{1}.txt' format: {(Float pi * 10e10) floor. } .
> path asFileEntry fileContents: ((1 to: 100) inject: '' into: [
> :prev :each | prev, 'A lot of stuff, needs over 2000 chars! ']).
> fs := path asFileEntry readStream.
> read := fs upTo: $X strict: true.
> self assert: (read = nil).
> fs close.
>
>>
>> bye
>> Nicola
>>
>
> Cheers,
>
>>
>> On 10/23/21 02:26, Nicola Mingotti wrote:
>>>
>>> Hi Juan, let me a bit of time to read your references, I thought
>>> what I sent were test methods,
>>> clearly i miss part of the story.
>>>
>>> There shouldn't be any concatenation of nil and for God sake NO
>>> partial records.
>>> This is what I wanted to avoid, apologies.
>>>
>>> Tomorrow i will probably be out for the Linux day, i will update
>>> when possible.
>>>
>>>
>>> bye
>>> Nicola
>>>
>>>
>>>
>>>
>>> On 10/23/21 01:20, Juan Vuletich wrote:
>>>> Hi Folks,
>>>>
>>>> The main point here is not "strict vs. legacy", "logically correct
>>>> vs incorrect" or anything like that at all.
>>>>
>>>> The point is "separator vs. terminator", and how using a terminator
>>>> instead of a separator allows processing files while they are still
>>>> being written to. (And this has really no relation with running on
>>>> a server or any other kind of machine.)
>>>>
>>>> Besides, Nicola, your code has a bug when recurring on terminator:
>>>> it will answer the previous partial last record concatenated with nil.
>>>>
>>>> Finally, please take a look at TestCase,SUnit and
>>>> BaseImageTests.pck.st to see what we mean by a "test".
>>>>
>>>> Thanks,
>>>>
>>>> On 10/22/2021 12:18 PM, Nicola Mingotti via Cuis-dev wrote:
>>>>>
>>>>> Hi Hernan,
>>>>>
>>>>> We will have opportunity to work together on larger problems, this
>>>>> is too small.
>>>>> It would take more time to talk than to do things ;)
>>>>>
>>>>> I have a proposed version. I rewrote the methods. wrote the test.
>>>>> I kept a good part
>>>>> of the original code which may have evolved for efficiency over time.
>>>>>
>>>>> upToLegacy method can of course be eliminated. it is there only
>>>>> for reference.
>>>>>
>>>>> upTo: XXX --- now calls ---> upTo: XXX strict: false
>>>>>
>>>>> upTo: XXX strict: XXX ------ is recursive, it needs an extra
>>>>> helper method to remember a parameter (Scheme recursion style)
>>>>> ----> upTo: XXX strict: XXX posMemo: xxxx
>>>>>
>>>>> See attached fileout
>>>>>
>>>>>
>>>>> bye
>>>>> Nicola
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 10/21/21 19:49, Hernan Wilkinson wrote:
>>>>>> ok, let me know. I wish we could do it together but my agenda
>>>>>> (and I guess yours) is almost always full...
>>>>>>
>>>>>> On Thu, Oct 21, 2021 at 2:32 PM Nicola Mingotti
>>>>>> <nmingotti at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Hernan,
>>>>>>
>>>>>> ok, let me try, it is too many days i am talking about it.
>>>>>>
>>>>>> I will let you know soon
>>>>>>
>>>>>> bye
>>>>>> Nicola
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/21/21 19:02, Hernan Wilkinson wrote:
>>>>>>> Hi Nicolas,
>>>>>>> if you could refactor upTo: to use the same code as
>>>>>>> strictUpTo: and write the tests to check that everything
>>>>>>> works as expected, that would be great!
>>>>>>> I would not use the names of the Linux stdlib for those
>>>>>>> messages nor the C functions, it is not necessary...
>>>>>>> If you do not have the time to do it, I can give it a try
>>>>>>> if you wish.
>>>>>>>
>>>>>>> Cheers!
>>>>>>> Hernan.
>>>>>>>
>>>>>>> On Thu, Oct 21, 2021 at 12:47 PM Nicola Mingotti
>>>>>>> <nmingotti at gmail.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Hernan,
>>>>>>>
>>>>>>> . forget the code and test. I can rewrite it from
>>>>>>> scratch with test. I actually changed
>>>>>>> existing code for "politeness" ;)
>>>>>>>
>>>>>>> . for me it is very important to have this matter fixed,
>>>>>>> well and for the future.
>>>>>>> It is not good to have standard lib functionality
>>>>>>> disseminated in my application packages.
>>>>>>>
>>>>>>> . since I found Linux stdlib has a function to do well
>>>>>>> what i want i will use that name(s)
>>>>>>> to avoid confusion and recycle already existing function
>>>>>>> names. "getline" and "getdelim".
>>>>>>>
>>>>>>> . if you really dislike this functions I can put them in
>>>>>>> OSProcess and maybe
>>>>>>> just link the C version only for Linux/BSD. So much I
>>>>>>> think they are valuable in the server environment.
>>>>>>>
>>>>>>> . to fix this i need maybe 1-2 days. If i need to link
>>>>>>> the C functions I don't know, since I never tried.
>>>>>>>
>>>>>>> So, let me know, if you are not against these functions
>>>>>>> I am open to implement them well.
>>>>>>>
>>>>>>>
>>>>>>> ===== Extra considerations whose reading is secondary
>>>>>>> ==================
>>>>>>>
>>>>>>> . your fix was one step in the right direction but not
>>>>>>> enough, you also need to
>>>>>>> bring back the stream pointer to the last existant $A.
>>>>>>> This is to say: too complex.
>>>>>>> A good method must do all its chore, not leave us back
>>>>>>> the dirty business and special conditions.
>>>>>>>
>>>>>>> . I understand the concision, small core etc. On the
>>>>>>> other side, i
>>>>>>> run Cuis on the servers. the most important thing there
>>>>>>> is on servers are files and
>>>>>>> sockets. You must read from there all of the time. It
>>>>>>> must be easy and idiot proof,
>>>>>>> rock solid and resistant to concurrent processing as far
>>>>>>> as possible.
>>>>>>>
>>>>>>> . I see that Python and Ruby standard library do it
>>>>>>> wrong, at bit better than Cuis 'upTo' does.
>>>>>>> but still bad. They leave you the '\n' at the end, but,
>>>>>>> if any process goes on writing
>>>>>>> 'f1.txt' Ruby and Python lost the half backed record !
>>>>>>> -------- Linux
>>>>>>> $> printf 'line-1\nline-2\nline-TRAP' > f1.txt
>>>>>>> # python
>>>>>>> $> python3.9 -c "f=open('f1.txt','r');
>>>>>>> print(f.readlines())"
>>>>>>> => ['line-1\n', 'line-2\n', 'line-TRAP']
>>>>>>> # ruby
>>>>>>> $> ruby -e "f=open('f1.txt','r'); puts
>>>>>>> f.readlines().to_s; "
>>>>>>> => ["line-1\n", "line-2\n", "line-TRAP"]
>>>>>>> # both Python and Ruby ate the half backed record ! bad !
>>>>>>> ---------------------------------------------------------
>>>>>>>
>>>>>>> . C and CommonLisp standard libraries have a way to do
>>>>>>> it right:
>>>>>>> -) CL read-line.
>>>>>>> http://www.lispworks.com/documentation/HyperSpec/Body/f_rd_lin.htm#read-line
>>>>>>>
>>>>>>> -) C getline.
>>>>>>> https://man7.org/linux/man-pages/man3/getline.3.html
>>>>>>>
>>>>>>> . I understand I am probably the only one running Cuis
>>>>>>> in the server so I am the first
>>>>>>> to step into a few particular problems.
>>>>>>>
>>>>>>> . In my opinion Cuis in the Server can be a good match,
>>>>>>> up to now i have 2 small
>>>>>>> company services working and a big one project in
>>>>>>> continuous development.
>>>>>>> Time will tell. Sturdiness, undertandability and ease of
>>>>>>> modification were my top priority.
>>>>>>> Up to now things are at least working.
>>>>>>>
>>>>>>> ======================================================
>>>>>>>
>>>>>>> bye
>>>>>>> Nicola
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/21/21 14:53, Hernan Wilkinson wrote:
>>>>>>>> Hi Nicola,
>>>>>>>> I see your point regarding the functionality of upTo:,
>>>>>>>> but you can easily overcome that using #peekBack. Using
>>>>>>>> you example:
>>>>>>>> -----
>>>>>>>> s _ 'hello-1Ahello-2Ahel'.
>>>>>>>> '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>>>>
>>>>>>>> st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>>>>
>>>>>>>> st1 upTo: $A. " 'hello-1' "
>>>>>>>> st1 upTo: $A. " 'hello-2' "
>>>>>>>> st1 upTo: $A. " 'hel' "
>>>>>>>> (st1 atEnd and: [ st1 peekBack ~= $A ]) ifTrue: [ self
>>>>>>>> error: 'End of file without delimiter ].
>>>>>>>> ------
>>>>>>>> Regarding my concern of adding this functionality to
>>>>>>>> Cuis, we are trying to have a compact set of classes
>>>>>>>> and methods to reduce complexity (or at least not
>>>>>>>> increase it) and help newcomers to understand it and
>>>>>>>> oldies to remember it :-) . We are also trying to add
>>>>>>>> more and more tests because it is the only way to keep
>>>>>>>> a system from becoming a legacy one and to reduce the
>>>>>>>> fear it produces to change something.
>>>>>>>> The strictUpTo:startPos: you are sending is almost a
>>>>>>>> copy of the upTo: method, with a few lines changed.
>>>>>>>> Even though the functionality makes sense (although
>>>>>>>> right now you are the only one needing it and as I
>>>>>>>> said, you can use peekBack to overcome it), adding that
>>>>>>>> method adds repeated code which in the long term makes
>>>>>>>> it more difficult to understand and maintain, even more
>>>>>>>> because it does not have tests.
>>>>>>>> So I hope you understand that as maintainers of Cuis,
>>>>>>>> we want to be loyal to the goals I mentioned before and
>>>>>>>> keep Cuis as clean and simple as possible. If you can
>>>>>>>> refactor what you sent to avoid having repeated code
>>>>>>>> with #upTo: and add tests that verify the functionality
>>>>>>>> of both methods (strictUpTo: and upTo:), that will make
>>>>>>>> our task easier and meet the goals we have. If you
>>>>>>>> think this does not make sense to you, or you do not
>>>>>>>> have the time to do it, it is completely understandable
>>>>>>>> and in that case I suggest for you to have it as an
>>>>>>>> extension of the StandardFileStream class or just use
>>>>>>>> the peekBack message as I showed.
>>>>>>>> I hope you understand my concern and agree with me. If
>>>>>>>> not, please let me know.
>>>>>>>>
>>>>>>>> Cheers!
>>>>>>>> Hernan.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Oct 19, 2021 at 10:32 AM Nicola Mingotti
>>>>>>>> <nmingotti at gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Hernan,
>>>>>>>>
>>>>>>>> In all frankness, in I would wipe out the old
>>>>>>>> 'upTo' because its behavior is a bit "wild".
>>>>>>>>
>>>>>>>> On the other side, I understand it may create
>>>>>>>> problems in retro-compatibility, that is why for
>>>>>>>> the moment i propose to add a new method which
>>>>>>>> behaves a bit better.
>>>>>>>>
>>>>>>>> I hope this example explains the problem:
>>>>>>>> -------------------------------------------------------
>>>>>>>> s _ 'hello-1Ahello-2Ahel'.
>>>>>>>> '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>>>>
>>>>>>>> st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>>>>
>>>>>>>> st1 upTo: $A. " 'hello-1' "
>>>>>>>> st1 upTo: $A. " 'hello-2' "
>>>>>>>> st1 upTo: $A. " 'hel' " "(*)"
>>>>>>>> ------------------------------------------------------
>>>>>>>> (*) You can't establish in any way if you actually
>>>>>>>> found an "A" terminated block or just hit the end
>>>>>>>> of file
>>>>>>>> (*) If you hit the end of file you eat an
>>>>>>>> incomplete record, this is another problem, maybe
>>>>>>>> another process
>>>>>>>> was going to end writing that record but you will
>>>>>>>> never know.
>>>>>>>>
>>>>>>>> Maybe there is another method around that performs
>>>>>>>> similarly to 'strictUpTp', if there is I did not
>>>>>>>> find it, sorry.
>>>>>>>>
>>>>>>>> IMHO, In a scale of importance from 0 to 10, this
>>>>>>>> method, for a programmer, >= 8.
>>>>>>>> I would definitely not put it into an external
>>>>>>>> package, too much fundamental.
>>>>>>>>
>>>>>>>> bye
>>>>>>>> Nicola
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/19/21 14:44, Hernan Wilkinson wrote:
>>>>>>>>> Hi Nicola!
>>>>>>>>> I was wondering, why are you suggesting adding
>>>>>>>>> them to the base? Is it not enough to implement
>>>>>>>>> them as an extension in your package?
>>>>>>>>> Also, I think that any new functionality should
>>>>>>>>> come with its corresponding tests to help the
>>>>>>>>> maintenance and understanding of the functionality.
>>>>>>>>>
>>>>>>>>> Cheers!
>>>>>>>>> Hernan.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Oct 19, 2021 at 7:04 AM Nicola Mingotti
>>>>>>>>> via Cuis-dev <cuis-dev at lists.cuis.st> wrote:
>>>>>>>>>
>>>>>>>>> Hi Juan, guys,
>>>>>>>>>
>>>>>>>>> I would like to add to Cuis the 2 methods i
>>>>>>>>> attach here. One is a helper method.
>>>>>>>>>
>>>>>>>>> -----------
>>>>>>>>> StandardFileStream strictUpTo: delim.
>>>>>>>>> -----------
>>>>>>>>>
>>>>>>>>> Differently from 'upTo: delim' this method:
>>>>>>>>> 1. Does not return stuff if it does not find
>>>>>>>>> 'delim'.
>>>>>>>>> 2. Does not upgrade the position on the stream
>>>>>>>>> if does not find 'delim'.
>>>>>>>>> 3. If it finds 'delim' returns a chunk that
>>>>>>>>> includes it.
>>>>>>>>>
>>>>>>>>> I am parsing log files at the moment, this is
>>>>>>>>> very much useful.
>>>>>>>>>
>>>>>>>>> NOTE. Up to now I tested only on small files.
>>>>>>>>>
>>>>>>>>> bye
>>>>>>>>> Nicola
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Cuis-dev mailing list
>>>>>>>>> Cuis-dev at lists.cuis.st
>>>>>>>>> https://lists.cuis.st/mailman/listinfo/cuis-dev
>>>>>>>>>
>>>>>>>>>
>>>>
>>>> --
>>>> Juan Vuletich
>>>> www.cuis-smalltalk.org
>>>> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
>>>> https://github.com/jvuletich
>>>> https://www.linkedin.com/in/juan-vuletich-75611b3
>>>> @JuanVuletich
>>>
>>
>
>
> --
> Juan Vuletich
> www.cuis-smalltalk.org
> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
> https://github.com/jvuletich
> https://www.linkedin.com/in/juan-vuletich-75611b3
> @JuanVuletich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20211025/8a697562/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: StandardFileStreamTest.st
Type: application/vnd.sailingtracker.track
Size: 11015 bytes
Desc: not available
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20211025/8a697562/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: NModifUpTo.pck.st
Type: application/vnd.sailingtracker.track
Size: 2971 bytes
Desc: not available
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20211025/8a697562/attachment-0003.bin>
More information about the Cuis-dev
mailing list