[Cuis-dev] fileout. proposed 2 new methods for strict file chunks reading
Juan Vuletich
juan at jvuletich.org
Sun Oct 24 09:14:05 PDT 2021
Hi Nicola,
On 10/23/2021 6:56 PM, Nicola Mingotti via Cuis-dev wrote:
>
> Hi Juan,
>
> At the best of my current undestanding I can provide this:
>
> 1. Fileout for tests in BaseImageTests
Much better!
> 2. A few fileout for new methods and method names
I still think the focus should be on terminator vs. separator,
especially on method and argument names. See
https://en.wikipedia.org/wiki/Newline :
"Interpretation
Two ways to view newlines, both of which are self-consistent, are that
newlines either separate lines or that they terminate lines. If a
newline is considered a separator, there will be no newline after the
last line of a file. Some programs have problems processing the last
line of a file if it is not terminated by a newline. On the other hand,
programs that expect newline to be used as a separator will interpret a
final newline as starting a new (empty) line. Conversely, if a newline
is considered a terminator, all text lines including the last are
expected to be terminated by a newline. If the final character sequence
in a text file is not a newline, the final line of the file may be
considered to be an improper or incomplete text line, or the file may be
considered to be improperly truncated. "
> 3. I could not replicate the bug you say, I did not understand well
> maybe. if you could send me
> a fail example it would be helpful.
>
Sure. The following test fails even after fixing the obvious bug:
testUpToStrict3
| path fs read |
path := 'test-{1}.txt' format: {(Float pi * 10e10) floor. } .
path asFileEntry fileContents: ((1 to: 100) inject: '' into: [
:prev :each | prev, 'A lot of stuff, needs over 2000 chars! ']).
fs := path asFileEntry readStream.
read := fs upTo: $X strict: true.
self assert: (read = nil).
fs close.
>
> bye
> Nicola
>
Cheers,
>
> On 10/23/21 02:26, Nicola Mingotti wrote:
>>
>> Hi Juan, let me a bit of time to read your references, I thought what
>> I sent were test methods,
>> clearly i miss part of the story.
>>
>> There shouldn't be any concatenation of nil and for God sake NO
>> partial records.
>> This is what I wanted to avoid, apologies.
>>
>> Tomorrow i will probably be out for the Linux day, i will update when
>> possible.
>>
>>
>> bye
>> Nicola
>>
>>
>>
>>
>> On 10/23/21 01:20, Juan Vuletich wrote:
>>> Hi Folks,
>>>
>>> The main point here is not "strict vs. legacy", "logically correct
>>> vs incorrect" or anything like that at all.
>>>
>>> The point is "separator vs. terminator", and how using a terminator
>>> instead of a separator allows processing files while they are still
>>> being written to. (And this has really no relation with running on a
>>> server or any other kind of machine.)
>>>
>>> Besides, Nicola, your code has a bug when recurring on terminator:
>>> it will answer the previous partial last record concatenated with nil.
>>>
>>> Finally, please take a look at TestCase,SUnit and
>>> BaseImageTests.pck.st to see what we mean by a "test".
>>>
>>> Thanks,
>>>
>>> On 10/22/2021 12:18 PM, Nicola Mingotti via Cuis-dev wrote:
>>>>
>>>> Hi Hernan,
>>>>
>>>> We will have opportunity to work together on larger problems, this
>>>> is too small.
>>>> It would take more time to talk than to do things ;)
>>>>
>>>> I have a proposed version. I rewrote the methods. wrote the test. I
>>>> kept a good part
>>>> of the original code which may have evolved for efficiency over time.
>>>>
>>>> upToLegacy method can of course be eliminated. it is there only for
>>>> reference.
>>>>
>>>> upTo: XXX --- now calls ---> upTo: XXX strict: false
>>>>
>>>> upTo: XXX strict: XXX ------ is recursive, it needs an extra helper
>>>> method to remember a parameter (Scheme recursion style) ---->
>>>> upTo: XXX strict: XXX posMemo: xxxx
>>>>
>>>> See attached fileout
>>>>
>>>>
>>>> bye
>>>> Nicola
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 10/21/21 19:49, Hernan Wilkinson wrote:
>>>>> ok, let me know. I wish we could do it together but my agenda (and
>>>>> I guess yours) is almost always full...
>>>>>
>>>>> On Thu, Oct 21, 2021 at 2:32 PM Nicola Mingotti
>>>>> <nmingotti at gmail.com> wrote:
>>>>>
>>>>>
>>>>> Hi Hernan,
>>>>>
>>>>> ok, let me try, it is too many days i am talking about it.
>>>>>
>>>>> I will let you know soon
>>>>>
>>>>> bye
>>>>> Nicola
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 10/21/21 19:02, Hernan Wilkinson wrote:
>>>>>> Hi Nicolas,
>>>>>> if you could refactor upTo: to use the same code as
>>>>>> strictUpTo: and write the tests to check that everything
>>>>>> works as expected, that would be great!
>>>>>> I would not use the names of the Linux stdlib for those
>>>>>> messages nor the C functions, it is not necessary...
>>>>>> If you do not have the time to do it, I can give it a try if
>>>>>> you wish.
>>>>>>
>>>>>> Cheers!
>>>>>> Hernan.
>>>>>>
>>>>>> On Thu, Oct 21, 2021 at 12:47 PM Nicola Mingotti
>>>>>> <nmingotti at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Hernan,
>>>>>>
>>>>>> . forget the code and test. I can rewrite it from scratch
>>>>>> with test. I actually changed
>>>>>> existing code for "politeness" ;)
>>>>>>
>>>>>> . for me it is very important to have this matter fixed,
>>>>>> well and for the future.
>>>>>> It is not good to have standard lib functionality
>>>>>> disseminated in my application packages.
>>>>>>
>>>>>> . since I found Linux stdlib has a function to do well
>>>>>> what i want i will use that name(s)
>>>>>> to avoid confusion and recycle already existing function
>>>>>> names. "getline" and "getdelim".
>>>>>>
>>>>>> . if you really dislike this functions I can put them in
>>>>>> OSProcess and maybe
>>>>>> just link the C version only for Linux/BSD. So much I
>>>>>> think they are valuable in the server environment.
>>>>>>
>>>>>> . to fix this i need maybe 1-2 days. If i need to link
>>>>>> the C functions I don't know, since I never tried.
>>>>>>
>>>>>> So, let me know, if you are not against these functions I
>>>>>> am open to implement them well.
>>>>>>
>>>>>>
>>>>>> ===== Extra considerations whose reading is secondary
>>>>>> ==================
>>>>>>
>>>>>> . your fix was one step in the right direction but not
>>>>>> enough, you also need to
>>>>>> bring back the stream pointer to the last existant $A.
>>>>>> This is to say: too complex.
>>>>>> A good method must do all its chore, not leave us back
>>>>>> the dirty business and special conditions.
>>>>>>
>>>>>> . I understand the concision, small core etc. On the
>>>>>> other side, i
>>>>>> run Cuis on the servers. the most important thing there
>>>>>> is on servers are files and
>>>>>> sockets. You must read from there all of the time. It
>>>>>> must be easy and idiot proof,
>>>>>> rock solid and resistant to concurrent processing as far
>>>>>> as possible.
>>>>>>
>>>>>> . I see that Python and Ruby standard library do it
>>>>>> wrong, at bit better than Cuis 'upTo' does.
>>>>>> but still bad. They leave you the '\n' at the end, but,
>>>>>> if any process goes on writing
>>>>>> 'f1.txt' Ruby and Python lost the half backed record !
>>>>>> -------- Linux
>>>>>> $> printf 'line-1\nline-2\nline-TRAP' > f1.txt
>>>>>> # python
>>>>>> $> python3.9 -c "f=open('f1.txt','r'); print(f.readlines())"
>>>>>> => ['line-1\n', 'line-2\n', 'line-TRAP']
>>>>>> # ruby
>>>>>> $> ruby -e "f=open('f1.txt','r'); puts
>>>>>> f.readlines().to_s; "
>>>>>> => ["line-1\n", "line-2\n", "line-TRAP"]
>>>>>> # both Python and Ruby ate the half backed record ! bad !
>>>>>> ---------------------------------------------------------
>>>>>>
>>>>>> . C and CommonLisp standard libraries have a way to do it
>>>>>> right:
>>>>>> -) CL read-line.
>>>>>> http://www.lispworks.com/documentation/HyperSpec/Body/f_rd_lin.htm#read-line
>>>>>>
>>>>>> -) C getline.
>>>>>> https://man7.org/linux/man-pages/man3/getline.3.html
>>>>>>
>>>>>> . I understand I am probably the only one running Cuis in
>>>>>> the server so I am the first
>>>>>> to step into a few particular problems.
>>>>>>
>>>>>> . In my opinion Cuis in the Server can be a good match,
>>>>>> up to now i have 2 small
>>>>>> company services working and a big one project in
>>>>>> continuous development.
>>>>>> Time will tell. Sturdiness, undertandability and ease of
>>>>>> modification were my top priority.
>>>>>> Up to now things are at least working.
>>>>>>
>>>>>> ======================================================
>>>>>>
>>>>>> bye
>>>>>> Nicola
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/21/21 14:53, Hernan Wilkinson wrote:
>>>>>>> Hi Nicola,
>>>>>>> I see your point regarding the functionality of upTo:,
>>>>>>> but you can easily overcome that using #peekBack. Using
>>>>>>> you example:
>>>>>>> -----
>>>>>>> s _ 'hello-1Ahello-2Ahel'.
>>>>>>> '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>>>
>>>>>>> st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>>>
>>>>>>> st1 upTo: $A. " 'hello-1' "
>>>>>>> st1 upTo: $A. " 'hello-2' "
>>>>>>> st1 upTo: $A. " 'hel' "
>>>>>>> (st1 atEnd and: [ st1 peekBack ~= $A ]) ifTrue: [ self
>>>>>>> error: 'End of file without delimiter ].
>>>>>>> ------
>>>>>>> Regarding my concern of adding this functionality to
>>>>>>> Cuis, we are trying to have a compact set of classes and
>>>>>>> methods to reduce complexity (or at least not increase
>>>>>>> it) and help newcomers to understand it and oldies to
>>>>>>> remember it :-) . We are also trying to add more and
>>>>>>> more tests because it is the only way to keep a system
>>>>>>> from becoming a legacy one and to reduce the fear it
>>>>>>> produces to change something.
>>>>>>> The strictUpTo:startPos: you are sending is almost a
>>>>>>> copy of the upTo: method, with a few lines changed.
>>>>>>> Even though the functionality makes sense (although
>>>>>>> right now you are the only one needing it and as I said,
>>>>>>> you can use peekBack to overcome it), adding that method
>>>>>>> adds repeated code which in the long term makes it more
>>>>>>> difficult to understand and maintain, even more because
>>>>>>> it does not have tests.
>>>>>>> So I hope you understand that as maintainers of Cuis,
>>>>>>> we want to be loyal to the goals I mentioned before and
>>>>>>> keep Cuis as clean and simple as possible. If you can
>>>>>>> refactor what you sent to avoid having repeated code
>>>>>>> with #upTo: and add tests that verify the functionality
>>>>>>> of both methods (strictUpTo: and upTo:), that will make
>>>>>>> our task easier and meet the goals we have. If you think
>>>>>>> this does not make sense to you, or you do not have the
>>>>>>> time to do it, it is completely understandable and in
>>>>>>> that case I suggest for you to have it as an extension
>>>>>>> of the StandardFileStream class or just use the peekBack
>>>>>>> message as I showed.
>>>>>>> I hope you understand my concern and agree with me. If
>>>>>>> not, please let me know.
>>>>>>>
>>>>>>> Cheers!
>>>>>>> Hernan.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Oct 19, 2021 at 10:32 AM Nicola Mingotti
>>>>>>> <nmingotti at gmail.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Hernan,
>>>>>>>
>>>>>>> In all frankness, in I would wipe out the old 'upTo'
>>>>>>> because its behavior is a bit "wild".
>>>>>>>
>>>>>>> On the other side, I understand it may create
>>>>>>> problems in retro-compatibility, that is why for
>>>>>>> the moment i propose to add a new method which
>>>>>>> behaves a bit better.
>>>>>>>
>>>>>>> I hope this example explains the problem:
>>>>>>> -------------------------------------------------------
>>>>>>> s _ 'hello-1Ahello-2Ahel'.
>>>>>>> '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>>>
>>>>>>> st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>>>
>>>>>>> st1 upTo: $A. " 'hello-1' "
>>>>>>> st1 upTo: $A. " 'hello-2' "
>>>>>>> st1 upTo: $A. " 'hel' " "(*)"
>>>>>>> ------------------------------------------------------
>>>>>>> (*) You can't establish in any way if you actually
>>>>>>> found an "A" terminated block or just hit the end of
>>>>>>> file
>>>>>>> (*) If you hit the end of file you eat an incomplete
>>>>>>> record, this is another problem, maybe another process
>>>>>>> was going to end writing that record but you will
>>>>>>> never know.
>>>>>>>
>>>>>>> Maybe there is another method around that performs
>>>>>>> similarly to 'strictUpTp', if there is I did not
>>>>>>> find it, sorry.
>>>>>>>
>>>>>>> IMHO, In a scale of importance from 0 to 10, this
>>>>>>> method, for a programmer, >= 8.
>>>>>>> I would definitely not put it into an external
>>>>>>> package, too much fundamental.
>>>>>>>
>>>>>>> bye
>>>>>>> Nicola
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/19/21 14:44, Hernan Wilkinson wrote:
>>>>>>>> Hi Nicola!
>>>>>>>> I was wondering, why are you suggesting adding
>>>>>>>> them to the base? Is it not enough to implement
>>>>>>>> them as an extension in your package?
>>>>>>>> Also, I think that any new functionality should
>>>>>>>> come with its corresponding tests to help the
>>>>>>>> maintenance and understanding of the functionality.
>>>>>>>>
>>>>>>>> Cheers!
>>>>>>>> Hernan.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Oct 19, 2021 at 7:04 AM Nicola Mingotti via
>>>>>>>> Cuis-dev <cuis-dev at lists.cuis.st> wrote:
>>>>>>>>
>>>>>>>> Hi Juan, guys,
>>>>>>>>
>>>>>>>> I would like to add to Cuis the 2 methods i
>>>>>>>> attach here. One is a helper method.
>>>>>>>>
>>>>>>>> -----------
>>>>>>>> StandardFileStream strictUpTo: delim.
>>>>>>>> -----------
>>>>>>>>
>>>>>>>> Differently from 'upTo: delim' this method:
>>>>>>>> 1. Does not return stuff if it does not find
>>>>>>>> 'delim'.
>>>>>>>> 2. Does not upgrade the position on the stream
>>>>>>>> if does not find 'delim'.
>>>>>>>> 3. If it finds 'delim' returns a chunk that
>>>>>>>> includes it.
>>>>>>>>
>>>>>>>> I am parsing log files at the moment, this is
>>>>>>>> very much useful.
>>>>>>>>
>>>>>>>> NOTE. Up to now I tested only on small files.
>>>>>>>>
>>>>>>>> bye
>>>>>>>> Nicola
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Cuis-dev mailing list
>>>>>>>> Cuis-dev at lists.cuis.st
>>>>>>>> https://lists.cuis.st/mailman/listinfo/cuis-dev
>>>>>>>>
>>>>>>>>
>>>
>>> --
>>> Juan Vuletich
>>> www.cuis-smalltalk.org
>>> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
>>> https://github.com/jvuletich
>>> https://www.linkedin.com/in/juan-vuletich-75611b3
>>> @JuanVuletich
>>
>
--
Juan Vuletich
www.cuis-smalltalk.org
https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
https://github.com/jvuletich
https://www.linkedin.com/in/juan-vuletich-75611b3
@JuanVuletich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20211024/0005011d/attachment-0001.htm>
More information about the Cuis-dev
mailing list