[Cuis-dev] fileout. proposed 2 new methods for strict file chunks reading
Nicola Mingotti
nmingotti at gmail.com
Fri Oct 22 17:26:22 PDT 2021
Hi Juan, let me a bit of time to read your references, I thought what I
sent were test methods,
clearly i miss part of the story.
There shouldn't be any concatenation of nil and for God sake NO partial
records.
This is what I wanted to avoid, apologies.
Tomorrow i will probably be out for the Linux day, i will update when
possible.
bye
Nicola
On 10/23/21 01:20, Juan Vuletich wrote:
> Hi Folks,
>
> The main point here is not "strict vs. legacy", "logically correct vs
> incorrect" or anything like that at all.
>
> The point is "separator vs. terminator", and how using a terminator
> instead of a separator allows processing files while they are still
> being written to. (And this has really no relation with running on a
> server or any other kind of machine.)
>
> Besides, Nicola, your code has a bug when recurring on terminator: it
> will answer the previous partial last record concatenated with nil.
>
> Finally, please take a look at TestCase,SUnit and
> BaseImageTests.pck.st to see what we mean by a "test".
>
> Thanks,
>
> On 10/22/2021 12:18 PM, Nicola Mingotti via Cuis-dev wrote:
>>
>> Hi Hernan,
>>
>> We will have opportunity to work together on larger problems, this is
>> too small.
>> It would take more time to talk than to do things ;)
>>
>> I have a proposed version. I rewrote the methods. wrote the test. I
>> kept a good part
>> of the original code which may have evolved for efficiency over time.
>>
>> upToLegacy method can of course be eliminated. it is there only for
>> reference.
>>
>> upTo: XXX --- now calls ---> upTo: XXX strict: false
>>
>> upTo: XXX strict: XXX ------ is recursive, it needs an extra helper
>> method to remember a parameter (Scheme recursion style) ----> upTo:
>> XXX strict: XXX posMemo: xxxx
>>
>> See attached fileout
>>
>>
>> bye
>> Nicola
>>
>>
>>
>>
>>
>> On 10/21/21 19:49, Hernan Wilkinson wrote:
>>> ok, let me know. I wish we could do it together but my agenda (and I
>>> guess yours) is almost always full...
>>>
>>> On Thu, Oct 21, 2021 at 2:32 PM Nicola Mingotti
>>> <nmingotti at gmail.com> wrote:
>>>
>>>
>>> Hi Hernan,
>>>
>>> ok, let me try, it is too many days i am talking about it.
>>>
>>> I will let you know soon
>>>
>>> bye
>>> Nicola
>>>
>>>
>>>
>>>
>>> On 10/21/21 19:02, Hernan Wilkinson wrote:
>>>> Hi Nicolas,
>>>> if you could refactor upTo: to use the same code as
>>>> strictUpTo: and write the tests to check that everything works
>>>> as expected, that would be great!
>>>> I would not use the names of the Linux stdlib for those
>>>> messages nor the C functions, it is not necessary...
>>>> If you do not have the time to do it, I can give it a try if
>>>> you wish.
>>>>
>>>> Cheers!
>>>> Hernan.
>>>>
>>>> On Thu, Oct 21, 2021 at 12:47 PM Nicola Mingotti
>>>> <nmingotti at gmail.com> wrote:
>>>>
>>>>
>>>> Hi Hernan,
>>>>
>>>> . forget the code and test. I can rewrite it from scratch
>>>> with test. I actually changed
>>>> existing code for "politeness" ;)
>>>>
>>>> . for me it is very important to have this matter fixed,
>>>> well and for the future.
>>>> It is not good to have standard lib functionality
>>>> disseminated in my application packages.
>>>>
>>>> . since I found Linux stdlib has a function to do well what
>>>> i want i will use that name(s)
>>>> to avoid confusion and recycle already existing function
>>>> names. "getline" and "getdelim".
>>>>
>>>> . if you really dislike this functions I can put them in
>>>> OSProcess and maybe
>>>> just link the C version only for Linux/BSD. So much I think
>>>> they are valuable in the server environment.
>>>>
>>>> . to fix this i need maybe 1-2 days. If i need to link the
>>>> C functions I don't know, since I never tried.
>>>>
>>>> So, let me know, if you are not against these functions I
>>>> am open to implement them well.
>>>>
>>>>
>>>> ===== Extra considerations whose reading is secondary
>>>> ==================
>>>>
>>>> . your fix was one step in the right direction but not
>>>> enough, you also need to
>>>> bring back the stream pointer to the last existant $A. This
>>>> is to say: too complex.
>>>> A good method must do all its chore, not leave us back the
>>>> dirty business and special conditions.
>>>>
>>>> . I understand the concision, small core etc. On the other
>>>> side, i
>>>> run Cuis on the servers. the most important thing there is
>>>> on servers are files and
>>>> sockets. You must read from there all of the time. It must
>>>> be easy and idiot proof,
>>>> rock solid and resistant to concurrent processing as far as
>>>> possible.
>>>>
>>>> . I see that Python and Ruby standard library do it wrong,
>>>> at bit better than Cuis 'upTo' does.
>>>> but still bad. They leave you the '\n' at the end, but, if
>>>> any process goes on writing
>>>> 'f1.txt' Ruby and Python lost the half backed record !
>>>> -------- Linux
>>>> $> printf 'line-1\nline-2\nline-TRAP' > f1.txt
>>>> # python
>>>> $> python3.9 -c "f=open('f1.txt','r'); print(f.readlines())"
>>>> => ['line-1\n', 'line-2\n', 'line-TRAP']
>>>> # ruby
>>>> $> ruby -e "f=open('f1.txt','r'); puts f.readlines().to_s; "
>>>> => ["line-1\n", "line-2\n", "line-TRAP"]
>>>> # both Python and Ruby ate the half backed record ! bad !
>>>> ---------------------------------------------------------
>>>>
>>>> . C and CommonLisp standard libraries have a way to do it
>>>> right:
>>>> -) CL read-line.
>>>> http://www.lispworks.com/documentation/HyperSpec/Body/f_rd_lin.htm#read-line
>>>>
>>>> -) C getline.
>>>> https://man7.org/linux/man-pages/man3/getline.3.html
>>>>
>>>> . I understand I am probably the only one running Cuis in
>>>> the server so I am the first
>>>> to step into a few particular problems.
>>>>
>>>> . In my opinion Cuis in the Server can be a good match, up
>>>> to now i have 2 small
>>>> company services working and a big one project in
>>>> continuous development.
>>>> Time will tell. Sturdiness, undertandability and ease of
>>>> modification were my top priority.
>>>> Up to now things are at least working.
>>>>
>>>> ======================================================
>>>>
>>>> bye
>>>> Nicola
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 10/21/21 14:53, Hernan Wilkinson wrote:
>>>>> Hi Nicola,
>>>>> I see your point regarding the functionality of upTo:,
>>>>> but you can easily overcome that using #peekBack. Using
>>>>> you example:
>>>>> -----
>>>>> s _ 'hello-1Ahello-2Ahel'.
>>>>> '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>
>>>>> st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>
>>>>> st1 upTo: $A. " 'hello-1' "
>>>>> st1 upTo: $A. " 'hello-2' "
>>>>> st1 upTo: $A. " 'hel' "
>>>>> (st1 atEnd and: [ st1 peekBack ~= $A ]) ifTrue: [ self
>>>>> error: 'End of file without delimiter ].
>>>>> ------
>>>>> Regarding my concern of adding this functionality to
>>>>> Cuis, we are trying to have a compact set of classes and
>>>>> methods to reduce complexity (or at least not increase it)
>>>>> and help newcomers to understand it and oldies to remember
>>>>> it :-) . We are also trying to add more and more tests
>>>>> because it is the only way to keep a system from becoming
>>>>> a legacy one and to reduce the fear it produces to change
>>>>> something.
>>>>> The strictUpTo:startPos: you are sending is almost a copy
>>>>> of the upTo: method, with a few lines changed. Even though
>>>>> the functionality makes sense (although right now you are
>>>>> the only one needing it and as I said, you can
>>>>> use peekBack to overcome it), adding that method adds
>>>>> repeated code which in the long term makes it more
>>>>> difficult to understand and maintain, even more because it
>>>>> does not have tests.
>>>>> So I hope you understand that as maintainers of Cuis, we
>>>>> want to be loyal to the goals I mentioned before and keep
>>>>> Cuis as clean and simple as possible. If you can refactor
>>>>> what you sent to avoid having repeated code with #upTo:
>>>>> and add tests that verify the functionality of both
>>>>> methods (strictUpTo: and upTo:), that will make our task
>>>>> easier and meet the goals we have. If you think this does
>>>>> not make sense to you, or you do not have the time to do
>>>>> it, it is completely understandable and in that case I
>>>>> suggest for you to have it as an extension of the
>>>>> StandardFileStream class or just use the peekBack message
>>>>> as I showed.
>>>>> I hope you understand my concern and agree with me. If
>>>>> not, please let me know.
>>>>>
>>>>> Cheers!
>>>>> Hernan.
>>>>>
>>>>>
>>>>> On Tue, Oct 19, 2021 at 10:32 AM Nicola Mingotti
>>>>> <nmingotti at gmail.com> wrote:
>>>>>
>>>>>
>>>>> Hi Hernan,
>>>>>
>>>>> In all frankness, in I would wipe out the old 'upTo'
>>>>> because its behavior is a bit "wild".
>>>>>
>>>>> On the other side, I understand it may create problems
>>>>> in retro-compatibility, that is why for
>>>>> the moment i propose to add a new method which behaves
>>>>> a bit better.
>>>>>
>>>>> I hope this example explains the problem:
>>>>> -------------------------------------------------------
>>>>> s _ 'hello-1Ahello-2Ahel'.
>>>>> '/tmp/test.txt' asFileEntry fileContents: s.
>>>>>
>>>>> st1 _ '/tmp/test.txt' asFileEntry readStream .
>>>>>
>>>>> st1 upTo: $A. " 'hello-1' "
>>>>> st1 upTo: $A. " 'hello-2' "
>>>>> st1 upTo: $A. " 'hel' " "(*)"
>>>>> ------------------------------------------------------
>>>>> (*) You can't establish in any way if you actually
>>>>> found an "A" terminated block or just hit the end of file
>>>>> (*) If you hit the end of file you eat an incomplete
>>>>> record, this is another problem, maybe another process
>>>>> was going to end writing that record but you will
>>>>> never know.
>>>>>
>>>>> Maybe there is another method around that performs
>>>>> similarly to 'strictUpTp', if there is I did not find
>>>>> it, sorry.
>>>>>
>>>>> IMHO, In a scale of importance from 0 to 10, this
>>>>> method, for a programmer, >= 8.
>>>>> I would definitely not put it into an external
>>>>> package, too much fundamental.
>>>>>
>>>>> bye
>>>>> Nicola
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 10/19/21 14:44, Hernan Wilkinson wrote:
>>>>>> Hi Nicola!
>>>>>> I was wondering, why are you suggesting adding them
>>>>>> to the base? Is it not enough to implement them as an
>>>>>> extension in your package?
>>>>>> Also, I think that any new functionality should come
>>>>>> with its corresponding tests to help the maintenance
>>>>>> and understanding of the functionality.
>>>>>>
>>>>>> Cheers!
>>>>>> Hernan.
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 19, 2021 at 7:04 AM Nicola Mingotti via
>>>>>> Cuis-dev <cuis-dev at lists.cuis.st> wrote:
>>>>>>
>>>>>> Hi Juan, guys,
>>>>>>
>>>>>> I would like to add to Cuis the 2 methods i
>>>>>> attach here. One is a helper method.
>>>>>>
>>>>>> -----------
>>>>>> StandardFileStream strictUpTo: delim.
>>>>>> -----------
>>>>>>
>>>>>> Differently from 'upTo: delim' this method:
>>>>>> 1. Does not return stuff if it does not find 'delim'.
>>>>>> 2. Does not upgrade the position on the stream if
>>>>>> does not find 'delim'.
>>>>>> 3. If it finds 'delim' returns a chunk that
>>>>>> includes it.
>>>>>>
>>>>>> I am parsing log files at the moment, this is
>>>>>> very much useful.
>>>>>>
>>>>>> NOTE. Up to now I tested only on small files.
>>>>>>
>>>>>> bye
>>>>>> Nicola
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Cuis-dev mailing list
>>>>>> Cuis-dev at lists.cuis.st
>>>>>> https://lists.cuis.st/mailman/listinfo/cuis-dev
>>>>>>
>>>>>>
>
> --
> Juan Vuletich
> www.cuis-smalltalk.org
> https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
> https://github.com/jvuletich
> https://www.linkedin.com/in/juan-vuletich-75611b3
> @JuanVuletich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20211023/235cba6d/attachment-0001.htm>
More information about the Cuis-dev
mailing list