[Cuis-dev] UTF-8 Unicode editors

Juan Vuletich JuanVuletich at zoho.com
Mon May 2 06:10:29 PDT 2022


On 5/1/2022 1:59 AM, Douglas Brebner via Cuis-dev wrote:
> On 30/04/2022 02:55, Juan Vuletich via Cuis-dev wrote:
>> Class StringUtf8 is a bytes object (like ByteArray). It implements 
>> #byteSize, #byteAt:, #byteAt:put:. #at: is rather expensive. It 
>> doesn't support #at:put:. It has several iteration messages in 
>> CodePoints. Performance seems OK for reasonably small stuff.
>>
>> In any case, it is always possible to change the data structure, as 
>> long as it is polymorphic.
>>
>> I haven't used it for Smalltalk source code yet, only for stand alone 
>> editors. Parser would see instances of UnicodeCodePoint, maybe mixed 
>> with Characters (for code points that fit in a Cuis Character).
>>
>> So far this is experimental, so we can adapt it while we learn the 
>> details of what we need.
>
> What about using Ken's Ropes?

I think this is independent of Ropes. Ropes are great for very large 
text. Rope fragments should be sequences of UTF-8 bytes. So Ropes should 
be a third representation, and should be completely compatible with 
StringUTF8. Applications should be able to pick their preferred 
representation.

-- 
Juan Vuletich
www.cuis-smalltalk.org
https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
https://github.com/jvuletich
https://www.linkedin.com/in/juan-vuletich-75611b3
https://independent.academia.edu/JuanVuletich
https://www.researchgate.net/profile/Juan-Vuletich
https://patents.justia.com/inventor/juan-manuel-vuletich
https://twitter.com/JuanVuletich



More information about the Cuis-dev mailing list