[Cuis-dev] UTF-8 Unicode editors

Luciano Notarfrancesco luchiano at gmail.com
Sun May 1 11:27:01 PDT 2022


Yes, the selectors would have to be normalized, but I think that would
happen automatically if the unicode characters are inserted within Cuis
with the same mechanism that we use now to insert special characters like
\oplus. And there would be a list of code points that are considered
alphabetical (e.g. Greek letters) and allowed in keywords and variable
names, and others that are allowed in binary messages, etc.

I don’t need selectors in Chinese or Thai tho, supporting something like
that would be harder. Actually Chinese is probably easy, but in Thai the
vowels and tone modifiers are like “decorations” around, over or under the
consonants, and I’m guessing the encoding could be more ambiguous.

On Sun, 1 May 2022 at 11:48 PM Andres Valloud via Cuis-dev <
cuis-dev at lists.cuis.st> wrote:

> You will have to enforce that selectors are normalized (i.e. no emitting
> multiple code points to compose a character that could be represented
> with just one code point).
>
> Also, FYI there is a software project somewhere that has a file with a
> name that looks like "ctalin", but unfortunately that 'c' is a cyrillic
> 's' that looks indistinguishable from a 'c'.  Most annoying.  But look
> at how the general inability of fonts to display humanly distinguishable
> glyphs for the >1m code points already assigned leads to this kind of
> problem.
>
> On 4/30/22 6:34 AM, Luciano Notarfrancesco via Cuis-dev wrote:
> > Hi Philip,
> > Thanks for the link and advice, I didn’t know this website.
> > I think it might be possible to implement unicode selectors without
> > introducing wide strings or other complications. That would be perfect.
> > We’ll see…
> >
> > On Sat, 30 Apr 2022 at 7:31 PM Philip Bernhart via Cuis-dev
> > <cuis-dev at lists.cuis.st <mailto:cuis-dev at lists.cuis.st>> wrote:
> >
> >     Hi,
> >
> >     Luciano Notarfrancesco via Cuis-dev <cuis-dev at lists.cuis.st
> >     <mailto:cuis-dev at lists.cuis.st>> writes:
> >
> >      > This is super cool. I’d like to have unicode symbols at some
> >     point, not
> >      > sure if we’ll need WideStrings in UTF-32, or how big will be the
> >     impact in
> >      > memory use if we make all strings wide (not only memory use, but
> also
> >      > speed, because we have primitives for String). Anyway we’ll see
> >     once we
> >      > start experimenting. Thank you for doing this!
> >
> >     I don't see why Cuis should support anything besides bytes, UTF-8
> >     and the conversion from and to UTF-8 to any other of the broken
> >     character encodings in the world.
> >
> >     Participants of this thread should check against the great utf-8
> >     everywhere site: https://utf8everywhere.org/#myths
> >     <https://utf8everywhere.org/#myths>
> >
> >     Juan did when pondering about Cuis string support.
> >
> >
> >     My rambling 0.2 EUR,
> >     Philip
> >     --
> >     Cuis-dev mailing list
> >     Cuis-dev at lists.cuis.st <mailto:Cuis-dev at lists.cuis.st>
> >     https://lists.cuis.st/mailman/listinfo/cuis-dev
> >     <https://lists.cuis.st/mailman/listinfo/cuis-dev>
> >
> >
> --
> Cuis-dev mailing list
> Cuis-dev at lists.cuis.st
> https://lists.cuis.st/mailman/listinfo/cuis-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20220502/1c5b736d/attachment.htm>


More information about the Cuis-dev mailing list