[Cuis-dev] isSeparator
Ezequiel Birman
ebirman77 at gmail.com
Wed May 8 08:31:12 PDT 2024
And of course I forgot there are a lot more visible separators, like the
middle dots in ancient roman texts, phoenician and aegean scripts...
Currently `isSeparator` is being used during parsing, case conversions,
trimming, etc. Sometimes meaning blank i.e. non-drawable, and sometimes
meaning any word separator whether drawable or not.
I'll add an isBlank or isDrawable for my use case, but let me know what you
think about adding unicode space-like separators to isSeparator.
--
Eze
On Wed, 8 May 2024 at 15:37, Ezequiel Birman <ebirman77 at gmail.com> wrote:
> Lately I've started tinkering with text morphs and I was wondering about
> UnicodeCodePoint > #isSeparator. I needed to (in)validate non-drawable
> codepoints including control sequences, but the current implementation
> doesn't include the codepoints for thin space, hair space, em space, etc.
> is it on purpose? For what is worth I gathered all the non-drawable
> codepoints (maybe some are still missing):
>
> ^ `#(32 9 10 13 12 160 8192 8193 8194 8195 8196 8197 8198 8199 8200 8201
> 8202 8203 8239 8287 12288)` statePointsTo: value
>
> Also, I learned that there is one separator that *is* drawable: The Ogham
> space mark. Probably, it should be included too, unless I am
> misunderstanding the semantics of isSeparator.
>
> I should have added comments describing the codepoint, will do asap.
>
> --
> Eze
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20240508/5e6182e6/attachment.htm>
More information about the Cuis-dev
mailing list