[Cuis-dev] [ODBC] Using accented characters in resultset
H. Hirzel
hannes.hirzel at gmail.com
Sun Jan 4 18:50:56 PST 2026
Hi Olivier, hi Juan
I think noteworthy is that UFT8 and ASCII are the same for the first 127
code points.
https://www.unicode.org/charts/PDF/U0000.pdf
This explains that the ToDo example of Martin Volkman worked fine.
https://mvolkmann.github.io/blog/topics/#/blog/smalltalk/50-databases/?v=1.1.1
My guess is as Juan writes that the package
https://github.com/Cuis-Smalltalk/DatabaseSupport currently has no
maintainer that it probably has never been tested for Non-ASCII and
Unicode in general. What you, Olivier, are actually now doing .... with
characters from the second Unicode block
https://www.unicode.org/charts/PDF/U0080.pdf (second Unicode code block)
https://en.wikipedia.org/wiki/Latin-1_Supplement#Compact_table
In my earlier mail I wrote about ANSI (as imprecise short for
https://en.wikipedia.org/wiki/Windows-1252). That was not correct as
that would be a superset of the character inventory of these two code
blocks (some control characters swapped for other characters).
To conclude: the fix Juan proposes
UnicodeString fromUtf8Bytes: (buffer copyFrom: 1 to: len
should do the job.
(UnicodeString fromUtf8Bytes: #[195 169] ) 'é' .
Looking forward to see an update on this because I also would like to
use database connectivity this year ....
Kind regards
Hannes
On 02/01/2026 9:20 pm, Juan Vuletich wrote:
> Hi Olivier,
>
> It looks like you're getting UTF-8 bytes for Unicode, and the ODBC
> package is turning them into a (Byte)String. Take a look at the
> attach. This is my take on how a UnicodeString should be created instead.
>
> I'm not aware of anyone using or maintaining that package right now. I
> can not actually test it, as I don't have any ODBC data source on my
> machine. But the protocols in Cuis for creating UnicodeString from
> UTF-8 bytes, and extracting the UTF-8 bytes from a
> String/UnicodeString (to save to the database) are pretty
> straightforward, as shown in the attach.
>
> If you can fix the problem, and test that the fix works, we can
> integrate your contribution to the package, so it will help others in
> the future.
>
> Thanks!
>
> On 2025-12-30 7:17 PM, H. Hirzel via Cuis-dev wrote:
>> Hi Olivier
>>
>> A contribution towards an answer: yes it is an encoding issue.
>>
>> If I use Notepad++
>>
>> type
>>
>> invité
>>
>> (encoding is UTF8 there by default) and then choose ANSI in the
>> 'encoding' menu I get what you mention:
>>
>> 'invité'.
>>
>> However how to do this in Cuis I have not yet managed to do. I guess
>> there is a wayto tell the database driver that you want UTF8. That
>> should actually be the default if things are fine.
>>
>> --Hannes
>>
>> On 29/12/2025 10:40 am, olivier auverlot via Cuis-dev wrote:
>>> Hi,
>>>
>>> I am trying to use the ODBC package for Cuis Smalltalk. I used the
>>> documentation by Mark Volkmann
>>> (https://mvolkmann.github.io/blog/topics/#/blog/smalltalk/50-databases/?v=1.1.1).
>>> Thank you, Mark, for your tutorial!
>>>
>>> I'm connecting to a PostgreSQL database but I'm having an issue with
>>> French accented characters. For example, instead of 'Invité', I get
>>> the string 'invité'.
>>>
>>> I suspect an encoding issue, although I'm using the psqlodbcw.so
>>> library which supports Unicode.
>>>
>>> Does anyone have experience with this ODBC package and can help me
>>> resolve this issue?
>>>
>>> Best regards
>>> Olivier
>>>
More information about the Cuis-dev
mailing list