[Cuis-dev] [ODBC] Using accented characters in resultset

Sun Jan 4 18:50:56 PST 2026

Hi Olivier, hi Juan

I think noteworthy is that UFT8 and ASCII are the same for the first 127 
code points.

https://www.unicode.org/charts/PDF/U0000.pdf

This explains that the ToDo example of Martin Volkman worked fine.

https://mvolkmann.github.io/blog/topics/#/blog/smalltalk/50-databases/?v=1.1.1

My guess is as Juan writes that the package 
https://github.com/Cuis-Smalltalk/DatabaseSupport currently has no 
maintainer that it probably has never been tested for Non-ASCII and 
Unicode in general. What you, Olivier, are actually now doing .... with 
characters from the second Unicode block

https://www.unicode.org/charts/PDF/U0080.pdf  (second Unicode code block)

https://en.wikipedia.org/wiki/Latin-1_Supplement#Compact_table

In my earlier mail I wrote about ANSI (as imprecise short for 
https://en.wikipedia.org/wiki/Windows-1252). That was not correct as 
that would be a superset of the character inventory of these two code 
blocks (some control characters swapped for other characters).

To conclude: the fix Juan proposes

UnicodeString fromUtf8Bytes: (buffer copyFrom: 1 to: len

should do the job.

(UnicodeString fromUtf8Bytes:  #[195 169]  )   'é' .

Looking forward to see an update on this because I also would like to 
use database connectivity this year ....

Kind regards

Hannes

On 02/01/2026 9:20 pm, Juan Vuletich wrote:
> Hi Olivier,
>
> It looks like you're getting UTF-8 bytes for Unicode, and the ODBC 
> package is turning them into a (Byte)String. Take a look at the 
> attach. This is my take on how a UnicodeString should be created instead.
>
> I'm not aware of anyone using or maintaining that package right now. I 
> can not actually test it, as I don't have any ODBC data source on my 
> machine. But the protocols in Cuis for creating UnicodeString from 
> UTF-8 bytes, and extracting the UTF-8 bytes from a 
> String/UnicodeString (to save to the database) are pretty 
> straightforward, as shown in the attach.
>
> If you can fix the problem, and test that the fix works, we can 
> integrate your contribution to the package, so it will help others in 
> the future.
>
> Thanks!
>
> On 2025-12-30 7:17 PM, H. Hirzel via Cuis-dev wrote:
>> Hi Olivier
>>
>> A contribution towards an answer: yes it is an encoding issue.
>>
>> If I use Notepad++
>>
>> type
>>
>> invité
>>
>> (encoding is UTF8 there by default) and then choose ANSI in the 
>> 'encoding' menu I get what you mention:
>>
>> 'invitÃ©'.
>>
>> However how to do this in Cuis I have not yet managed to do. I guess 
>> there is a wayto tell the database driver that you want UTF8. That 
>> should actually be the default if things are fine.
>>
>> --Hannes
>>
>> On 29/12/2025 10:40 am, olivier auverlot via Cuis-dev wrote:
>>> Hi,
>>>
>>> I am trying to use the ODBC package for Cuis Smalltalk. I used the 
>>> documentation by Mark Volkmann 
>>> (https://mvolkmann.github.io/blog/topics/#/blog/smalltalk/50-databases/?v=1.1.1). 
>>> Thank you, Mark, for your tutorial!
>>>
>>> I'm connecting to a PostgreSQL database but I'm having an issue with 
>>> French accented characters. For example, instead of 'Invité', I get 
>>> the string 'invitÃ©'.
>>>
>>> I suspect an encoding issue, although I'm using the psqlodbcw.so 
>>> library which supports Unicode.
>>>
>>> Does anyone have experience with this ODBC package and can help me 
>>> resolve this issue?
>>>
>>> Best regards
>>> Olivier
>>>