I’m on Support Bundle for Toad for Oracle 22.214.171.124
While testing an application involving XML messages I’ve found that a stored UTF-8 character ä (x’c3a4’) is shown as two garbage characters (UTF-16 x’00c300a4’)
To check where the issue arises, I’ve isolated one sample attribute containing the error and displayed it as xml and as varchar2. See attachment 1, shown very small below this line (I tried to enlarge but failed, sorry)
As you can see (hopefully) the right-hand result column continaing varchar2 data shows correctly Hochhäuser (please mind the a-umlaut ä character)
In the left-hand column the xml is shown “as is”, but the ä is getting mangled into two garbage characters.
To figure out what happened I checked the hex contents of both fields to compare. Sure enough in the left-hand column we find a code sequence of x’00c300a4’ (corrected for endian-ness) at offset x’6a’:
while the right-hand column shows X’00E4’ at offset x’6a’
To further corroborate I also dumped the input field that was used to generate the XML data:
It takes close scrutiny, but you’ll find that leading ‘Hochh’ all take just 1 byte, just like trailing ‘user’, but the intervening ä is encoded as 195, 164, which is x’c3a4’ and that is indeed the correct unicode UTF-8 representation for the ä character.
The difference between the two versions of the xml field now is partly explicable:
the correct version shows X’e4’ translated from the orginal x’c3a4’, whereas the incorrect version shows x’00c00c4’ where apparently each by te has been prefixed with x’00’ (in an attempt to translate into utf-16?)