Running update statement through Sql Editor loses Unicode Characters

Quite simple, we run Toad v10.0 against our Oracle databases. We’re in the process of updating our applications to support Unicode characters (Welsh) and while this is going well we’ve noticed that when running an Insert or Update statement in the Toad Sql window (editor) the unicode characters, while appearing in the window, are not saved to the fields and instead are converted to nearest ascii character. (In this case â ê î ô û ŷ ŵ becomes â ê î ô û y w)

However if I double click on the grid row in schema browser and manually update the field and commit then the Unicode data is saved correctly.

We’ve checked out NLS_LANG and local regionalisation settings and have conformed to the suggestions made in this video http://toadfororacle.com/entry.jspa?categoryID=406&externalID=3410 but still no joy.

Can anyone suggest a reason why running a statement through the TOAD sql editor might lose the Unicode characters on being run.

Cheers for any help or suggestions

Morning israfel_e,

...

However if I double click on the grid row in schema browser
and manually update the field and commit then the Unicode
data is saved correctly.
I tried the following in Toad 10.6 GA:

CREATE TABLE norm(A nvarchar2(10));
INSERT INTO norm(A) VALUES ('â ê î ô û');
COMMIT;
SELECT * FROM norm;

The select worked fine - all characters were accented, as above (unless Outlook has destroyed the accents of course!) but basically I copied and pasted from your original email.

Just out of interest, what are the following:

  • The database character set;
  • The NLS_CHARACTER_SET;
  • The definition of the column giving grief.

I'm assuming from the above that when you change data using the editor, the data are incorrect when selected back again but if you edit the data directly in the grid, they are correct when subsequently selected.

...

Can anyone suggest a reason why running a statement through
the TOAD sql editor might lose the Unicode characters on being run.

I suspect you have the results grid font set to one that cannot display Unicode. (Which assumes that the results are not displayed correctly at ALL times, regardless of how you edited the data. If this is not the case, scroll down ...)

View->Toad Options->Data Grids->Visual.
Click "Edit Styles..." button.

The style you currently use should be selected on the left side.

Click the "Edit..." button.

Select ContentEven, make sure Font is checked and click the '...' button. Check and set the font to the same as your editor window - mine is either "Courier New" or "Microsoft San's Serif" - both work with your example data. Click OK.

Repeat for ContentOdd, Header and Preview.

OK your way out and try the query again.

Now, assuming that the results grid does always show the accented characters regardless of how you edit them, then it would seem that the font you use in the editor is non-Unicode. So ...

View->Toad Options->Editor->Display.
Click the "Editor..." button.
Select a font - mine is, once again, Courier New. Click OK.

Repeat for the "Line #..." button - otherwise the line numbers don't look correct when in a different font. Feel free to adjust the colours though!

OK your way out again and try again.

If this still fails, then I admit to being stumped!

HTH

Cheers,
Norm. [TeamT]

Information in this message may be confidential and may be legally privileged. If you have received this message by mistake, please notify the sender immediately, delete it and do not copy it to anyone else.

We have checked this email and its attachments for viruses. But you should still check any attachment before opening it.
We may have to make this message and any reply to it public if asked to under the Freedom of Information Act, Data Protection Act or for litigation. Email messages and attachments sent to or from any Environment Agency address may also be accessed by someone other than the sender or recipient, for business purposes.

If we have sent you information and you wish to use it please read our terms and conditions which you can get by calling us on 08708 506 506. Find out more about the Environment Agency at www.environment-agency.gov.uk

Norm,

Thanks for your replay.

The results grid is displaying the unicode characters properly when they are entered through a stored proc or directly through the Datagrid. It is when you call an insert or update statement through the SQL editor window that the data is converted to it’s closest ascii characters. The characters you have below I think are actually extended ascii characters as they don’t get changed. It is W,Y with an accent on top (welsh characters which yahoo groups wouldn’t accept in my opening post, characters as seen on this page http://www.200words-a-day.com/typing-welsh-characters.html) get converted to ascii W,Y which causes a data issue as they’re not the same characters.

We’re still stumped by this one, I assume in the update statement through SQL Editor where are string is being entered it’s automatically converted to varchar2 then losing the characters.

Our NLS settings are

NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CHARACTERSET WE8MSWIN1252
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_RDBMS_VERSION 11.1.0.7.0

Cheers

Morning israfel_e,

The results grid is displaying the Unicode characters
properly when they are entered through a stored proc or
directly through the Data grid.
Ok, this means that whatever font you are using in the grid is happy to display Unicode.

It is when you call an insert
or update statement through the SQL editor window that the
data is converted to it's closest ASCII characters.
This makes me think that your editor font is incorrect. IN the past, I've been able to display the Euro character in the editor, but when the SQL was executed, the Euro was converted to something else. Changing the editor font worked for me.

The characters you have below I think are actually extended
ASCII characters as they don't get changed. It is W,Y with
an accent on top (welsh characters which yahoo groups
wouldn't accept in my opening post, characters as seen on
this page
Typing Welsh Characters)
get converted to ASCII W,Y which causes a data issue as
they're not the same characters.
I checked that page, the instructions on getting a circumflex (^) on top of Y or W (upper or lower) don't work for me in Toad, Notepad, Context, Anything in fact, I get the un-accented characters.

However, CHARMAP works!

I got rid of my old table, and started again:

CREATE TABLE norm(A nvarchar2(10));
INSERT INTO norm(A) VALUES ('ŴŵŶŷ'); -- Welsh here
INSERT INTO norm(A) VALUES ('WwYy'); -- Not Welsh here.
COMMIT;

SELECT A, dump(A,16) AS hex FROM norm;

A HEX
WwYy Typ=1 Len=8: 0,57,0,77,0,59,0,79
WwYy Typ=1 Len=8: 0,57,0,77,0,59,0,79

So, straight away, the Welsh characters are lost.

Next, from an F4 on norm->Data tab, I added one row using the Welsh characters again. And committed.

SELECT A, dump(A,16) AS hex FROM norm;

A HEX
ŴŵŶŷ Typ=1 Len=8: 1,74,1,75,1,76,1,77
WwYy Typ=1 Len=8: 0,57,0,77,0,59,0,79
WwYy Typ=1 Len=8: 0,57,0,77,0,59,0,79

So, the first row (above) clearly shows that the Welsh characters are being inserted via the grid.

I did of course run the SQL Monitor tool while this was going on. The output from the Editor insert is this:

PS. This works in the editor:

INSERT INTO norm(A) VALUES (:Welsh);
COMMIT;

When executed you get a prompt for the welsh characters, set the type to
NVARCHAR2 and paste the characters, without quotes, into the value
field.

It’s a PITA but does allow the editor to be used.

Cheers,
Norm. [TeamT]

Norman Dunbar
Contract Senior Oracle DBA
Capgemini Database Build Team
Internal : 7 28 2051
External : 0113 231 2051

Information in this message may be confidential and may be legally privileged. If you have received this message by mistake, please notify the sender immediately, delete it and do not copy it to anyone else.

We have checked this email and its attachments for viruses. But you should still check any attachment before opening it.
We may have to make this message and any reply to it public if asked to under the Freedom of Information Act, Data Protection Act or for litigation. Email messages and attachments sent to or from any Environment Agency address may also be accessed by someone other than the sender or recipient, for business purposes.

If we have sent you information and you wish to use it please read our terms and conditions which you can get by calling us on 08708 506 506. Find out more about the Environment Agency at www.environment-agency.gov.uk

Hi Norm-

Use the Toad Advisor to see the "real" NLS_LANG in use, because the search order for NLS_LANG is 1) user environment variable, 2) system environment variable and 3) Registry for the active Home.

I tried setting NLS_LANG in the registry to .AL16UTF16

As you know (but many others do not), NLS_LANG must match the Windows character set, not the Oracle database. This by far is the most common misconception - people make it match the Oracle character set all the time and eventually get conversion errors. The other common mistake is people think they're taking a shortcut by just setting NLS_LANG to a Unicode value. Wrong again; it has to match the Windows client character set.

The best way to be safe is open the Toad Advisor and look for "Invalid NLS_LANG value, expecting X" and set NLS_LANG to what Toad suggests. I build a pretty extensive lookup table in to match the latest Oracle recommendations for Language_Territory.CharacterSet.

Almost every time when a user says "yes, yes, NLS_LANG is correct," they mean they made it match the database character set.

Mark

Hi Mark,

Use the Toad Advisor to see the "real" NLS_LANG in use,
because the search order for NLS_LANG is 1) user environment
variable, 2) system environment variable and 3) Registry for
the active Home.
Well, I know I only have what is in the registry - I've checked. Toad
Advisor agrees with me, "Verifying value of NLS_LANG - DONE" it says.

I tried setting NLS_LANG in the registry to .AL16UTF16

As you know (but many others do not), NLS_LANG must match
the Windows character set, not the Oracle database.
True, I was attempting the impossible - again! :wink: SQL*Plus didn't like
it anyway.

...

Cheers,
Norm. [TeamT]

Norman Dunbar
Contract Senior Oracle DBA
Capgemini Database Build Team
Internal : 7 28 2051
External : 0113 231 2051

Information in this message may be confidential and may be legally privileged. If you have received this message by mistake, please notify the sender immediately, delete it and do not copy it to anyone else.

We have checked this email and its attachments for viruses. But you should still check any attachment before opening it.
We may have to make this message and any reply to it public if asked to under the Freedom of Information Act, Data Protection Act or for litigation. Email messages and attachments sent to or from any Environment Agency address may also be accessed by someone other than the sender or recipient, for business purposes.

If we have sent you information and you wish to use it please read our terms and conditions which you can get by calling us on 08708 506 506. Find out more about the Environment Agency at www.environment-agency.gov.uk

PPS. This also works - from 9i onwards:

INSERT INTO norm(A) VALUES (unistr(‘w=\0175, W=\0174’));
INSERT INTO norm(A) VALUES (unistr(‘y=\0177, Y=\0176’));
COMMIT;

That gives the correct w with ^ and y with ^ according to the Unicode
character codes supplied in the string.

Cheers,
Norm. [TeamT]

Norman Dunbar
Contract Senior Oracle DBA
Capgemini Database Build Team
Internal : 7 28 2051
External : 0113 231 2051

Information in this message may be confidential and may be legally privileged. If you have received this message by mistake, please notify the sender immediately, delete it and do not copy it to anyone else.

We have checked this email and its attachments for viruses. But you should still check any attachment before opening it.
We may have to make this message and any reply to it public if asked to under the Freedom of Information Act, Data Protection Act or for litigation. Email messages and attachments sent to or from any Environment Agency address may also be accessed by someone other than the sender or recipient, for business purposes.

If we have sent you information and you wish to use it please read our terms and conditions which you can get by calling us on 08708 506 506. Find out more about the Environment Agency at www.environment-agency.gov.uk