Unicode confuses command detection in the editor

It was logged back then and is still Open.

The short answer is... other than your posts it hasn't been reported before to my knowledge. The fix is complicated, requires many code changes across the product, and is highly susceptible to introducing new defects or reoccurrence of this issue.

The longer answer is... The parser uses UTF-8 encoding. Toad is written in Delphi and uses UTF-16 strings. Translation between UTF-8/16 is seamless and works as expected. Where we get into a pickle is when we're using size and position data from the parser. The parser reports on whole character positions. Delphi's string functions report on code points, not characters. For all "normal" characters in the Basic Multilingual Plane a code point and character are synonmous. This covers all normal language characters. Emojis use surrogate pairs, 2 code points, and functions like Length() SubString() etc. do not align with the data received from the parser. Length(':thinking:') returns 2 within Toad.

As a workaround you can use UNISTR() in your queries. Example, SELECT UNISTR('\D83D\DE0A') FROM DUAL;

I will reach out to the parser team to make sure I’m not missing something. If they have UTF-16 support with code point reporting then that would be excellent and safe.