Text file import ends with "No mapping for the Unicode character exists in the target multi-byte code page."

Hey all,

In beta 13.0.0.61 on Win7-64bit, I have this table:

CREATE TABLE GEO_US
(
GEONAMEID NUMBER NOT NULL,
GEONAME NVARCHAR2(200) NOT NULL,
ASCIINAME VARCHAR2(200 BYTE),
ALTERNATENAMES NVARCHAR2(2000),
LATITUDE NUMBER(8,5) NOT NULL,
LONGITUDE NUMBER(8,5) NOT NULL,
FEATURECLASS CHAR(1 BYTE),
FEATURECODE VARCHAR2(10 BYTE),
COUNTRYCODE CHAR(2 BYTE),
CC2 VARCHAR2(200 BYTE),
ADMIN1CODE VARCHAR2(20 BYTE),
ADMIN2CODE VARCHAR2(80 BYTE),
ADMIN3CODE VARCHAR2(20 BYTE),
ADMIN4CODE VARCHAR2(20 BYTE),
POPULATION NUMBER,
ELEVATION INTEGER,
DEM NUMBER,
TIMEZONE VARCHAR2(40 BYTE),
MODIFICATIONDATE DATE
)

…created in an 11.2.0.3 database with:

NLS_CHARACTERSET AL32UTF8
NLS_NCHAR_CHARACTERSET AL16UTF16

I’m trying to “Import Table Data” the 300MB of http://download.geonames.org/export/dump/US.zip into this empty table, but after selecting the file, I get an error popup of:

No mapping for the Unicode character exists in the target multi-byte code page.

I’m guessing that there’s an issue with some data for one of the NVARCHAR2 columns, but not sure how to find it. I’ve successfully imported the (much smaller!) “GB.zip” and “GD.zip” files from the same download with no unexpected issues.

I changed these defaults for the import:

Delimiter: Tab
Text Qualifier: None
Date Order: YMD
Date Delimiter: -

It’s at this point, when I click “Next” that the error happens. It’s not a huge deal for me, as this is just something fun to have, but thought it may have an impact on something in the future.

Thoughts?
TIA!

Rich

Hi Rich,

I get that too. It doesn’t have anything to do with database. Toad is just trying to open the file. I’ll investigate.

-John

Interesting!

For giggles, I opened the file in Toad’s Editor. A tad bit on the slow side (maybe not slow for a 300MB text file!), but it opened fine.

Yes, 16GB of RAM is nice to have. :slight_smile:

Rich

The data import engine pulls only 1Mb at a time from the file to minimize memory usage. I think what’s happening is that a unicode character is getting chopped in half as I read in chunks of data. I should be able to fix it…

Fixed for next beta. That’s a good test file. A little unwieldy perhaps but it’s got lots of unicode sprinked in with non-unicode.

Thanks John! I’ll check it out.

I assume “next beta” means .63 (or higher), right?

Rich

Monday’s whatever that is.

Just squeezed in a little time to check this out and it works perfectly! Wish I had more time to play with that data – it seems like it could be put to good use somehow…

Thanks again John!
Rich

You’re welcome, Rich.