Arabic Charactersets

One of the major decisions you have to make before creating a database is which character-set you will use.

When choosing DB character-set, your selection should be meet 2 goals:

  1. Eliminate or minimize character-set conversion between Oracle clients and the Oracle server.
  2. Minimizing the size of the database (why use 2 bytes per Arabic character when you can use 1).

You have to choose between the 2 most popular Oracle Arabic character-sets: AR8MSWIN1256 or AR8ISO8859P6. There are also some cases where you have to use UTF8.

Here are the scenarios:

  • Use AR8MSWIN1256 if all or most of your clients are MS Windows.
  • Use AR8ISO8859P6 if all or most of your clients are UNIX.
  • Use UTF-8 if you are storing in your database Hindi numbers or you plan on storing other languages (French, Japanese, etc…) frequently in addition to Arabic & English. Keep in mind storing UTF8 Arabic characters requires 2 bytes per Arabic character while AR8MSWIN1256 or AR8ISO8859P6 uses 1 byte only.
  • If you’re storing more languages other Arabic & English infrequently, instead of using UTF8 you can still use AR8MSWIN1256 or AR8ISO8859P6 and make the field type that stores multiple languages nvarchar. “nvarchar” data type uses National character-set instead of character-set.

Note: You can create a database with AR8ISO8859P6 on MS Windows platform and vice versa you can create a database with AR8MSWIN1256 on UNIX platform. As a general rule, the database character-set doesn’t have to be supported by operating system where the database instance is created.

Here is an example to demonstrate the difference between UTF8 and single byte character-set like AR8MSWIN1256.

Inserting data in AR8MSWIN1256 database

SQL> create table test_length
2 (
3 col1 varchar2(10)
4 );

Table created.

SQL>
SQL> insert into test_length values (‘كيف حالك’);

1 row created.

Now inserting same data in UTF8 database:

SQL> create table test_length
2 (
3 col1 varchar2(10)
4 );

Table created.

SQL>
SQL> insert into test_length values (‘كبف حالك’) ;
insert into test_length values (‘كيف حالك’)
*
ERROR at line 1:
ORA-12899: value too large for column “TEST”.”TEST_LENGTH”.”COL1″ (actual: 15, maximum: 10)

From the error message, actual 15 because 7 Arabic characters each 2 bytes and 1 space which is 1 byte

Note: varchar2(10) means 10 bytes not 10 characters, while length(column_name) returns the number of characters.

 

Hazem Ameen
Senior Oracle DBA

Advertisements

16 thoughts on “Arabic Charactersets

  1. Greetings,
    Can you please advise on the following,,

    I would like to migrate 8.0.5(AR8iso8859p6) to 10g (AR8mswin1256)
    what is the best way to avoid character conversion ? Both O/s-Solaris

    (1)
    Exp from 8.0.5(ar8iso8859p6) and IMP in 9i (AR8MSWIN1256)
    then EXP 9i (AR8MSWIN1256) & IMP in 10g (AR8MSWIN1256)

    OR

    (2)at 8i db
    SVRMGR> STARTUP MOUNT;
    SVRMGR> ALTER SYSTEM ENABLE RESTRICTED SESSION;
    SVRMGR> ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0;
    SVRMGR> ALTER SYSTEM SET AQ_TM_PROCESSES=0;
    SVRMGR> ALTER DATABASE OPEN;
    SVRMGR> AALTER DATABASE NATIONAL CHARACTER SET ar8mswin1256;
    SVRMGR> SHUTDOWN IMMEDIATE;
    SVRMGR> STARTUP;

    then exp & imp in 9i and then to 10g.

    OR

    (3)

    ?

    Thanks

    Ahmed, Kuwait.

  2. I am using oracle 11g and i want to store Arabic data,and i selected the AR8MSWIN1256 during my oracle installation in the OUI.But the character appear unreadable ,something different from Arabic.And my environment is set to Arabic also,Please Help me in this issue..
    Thanks…

  3. I am using character set AR8ISP8859P6 on Solaris system and Oracle Forms 6i. No problem at all. Now, I have migrated to Oracle Forms 10g. Arabic data is displayed Garbage. Arabic data is shown properly when it is manually converting to character set AR8MSWIN1256 using SQLPLUS CONVERT function.

    Oracle is not automatically converting the Arabic data in oracle forms 10g.

    1. Thank you so much Mr.Nasir for your reply it is very helpful.
      If you used cmd and tried to access your data on the database does the Arabic characters looks readable to you even though you are using character set AR8ISP8859P6 ??

  4. Salam Alikoum Hazem,

    I have Oracle 11 Windows server, I want to change the character set code from WE8MSWIN1252 to AR8ISO8859P6. Could you advise me how to do it.

    Allah Ahfiz,
    Moustafa

      1. My initial database was on Oracle 8 and it was Arabic enabled (AR8ISO8859P6), the data contained a lot of Arabic Chars. then 3 months ago. I upgraded the server to Oracle 11, I kept the charecter set code WE8MSWIN1252. All the Arabic letteres shown as ??????!!.
        My question: do I need to change to the AR8ISO8859P6? what about the old daata, is the old Arabic data will be displayed correctly?
        Is there any thing I need to do to rectify the situation?

  5. hy sir,
    I want to save Arabic data in my oracle table,, bt after inserting it is showing me(??????) . is there anything I m missing somewhere..
    plz help ,
    thanks in advance…

      1. sir,
        I run your query, its showing me below records..
        PARAMETER VALUE

        NLS_LANGUAGE AMERICAN
        NLS_TERRITORY AMERICA
        NLS_CURRENCY $
        NLS_ISO_CURRENCY AMERICA
        NLS_NUMERIC_CHARACTERS .,
        NLS_CHARACTERSET WE8MSWIN1252
        NLS_CALENDAR GREGORIAN
        NLS_DATE_FORMAT DD-MON-RR
        NLS_DATE_LANGUAGE AMERICAN
        NLS_SORT BINARY
        NLS_TIME_FORMAT HH.MI.SSXFF AM
        NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
        NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
        NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
        NLS_DUAL_CURRENCY $
        NLS_COMP BINARY
        NLS_LENGTH_SEMANTICS BYTE
        NLS_NCHAR_CONV_EXCP FALSE
        NLS_NCHAR_CHARACTERSET AL16UTF16
        NLS_RDBMS_VERSION 10.2.0.1.0

      2. **** make the oracle database read arabic and frensh languages:
        —————————————————————————————-

        First step:(Before ORACLE installation)
        ————-
        -open control Panel
        -open regional and language
        -open FORMAT TAB-> Format:ENGLISH(United states)
        -open Loaction TAb -> Current Location: LEBANON
        -open Administrative TAB
        -open Change system locale..
        -choice: current system local: ARABIC(LEBANON)

        Second step:
        —————–
        -install oracle as administrator
        -choice oracle language : ARABIC – ENGLISH – FRENSH
        -then choice the DataBase CaracterSet to:ARABIC_LEBANON.AR8MSWIN1256
        -continue installation…

      3. Sir,
        Thanks for the information, and I will do the same as well. but at the server end oracle is already installed. can I do something without re-install Oracle for language?

        thanks in advance..

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s