Wednesday, November 30, 2011

Tip - How to write to utf-8 encoding format using CommaTextIo Class

A few months back, I had written a basic post on How to export csv data from AX using CommaTextIO Class

Now, we had a requirement to output the file in utf-8 encoding format and the credit for this post goes to Super Mario (Read below to find who Super Mario is ;-) ..) for his tip which I'm sharing with you all.

Tip/Solution:

The UTF-8 encoding was achieved by using an additional parameter while calling the constructor for CommaTextIO class.The 3rd(optional) parameter is the codepage integer value. It was achieved as per below:
commaTextIo = new CommaTextIo(fileitemdata,#io_write, 65001);

65001 is the Codepage identifier for utf-8

What's interesting in Dynamics AX 2012 for UTF-8:

I was curious to know how UTF-8 is implemented in the new version of Dynamics AX 2012 and did a search at the AOT Level, found that now we have a placeholder for UTF-8 format in AOT > Macros > File
I compared the File Macro in AX 2009 and AX 2012, in AX 2009 there’s no macro for utf-8 but in AX 2012, we have a declaration for it in Macros > File
/*
UTF-8 Format
*/
#define.utf8Format (65001)

But surprisingly it's not used anywhere for e.g. if you see the EditorScripts Class in Dynamics AX 2012, it still uses the literal value 65001 and not the Macro which is defined above.
public void sendTo_file(Editor  e)
{
    Filename filename;
    TextIo io;
    int i = strFind(e.path(), '\\', strLen(e.path()), -strLen(e.path()));
    str defaultName = subStr(e.path(), i+1, strLen(e.path()));
    ;
    filename = WinAPI::getSaveFileName(0, ['Text','*.txt'], '', "@SYS56237", 'txt', defaultName );
    if (filename)
    {
        // BP deviation documented
        io = new TextIo(filename, 'W', 65001); // Write the file in UTF8
        io.write(EditorScripts::getSelectedText(e));
    }
}

I extended our current File Macro by including the following utf encoding declarations under Macros > File for our future reference.

/*
UTF Encoding Format
*/
#define.utf7Format (65000)
#define.utf8Format (65001)
#define.utf16Format (1200)
#define.utf32format (12000)
#define.usascii (20127)

Below MSDN link has the definitions of all the codepage values if you are interested.

Ok Ok... I know you have reached here and still wondering who is Super Mario .. :-) He's my boss who is a techie himself. :-)

1 comment:

  1. This is great. UTF-8 is the reason when a user double clicks on a CSV it will either open correctly in Excel or it will just be lines that need imported. You should clarify that in your post.

    ReplyDelete