Synopsis

char *safp_STEPToSJis(char *instring)

Purpose

Read a string token and perform STEP to Kanji-SJis MBCS conversion

Description

Given a input string, return a string after doing Japanese language conversion. This routine will convert to the Shift-Jis MultiByte Character Set. This function uses the unl subsystem. unl_Initialize() must be called prior to using this function.

This routine will handle STRINGS according to section 7.3.3 of Part 21 of STEP. Some of this information is repeated here. There are 4 possible representations:

STEP calls out some encodings for STRINGS in the Part 21 exchange file. There are four different methods:

1 - Standard ASCII encodings (basic alphabet) (section 7.3.3) - Decimal equivalents 31 through 126 inclusive of ASCII (ISO 8859-1). Note that it is impossible to represent the control characters (newline, tab) using this method.

2 - Encoding the full alphabet of ISO8859. (See section 7.3.3.1) This encoding uses a "\S" to indicate that the next character should be interpreted with the 8th bit on (add 128 to the decimal equivalent). This encoding is not supported yet.

3 - UNICODE or ISO 10646 (Section 7.3.3.2). This is how STEP does KANJI. Once the row/column for a given character has been found from the UNICODE tables, we simply encode it into a 4 character hexadecimal format with an ANNOUNCER to indicate that this is a special encoding. This ANNOUNCER is the special characters "\X2". This works for all 16 bit characters. Currently, UNICODE only has mappings for 16 bit characters. However, it does allow for extensions into the 32 bit world, but so far, there are not any characters in those planes yet.

So for a character at row 0/column 0 we would get the following mapping: "\X20000" For a character at row 255/column 255 we would get the following mapping: "\X2\FFFF"

For a character at row 0/column 255 we would get: "\X200FF"

4 - ARBITRARY HEX - (Section 7.3.3.3) - This just allows a hexadecimal representation for characters. This is how you can send control characters, (carriage returns, tabs) in a strings. NL (newline) is actually a integer value of 10 from the ascii table. So we encode it as a hexadecimal "0A". We also use the "/X/" to ANNOUNCE that the next two characters are using the ARBITRARY method for encoding "/X/0A"

Input

instring

the input string to decode

Return

the output string that has been decoded. If this string is not the same address as the input string, a strcpy is called to move the returned string to the instring.