« [ANN] MonkeyBread Sof… | Home | Connecting to Microso… »

Detect and remove Byte order mark


Sometimes you download a text from the web and see that there are hidden or strange characters in front. That is often the byte order mark. It tells you what type of text encoding you have. e.g. the byte sequence EF, BB, BF for UTF-8 (written as hex characters). If you use wrong encoding, they look like this: Ôªø (MacRoman) or  (Windows ANSI). So when you see those characters, you have a Text which should be decoded as UTF-8 and not with the encodings you used!

So in FileMaker, you can read text as UTF-8, detect the BOM character and remove it for further processing:

# Read text as UTF-8
Set Variable [$text; Value: MBS("CURL.GetResultAsText"; $curl; "UTF-8")]

# get first character
Set Variable [$c; Value:Code(Left($text; 1))]
# is it the UTF-8 BOM?
If [$c = 65279]
# remove it
Set Variable [$text; Value:Middle($text; 2; Length($text)-1)]
End If

# now use the text
Set Field [CURL Test::Text; $text]


At any time you can create the BOM character for UTF-8 with Character(65279) and put it in front of text, before sending it via plugin functions to a file, a socket, a server or a serial port.
12 11 14 - 17:28
No comments

  
Remember personal info?

Emoticons / Textile


Notify:
Hide email:

Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.