Convert weird characters to their html entity equivalent
on 22-Nov-2011 | Comments ( 0 ) Tags: PHP
Recently I had to import some html from another site that was using a different encoding than UTF-8. The strange thing was that I could not find a good article on converting weird looking characters to their html equivalent and on top of this none of the php encoding functions worked for me.
After a long search I found this article that put me on the right track.
Using the above I started creating my own code to map the weird characters that broke the site:
function cleanImportedText($body){
$replace = array("\x95","\x99");
$replaceWith = array("•","™");
$body = str_replace($replace, $replaceWith, $body );
return $body;
}
The example above does the encoding for • and ™ from hexa. You can use it as starting point for your own mapping of "strange" characters.

Write a comment