PHP读取文件,解决中文乱码UTF-8的方法分析

本文实例讲述了php读取文件,解决中文乱码utf-8的方法。分享给大家供大家参考，具体如下：

$opts = array(
  'file' => array(
    'encoding' => "utf-8"
  )
);
$opts = array('http' => array('encoding' => 'utf-8'));
$ctxt = stream_context_create($opts);
$content = file_get_contents($filepath, file_text, $ctxt);

最简单的就是将gf2312→utf-8

$str = iconv("gb2312", "utf-8", $str);

不管用的

$content = mb_convert_encoding($content, "utf-8", "auto");

******************************************丑陋的分割线来告诉大家上面的不好的：下面的才是正确的方法···哈哈···**********************************************************

define('utf32_big_endian_bom', chr(0x00) . chr(0x00) . chr(0xfe) . chr(0xff));
define('utf32_little_endian_bom', chr(0xff) . chr(0xfe) . chr(0x00) . chr(0x00));
define('utf16_big_endian_bom', chr(0xfe) . chr(0xff));
define('utf16_little_endian_bom', chr(0xff) . chr(0xfe));
define('utf8_bom', chr(0xef) . chr(0xbb) . chr(0xbf));

$text = file_get_contents($newpath);
$first2 = substr($text, 0, 2);
$first3 = substr($text, 0, 3);
$first4 = substr($text, 0, 3);
$encodtype = "";
if ($first3 == utf8_bom)
  $encodtype = 'utf-8 bom';
else if ($first4 == utf32_big_endian_bom)
  $encodtype = 'utf-32be';
else if ($first4 == utf32_little_endian_bom)
  $encodtype = 'utf-32le';
else if ($first2 == utf16_big_endian_bom)
  $encodtype = 'utf-16be';
else if ($first2 == utf16_little_endian_bom)
  $encodtype = 'utf-16le';

$content = file_get_contents($newpath);

$content = iconv($encodtype, "utf-8", $content);

终极版·····

$text = file_get_contents($filepath);
//$encodtype = mb_detect_encoding($text);
define('utf32_big_endian_bom', chr(0x00) . chr(0x00) . chr(0xfe) . chr(0xff));
define('utf32_little_endian_bom', chr(0xff) . chr(0xfe) . chr(0x00) . chr(0x00));
define('utf16_big_endian_bom', chr(0xfe) . chr(0xff));
define('utf16_little_endian_bom', chr(0xff) . chr(0xfe));
define('utf8_bom', chr(0xef) . chr(0xbb) . chr(0xbf));
$first2 = substr($text, 0, 2);
$first3 = substr($text, 0, 3);
$first4 = substr($text, 0, 3);
$encodtype = "";
if ($first3 == utf8_bom)
  $encodtype = 'utf-8 bom';
else if ($first4 == utf32_big_endian_bom)
  $encodtype = 'utf-32be';
else if ($first4 == utf32_little_endian_bom)
  $encodtype = 'utf-32le';
else if ($first2 == utf16_big_endian_bom)
  $encodtype = 'utf-16be';
else if ($first2 == utf16_little_endian_bom)
  $encodtype = 'utf-16le';
//下面的判断主要还是判断ansi编码的·
if ($encodtype == '') {//即默认创建的txt文本-ansi编码的
  $content = iconv("gbk", "utf-8", $text);
} else if ($encodtype == 'utf-8 bom') {//本来就是utf-8不用转换
  $content = $text;
} else {//其他的格式都转化为utf-8就可以了
  $content = iconv($encodtype, "utf-8", $text);
}

以上的终极版·可以适应中文操作windows系统建立的ansi“““““““utf-8““““`unicode““`的txt文本····

黄山市民网：https://www.huangshanshimin.com/

相关文章