TA的每日心情 | 慵懒 2014-11-5 09:39 |
---|
签到天数: 281 天 [LV.8]以坛为家I
|
藤椅

楼主 |
发表于 2006-10-24 22:06:16
|
只看该作者
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">从网上引来一段从</span><span lang="EN-US"><font face="Times New Roman">UNICODE</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">到</span><span lang="EN-US"><font face="Times New Roman">UTF8</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">的转换规则:</span></p><p class="MsoNormal" align="left" style="MARGIN: 0cm 0cm 0pt; LINE-HEIGHT: 170%; TEXT-ALIGN: left; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto;"><span lang="EN-US" style="FONT-SIZE: 9.5pt; COLOR: white; LINE-HEIGHT: 170%; FONT-FAMILY: Verdana; mso-font-kerning: 0pt; mso-bidi-font-family: 宋体;"> </span><span lang="EN-US"><font face="Times New Roman">Unicode<span style="mso-spacerun: yes;"> </span>UTF-8 </font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman">0000 - <chmetcnv wst="on" tcsc="0" numbertype="1" negative="False" hasspace="False" sourcevalue="7" unitname="F">007F</chmetcnv><span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span>0xxxxxxx</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman">0080 - 07FF<span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span>110xxxxx 10xxxxxx</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman">0800 – FFFF<span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span>1110xxxx 10xxxxxx 10xxxxxx</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">例如</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">汉</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">字的</span><span lang="EN-US"><font face="Times New Roman">Unicode</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">编码是</span><font face="Times New Roman"><chmetcnv wst="on" tcsc="0" numbertype="1" negative="False" hasspace="False" sourcevalue="6" unitname="C"><span lang="EN-US">6C</span></chmetcnv><span lang="EN-US">49</span></font><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">。</span><font face="Times New Roman"><chmetcnv wst="on" tcsc="0" numbertype="1" negative="False" hasspace="False" sourcevalue="6" unitname="C"><span lang="EN-US">6C</span></chmetcnv><span lang="EN-US">49</span></font><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">在</span><span lang="EN-US"><font face="Times New Roman">0800-FFFF</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">之间,所以要用</span><span lang="EN-US"><font face="Times New Roman">3</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">字节模板:</span><span lang="EN-US"><font face="Times New Roman">1110xxxx 10xxxxxx 10xxxxxx</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">。将</span><font face="Times New Roman"><chmetcnv wst="on" tcsc="0" numbertype="1" negative="False" hasspace="False" sourcevalue="6" unitname="C"><span lang="EN-US">6C</span></chmetcnv><span lang="EN-US">49</span></font><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">写成二进制是:</span><span lang="EN-US"><font face="Times New Roman">0110 1100 0100 1001</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,将这个比特流按三字节模板的分段方法分为</span><span lang="EN-US"><font face="Times New Roman">0110 110001 001001</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,依次代替模板中的</span><span lang="EN-US"><font face="Times New Roman">x</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,得到:</span><span lang="EN-US"><font face="Times New Roman">1110-0110 10-110001 10-001001</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,即</span><span lang="EN-US"><font face="Times New Roman">E6 B1 89</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,这就是其</span><span lang="EN-US"><font face="Times New Roman">UTF8</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">的编码。</span><span lang="EN-US"><font face="Times New Roman"> </font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><p><font face="Times New Roman"> </font></p></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">讲到这里,我们再顺便说说一个很著名的奇怪现象:当你在</span><span lang="EN-US"><font face="Times New Roman"> windows </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">的记事本里新建一个文件,输入</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">联通</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">两个字之后,保存,关闭,然后再次打开,你会发现这两个字已经消失了,代之的是几个乱码!呵呵,有人说这就是联通之所以拼不过移动的原因。</span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">其实这是因为</span><span lang="EN-US"><font face="Times New Roman">GB2312</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">编码与</span><span lang="EN-US"><font face="Times New Roman">UTF8</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">编码产生了编码冲撞的原因。</span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">而当你新建一个文本文件时,记事本的编码默认是</span><span lang="EN-US"><font face="Times New Roman">ANSI, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">如果你在</span><span lang="EN-US"><font face="Times New Roman">ANSI</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">的编码输入汉字,那么他实际就是</span><span lang="EN-US"><font face="Times New Roman">GB</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">系列的编码方式,在这种编码下,</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">联通</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">的内码是:</span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman"><span style="mso-spacerun: yes;"> </span>c1 1100 0001</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman"><span style="mso-spacerun: yes;"> </span>aa 1010 1010</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman"><span style="mso-spacerun: yes;"> </span>cd 1100 1101</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman"><span style="mso-spacerun: yes;"> </span>a8 1010 1000</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">注意到了吗?第一二个字节、第三四个字节的起始部分的都是</span><span lang="EN-US"><font face="Times New Roman">"110"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">和</span><span lang="EN-US"><font face="Times New Roman">"10"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,正好与</span><span lang="EN-US"><font face="Times New Roman">UTF8</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">规则里的两字节模板是一致的,于是再次打开记事本时,记事本就误认为这是一个</span><span lang="EN-US"><font face="Times New Roman">UTF8</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">编码的文件,让我们把第一个字节的</span><span lang="EN-US"><font face="Times New Roman">110</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">和第二个字节的</span><span lang="EN-US"><font face="Times New Roman">10</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">去掉,我们就得到了</span><span lang="EN-US"><font face="Times New Roman">"00001 101010"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,再把各位对齐,补上前导的</span><span lang="EN-US"><font face="Times New Roman">0</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,就得到了</span><span lang="EN-US"><font face="Times New Roman">"0000 0000 0110 1010"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,不好意思,这是</span><span lang="EN-US"><font face="Times New Roman">UNICODE</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">的</span><chmetcnv wst="on" tcsc="0" numbertype="1" negative="False" hasspace="False" sourcevalue="6" unitname="a"><span lang="EN-US"><font face="Times New Roman">006A</font></span></chmetcnv><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,也就是小写的字母</span><span lang="EN-US"><font face="Times New Roman">"j"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,而之后的两字节用</span><span lang="EN-US"><font face="Times New Roman">UTF8</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">解码之后是</span><span lang="EN-US"><font face="Times New Roman">0368</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,这个字符什么也不是。这就是只有</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">联通</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">两个字的文件没有办法在记事本里正常显示的原因。</span><span lang="EN-US"><font face="Times New Roman"> </font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><p><font face="Times New Roman"> </font></p></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">而如果你在</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">联通</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">之后多输入几个字,其他的字的编码不见得又恰好是</span><span lang="EN-US"><font face="Times New Roman">110</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">和</span><span lang="EN-US"><font face="Times New Roman">10</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">开始的字节,这样再次打开时,记事本就不会坚持这是一个</span><span lang="EN-US"><font face="Times New Roman">utf8</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">编码的文件,而会用</span><span lang="EN-US"><font face="Times New Roman">ANSI</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">的方式解读之,这时乱码又不出现了。</span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><p><font face="Times New Roman"> </font></p></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">受到过网络编程加持的计算机僧侣们都知道,在网络里传递信息时有一个很重要的问题,就是对于数据高低位的解读方式,一些计算机是采用低位先发送的方法,例如我们</span><span lang="EN-US"><font face="Times New Roman"> C</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">机采用的</span><span lang="EN-US"><font face="Times New Roman"> INTEL </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">架构,这就叫</span><span lang="EN-US"><font face="Times New Roman">little endian, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">而另一些是采用高位先发送的方式</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">这就叫</span><span lang="EN-US"><font face="Times New Roman">big endian. </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">在网络中交换数据时,为了核对双方对于高低位的认识是否是一致的,采用了一种很简便的方法,就是在文本流的开始时向对方发送一个标志符——如果之后的文本是高位在位,那就发送</span><span lang="EN-US"><font face="Times New Roman">"FEFF"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">,反之,则发送</span><span lang="EN-US"><font face="Times New Roman">"FFFE"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">。不信你可以用二进制方式打开一个</span><span lang="EN-US"><font face="Times New Roman">UTF-X</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">格式的文件,看看开头两个字节是不是这两个字节?</span><span lang="EN-US"><font face="Times New Roman"> </font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><p><font face="Times New Roman"> </font></p></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">顺便提一下</span><span lang="EN-US"><font face="Times New Roman">little endian</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">和</span><span lang="EN-US"><font face="Times New Roman">big endian</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">这两个网络术语的来历</span><span lang="EN-US"><font face="Times New Roman">: </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">在</span><span lang="EN-US"><font face="Times New Roman"><<</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">格列佛游记</span><span lang="EN-US"><font face="Times New Roman">>></font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">中</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">小人国中由于争论吃鸡蛋应该从大头敲还是从小头敲而分成了不同派系</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">还发生了战争</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">连皇帝都被干掉了</span><span lang="EN-US"><font face="Times New Roman">. </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">在计算机技术发展中</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">不同体系的硬件之间的通信也因为大头在前还是小头在前产生了同样严重的问题</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">因此技术专家里比较幽默的那部分人</span><span lang="EN-US"><font face="Times New Roman">----</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">那一绝大部分人</span><span lang="EN-US"><font face="Times New Roman">----</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">就采用了</span><span lang="EN-US"><font face="Times New Roman">"endian"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">这个有强烈政治隐喻的术语</span><span lang="EN-US"><font face="Times New Roman">.</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">好了,终于可以回答</span><span lang="EN-US"><font face="Times New Roman">NICO</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">的问题了,在数据库里,有</span><span lang="EN-US"><font face="Times New Roman">n</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">前缀的字串类型就是</span><span lang="EN-US"><font face="Times New Roman">UNICODE</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">类型,这种类型中,固定用两个字节来表示一个字符,无论这个字符是汉字还是英文字母,或是别的什么。</span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><p><font face="Times New Roman"> </font></p></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">下面的例子应该可以说明</span><span lang="EN-US"><font face="Times New Roman">unicode</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">型和</span><span lang="EN-US"><font face="Times New Roman">ansi</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">型的字段的区别</span><span lang="EN-US"><font face="Times New Roman">:</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">我们在任意类型的数据库中建一个表</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">含有如下的字段</span><span lang="EN-US"><font face="Times New Roman">.</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman">nc nchar(10) </font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman">c char(10)</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">然后</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">我们再试着向其中加入下面的记录</span><span lang="EN-US"><font face="Times New Roman">:</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman">"1234567890", "1234567890" </font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">一二三四五六七八九十</span><span lang="EN-US"><font face="Times New Roman">","</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">一二三四五六七八九十</span><span lang="EN-US"><font face="Times New Roman">"</font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">对于第一条记录</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">两个字段都可以插入</span><span lang="EN-US"><font face="Times New Roman">10</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">个字符</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">同时也都一个字符也多存不了</span><span lang="EN-US"><font face="Times New Roman">. </font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">但对于第二条记录</span><span lang="EN-US"><font face="Times New Roman">,<span style="mso-spacerun: yes;"> </span>nc</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">字段可以把从</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">一</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">到</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">十</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">的数据都保存进去</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">而</span><span lang="EN-US"><font face="Times New Roman">c</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">字段只能保存到</span><span lang="EN-US"><font face="Times New Roman">"</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">五</span><span lang="EN-US"><font face="Times New Roman">", </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">再多就会出错</span><span lang="EN-US"><font face="Times New Roman">. </font></span></p><p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt;"><span lang="EN-US"><span style="mso-spacerun: yes;"><font face="Times New Roman"> </font></span></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">为什么</span><span lang="EN-US"><font face="Times New Roman">? </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">因为在</span><span lang="EN-US"><font face="Times New Roman">nchar</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">字段里</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">一个汉字一个字符</span><span lang="EN-US"><font face="Times New Roman">, 10</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">字符宽的字段就可以保存</span><span lang="EN-US"><font face="Times New Roman">10</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">个汉字</span><span lang="EN-US"><font face="Times New Roman">. </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">而</span><span lang="EN-US"><font face="Times New Roman">char</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">字段里</span><span lang="EN-US"><font face="Times New Roman">, </font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">一个汉字算两个字符</span><span lang="EN-US"><font face="Times New Roman">, 10</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">字符宽的字段就只能保存</span><span lang="EN-US"><font face="Times New Roman">5</font></span><span style="FONT-FAMILY: 宋体; mso-ascii-font-family: "Times New Roman"; mso-hansi-font-family: "Times New Roman";">个汉字了</span><span lang="EN-US"><font face="Times New Roman">.</font></span></p> |
|