最後更新: 2018-05-29
目錄
- Charset 與 Collate
- Character Set
- DB, Table, Column Level 的 Character Set 設定
- Character set and Collation 的順序
- my.cnf
- SQL 更改 Character Set
- utf8 vs utf8mb4
-
PHP Settings
- php.ini
- Code - Troubleshoot
Charset 與 Collate
分別
- Charset: 儲存值時用
- Collate: 值比較時用 (簡單格式: bin)
The server sets below variables whenever the default database changes
- collation_database
- character_set_database
# To see the default character set and collation for a given database
USE db_name;
SELECT @@character_set_database, @@collation_database;
+--------------------------+----------------------+ | @@character_set_database | @@collation_database | +--------------------------+----------------------+ | utf8 | utf8_general_ci | +--------------------------+----------------------+
# Show MySQL default character set
SHOW VARIABLES LIKE 'char%';
+--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+
P.S.
character_set_database (Deprecated)
If there is no default database, the variable has the same value as character_set_server.
Checking
show charset;
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
Character Set
查看本機支援什麼 Character Set:
# To see the default collation for each character set
mysql> SHOW CHARACTER SET;
- _ci 大小写不敏感
- _cs 大小写敏感
- _bin 二元
查看某 Character Set 的 collation
mysql> SHOW COLLATION WHERE Charset = 'big5';
+-----------------+---------+----+---------+----------+---------+ | Collation | Charset | Id | Default | Compiled | Sortlen | +-----------------+---------+----+---------+----------+---------+ | big5_chinese_ci | big5 | 1 | Yes | Yes | 1 | | big5_bin | big5 | 84 | | Yes | 1 | +-----------------+---------+----+---------+----------+---------+
utf8_*
utf8_bin: compare strings by the binary value of each character in the string
utf8_general_ci: compare strings using general language rules and using case-insensitive comparisons
utf8_general_cs: compare strings using general language rules and using case-sensitive comparisons
For example, the following will evaluate at true with either of the UTF8_general collations, but not with the utf8_bin collation: Code:
Ä = A Ö = O Ü = U
With the utf8_general_ci collation, they would also return true even if not the same case.
None of these is inherently "better"; they simply have different functionalities. You need to choose which one best suits your particular needs.
INFORMATION_SCHEMA.SCHEMATA
Database options are stored in the data dictionary and
can be examined by checking the INFORMATION_SCHEMA.SCHEMATA table.
DB, Table, Column Level 的 Character Set 設定
DB Level
CREATE DATABASE db_name [ DEFAULT CHARACTER SET charset_name [COLLATE collation_name] ]
CREATE DATABASE `domain_db` DEFAULT CHARACTER SET utf8
Table Level
CREATE TABLE tbl_name (column_list) [ DEFAULT CHARACTER SET charset_name [COLLATE collation_name] ]
Column Level
// CHAR、VARCHAR或TEXT類型
col_name {CHAR | VARCHAR | TEXT} (col_length) [CHARACTER SET charset_name [COLLATE collation_name]]
Character set and Collation 的順序
If both CHARACTER SET charset_name and COLLATE collation_name are specified,
- character set charset_name and collation collation_name are used.
If CHARACTER SET charset_name is specified without COLLATE,
character set charset_name and its default collation are used.
mysql> SHOW CHARACTER SET WHERE CHARSET = 'utf8';
+---------+---------------+-------------------+--------+ | Charset | Description | Default collation | Maxlen | +---------+---------------+-------------------+--------+ | utf8 | UTF-8 Unicode | utf8_general_ci | 3 | +---------+---------------+-------------------+--------+
If COLLATE collation_name is specified without CHARACTER SET,
- the character set associated with collation_name and collation collation_name are used
Otherwise (neither CHARACTER SET nor COLLATE is specified)
- the server character set and server collation are used.
If there is no default database,
the variables have the same value as the corresponding server-level system variables,
- character_set_server
- collation_server
my.cnf
For mariadb-5.5 (Centos 7)
/etc/my.cnf.d/server.cnf
[mysqld] character-set-server=utf8
/etc/my.cnf.d/client.cnf
[client] default-character-set=utf8
* character-set 有分 server 及 client 的 !!!
* It is still necessary for applications to configure their connection using "SET NAMES" after they connect
SQL 更改 Character Set
SET NAMES utf8;
相當於
SET character_set_client = utf8; SET character_set_connection = utf8; SET character_set_results = utf8;
Remark: Checking
SHOW VARIABLES LIKE 'char%';
SHOW VARIABLES LIKE 'collation%';
SET CHARACTER SET utf8;
相當於
SET character_set_client = utf8;
SET character_set_results = utf8;
SET collation_connection = @@collation_database;
P.S.
character_set 與 collation 是兩樣不同東西來
可用的 CHARACTER SET
SHOW CHARACTER SET;
+----------+-----------------------------+---------------------+--------+ | Charset | Description | Default collation | Maxlen | +----------+-----------------------------+---------------------+--------+ | big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 | ...
utf8 vs utf8mb4
The utf8 character set in MySQL only supports a subset of Unicode characters,
specifically those that can be represented with 3 bytes.
It cannot handle certain characters: emojis.
The utf8mb4 is an extension of utf8 that supports the full range of Unicode characters (4 bytes)
Converting from utf8mb4 to utf8
1. Backup
2. DATABASE
ALTER DATABASE database_name CHARACTER SET utf8 COLLATE utf8_general_ci;
3. TABLE
SHOW TABLES;
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
PHP Settings
php.ini
default_charset = "UTF-8"
Code
Version >= 5.2.3
bool mysql_set_charset ( string $charset [, resource $link_identifier = NULL ] )
<?php $link = mysql_connect('localhost','user1','pass1',TRUE); mysql_selectdb('db1',$link); mysql_set_charset('utf8',$link); ?>
Troubleshoot
解決: had no or invalid character set, and default character set is multi-byte, so character column sizes may have changed
[mysqld] init-connect='SET NAMES big5' # character set (--character-set-server=) character-set-server = big5 # collation (--collation-server=) collation-server=big5_chinese_ci default-character-set = big5
mysql> show variables like 'char%';
mysql> show variables like 'collation%';
Mysql 4.0 Character Sets and Corresponding Mysql 4.1 Character Set/Collation Pairs
If you want to start mysqld from a 4.1.x distribution with data created by MySQL 4.0,
you should start the server with the same character set and collation.
In this case, you won't need to reindex your data.
4.0 4.1 4.1 Collation big5 big5 big5_chinese_ci
Change Tables Character Set
If you use ALTER TABLE to convert a column from one character set to another,
MySQL attempts to map the data values,
but if the character sets are incompatible, there may be data loss.