innodb

最後更新: 2019-03-19

 

 


ENGINE

 

SHOW ENGINE INNODB STATUS \G;

*************************** 1. row ***************************
  Type: InnoDB
  Name:
Status:
=====================================
141215 16:32:13 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 12 seconds
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 1445, signal count 1391
Mutex spin waits 0, rounds 174859, OS waits 806
RW-shared spins 1397, OS waits 363; RW-excl spins 3892, OS waits 190
------------
TRANSACTIONS
------------
Trx id counter 0 350629575
Purge done for trx's n:o < 0 350628900 undo n:o < 0 0
History list length 67
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0 0, not started, process no 13553, OS thread id 2647452528
MySQL thread id 9027, query id 320348 localhost root
SHOW ENGINE INNODB STATUS
---TRANSACTION 0 0, not started, process no 13553, OS thread id 2655841136
MySQL thread id 2886, query id 264782 x.x.x.x mysql-server
---TRANSACTION 0 350627208, not started, process no 13553, OS thread id 2727345008
MySQL thread id 4, query id 319436 myserver 192.168.0.2 mysql-server
---TRANSACTION 0 350629519, ACTIVE 1 sec, process no 13553, OS thread id 2726742896, thread declared inside InnoDB 417
mysql tables in use 2, locked 2
MySQL thread id 49, query id 320254 myserver 192.168.0.2 mysql-server Sending data
select deliveryqueue_id,host into _id,_host from DeliveryQueue
  where user_id= NAME_CONST('_user_id',91)
  and status=0
  and (host not in (select host from SMTPHost where smtpserver_id= NAME_CONST('_smtpserver_id',5)))
  and (Find_IN_SET(host, NAME_CONST('_allow_list',NULL))  or  ( NAME_CONST('_allow_list',NULL) is null) )
  and ( (not Find_IN_SET(host, NAME_CONST('_block_list',NULL))) or ( NAME_CONST('_block_list',NULL) is null) )
  limit 1
Trx read view will not see trx with id >= 0 350629520, sees < 0 350629520
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (write thread)
Pending normal aio reads: 0, aio writes: 0,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
2256 OS file reads, 24713 OS file writes, 19498 OS fsyncs
1.08 reads/s, 23945 avg bytes/read, 14.58 writes/s, 8.17 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 278, seg size 280,
836 inserts, 722 merged recs, 169 merges
Hash table size 1106407, node heap has 314 buffer(s)
188409.80 hash searches/s, 1884.68 non-hash searches/s
---
LOG
---
Log sequence number 5 1684606307
Log flushed up to   5 1684606307
Last checkpoint at  5 1684564401
0 pending log writes, 0 pending chkp writes
19101 log i/o's done, 7.83 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 310276636; in additional pool allocated 1048064
Dictionary memory allocated 134864
Buffer pool size   16384
Free buffers       12190
Database pages     3880
Modified db pages  102
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 3530, created 350, written 8549
1.58 reads/s, 0.00 creates/s, 8.58 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
1 queries inside InnoDB, 0 queries in queue
2 read views open inside InnoDB
Main thread process no. 13553, id 2678983536, state: sleeping
Number of rows inserted 2843, updated 203389, deleted 5318, read 981977795
0.58 inserts/s, 4.92 updates/s, 2.08 deletes/s, 510744.44 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

1 row in set (0.00 sec)

ERROR:
No query specified

 


Tablespaces

 

Tablespaces are physical datafiles stored in the host file system holding data for one or more tables and indexes.

# InnoDB
CREATE [UNDO] TABLESPACE tablespace_name
    [ADD DATAFILE 'file_name']
    [AUTOEXTEND_SIZE [=] value]
    [FILE_BLOCK_SIZE = value]
    [ENCRYPTION [=] {'Y' | 'N'}]

 


Buffer Pool

 

InnoDB manages the pool as a list, using a variation of the least recently used (LRU) algorithm.

"midpoint insertion strategy"

This algorithm keeps blocks that are heavily used by queries in the new sublist.
The old sublist contains less-used blocks; these blocks are candidates for eviction.

The LRU algorithm operates as follows by default:

3/8 of the buffer pool is devoted to the old sublist.

By default, blocks read by queries immediately move into the new sublist, meaning they will stay in the buffer pool for a long time. A table scan (such as performed for a mysqldump operation, or a SELECT statement with no WHERE clause) can bring a large amount of data into the buffer pool and evict an equivalent amount of older data, even if the new data is never used again.

# The default value is 128MB

innodb_buffer_pool_size=128M

OR

innodb_buffer_pool_size=134217728

innodb_buffer_pool_instances

* When the size of the buffer pool is greater than 1GB,
   setting innodb_buffer_pool_instances to a value greater than 1
   can improve the scalability on a busy server.

* The time to initialize the buffer pool is roughly proportional to its size. (Linux x86_64 server: 10GB / 6 sec)

   Divides the buffer pool into a user-specified number of separate regions,

   each with its own LRU list and related data structures,

   to reduce contention during concurrent memory read and write operations.

innodb_old_blocks_pct

# Specifies the approximate percentage of the buffer pool that InnoDB uses for the old block sublist.
# The range of values is 5 to 95.
# The default value is 37 (that is, 3/8 of the pool).

innodb_old_blocks_time (Unit: ms)

A block inserted into the old sublist must stay there after its first access before it can be moved to the new sublist.

The default value is 0:

A block inserted into the old sublist moves to the new sublist when Innodb has evicted 1/4 of the inserted block's pages from the buffer pool, no matter how soon after insertion the access occurs.

If the value is greater than 0:

blocks remain in the old sublist until an access occurs at least that many ms after the first access.

For example, a value of 1000 causes blocks to stay in the old sublist for 1 second after the first access before they become eligible to move to the new sublist.
 


Thread

 

# Default    0 (which is infinite concurrency)
innodb_thread_concurrency = 0

InnoDB tries to keep the number of operating system threads concurrently inside InnoDB less than or equal to the limit given by this variable. Once the number of threads reaches this limit, additional threads are placed into a wait state within a “First In, First Out” (FIFO) queue for execution. Threads waiting for locks are not counted in the number of concurrently executing threads.

When using LinuxThreads, you should see a minimum of three mysqld processes running

one thread for the LinuxThreads manager
one thread to handle connections
one thread to handle alarms and signals

 


IO

 

# Default    200
innodb_io_capacity

upper limit on the I/O activity performed by the InnoDB background tasks,

such as flushing pages from the buffer pool and merging data from the insert buffer.

 

# The number of I/O threads for read operations in InnoDB.
# Default: 4, Max: 64

innodb_read_io_threads=4

# The number of I/O threads for write operations in InnoDB.

innodb_write_io_threads =4

 


innodb log

 

log file 的數量及大小

Mysql 的 innodb Default 會有 2 個 log File  # (innodb_log_files_in_group) [redo log]

在 /var/lib/mysql 裡                               # (innodb_log_group_home_dir)

它們的大小為 8 MB                                   # (innodb_log_file_size)

[The larger the value, the less checkpoint flush activity is needed in the buffer pool, saving disk I/O.]

[range from 1MB to 1/N-th of the size of the buffer pool, where N is the number of log files in the group.]

i.e.

ib_logfile0
ib_logfile1

Checking

select @@innodb_log_buffer_size;

select @@innodb_log_files_in_group;

log file buffer

# log buffer 大小(Default 8M)

# A large log buffer enables large transactions to run without a need to write the log to disk before the transactions commit

innodb_log_buffer_size = 8M

P.S.

# A wait is required for it to be flushed before continuing.
# This status shows the number of times that the log buffer was too small.

SHOW GLOBAL STATUS LIKE 'innodb_log_waits';

修改 log file 的數量及大小

innodb_log_file_size = 10M
innodb_log_files_in_group = 4

當舊的 logfile 存在時就會有 Error (不論修改數量或大小)

141215 15:53:19  InnoDB: Completed initialization of buffer pool
InnoDB: Error: log file ./ib_logfile0 is of different size 0 5242880 bytes
InnoDB: than specified in the .cnf file 0 10485760 bytes!
141215 15:53:19 [ERROR] Plugin 'InnoDB' init function returned error.
141215 15:53:19 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE faile

innodb_fast_shutdown

1 (Default)

There are still some transactional data embedded in the old ib_logfiles in conjunction with the ibdata1 file.)

所以 remove logfile 前要將 innodb_fast_shutdown 改成 "0"

0

does a "full purge", "merge insert buffer", "flush dirty pages" before a shutdown

remove logfile Step

mysql -u... -p... -ANe"SET GLOBAL innodb_fast_shutdown = 0"

/etc/init.d/mysql.server stop

mv ib_logfile0 ib_logfile0.OLD

mv ib_logfile1 ib_logfile1.OLD

/etc/init.d/mysql.server start

 


Shrink/Purge ibdata1 file

 

When you delete innodb tables, MySQL does not free the space inside the ibdata file,

that's why it keeps growing. These files hardly ever shrink.

# enabled by default as of version 5.6.6 of MySQL.
# each table, including its indexes, is stored as a separate file

[mysqld]
innodb_file_per_table=1

When you delete some rows, the pages are marked as deleted to reuse later but the space is never recovered.

The only way is to start the database with fresh ibdata1.

To do that you would need to take a full logical backup with mysqldump.

Then stop MySQL and remove all the databases, ib_logfile* and ibdata* files.

When you start MySQL again it will create a new fresh shared tablespace.

Then, recover the logical dump.

Example

amavisd db

在 my.cnf 有 innodb_file_per_table 設定, 所有 Data 都不要的情況下

  1. mysqldump --routines --no-data amavisd -p > amavisd.sql
  2. service amavisd stop
  3. mysql -p -e 'DROP DATABASE amavisd'
  4. mysql -p -e 'CREATE DATABASE amavisd CHARACTER SET utf8'
  5. mysql amavisd -p < amavisd.sql

du -sh /var/lib/mysql/amavisd/

736K    /var/lib/mysql/amavisd/

 


innodb_file_per_table

 

如處

 * store specific tables on separate storage devices (SSD)

 * reclaim disk space when truncating or dropping

Truncating or dropping tables stored in the shared system tablespace creates free space internally in the system tablespace data files (ibdata files)

    which can only be used for new InnoDB data.

 * move individual InnoDB tables rather than entire databases.

 * Tables created in file-per-table tablespaces use the Barracuda file format.
    Which format enables features such as compressed and dynamic row formats.

Check Setting

SHOW VARIABLES LIKE 'innodb_file_per_table';

Naming

  • table_name.frm
  • table_name.ibd

.ibd

# The data file & index for file-per-table tablespaces and general tablespaces.
# The .ibd file extension does not apply to the system tablespace,
# which consists of one or more ibdata files.

.isl

# A file that specifies the location of an .ibd file for an InnoDB table
# You can store InnoDB tablespaces outside the database directory

i.e.

CREATE TABLE t1 (c1 INT PRIMARY KEY) DATA DIRECTORY = '/alternative/directory';

ls /alternative/directory

t1.frm
t1.isl

Performance

When using a shared table-space, read and write operations can sometimes/often be combined
so that the server reads a swatch of data from multiple tables in one go from ibdata.

However, if the data is spread out amongst multiple files,

then it has to perform a separate I/O operation for each one individually.

The real-world performance impact would depend on size, query frequency, and internal fragmentation of the shared table-space.

 


method used to flush

 

# Defines the method used to flush data to InnoDB data files and log files

# Default: NULL ( Linux:  fsync )

innodb_flush_method=?

fsync(0)           # uses the fsync() system call

O_DSYNC(1)    # uses O_SYNC to open and flush the log files,
                      # and fsync() to flush the data files.

littlesync(2)     # This option is used for internal performance testing

nosync(3)        # This option is used for internal performance testing

O_DIRECT(4)    # uses O_DIRECT to open the data files,

                       # and uses fsync() to flush both the data and log files.

O_DIRECT_NO_FSYNC(5)      # InnoDB uses O_DIRECT during flushing I/O,

                                           # but skips the fsync() system call afterward.

Check setting ("Innodb_data_fsyncs")

# The number of fsync() operations so far.

SHOW GLOBAL STATUS LIKE 'Innodb_data_fsyncs';

+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Innodb_data_fsyncs | 3     |
+--------------------+-------+
1 row in set (0.00 sec)

O_DIRECT 應用場景:

數據庫系統(其高速緩存和IO優化機制均自成一體), 無需內核消耗CPU時間和內存去完成相同的任務.

 


OPTIMIZE TABLE Syntax

 

OPTIMIZE TABLE works for InnoDB, MyISAM, and ARCHIVE tables.

OPTIMIZE [NO_WRITE_TO_BINLOG | LOCAL] TABLE tbl_name [, tbl_name] ...

For InnoDB tables

OPTIMIZE TABLE is mapped to ALTER TABLE,

which rebuilds the table to update index statistics and free unused space in the clustered index.

creates a new .ibd file with a temporary name, using only the space required to store actual data.

using only the space required to store actual data.

When the optimization is complete, InnoDB removes the old .ibd file and replaces it with the new one.

For MyISAM tables

對 variable-length rows 才有幫助 (VARCHAR, VARBINARY, BLOB, or TEXT columns).

Keyword: LOCA (NO_WRITE_TO_BINLOG)

 * By default, the server writes OPTIMIZE TABLE statements to the binary log so that they replicate to replicas.

 


Converting an Existing Table to InnoDB

 

 

To convert a non-InnoDB table to use InnoDB use ALTER TABLE:

ALTER TABLE table_name ENGINE=InnoDB;

 


ibtmp1

 

ibtmp1 is a temporary working file for MySQL.

allows transactions to run without having to write the log to disk before the transactions commit.

Default Value

>= 5.7.6     16777216

<= 5.7.5     8388608

When you turn the MySQL server off and on again the file is deleted

Limit ibtmp1 size

# The ibtmp1 once created can't be shrink by any method without restarting mysql service.

# file_name:file_size[:autoextend[:max:max_file_size]]
innodb_temp_data_file_path = ibtmp1:12M:autoextend:max:1G

 


innodb row format

 

介紹

It determines how its rows are physically stored

 * affect the performance of queries

ROW_FORMAT:

Redundant, Compact, Dynamic, Compressed

P.S.

不同的 File format 支援不同的 ROW_FORMAT

Barracuda file format: All

Antelope file format: Redundant, Compact

Global Setting

Default row format for InnoDB tables is defined by "innodb_default_row_format variable"

# Checking

SELECT @@innodb_default_row_format;

dynamic

show variables like "innodb_file_format";

Barracuda

Row Format of a Table

# Defining the Row Format of a Table

CREATE TABLE t1 (c1 INT) ROW_FORMAT=DYNAMIC;

# Determining the Row Format of a Table

mysql> SHOW TABLE STATUS IN test1\G

REDUNDANT Row Format

store the first 768 bytes of variable-length column values in the index record within the B-tree node,
with the remainder stored on overflow pages.

If the value of a column is 768 bytes or less, an overflow page is not used,
and some savings in I/O may result, since the value is stored entirely in the B-tree node.

COMPACT Row Format

reduces row storage space by about 20% compared to the REDUNDANT row format,
at the cost of increasing CPU use for some operations.
If the workload is limited by CPU speed, compact format might be slower.

DYNAMIC Row Format

 * Barracuda file format

can store long variable-length column values(for VARCHAR, VARBINARY, and BLOB and TEXT types) fully off-page

Whether columns are stored off-page depends on the page size and the total size of the row.
When a row is too long, the longest columns are chosen for off-page storage until the clustered index record fits on the B-tree page.
TEXT and BLOB columns that are less than or equal to 40 bytes are stored in line.

Using DYNAMIC or COMPRESSED means that InnoDB stores varchar/text/blob fields that don't fit in the page completely off-page.

InnoDB only supports indexes of 767 bytes per column. You can raise this 3072 bytes by setting innodb_large_prefix=1 and using either DYNAMIC or COMPRESSED row format.

COMPRESSED Row Format

 * Barracuda file format

the KEY_BLOCK_SIZE option controls how much column data is stored in the clustered index,
and how much is placed on overflow pages.

 

 

 

 

 

 

Creative Commons license icon Creative Commons license icon