最後更新: 2019-10-23
Ram 的使用情況
free -m
total used free shared buffers cached Mem: 3133 2705 428 0 84 2269 -/+ buffers/cache: 352 2781 # <= 減了 cache 及 buffer 後的真 used 及 free Swap: 1906 1 1905
真正的 free 及 usage:
total used = real_usage + buffer + cached = 352 + 84 + 2269 = 2705
total free = total - real_usage = 3133 - 352 = 2781
總結:
第 2 行才真的有用.
PAGE_SIZE
# A page is a fixed length block of main memory
# Kernel swap and allocates memory using pages
getconf PAGE_SIZE
4096
buffer v.s cache
Buffer : temporarily hold data ( active I/O operations ) (inode, fs metadata)
Block Device 的讀寫緩衝區
Bache: frequently accessed data (result of completed I/O operations) (與 writethrough 及writeback 有關)
Filesystem 的 cache
Write-through / back
Write-through: write is done synchronously both to the cache and to the backing store.
Write-back: writing is done only to the cache.
The write to the backing store is postponed until the modified content is about to be replaced by another cache block.
記憶體裡面 cache 有一個 bit (dirty) 用來指示這筆資料已經被 CPU 修改過但是尚未回寫到儲存裝置中.
Diagram
CPU <-> Cache Buffer <-> Device
drop_caches
# To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
# To free dentry and inodes (slab cache memory):
echo 2 > /proc/sys/vm/drop_caches
* dirty objects cannot be freed, running sync before
Remark
- An inode in your context is a data structure that represents a file
- A dentries is a data structure that represents a directory
# To free pagecache, dentries and inodes:
echo 3 > /proc/sys/vm/drop_caches
Kernel 的 swap 偏好 (swappiness)
Linux uses a Split Least Recently Used (LRU) page replacement strategy.
查看
cat /proc/sys/vm/swappiness
60 <--- default
value:
0: The kernel will swap only to avoid an out of memory condition.
100: The kernel will swap aggressively. (prefer to find inactive pages and swap them out)
更改設定
sysctl -w vm.swappiness=30
OR
/etc/sysctl.conf
vm.swappiness=60
vm.min_free_kbytes
設定
# number of free pages the system maintains, 當小於 N 時, kswapd 就會開始工作
vm.min_free_kbytes = 102400
min_free_kbytes
allows this memory to be instantly available and reduces the memory pressure when new processes need to start,
run and finish while there is a high memory load and a full buffer cache.
This controls the amount of memory that is kept free for use by special reserves including "atomic" allocations
設定 too low
prevents the system from reclaiming memory. (導致出事時會 OOM-killing multiple processes)
設定 too high (5-10%)
results in the system spending too much time reclaiming memory.
(原因: Linux is designed to use all available RAM to cache file system data.
pdflush (Writeout of dirty data)
dirty_background_ratio
cat /proc/sys/vm/dirty_background_ratio
10
# Unit: %
# Writeout of dirty data begins in the background
dirty_ratio
# absolute maximum amount of system memory that can be filled with dirty pages
cat /proc/sys/vm/dirty_background_ratio
20
dirty_ratio vs dirty_bytes
dirty_bytes Contains the amount of dirty memory at which a process generating disk writes will itself start writeback.
dirty_bytes is the counterpart of dirty_ratio. Only one of them may be specified at a time.
When one sysctl is written it is immediately taken into account to evaluate the dirty memory limits and
the other appears as 0 when read.
page-cluster
logarithmic value: 0 => "1 page"(disables swap readahead), 1 => "1 page", 2=> "4 page"
number of pages up to which consecutive pages are read in from swap in a single attempt.
Default: 3
vfs_cache_pressure
percentage value
controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects
當 n < 100 => prefer to retain dentry and inode caches
Default: 100
value
0 never reclaim (easily lead to out-of-memory conditions) 100 reclaim at "fair" rate >100 prefer to reclaim (may have negative performance impact)
overcommit_memory (vm.overcommit_memory)
Default value is 0
# 0: 足夠才分配(OVERCOMMIT_GUESS)[heuristic]
# 1: 不管當前的內存狀態如何 (OVERCOMMIT_ALWAYS)
# 2: OVERCOMMIT_NEVER (小於 /proc/sys/vm/overcommit_ratio x Total_RAM + swap)
/proc/meminfo
Buffers
The amount, in kibibytes, of temporary storage for raw disk blocks.
SwapCached
The amount of memory, in kibibytes, that has once been moved into swap,
then back into the main memory, but still also remains in the swapfile.
This saves I/O, because the memory does not need to be moved into swap again.
Active
that has been used more recently and is usually not reclaimed unless absolutely necessary.
Active(anon)
The amount of anonymous and tmpfs/shmem memory, in kibibytes, that is in active use,
or was in active use since the last time the system moved something to swap.
Active(file)
The amount of file cache memory, in kibibytes, that is in active use,
or was in active use since the last time the system reclaimed memory.
Unevictable
The amount of memory, in kibibytes, discovered by the pageout code,
that is not evictable because it is locked into memory by user programs.
Mlocked
The total amount of memory, in kibibytes,
that is not evictable because it is locked into memory by user programs.
Dirty
The total amount of memory, in kibibytes, waiting to be written back to the disk.
Writeback
The total amount of memory, in kibibytes, actively being written back to the disk.
Shmem
The total amount of memory, in kibibytes, used by shared memory (shmem) and tmpfs.
/proc/meminfo 的 Shmem = free 的 shared
Slab
The total amount of memory, in kibibytes, used by the kernel to cache data structures for its own use.
Dirty Memory
Page cache (disk cache) is used to reduce the number of disk reads.
Setting: /etc/sysctl.conf
dirty ratio
Memory: | blocking L nonblocking U |
Upper limit
# kernel start background writing out dirty data
# 觸發 pdflush/flush/kdmflush
# a percentage of total available memory vm.dirty_background_ratio = 10
Lower limit
the process doing writes would block and wait kernel write out dirty pages to the disks
vm.dirty_ratio = 20
# 多久才觸發一次 pdflush/flush/kdmflush processes wake up. Unit: 百分之一秒
vm.dirty_writeback_centisecs = 500
# dirty page 過了多久之後下次 pdflush 會被寫入 (作用: safeguard against data loss)
vm.dirty_expire_centisecs = 3000
_bytes 與 _ratio
# If you set the _bytes version the _ratio version will become 0, and vice-versa.
vm.dirty_background_bytes and vm.dirty_bytes
egrep -w "Dirty|Writeback" /proc/meminfo
Dirty: 200 kB # sync 後它會歸 0 Writeback: 0 kB
egrep "nr_dirty|nr_writeback" /proc/vmstat
nr_dirty 50 # nr = number nr_writeback 0
Read cache 測試
測試
echo 1 > /proc/sys/vm/drop_caches
free -m
total used free shared buff/cache available
Mem: 15857 7576 8091 27 189 8002
Swap: 7167 922 6245
cat vda.qcow2 > /dev/null
free -m
total used free shared buff/cache available
Mem: 15857 7576 6362 27 1917 7940
Swap: 7167 922 6245
cache 由 189 -> 1917
cat, cp, dd call would put the file into cache
Bypass Copy Using Cache
fincore
count pages of file contents in core
其他