标准大页(HugePages)

标准大页(HugePages)是从 Linux Kernel 2.6 后被引入的。
目的是用更大的内存页面(memory page size)以适应越来越大的系统内存,让操作系统可以支持现代硬件架构的大页面容量功能。

透明大页(Transparent HugePages)

透明大页(Transparent Huge Pages)缩写为THP,透明超大页面(THP)在RHEL 6中默认情况下对所有应用程序都是启用的。
内核试图尽可能分配巨大的页面,主内核地址空间本身被映射为巨大的页面,减少了内核代码的TLB压力。
内核将始终尝试使用大页来满足内存分配。
如果没有可用的巨大页面(例如由于物理连续内存不可用),内核将回退到正常的4KB页面。
THP也是可交换的(不像hugetlbfs)。
这是通过将大页面分成更小的4KB页面来实现的,然后这些页面被正常地换出。

透明大页存在的问题:

Oracle Linux team在测试的过程中发现,如果linux开启透明大页THP,则I/O读写性能降低30%;
如果关闭透明大页THP,I/O读写性能则恢复正常。
另,建议在Oracle Database中不要使用THP。
ORACLE官方不建议在使用RedHat 6, OEL 6, SLES 11 and UEK2 kernels 时开启透明大页(THP),因为透明大页存在一些问题:

  1. 在RAC环境下,透明大页(THP)会导致异常节点重启和性能问题;
  2. 在单机环境中,透明大页(THP)也会导致一些异常的性能问题;

标准大页和透明大页区别:

两者区别在于大页的分配机制,标准大页管理是预分配 方式,而透明大页管理则是动态分配 方式。
目前透明大页与传统大页混合使用会出现一些问题,导致性能问题和系统重启。

如何开启标准大页(HugePages)

适用于:
1.Kernel Version 2.6及更高。
2.Oracle AMM内存管理和HugePages不兼容,确保在AMM关闭的情况下启动HugePages。
启用HugePages方法:
1 运行以下命令以确定内核是否支持HugePages
$ grep Huge /proc/meminfo
2 配置memlock
在/etc/security/limits.conf文件中设置memlock值,memlock设置以KB为单位。
当启用HugePages内存时,最大锁定内存限制应至少设置为当前服务器内存的90%。
禁用HugePages内存时,最大锁定内存限制应设置为至少3145728 KB(3 GB)。
例如,如果安装了64 GB RAM,则添加以下条目以增加最大锁定内存地址空间:

  • soft memlock 60397977
  • hard memlock 60397977
    也可以将memlock值设置为高于SGA要求的值。
    再次以oracle用户身份登录并运行ulimit-l命令以验证新的memlock设置:
    su - oracle
    $ ulimit -l
    3 启动实例
    检查实例是启动状态
    srvctl status instance -d dbname
    如果没启动,手动启动实例
    srvctl start instance -d dbname -i instance_name -o open
    4 使用脚本为当前共享内存段计算hugepages配置的建议值:
    root用户下执行
    chmod +x hugepages_settings.sh
    ./hugepages_settings.sh
    脚本来自My Oracle Support note 401749.1,脚本详细内容见末尾。
    5 停止数据库实例
    srvctl stop instance -d dbname -i instance_name -o immediate
    6 设置vm.nr_hugepages内核参数
    写入配置文件,永久生效

    vi /etc/sysctl.conf
    vm.nr_hugepages=<value from above>
    sysctl -p
    # sysctl -w vm.nr_hugepages=<value from above> 临时改变,重启失效

    7 启动实例
    srvctl start instance -d dbname -i instance_name -o open
    8 检查可用的hugepages
    $ grep Huge /proc/meminfo
    如果配置没生效,需要重启服务器
    [root@rac1 ~]# grep Huge /proc/meminfo
    AnonHugePages: 0 kB
    HugePages_Total: 179
    HugePages_Free: 9
    HugePages_Rsvd: 7
    HugePages_Surp: 0
    Hugepagesize: 2048 kB
    参考:

My Oracle Support note 401749.1?
My Oracle Support note 361323.1
Database Administrator's Reference for Linux and UNIX System-Based Operating Systems

如何关闭透明大页(Transparent HugePages)

Linux7 默认情况下 是开启透明大页功能的。检查系统对应版本
[root@DB ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.2 (Maipo)
关闭THP
[root@DB ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
默认情况下,状态为 always,需要调整为 never
THP 禁用方的几种方法

方法 1:

[root@DB ~]# vi /etc/default/grub
GRUB_CMDLINE_LINUX="rd.lvm.lv=rhel/swap rd.lvm.lv=rhel/root rhgb quiet transparent_hugepage=never"
运行下列命令使之修改生效:
[root@DB ~]# grub2-mkconfig -o /boot/grub2/grub.cfg

方法 2:
[root@DB ~]#vi /etc/rc.local
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi

[root@DB ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]

方法 3:

[root@DB ~]# echo never > /sys/kernel/mm/transparent_hugepage/enabled
[root@DB ~]# cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
always madvise [never]
查看是否关闭透明大页
[root@DB ~]# cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
always madvise [never]
如果输出结果为[always]表示透明大页启用了。[never]表示透明大页禁用;
[root@DB ~]# grep -i HugePages_Total /proc/meminfo
如果 HugePages_Total,返回 0,也意味着透明大页禁用了
[root@DB ~]# cat /proc/sys/vm/nr_hugepages
返回 0 也意味着透明大页禁用了。

hugepages_settings.sh脚本内容如下:

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
# on Oracle Linux
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support
# 
# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support
() where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments on Oracle Linux. Before proceeding with the execution please note following:
 * For ASM instance, it needs to configure ASMM instead of AMM.
 * The 'pga_aggregate_target' is outside the SGA and
   you should accommodate this while calculating the overall size.
 * In case you changes the DB SGA size,
   as the new SGA will not fit in the previous HugePages configuration,
   it had better disable the whole HugePages,
   start the DB with new SGA size and run the script again.
And make sure that:
 * Oracle Database instance(s) are up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not setup
   (See Doc ID 749851.1)
 * The shared memory segments can be listed by command:
     # ipcs -m
Press Enter to proceed..."
read
# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%dn",$1,$2); }'`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
if [ -z "$HPG_SZ" ];then
    echo "The hugepages may not be supported in the system where the script is being executed."
    exit 1
fi
# Initialize the counter
NUM_PG=0
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`
do
    MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
    if [ $MIN_PG -gt 0 ]; then
        NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
    fi
done
RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`
# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
    echo "***********"
    echo "** ERROR **"
    echo "***********"
    echo "Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:
    # ipcs -m
of a size that can match an Oracle Database SGA. Please make sure that:
 * Oracle Database instance is up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not configured"
    exit 1
fi
# Finish with results
case $KERN in
    '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
           echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
    '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    '3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    '3.10') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    '4.1') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    '4.14') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    '5.4') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    *) echo "Kernel version $KERN is not supported by this script (yet). Exiting." ;;
esac
# End

转自chenoracle

最后修改:2022 年 04 月 03 日
如果觉得我的文章对你有用,请随意赞赏