Table of Contents
Why should you use HugePages ?
- Larger Page Size and Less # of Pages: Default page size is 4K whereas the HugeTLB size is 2048K. That means the system would need to handle 512 times less pages.
- Reduced Page Table Walking: Since a HugePage covers greater contiguous virtual address range than a regular sized page, a probability of getting a TLB hit per TLB entry with HugePages are higher than with regular pages. This reduces the number of times page tables are walked to obtain physical address from a virtual address.
- Less Overhead for Memory Operations: On virtual memory systems (any modern OS) each memory operation is actually two abstract memory operations. With HugePages, since there are less number of pages to work on, the possible bottleneck on page table access is clearly avoided.
- Less Memory Usage: From the Oracle Database perspective, with HugePages, the Linux kernel will use less memory to create pagetables to maintain virtual to physical mappings for SGA address range, in comparison to regular size pages. This makes more memory to be available for process-private computations or PGA usage.
- No Swapping: We must avoid swapping to happen on Linux OS at all Document 1295478.1. HugePages are not swappable (whereas regular pages are). Therefore there is no page replacement mechanism overhead. HugePages are universally regarded as pinned.
- No ‘kswapd’ Operations: kswapd will get very busy if there is a very large area to be paged (i.e. 13 million page table entries for 50GB memory) and will use an incredible amount of CPU resource. When HugePages are used, kswapd is not involved in managing them. See also Document 361670.1
What should you know about using HugePages ?
- Transparent HugePages are known to cause unexpected node reboots and performance problems with RAC
- Oracle strongly advises to disable the use of Transparent HugePages.
- Oracle highly recommends the use of standard HugePages that were recommended for previous releases of Linux.
- Automatic Memory Management (AMM) is not compatible with Linux HugePages use Automatic Shared Memory Management and Automatic PGA Management
- If standard HugePages are not configured RACCHECK fails with: the Operating system hugepages count does not satisfy total SGA requirements
- The HugePages are allocated in a lazy fashion, so the “Hugepages_Free” count drops as the pages get touched and are backed by physical memory. The idea is that it’s more efficient in the sense that you don’t use memory you don’t touch.
- If you had set the instance initialization parameter PRE_PAGE_SGA=TRUE (for suitable settings see Document 30793.1), all of the pages would be allocated from HugePages up front.
To convert a RAC database to ASMM we need to do following steps
- Disable Transparent HugePage
- Convert database from AMM to ASMM
- Configure standard HUGEPAGES
Disable Transparent HugePages
Check status Transparent HugePages # cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never --> [always] marks that this system is using transparent hugepages Get the current Hugepagesize # grep Huge /proc/meminfo AnonHugePages: 485376 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB --> Hugepagesize : 2Mbyte AnonHugePages: 485376 kB -> the kerrnel is using Transparent HugePages as AnonHugePages > 0kByte Because the kernel currently uses Transparent HugePages only for the anonymous memory blocks like stack and heap, the value of AnonHugepages in /proc/meminfo is the current amount of Transparent HugePages that the kernel is using. To disable Transparent HugePages add transparent_hugepage=never to /etc/grub.conf ( see Note 1557478.1 for using /sys/kernel/ ) After changing /etc/grub.conf and reboot if your system and verify that Transparent HugePages are disabled. # cat /sys/kernel/mm/transparent_hugepage/enabled always madvise [never] # grep AnonHugePages /proc/meminfo AnonHugePages: 0 kB
Switching RAC database from Automatic Memory Management ( AMM ) to Automatic Shared Memory Management ( ASMM )
Please read the following link to archive this.
Setup standard HugePages
Configure memlock user limit set Have the memlock user limit set in /etc/security/limits.conf file. Set the value (in KB) slightly smaller than installed RAM. e.g. For 4GB RAM installed, you may set: ( 4*1024*1024 = 4194304 ) -> 4200000 # Needed for hugepages setup ( 4 GByte ) * soft memlock 4200000 * hard memlock 4200000 Login as oracle user and verify setting ( csh ) $ limit | grep memorylocked memorylocked 4200000 kbytes or sh $ ulimit -l 4200000 Start instance(s) and verify your current Huge pages usage $ grep Huge /proc/meminfo AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB --> Standart HugePages not yet configured Calculate nr_hugepages using script from Document 401749.1 ./hugepages_settings.sh This script is provided by Doc ID 401749.1 from My Oracle Support (http://support.oracle.com) where it is intended to compute values for the recommended HugePages/HugeTLB configuration for the current shared memory segments. Before proceeding with the execution please note following: * For ASM instance, it needs to configure ASMM instead of AMM. * The 'pga_aggregate_target' is outside the SGA and you should accommodate this while calculating SGA size. * In case you changes the DB SGA size, as the new SGA will not fit in the previous HugePages configuration, it had better disable the whole HugePages, start the DB with new SGA size and run the script again. And make sure that: * Oracle Database instance(s) are up and running * Oracle Database 11g Automatic Memory Management (AMM) is not setup (See Doc ID 749851.1) * The shared memory segments can be listed by command: # ipcs -m Press Enter to proceed... Recommended setting: vm.nr_hugepages = 708 Set kernel parameter and reboot system Edit the file /etc/sysctl.conf and set the vm.nr_hugepages parameter there: vm.nr_hugepages = 708 Check available hugepages after reboot system and restarting RAC instances As oracle user verify ( HugePages_Free < HugePages_Total --> Hugepages are used ) $ grep Huge /proc/meminfo AnonHugePages: 0 kB HugePages_Total: 708 HugePages_Free: 399 HugePages_Rsvd: 396 HugePages_Surp: 0 Hugepagesize: 2048 kB $ id uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),501(vboxsf),506(asmdba),54322(dba) $ limit | grep memorylocked memorylocked 4200000 kbytes $ cat /sys/kernel/mm/transparent_hugepage/enabled always madvise [never] Verify alert.log for Huge Page usage ****************** Large Pages Information ***************** Total Shared Global Region in Large Pages = 1410 MB (100%) Large Pages used by this instance: 705 (1410 MB) Large Pages unused system wide = 3 (6144 KB) (alloc incr 16 MB) Large Pages configured system wide = 708 (1416 MB) Large Page size = 2048 KB ***********************************************************
For troubleshooting check Note: HugePages on Oracle Linux 64-bit (see Doc ID 361468.1)
Symptom Possible Cause Troubleshooting Action System is running out of memory or swapping --> Not enough HugePages to cover the SGA(s) and therefore the area reserved for HugePages are wasted where SGAs are allocated through regular pages. --> Review your HugePages configuration to make sure that all SGA(s) are covered. Databases fail to start --> memlock limits are not set properly --> Make sure the settings in limits.conf apply to database owner account. One of the database fail to start while another is up --> The SGA of the specific database could not find available HugePages and remaining RAM is not enough. --> Make sure that the RAM and HugePages are enough to cover all your database SGAs Cluster Ready Services (CRS) fail to start --> HugePages configured too large (maybe larger than installed RAM) --> Make sure the total SGA is less than the installed RAM and re-calculate HugePages. HugePages_Total = HugePages_Free --> HugePages are not used at all. No database instances are up or using AMM. --> Disable AMM and make sure that the database instances are up. See Doc ID 1373255.1 Database started successfully and the performance is slow --> The SGA of the specific database could not find available HugePages and therefore the SGA is handled by regular pages, which leads to slow performance --> Make sure that the HugePages are many enough to cover all your database SGAs
Memory considerations using Hugepages – Monitor tool: top
Without TFA,CRS,RAC Rdbms - pure OS with Hugepages disabled Mem: 3602324k total, 686116k used, 2916208k free, 36936k buffers Swap: 6373372k total, 0k used, 6373372k free, 271596k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3099 root 20 0 1065m 114m 9.8m S 0.7 3.2 0:08.47 java 2694 root 20 0 149m 47m 7132 S 0.3 1.3 0:02.17 Xorg 2838 root 20 0 197m 1476 960 R 0.3 0.0 0:00.22 VBoxClient --> OS only used 600 MByte from our RSS memory - 3 Gbyte are free Without TFA,CRS,RAC Rdbms - pure OS with Hugepages enabled # Using hugepages vm.nr_hugepages = 708 Mem: 3602324k total, 1996272k used, 1606052k free, 34716k buffers Swap: 6373372k total, 0k used, 6373372k free, 245076k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2613 root 20 0 144m 42m 7632 S 0.0 1.2 0:04.23 Xorg 2860 root 20 0 918m 21m 14m S 0.0 0.6 0:01.49 nautilus -> Hugepages are allocated during boot time -> RSS free memory drops from 2.9 GByte to 1.6 Byte ( 708 x 2 MByte = 1.4 Gbyte ) -> Hugepages are not pageable -> All the Hugepage memory is taken from RSS Active components : TFA, OS with Hugepages Mem: 3602324k total, 2130804k used, 1471520k free, 38464k buffers Swap: 6373372k total, 0k used, 6373372k free, 273624k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4470 root 20 0 1067m 104m 9.8m S 0.0 3.0 0:09.72 java <-- TFA 2613 root 20 0 144m 42m 7632 S 0.0 1.2 0:04.23 Xorg 2860 root 20 0 918m 21m 14m S 0.0 0.6 0:01.49 nautilus -> Even TFA uses ~ 1 Gbyte Virtual memory it only use ~ 100 Mbyte from our RSS Active components: TFA,CRS,ASM,OS with Hugepages Mem: 3602324k total, 3136516k used, 465808k free, 43168k buffers Swap: 6373372k total, 0k used, 6373372k free, 720812k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5690 grid RT 0 646m 115m 53m S 1.0 3.3 0:02.46 ocssd.bin 4470 root 20 0 1067m 105m 9.8m S 0.0 3.0 0:10.80 java 5678 root RT 0 636m 94m 55m S 0.0 2.7 0:00.18 cssdagent 5658 root RT 0 635m 93m 55m S 0.0 2.7 0:00.25 cssdmonitor 5646 root RT 0 628m 86m 55m S 1.7 2.5 0:02.74 osysmond.bin 2613 root 20 0 144m 42m 7896 S 0.3 1.2 0:06.75 Xorg --> After loading CRS we have a large drop of 1 Gyte from free memory due to --> Lots of CRS processes get started and use RSS memory --> ASM gets started --> Lots of Oracle shared libraries get mapped into memory Active components: RAC RDBMS instance, TFA,CRS,ASM,OS with Hugepages Mem: 3602324k total, 3519732k used, 82592k free, 47796k buffers Swap: 6373372k total, 14692k used, 6358680k free, 740624k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9369 root RT 0 431m 144m 60m D 1.7 4.1 0:35.32 ologgerd 5690 grid RT 0 646m 116m 53m S 0.7 3.3 0:25.15 ocssd.bin 4470 root 20 0 1067m 107m 9.8m S 0.0 3.0 0:18.15 java 5678 root RT 0 636m 94m 55m S 0.3 2.7 0:01.96 cssdagent 5658 root RT 0 635m 93m 55m S 0.3 2.7 0:02.22 cssdmonitor 5646 root RT 0 629m 87m 55m S 2.3 2.5 0:36.55 osysmond.bin --> After RDBMS restart only 400 MByte memory are allocted from free RSS memory --> Most of the Oracle libraries are already mapped by CRS startup --> SGA is already allocated during boot time as we are using Hugepages --> Oracle RDBMS processes still takes some RSS memory ( ~ 400 MByte per instance)
References
- HugePages on Oracle Linux 64-bit (Doc ID 361468.1) <– Read this first
- ALERT: Disable Transparent HugePages on SLES11, RHEL6, OEL6 and UEK2 Kernels (Doc ID 1557478.1)
- HugePages and Oracle Database 11g Automatic Memory Management (AMM) on Linux (Doc ID 749851.1)
- HugePages on Linux: What It Is… and What It Is Not… (Doc ID 361323.1)
- Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB Configuration (Doc ID 401749.1)