Professional Documents
Culture Documents
By Dirk Michel Power Systems Performance 2012 updates by Bob Welgan, IBM Systems ISV Enablement SAP Performance Benchmarking
Active Memory Expansion performance Preface Acknowledgments: Disclaimer: 1 Introduction 2 Active Memory Expansion overview 2.1 Active Memory Expansion concept 2.2 Active Memory Expansion value 3 Active Memory Expansion performance considerations 3.1 Memory expansion factor 3.2 Application response time 4 Active Memory Expansion performance measurements 4.1 ERP overview 4.1.1 ERP workload performance measurements 4.1.2 ERP workload single partition throughput 4.1.3 ERP workload summary 5 Conclusion Appendix A: POWER7 ERP workload server throughput Trademarks and special notices 3 3 3 4 5 5 6 7 7 8 9 9 9 11 13 14 15 18
Page 2
Preface
This document introduces the basic concepts of Active Memory Expansion, showing the principles of operation and performance characteristics of this new component of the AIX operating system. Active Memory Expansion is available on IBM POWER7 systems starting with AIX 6.1 TL04 SP2. This paper is an update to an earlier paper and it contains data collected from experiments on an IBM POWER7+ system. The audience for this document consists of computer users, administrators, programmers, and performance analysts, as well as team leaders and management who need to understand the basics of this technology. You can find a detailed description of design points and detailed instructions for the configuration of Active Memory Expansion in the AIX information center at
http://pic.dhe.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.prftungd/doc/prftungd/intro_ame_process.htm
Acknowledgments:
We would like to thank the people who made invaluable contributions to this paper as well as the developers of this remarkable technology. Contributions included authoring, insights, reviews, critiques and reference documents. Co-authors of this document are: Thuy Nguyen, Boyd Murrah, Walter Orb, Joerg Droste, and Jose Escalera
Disclaimer:
All performance data contained in this publication was obtained in the specific operating environment and under the conditions described below and is presented as an illustration. Performance obtained in other operating environments may vary and customers should conduct their own testing.
Page 3
1 Introduction
All computers have a limited amount of random access memory (RAM) in which to run programs. Therefore, one of the perennial design issues for all computer systems is how to make the best use of the entire RAM that is physically available in the system, in order to run as many programs concurrently as possible, in the limited space available. Active Memory Expansion, originally a POWER7 feature, supplies a new technique for making better use of RAM: Portions of programs that are infrequently used are compressed into a smaller space in RAM. This, in turn, expands the amount of RAM available for the same or other programs. Starting with POWER7+, Active Memory Expansion memory page compression and decompression is offloaded to a hardware accelerator. This white paper describes the process of Active Memory Expansion in more detail in the next sections, and then reports measurements on a typical workload which illustrates the beneficial effects. Among the benefits of Active Memory Expansion, this paper shows the following scenarios and their performance results: 1. Reducing the physical memory requirement of an logical partition (LPAR) resulting in 125% memory expansion on a POWER7+ system 2. Increasing the effective memory capacity and throughput of a memoryconstrained LPAR, resulting in a 54% increase in application throughput on a POWER7+ system
Page 4
Memory
AME
Compressed Pool
Expanded Memory
Figure 1 demonstrates an example of taking memory away from an existing LPAR and running it with less physical memory. The figure shows: 1. For a given data set in RAM, compression of data in inactive (or seldom active) pages reduces the total memory requirement for that data set. If the logical memory size in the LPAR is not changed, this expands the space available for active pages in running programs. Alternatively, the extra space can be given to another LPAR in the same frame, or the number of LPARs in the frame can be increased. 2. In view of the fact that compressed memory cannot be directly used by running programs, the diagram illustrates that the logical memory in any LPAR using Active Memory Expansion must be partitioned into active (uncompressed) and inactive (compressed) page pools.
Active Memory Expansion Performance.doc Page 5
3. The degree of compression illustrated in the figure above is purely for purposes of illustration. The compressibility of workloads will vary. 4. Pages in the uncompressed pool will be migrated to the compressed pool when they have been unreferenced for a sufficient length of time and there is a demand for memory pages. Pages in the compressed pool are migrated back on demand by referencing them. These operations are transparent to the application. 5. The size of the compressed memory pool is dynamically controlled by the operating system. It is not fixed. Expansion factor and compressibility of the data determine the size of the compressed pool.
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
In environments where the number of LPARs is limited by the physical amount of memory that is available to the system, Active Memory Expansion reduces the physical memory consumption of the individual LPARs and frees up memory that can be used for additional LPARs to drive more workload on the system.
Page 6
Application Throughput
CPU Utilization
Page 7
Uncompressed Pool
Compressed Pool
Compressed Pool
Expanded Memory
Expanded Memory
Figure 3. Impact on the sizes of the compressed and uncompressed pools for memory-expansion factors
A reasonable memory-expansion factor maintains a small compressed pool and a relatively large uncompressed pool. An aggressive memory-expansion factor leads to a larger compressed pool and a smaller uncompressed pool.
The performance of a workload is less affected by an aggressive memory-expansion factor if the workload has a small active working set and therefore a low amount of compression and decompression activity. However, the amount of compression and decompression activity increases if the memory-compression factor is set too aggressively. The performance of workloads with a large active memory set is more sensitive to an aggressive memory expansion factor. A reasonable memory-expansion factor can be determined by running the Active Memory Expansion Planning Tool, amepat from the command line.
Page 8
Users on systems with large amounts of memory might experience performance degradation when using Active Memory Expansion if their application used and benefitted from memory pages that are larger than 4 Kb. Active Memory Expansion automatically reconfigures the system to use only 4 Kb memory pages when enabled.
Active Memory Expansion Performance.doc Page 9
100 90 80 70 60 50 40 30 20 10 0 0% 25% 50% 75% 100% 125% 150% 175% 200% 225% Memory Expansion %
Figure 4 demonstrates the processor utilization of the sample SAP ERP workload at different memory expansions. Figure 5 shows the relative performance of the same workload as in the previous graph. There is virtually no measurable performance impact due to Active Memory Expansion until the memory expansion exceeds 125% and even then the performance impact is very slight. This demonstrates how Active Memory Expansion can help maintain application performance even in a reduced physical memory footprint.
CPU %
Page 10
0.80
0.60 0% 49% 67% 100% 113% 150% 177% 204% Memory Expansion %
As described in the previous section, the SAP ERP workload tests the ability of a configuration to support multiple users following through the simulated steps to complete a number of business transactions. Figure 5 shows the relative application performance of Active Memory Expansion enabled configurations compared to a base configuration that has all the necessary physical resources available. In this memory-constrained environment, as physical memory is reduced, Active Memory Expansion compensates by using available processor cycles to make more memory available. In spite of the additional work dispatched to the processor, the number of transactions per second completed is not affected. The memory expansion, as shown, can continue to increase as long as processor resources are available. After a threshold of processor availability is crossed, performance stability decreases. As long as there are processor cycles available, Active Memory Expansion allows the user to size the configuration for a workload beyond the ordinary limitations imposed by the lack of available physical memory.
Page 11
Throughput
600 Transactions per second 500 400 300 200
OS paging
100 0
100 90 80 CPU % Busy 70 60 50 40 30 20 10 0 2496 2624 2752 2880 3008 3136 3264 3392 3520 3648 3776 3904 4032 4160 4288 4416 4544 4672 4800 4928 5056 Users Avg pi AME off AME off AME on
24 96 26 24 27 52 28 80 30 08 31 36 32 64 33 92 35 20 36 48 37 76 39 04 40 32 41 60 42 88 44 16 45 44 46 72 48 00 49 28 50 56
Users AME off AME on
CPU Utilization
200 180 160 140 120 100 80 60 40 20 0 page-ins per sec
Figure 7. Processor utilization without and with Active Memory Expansion turned on
Up to the 3,264 user level, the throughput and processor utilization are nearly identical in both cases. With Active Memory Expansion turned off, there is already a significant amount of OS paging at the 3,264 user level starting to affect the throughput. At 3,392 users the throughput actually decreased because of heavy OS paging. At this level, you are using only 60% of the available processor resources; however the simulated users are suffering from increased response times (and thus decreased throughput). After turning Active Memory Expansion on, you can easily run the same workload and then scale the load up all the way to 5,058 users and a processor utilization of 90%. The throughput increased by 54% going from 3,392 users to 5,056 users.
Page 12
Page 13
5 Conclusion
The scenarios of reducing physical memory requirement of a partition, and increasing a partitions memory capacity outlined in this paper have shown that Active Memory Expansion for AIX on POWER7+ technology provides a significant improvement in system throughput and utilization by expanding system memory capacity.
Page 14
The next set of measurements conducted on a POWER7 system was intended to demonstrate Active Memory Expansion in a production-like setup with multiple partitions running on a single server. This test was conducted on an AIX 6.1 system with DB2 Version 9.5. This test simulated a SAP 3-tier setup with a single SAP system on a server with 48 GB of physical memory. The first partition was running the SAP ERP database server and an application instance. Three more partitions were configured as SAP application servers connected to the database server running in the first partition. Table 2 shows the configuration of the server partitions: Partition Number of processors Partition memory (GB) Active Memory Expansion off 20 14 14 0* Active Memory Expansion on 18 10 10 10 Database sever and application server Application server Application server Application server Role
1 2 3 4
8 8 8 8
* Partition 4 was deactivated during the runs with Active Memory Expansion turned off
For the runs with Active Memory Expansion disabled, the simulated users were distributed over the first three partitions. The fourth partition was deactivated, as all of the servers installed physical memory was used. After turning Active Memory Expansion on, the physical memory allocation for each of the first three partitions was reduced. This freed up 10 GB of physical memory, which allowed the activation of the fourth partition. The simulated users were then distributed over all four partitions. Figure 8 and Figure 9 show the results of scaling up the workload for both configurations. Note that these are shown as column graphs because the increase in workload was not done in equidistant steps (unlike in the previous section). Figure 8 the combined throughput of all active partitions in transactions per second (TPS) at a specific workload level. Figure 9 shows the respective processor utilization of the physical server for that workload. For the test case with Active Memory Expansion
Active Memory Expansion Performance.doc Page 15
turned off, the eight deactivated processors of partition 4 were included in the processor utilization calculation as idle processors.
Server Throughput
500
450
heavy OS paging
300
250 2900 3200 3780 Users AME off AME on 4500 5000
Server Utilization
100
80
heavy OS paging
CPU % 60
40
20
For the first two data points, the throughput and processor utilization are almost identical for both test cases, regardless of whether the simulated users were distributed over three partitions (with Active Memory Expansion turned off) or over four partitions (with Active Memory Expansion turned on). The partitions that had Active Memory Expansion turned off started to page in the operating system at the 2900 user level. Because paging is unwanted in a production environment, the test used the 2900 user level as
Active Memory Expansion Performance.doc Page 16
baseline for our comparison in this section. At the 3780 user level, the throughput for the configuration with Active Memory Expansion turned off decreased caused by heavy paging in the operating system. The processor utilization actually dropped as well as the runable user threads were slowed down by having to wait for their required memory pages to be paged in. In this particular test-case, the OS statistics showed an average of 400 page-in operations per second in a partition, with peaks of more than 4800 page-ins per second. The average response time of the simulated users went up from below 200 milliseconds to over 4 seconds. In real life, a system with such a heavy amount of OS paging would be basically unusable, even though there is plenty of spare processor capacity to handle additional workload. The configuration with four partitions and Active Memory Expansion enabled was easily able to handle the same workload and subsequently, the workload was scaled up to 5000 simulated users with a subsecond response time. The throughput increased by 60% going from 3200 users to 5000 users. The total virtual memory demand for all partitions at the 5000 user level was a little more than 65 GB, so the Memory Expansion for this test-case was about 35%. This number is lower than the ones shown in section 4.1.1 ERP workload performance measurements for two reasons: The workload was significantly increased and the major part of the spare processor capacity was used to process the increased business workload. The virtual memory footprint of an application server-only partition is relatively smaller than a partition running both a database and an application server. For the same workload, the percentage of actively used memory pages in an application server-only system is higher than in a database and application server partition.
A test case with four partitions each running its own SAP database and application server instance would result in overall higher virtual memory demand and lead to higher memory expansion numbers. A test case similar to the one in section 4.1.1 ERP Workload Performance Measurements using the four-partition setup with a fixed workload of 3600 users was also run. During the test sequence, the physical memory configuration for each partition was reduced. Table 3 shows the test results for this test-case: Memory GB Memory expansion Throughput (TPS) Processor % 48 0% 356 60 44 9% 356 60 39 23% 355 71 34 41% 353 77 32 50% 354 81 28 71% 343 90
The first row shows the combined physical memory allocation for all partitions. The measured throughput stays virtually the same up to 50% memory expansion and decreased slightly at the 71% memory expansion level.
Active Memory Expansion Performance.doc Page 17
Page 19