Coupling Technology: What’s Right for Me?
Desired outcome and choices help determine setup
8/5/2015 12:30:41 AM |
By Barbara Weiler and Gary King
Parallel Sysplex technology has been around since the early 1990s and its complexity often has people asking, “What’s the right stuff for me?” This article will guide an informed decision in answer to that question. In the beginning there weren’t many choices to make if you were creating a sysplex but along the way the number of choices increased with the introduction of new technology and functions.
A sysplex is built of some number of z/OS images and their associated DASD, one or more coupling facility (CF) images, and a number of coupling links to connect the images (see Figure 1). The sysplex can be used for resource sharing, data sharing and workload management. Whether you are using cross-system CF; global resource serialization star or various logs for resource sharing; locking and/or group buffer pools for data sharing; or providing workload management using VTAM, CICS Transaction Server, IMS SMQ or MQ Shared Queues or some combination of these will influence the attention required in analyzing specific technologies.
For example, if the sole purpose of the sysplex is for resource sharing then the speed of the coupling links may be less of a concern than if there is a significant amount of data sharing being done. Regardless of the purpose for the sysplex, making the right technology choices is important to achieving the expected overall performance.
There are three key things to consider when choosing CF technology: functionality, capacity and service time. How the CF functions is related to the Coupling Facility Control Code (CFCC) level which determines the capabilities of the CF. Recently added CFCC capabilities include coupling thin interrupt support in CF Level 19 and later (for IBM zEnterprise zEC12 GA2, IBM zEnterprise zBC12 and IBM z13 servers) and support for the new Integrated Coupling Adapter SR coupling links in CF Level 20 (for the z13). Thus, if a specific capability is required, that would dictate the CFCC level and limit the hardware choices.
Next to consider are the CF partition options. CF images can run on Internal Coupling Facility (ICF) engines or on Central Processors (CP). There are financial and performance advantages to running on ICFs especially with larger sysplexes but some smaller sysplexes can run fine using CPs.
With either engine type there is a choice to run with the CF engines dedicated to the partition or shared with other partitions. For production environments dedicated engines are recommended to provide optimal performance. Prior to CF Level 19 and the introduction of coupling thin interrupts, using shared engines meant that only one of the CF images sharing the engines would achieve good service times while the others had poor to bad service times. While it is clear that this isn’t necessarily desired for high volume production environments it could suffice for test environments. With CF Level 19 and coupling thin interrupts the opportunity for using shared CF engines has greatly improved.
The next item to consider under CF Funtionality is whether to use internal—on the same Central Electronic Comples (CEC) as some number of z/OS images in the same sysplex—or external or standalone CFs. Figure 2 illustrates having standalone CFs by green triangles. This configuration inherently provides failure isolation. It’s the easiest to maintain and provides the most connectivity and is most commonly used for large sysplexes and intensive data sharing workloads. Figure 3 demonstrates having a combination of an internal CF by the green rectangle and a standalone CF while Figure 4 illustrates having only internal CFs.
An all-internal CF configuration is less costly due to not requiring a separate footprint for the CFs. It also has the advanatages of upgrading both the CF and host technologies simultaneously and being able to utilize Internal Coupling (IC) links. This configuration does, however, require system-managed CF structure duplexing (CF Duplexing) to provide failure isolation. Note that the cost for CF Duplexing for intensive data sharing workloads could be prohibitive. Internal CF partitions are mostly used for smaller sysplexes or those for resource sharing or low-intensity data sharing workloads.
What is CF Duplexing? In general, configuring a primary and a secondary structure to provide failover capability should one or the other structure have an issue. There are two key types of CF Duplexing: User Managed (UM) and System Managed (SM). UM CF structure duplexing, only available for DB2 group buffer pools and IMS shared virtual storage option is where the user requests the primary and secondary structure, writes updates to both and synchronizes via already held locks. SM CF structure duplexing instead has the installition requesting the duplexing option for the specific exploiters or structures. Then, the system creates the primary and secondary structures, writes updates to both, and synchronizes using CF to CF operations.
As noted above, using the SM CF Duplexing feature can be costly and thus choosing to implement the function requires an evaluation of the value received and the cost incurred. Duplexing will provide faster recovery times from a CF failure. In fact, log recovery times will be on the order of 40 times faster while rebuild times will be around 4 times faster. The other value to CF Duplexing is, as noted previously, the failure isolation to exploit internal CFs. On the cost side of the equation there is an increase in resources required.
Additional host CPU, CF CPU and CF links will be required. User managed costs will be two times the write activity to the simplex structure. The write activity to the types of structures that support User Managed duplexing is typically 20 percent but could range from 1 percent to 100 percent of the structure activity. SM Duplexing costs, typically used for list and lock structures, can range from three to five times the simplex cost. In these structures nearly 100 percent of the structure activity is write activity. Given the described costs it is highly recommended that the CF Duplexing be enabled selectively when the value outweighs the cost.
CF capacity is clearly related to some of the choices made regarding CF functionality. Recommended capacity planning should be followed to allow for enough capacity for handling the desired request rate while keeping the CF utilization below 50 percent. Note that in the case where there may be only a single CF engine the recommendation is to keep the CF utilization below 30 percent to accommodate single engine queuing and long-running CF commands that might block shorter running commands.
CF capacity can be increased by either adding CF engines or by moving to faster engines. Capacity planning should be done utilizing recommended tools such as IBM Processor Capacity Reference (zPCR) and/or zCP3000.
Among the things that influence the CF service time is the speed of the CF engines and the Coupling link technology. The various link channel types available on the z13 in order of relative speed are: ICP; CS5; and the Parallel Sysplex InfiniBand (IFB) channels CIB 12x IFB3, CIB 12x IFB, and CIB 1x IFB. Additional details relating to the various channel types can be found in the reference material found at the end of this article.
The exploiter can determine whether to send to the CF request synchronously or asynchronously relative to the host processor. In either case there’s a software and a hardware cost associated with the request. Synchronous requests will incur a software cost related to the time spent by the exploiter and cross-system extended services (XES) in making the request for CF services and the time spent in making the resulting response. The hardware cost for synchronous requests can be thought of as the time XES is waiting for the CF operation to be completed. During this time the host processor is waiting for the CF operation to complete. The synchronous CF service time is the sum of this software and hardware costs. This service time is directly related to host data sharing cost.
When the exploiter decides to request the CF operation asynchronously the cost components are somewhat different. In this case the software cost, in addition to the exploiter and XES components as described in the synchronous operation, also includes a cost associated with service request blocks. This added activity is related to the task switch that would allow other work to be done on the host processor while waiting for the CF operation to complete. This cost would also include any impact to the host hardware due to the task. Additional latency to recognize that the CF operation is complete must also be included.
Meanwhile, the hardware component as described in the synchronous execution is virtually non-existent in terms of data sharing cost. However, this hardware component is included in asynchronous CF service time. The added latency for XES to recognize the completion of an asynchronous CF operation is variable and is related to the amount of z/OS activity. This added latency was improved due to the z/OS exploitation of coupling thin interrupts in z/OS V2.1 or with the proper maintenance on V2.13 and V2.12. The amount of improvement is related the amount the z/OS activity with the larger improvement coming when z/OS is less active.
Given this basic understanding of synchronous and asynchronous CF operations and how they impact data sharing costs and CF service time, it’s important to know that there are times when the exploiter’s request to send a request synchronously can be overturned. This can happen if XES determines by way of a heuristic algorithm that the synchronous command is taking too long compared to a “break even” cost associated with asynchronous processing. This is a means of limiting the data sharing cost. Additionally a CF request can be changed from synchronous to asynchronous if the synchronous request encounters a subchannel busy condition.
The rate of CF requests can also have an impact on the CF service time. Periods of high CF activity or bursts of requests can tend to exhibit higher service times due to queuing or constraints on resources.
Perhaps utmost in the CF service time discussion is the Host Capacity Effect. This is directly related to the amount of CF activity and can be calculated as the product of the CF request rate and the sum of the software and hardware cost as described previously. It will vary based on the portion of the workload involved in data sharing; the access rate to shared data; and the type of hardware for host, CF and coupling links. Typical system hardware effects range from 2 to 3 percent for Resource Sharing and 5 to 10 percent for data sharing primary production environments. Individual transaction or job effects can have a wide variation.
Figure 5 shows some production examples of the Host Effect for the primary application involved in data sharing. The chart is based on a survey of data sharing customers. Figure 6 also illustrates the host effect for the primary application involved in data sharing but in this view the chart is based on having nine CF operations per million instructions (ops/Mi). Note that was the highest CF access rate in Figure 5. The chart then provides the host effect associated with the various host processors (across the top) for the CF processor and coupling link technology (down the left column). For example, z13 host coupled to a zEC12 CF with CIB 12x IFB3 coupling links would have a host effect of 12 percent. That is, 12 percent of the z13 MIPS involved in the production data sharing application that is doing nine CF ops/Mi will be spent on data sharing. The table can be scaled linearly for applications that do more or less CF ops/Mi.
Choosing the appropriate coupling technology isn’t a simple decision. It involves knowing which CF functions you wish to exploit, determining a CF configuration that provides the desired performance characteristics.
Distance is another item to consider when looking at CF service time. The CF service time will be elongated 10 microseconds per kilometer of distance due to the speed of light through the fiber. This service time increase can force the CF requests to be changed from synchronous to asynchronous with respect to the host processor. Remember that sending the request asynchronously adds to the CF service time but the host effect will be limited due to being able to run other work while waiting. This increase in service time can have an additional impact on the application performance.
Since the transaction waits for both synchronous and asynchronous requests there is potential impact to the subsystem queues and lock contention. Figure 7 illustrates the impact of distance on the transaction response time. Note the significant growth in response time when the LPAR is remote to the CF as opposed to when it’s local. While the service time elongation and the host effect being capped by the conversion of synchronous to asynchronous activity might sound fairly straightforward, distance effects are difficult to predict. It’s likely that more subchannels will be required between the host and the remote CF. Taking advantage of the 32 subchannels per channel-path identifier available with CIB 1x IFB links helps here. Each application will react differently so the potential application performance impact noted earlier is difficult to predict. It’s recommended that application stress testing over simulated distance be done prior to moving to a sysplex, which spans a considerable distance.
Deciding what coupling technology is right is complex. Many factors will play a role in the decision. The following checklist, along with the information provided should help guide the decision-making process.
- CF funtionality: CFCC level, dedicated vs. shared, standalone vs. internal
- CF capacity: Need enough to hand the request rate and keep the CF utilization less than 50 percent, add CF engines or move to faster CF engines if more capacity is required
- CF service time:
- Affected by: CF engine speed, link technology, distance between the host and CF
- Affects: Cost of data sharing due to host processor time waiting for synchronous requests, application performance due to transaction waiting on requests (most like when there is significant distance involved)
- Impact relative to: Rate of requests to the CF, speed of the host processor
When making a decision to implement a sysplex or change the configuration of the sysplex it is important to follow the recommended guidelines in order to achieve the best possible performance.
“Coupling Facility Configuration Options” and “System Managed Coupling Facility Structure Duplexing” whitepapers
System z Parallel Sysplex Best Practices
Coupling Facility Structure Sizer Tool
“Coupling Thin Interrupts” whitepaper
Barbara Weiler is in IBM z Systems Performance. She has been a member of the IBM system performance community for 33 years specializing, most recently in Parallel Sysplex performance. Weiler has a master’s in Secondary Mathematics Education from SUNY New Paltz and a bachelor’s in Mathematics from Marist College.
Gary King is an IBM Distinguished Engineer in z Systems Design and Performance. Since joining IBM in 1974, he has been involved in the design and evaluation of the major system resource managers of z/OS, Parallel Sysplex, coupling facilities and high-end servers. He has developed a number of techniques and methodologies for performance analysis and capacity planning which have been used both to direct product development and to assist clients with performance management of their systems.