Professional Documents
Culture Documents
http://www.mvsbook.fsnet.co.uk/chap03c.htm
Contents
The Web Version of this chapter is split into 4 pages - this is page 3 - page contents are as follows: Section 3.6 - Serialisation 3.6.1 Introduction 3.6.2 Running Disabled 3.6.3 The ENQ/DEQ mechanism 3.6.4 Locks 3.6.5 Intersects Section 3.7 - Program Management 3.7.1 Program Fetch 3.7.2 Program Modes, Attributes, and Properties 3.7.3 The Linklist and LLA Back to part 2 of chapter 3 Forward to part 4 of chapter 3 Home Contents Prev part of chapter Next part of chapter Top of Page
3.6 Serialisation
3.6.1 Introduction
One potential problem in a multiprogramming operating system (i.e. one which can interleave multiple units of work, running them all concurrently) is the danger of two units of work attempting to update the same resource at the same time, or attempting to use a process which can only handle one requestor at a time. This could lead to serious integrity problems if it was allowed to occur. Imagine, for example, two tasks attempting to update the same record of a dataset at the same time. Each would read the record, update its copy in storage, then write its updated copy back to the dataset. The second updated copy to be written back, however, would overwrite the first one, and the first update would be lost. MVS provides several mechanisms to prevent such problems. These are: * running disabled for interrupts * the ENQ/DEQ mechanism * the lock mechanism * the intersect mechanism The following sections will cover each of these in turn.
1 of 8
11/2/2001 1:48 PM
http://www.mvsbook.fsnet.co.uk/chap03c.htm
Disabling the processor is an effective mechanism for serialising interrupt processing, but it would be impractical for other types of serialisation. It is essentially a hardware mechanism which is suitable only for controlling hardware events. It relates to an individual processor, as the PSW bits which it uses to control serialisation are kept in a register of the processor concerned, so it is not suitable for serialising resources which are shared between multiple processors. And it requires a dedicated bit in the PSA for each resource requiring serialisation, which would be totally impractical for serialising access to resources such as datasets, of which there could be tens of thousands in your installation, and which have unpredictable names - how would the system know which bit related to which dataset? Home Contents Previous Section Next Section Top of Page
Each ENQ request can have a "scope" of SYSTEMS, SYSTEM, or STEP. SYSTEMS means the resource is to be serialised across all MVS systems known to (and communicating with) the GRS address space on the current system. This is discussed in more detail in Chapter 14. SYSTEM means the resource is to be serialised across all address spaces on the current MVS system, and STEP means the resource is to be serialised only within the current address space. The scope should correspond to the usage of the resource - if a dataset is shared between multiple MVS systems, the scope of an ENQ for it should be SYSTEMS, for example, to ensure that requestors of the same resource on another system sharing the dataset are not able to ignore the first requestor. Similarly, if a control block exists only within a given address space and can only be used by that address space, an ENQ for it should only have a scope of STEP, or the ENQ could hold up requestors of a control block with the same name in a different address space! Each ENQ must also specify whether the requestor requires SHARED or EXCLUSIVE control of the resource. Usually, requestors who only wish read-type access to a resource will ENQ it as SHARED, while those who need to update it will ENQ it with EXCLUSIVE control. GRS will allow multiple requestors to hold the same resource with SHARED control, but only one to hold it with EXCLUSIVE control. For example, if you specify DISP=SHR on a JCL DD statement for a dataset, MVS will ENQ it for SHARED access at allocation time, but if you specify DISP=OLD, MVS will ENQ it for EXCLUSIVE access at allocation time. As a result, only one job at a time can allocate a given dataset with DISP=OLD, whereas many can allocate it concurrently with
2 of 8
11/2/2001 1:48 PM
http://www.mvsbook.fsnet.co.uk/chap03c.htm
DISP=SHR. The action GRS takes in response to an ENQ macro is as follows: * check the existing QCBs in its address space to see if there is an outstanding ENQ for this resource * if there is not, create a QCB for the resource and add it to the QCB chain, then create a QEL for this request and add it to the queue for this QCB, and finally return control to the task issuing the ENQ * if there is already a QCB for this resource, add a QEL representing this request to the queue for the resource, then check to see if the request can be satisfied * if the request is for exclusive control and there is already a QCB for the resource (and therefore a QEL representing an outstanding ENQ), then the requestor will be suspended, and GRS will not return control to it until the request reaches the top of the queue * if the request is for shared control and there is a QCB for the resource and a QEL representing an outstanding ENQ for exclusive control, then this requestor will also be suspended (whether the previous request for exclusive control has been granted yet or not) until all previous requests for exclusive control have been released with DEQ instructions * if the request is for shared control and there is a QCB but all QELs on the queue represent requests for shared control, then this requestor will be allowed to proceed, and GRS will return control to it. When the DEQ macro is issued, GRS will: * delete the QEL representing the request * if this is the only QEL for the QCB concerned, delete the QCB * if there are other QELs for this QCB, rebuild the QEL chain without the deleted QEL * if there are other QELs for this QCB which were suspended but now can be released, allow the tasks owning them to resume execution This is an effective method of serialising access to a wide range of resources, but there are a couple of potential problems which make it unsuitable for serialising on vital system resources. One is simply the inefficiency of holding up shared requests when there are exclusive requests above the shared request in the queue but all the current holders of the resource are only actually holding shared control. Thus, one user mistakenly specifying DISP=OLD on a widely shared dataset (e.g. an ISPF dataset) can cause many others to grind to a halt, with little hope of the exclusive request ever getting to the top of the queue itself. This can only be dealt with by cancelling the user concerned and ensuring they do not make the same mistake again. More insidious, however, is the case of the "deadly embrace". This can occur if task A gains exclusive control of resource X, then attempts to gain control of resource Y, while task B has already gained exclusive control of resource Y and is now attempting to gain control of resource X. Similar problems can occur with a chain of tasks making interrelated requests, but the two-task two-resource version is the commonest. In this case it is logically impossible for either task ever to resume as they will both wait indefinitely for each other. The only solution once again is to cancel one of the tasks and try to prevent the recurrence of the problem. In practice, this can usually be done by amending the program code concerned so that all tasks which will attempt to gain control of the same group of resources always do the ENQs for the different resources in the same order. Thus, in our example above, if both tasks attempted to ENQ on X first and then Y, the deadly embrace would never occur. These problems can be resolved for applications, as there are tools available which will display the contention occurring and allow the cancellation of the guilty tasks. For system code, however, which may bring the whole system to a halt when it is forced to wait, long waits and deadlocks are unacceptable - both because of the impact on other users, and because it may be impossible to recover at all without re-IPLing the system (the tools for diagnosing and solving the problem may themselves be locked out). Home Contents Previous Section Next Section Top of Page
3.6.4 Locks
Locks are the serialisation mechanism used by MVS for fundamental system modules. They do not suffer from the "deadly embrace" danger, they have a much shorter "path-length" than ENQ/DEQ (i.e. they use less instructions so execute faster), and unlike the process of disabling the processor for interrupts, they are capable of serialising access to resources which are shared between multiple processors in a multiprocessor CPC. They are used for serialising access to processes and resources (particularly control blocks) used in storage management (real, virtual, and auxiliary), dispatching, the I/O supervisor, and cross-memory services, among others.
3 of 8
11/2/2001 1:48 PM
http://www.mvsbook.fsnet.co.uk/chap03c.htm
Locks differ from ENQ's in that the system does not need to search a chain of control blocks to see if a lock can be satisfied - instead, there is a specific location, known as a lockword, whose contents can be tested to reveal the status of a lock. Lockwords are tested and set by an MVS component known as the Lock Manager, in response to requests made using the SETLOCK macro. There is one lockword for each type of lock, except for locks which control multiple resources of the same type (e.g. UCBs). These are known as "multiple locks", and have one lockword for each occurrence of the resource (in the case of a UCB, for example, the lockword for each UCB is found in an area of storage immediately in front of the UCB concerned). Each CPU also has a group of bits in its PSA which are used to indicate which locks it currently holds. When a requestor issues the SETLOCK macro, the lock manager first checks the bits in the PSA, then accesses the lockword if the bits indicate that the lock can proceed. The lockwords are in storage which is shared between all processors in the complex, and the value in them is either hex zeros (indicating the lock is not held) or an indicator of the CPU-id of the processor holding the lock. If the lock is not already held by another processor, the Lock Manager will set the value of the requesting CPU in the lockword and return control to the requestor. If the lock is already held, the Lock Manager will force the requestor to wait. However, there are different types of lock, and the precise action which the Lock Manager takes will depend on the type of lock. For a SUSPEND lock, the Lock Manager will suspend the requestor if the lock is unavailable, which allows other tasks to be dispatched on the processor in the meantime. For a SPIN lock, however, the Lock Manager will place the requestor in a spin loop and disable it for interrupts if the lock is unavailable. This prevents other work from being dispatched - even interrupts - and ensures the requestor obtains the lock as quickly as possible. The Lock Manager also disables the requestor for interrupts when it does obtain a SPIN lock, in order to allow it to release the lock again as quickly as possible. One implication of this is that units of work requesting SPIN locks must ensure they will not encounter page faults during the time they hold the lock (since these cannot be resolved without taking an I/O interrupt), by fixing any pages of virtual storage they will require in real storage before asking for the lock. Locks are also divided into "local" and "global" locks, analogous to STEP and SYSTEM scope for ENQs - local locks serialise access to resources within an address space, while global locks serialise across the entire system. The feature of locks which prevents deadlocks occurring is the existence of a hierarchy of locks. If a task has obtained one lock, it may only obtain other locks which are higher in the hierarchy - if it attempts to obtain a lower lock than one it holds already, it is abended. This enforces the procedure recommended above for preventing deadlocks using ENQ - i.e. it ensures that requests for different resources are always made in the same order, so there is no danger of one task asking for lock A then lock B, while another is asking for lock B then lock A. It is the process of setting and checking the lock bits in the PSA which determines whether a higher level lock is held when a SETLOCK macro is issued. Figure 3.7 shows some of the lock types in hierarchical order. Note that all the global spin locks appear at the top, followed by global suspend locks and finally local suspend locks (there are no local spin locks).
Lock Name ASM RSM DISP IOSUCB SRM CMS LOCAL Category Global Global Global Global Global Global Local Type Spin Spin Spin Spin Spin Suspend Suspend Resource Serialised Auxiliary Storage Management Real Storage Management Dispatcher (ASVT and dispatching queue) UCB Updates SRM Control Blocks Cross Memory Services Local storage
Home
Contents
Previous Section
Next Section
Top of Page
3.6.5 Intersects
Intersects are used in addition to the lock mechanism to serialise access to the queues used by the MVS dispatcher. They are designed to give the dispatcher itself priority in the use of its queues, but to allow other MVS functions to update those queues when the dispatcher is not using them.
4 of 8
11/2/2001 1:48 PM
http://www.mvsbook.fsnet.co.uk/chap03c.htm
Whenever a function other than the dispatcher wishes to use one of these queues, it must first obtain the relevant lock (the local lock if the queue is one which belongs to a specific address space, such as the TCB queue, or the dispatcher lock for a shared queue, such as the ASCB ready queue). This ensures that only one function other than the dispatcher itself can attempt to access a given queue at any one time. It must then "request an intersect" to find out if the dispatcher is using the queue. If it is, it will have set a bit to indicate this, this bit will be detected by the intersect process, and the requestor will spin until the dispatcher frees the resource by resetting the bit (these bits are in the ASCB for local queues, and the Supervisor Vector Table - SVT - for global queues). Once the requestor can gain control of the resource, it sets the bit itself. Once it has completed, it resets the bit, releasing the intersect, then it relinquishes the lock. Thus the dispatcher need only check/set the intersect bit to obtain control of one of its queues, while any other function must first obtain the relevant lock and then check/set the intersect bit. Home Contents Previous Section Next Section Top of Page
5 of 8
11/2/2001 1:48 PM
http://www.mvsbook.fsnet.co.uk/chap03c.htm
* if the program is not found in any of these, program fetch will abend the caller with system abend code 806. If the program is found in a tasklib, STEPLIB, JOBLIB, or linklist library, it must now be loaded into virtual storage in the caller's private address space. Fetch has its own IOS driver (see the I/O Management section above) which bypasses EXCP and access methods and optimises the process of loading in the program. The text of the load module is passed to the 'relocating loader", which resolves any pointers in the program which are dependent on the virtual address at which it is loaded into storage. Finally the JPA Queue is updated to reflect the addition of the new module, and control is passed to the fetched program (or back to the caller if the fetch was performed in response to a LOAD macro). Home Contents Previous Section Next Section Top of Page
6 of 8
11/2/2001 1:48 PM
http://www.mvsbook.fsnet.co.uk/chap03c.htm
paging works for this area of storage. The copy of the PLPA pages on auxiliary storage (i.e. in the paging datasets) is created at IPL time, and from then on, PLPA pages are never paged out. If the RSM decides to steal a PLPA page, it always does it without a page-out operation, as these areas are never updated, so the original copy in the page dataset is always considered to be usable. The next page-in for the page will always page in the copy which was created at IPL time. As any page of the PLPA could be paged out at any time, it is clear that any modification to it would be lost when it was paged back in. To prevent problems of this sort, PLPA pages are page-protected, so they cannot be updated, and programs to be loaded into the PLPA must be refreshable. If you place non-refreshable programs into the PLPA they will abend 0C4 as soon as they attempt to modify themselves. Another attribute which can be assigned at link-edit time is APF authorisation, though this is only effective if the library into which the program is linked is also APF authorised. This is discussed in more depth in the Storage Management section earlier in this chapter. Properties can also be assigned to programs in the MVS Program Properties Table. This can be used to make programs non-swappable or non-cancellable, or to assign special storage keys to them. Prior to MVS 2.2 you had to reassemble the module IEFSD060 to update this table, but you can now specify these properties in the SCHEDxx member of SYS1.PARMLIB which you select at IPL time. Home Contents Previous Section End of Page Top of Page
7 of 8
11/2/2001 1:48 PM
http://www.mvsbook.fsnet.co.uk/chap03c.htm
for every module moved by the compress operation would be invalid, leading to abends whenever a user attempted to load one of these, until a refresh command was issued. With MVS/ESA, LLA has been renamed Library Lookaside as it can now be extended to cover non-linklist libraries, and it now uses an ESA facility called the Virtual Lookaside Facility to extend its function to include keeping commonly used load modules in virtual storage so that fetches for them can be resolved without any disk I/O at all. Refreshes can now be issued for a single member or a single library, and libraries can be dynamically added to or removed from LLA control. Probably the most interesting extension to LLA, however, is the use of VLF. As LLA resolves BLDL requests, it keeps a record of the most heavily used load modules, and calculates an index known as the "net staging value" for each module, which takes account of how often it is loaded, how large it is, and how great the response time saving arising from keeping it in memory would be. It then "stages" the modules with the highest net staging values into a VLF dataspace, and any future FETCH requests for them will be resolved from here instead of through physical I/O to the dataset on disk. Modules may later be purged from VLF for various reasons for example, if the dataspace runs short of storage, it will purge the least recently used modules, or if a refresh is issued which covers the module, it will be purged. LLA now also has two modes - a "freeze" mode and a "nofreeze" mode. Freeze mode is the default for linklist libraries, and works as we have described LLA processing above. Nofreeze mode, however, bypasses LLA directory processing and only does VLF staging for the library concerned. This is useful for libraries which are frequently updated, or updated by application programmers, as it means that it is unnecessary to issue refreshes every time a member of the library is updated. When a FETCH request is issued for a module which is already in the VLF dataspace but whose CCHHRR address has changed, LLA detects that the module has been updated, purges it from VLF, and resolves the FETCH request from disk. This does not work if you update a program in place (e.g. using AMASPZAP) so you still need to issue a refresh command when you do this. Home Contents Previous Section Next part of chapter Top of Page
This page last changed: 5 July 1998. All text on this site David Elder-Vass. Please see conditions of use. E-mail comments to: dave@mvsbook.fsnet.co.uk (Please check the FAQ's page first!)
None of the statements or examples on this site are guaranteed to work on your particular MVS system. It is your responsibility to verify any information found here before using it on a live system.
8 of 8
11/2/2001 1:48 PM