OpenVMS VMS731_XFC-V0100 Alpha V7.3-1 XFC ECO Summary
TITLE: OpenVMS VMS731_XFC-V0100 Alpha V7.3-1 XFC ECO Summary
NOTE: An OpenVMS saveset or PCSI installation file is stored
on the Internet in a self-expanding compressed file.
For OpenVMS savesets, the name of the compressed saveset
file will be kit_name.a-dcx_vaxexe for OpenVMS VAX or
kit_name.a-dcx_axpexe for OpenVMS Alpha. Once the OpenVMS
saveset is copied to your system, expand the compressed
saveset by typing RUN kitname.dcx_vaxexe or kitname.dcx_alpexe.
For PCSI files, once the PCSI file is copied to your system,
rename the PCSI file to kitname.pcsi-dcx_axpexe or
kitname.pcsi-dcx_vaxexe, then it can be expanded by typing
RUN kitname.pcsi-dcx_axpexe or kitname.pcsi-dcx_vaxexe. The
resultant file will be the PCSI installation file which can be
used to install the ECO.
New Kit Date: 08-MAY-2003
Modification Date: 02-JUL-2003
Modification Type: Kit Released with Corrected hidden information
Copyright (c) Hewlett-Packard Company 2003. All rights reserved.
OP/SYS: OpenVMS Alpha
SOURCE: Hewlett-Packard Company
ECO Kit Name: VMS731_XFC-V0100
ECO Kits Superseded by This ECO Kit: None
ECO Kit Approximate Size: 8816 Blocks
Kit Applies To: OpenVMS Alpha V7.3-1
System/Cluster Reboot Necessary: Yes
Rolling Re-boot Supported: Yes
Installation Rating: INSTALL_2
To be installed by all customers using the following
The following remedial kit(s) must be installed BEFORE
installation of this kit:
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
ECO KIT SUMMARY:
An ECO kit exists for XFC components on OpenVMS Alpha V7.3-1.
This kit addresses the following problems:
PROBLEMS ADDRESSED IN VMS731_XFC-V0100 KIT
o Multiple XFC bug fixes and enhancements have been made:
- Files written by a DFS client to a disk drive served by a
cluster node can end up with stale data on the cluster
nodes not serving the drive.
- CPU spinwait bugchecks Some conditions (large numbers of
non-cached I/Os) can result in a very long internal XFC
queue. On very large systems, searching this queue take
30 or more seconds. A suggested workaround was to limit
the XFC cache to 4 or 5 GB. This is no longer necessary.
XFC was inadvertently using the FILSYS and SCS spinlocks
in the wrong order. The MTAACP (mag tape ACP) also uses
both spinlocks which can result in a deadlock and
subsequent cpuspinwait bugcheck. This problem will not
show up with backup, but only when doing filesystem access
to a tape drive (e.g. copy x.x mta0: ) and then only if
the timing was just right.
It was possible for a XFC file truncate processing to take
enough time to result in a spinwait bugcheck. This
problem was identified during a code inspection of the XFC
truncate processing and never observed on a customer
- Volume depose speedup A volume dismount requires that all
files in the cache for that volume be deposed from the
cache (on the current node). This operation was operating
at about 1 file per second resulting in very long times to
free memory. In addition, the code deposed the first file
synchronously which could cause noticeable delays for the
- Minimum cache size enforced. XFC would allow any values
for VCC_MAX_CACHE including zero. The result was either
caching being disabled cluster-wide or a memory management
bugcheck on the local node during boot. This fix ensures
that about 5 MB of memory is always allocated to XFC
allowing the node to boot (there is also a message output
on the console).
- ASSERTFAIL bugcheck copying file to spooled device on
standalone nodes. XFC assumed that all file deletes
passed through XFC allowing XFC to properly depose the
cache. On standalone nodes only, this assumption lead to
XFC attempting to release a lock it didn't own and
crashing with an ASSERTFAIL bugcheck. This typically
showed up while attempting to copy to a spooled device.
This does not occur on nodes in clusters.
- Performance data not being updated. XFC was not calling
routine pms_std$end_rq() prior to completing disk I/Os.
This resulted in performance data collectors seeing I/O
starts, but not I/O completions.
- Corrupt LRU queue after truncate During I/O completion,
XFC cleans up structures associated with the I/O including
adjusting positions of extents (ECBs) in the LRU queue.
Occasionally, these elements have either been deallocated
or used for another I/O which results in a bugcheck. This
is an extremely rare event. It has been seen at one
internal site almost a year ago and at 3 customer sites.
The XFC truncate code had an implicit assumption that
there would not be active I/Os on the file. The code
neglected to account for either XFC readahead I/Os or
asynchronous I/Os issued prior to the call to truncate.
The XFC truncate code was completely rewritten to properly
synchronize with concurrent I/Os to the file being
- Public counters overflow The XFC public counter used by
the DCL command 'SHOW MEMORY/CACHE' were stored in
unsigned longwords limiting the maximum counts to
approximately 4 billion. These counters have been
increased to unsigned quadwords. In addition, the public
interface to the internal counters (CACHE$GET_STAT()) has
been enhanced to return up to 8 bytes of data for each of
- ASSERTFAIL bugchecks in XFC lock processing If a write
happens for a file which is in read sharing mode, XFC
attempts to convert the File Arbitration Lock (FAL) from
PR mode (caching cluster-wide) to PW mode (caching locally
only). If this conversion fails, then XFC moves the FAL
to CW mode and starts a thread to move the FAL back to a
caching mode. This thread is called a FAL up conversion.
During this sequence, it was possible for a blocking AST
on the FAL to fire. It would also lead to a FAL up
conversion being started. If the timing were just right,
then two FAL up conversions could be in progress. One of
the two would find the FAL in the wrong state and bugcheck
- ASSERTFAIL in routine XfcLockIsFALHeld () or
XfcLockReleaseFALViaEX () Under some conditions, it was
possible that a file truncate operation could happen while
an I/O was in progress. The truncate operation would
leave data in cache for the cache, but with the XFC file
arbitration lock in a state not allowing valid data. XFC
crashed with an ASSERTFAIL bugcheck when this
inconsistency was discovered. This has been fixed by a
complete rewrite of the XFC truncate processing.
- Volume and file latencies incorrectly calculated XFC
provides statistics on average access latencies (via the
XFC SDA extensions and the CACHE$GET_STATVOL system
It does this by accumulating the total latency times as
the accesses are completed and then, when the average is
requested, dividing by the number of accesses.
Unfortunately, the access counts include accesses for
which the latency could not be determined (because the
access began on one CPU and finished on another: the
per-CPU cycle counter is used to determine the elapsed
time) and, therefore, were not included in the accumulated
XFC's statistics gathering fields already include counts
of the accesses not counted in the latency accumulations.
So, the change is to include those counts in the
- Improved performance of non-cached I/Os. XFC was adding
overhead to I/Os which weren't being cached - for example
very large I/Os (6000 blocks). This extra overhead has
- XFC SDA extension enhancements
1. Help for XFC SDA extension has been updated.
2. The SDA command XFC SHOW FILE command now displays the
file name. In addition, the output of the SDA command
XFC SHOW FILE /BRIEF is sorted by volume.
This kit requires a system reboot. HP strongly recommends that
a reboot is performed immediately after kit installation to avoid
If you have other nodes in your OpenVMS cluster, they must also be
rebooted in order to make use of the new image(s). If it is not
possible or convenient to reboot the entire cluster at this time, a
rolling re-boot may be performed.
Install this kit with the POLYCENTER Software installation utility
by logging into the SYSTEM account, and typing the following at the
PRODUCT INSTALL VMS731_XFC /SOURCE=[location of Kit]
The kit location may be a tape drive, CD, or a disk directory that
contains the kit.
Additional help on installing PCSI kits can be found by typing
HELP PRODUCT INSTALL at the system prompt
Special Installation Instructions:
o Scripting of Answers to Installation Questions
During installation, this kit will ask and require user
response to several questions. If you wish to automate the
installation of this kit and avoid having to provide responses
to these questions, you must create a DCL command procedure
that includes the following definitions and commands:
- $ DEFINE/SYS NO_ASK$BACKUP TRUE
- $ DEFINE/SYS NO_ASK$REBOOT TRUE
- Add the following qualifiers to the PRODUCT INSTALL
command and add that command to the DCL procedure.
- De-assign the logicals assigned
For example, a sample command file to install the
VMS731_XFC-V0100 kit would be:
$ DEFINE/SYS NO_ASK$BACKUP TRUE
$ DEFINE/SYS NO_ASK$REBOOT TRUE
$ PROD INSTALL VMS731_XFC/PROD=DEC/BASE=AXPVMS/VER=V1.0
$ DEASSIGN/SYS NO_ASK$BACKUP
$ DEASSIGN/SYS NO_ASK$REBOOT
All trademarks are the property of their respective owners.
Files on this server are as follows: