--------------------------- SIMH Tape Format Extensions --------------------------- The SIMH tape format specification ("SIMH Magtape Representation and Handling") says that a conforming tape image contains a series of objects representing either "metadata markers," such as tape marks, or data records. Each object is introduced by a 32-bit control word. Several four-byte markers are defined, as is the format of data records, which begin and end with identical data length words that bracket the data payload. All of the remaining control word bit patterns are reserved. Data record control words use bit 31 for an error indicator, reserved bits 30-24 "must be zero," and record length bits 23-0 "must be non-zero." However, the current SIMH tape library does not enforce this. It strips bit 31 and uses bits 30-0 as the record length. Enforcing the reserved bits restriction would limit individual data records to 16 MB each. As the library only reads and writes full records, a simulator capable of reading or writing a record of maximum size would require a 16 MB buffer. With enforcement, however, error detection and recovery improves, and several extension options open up. At any given file position, interpretation of a tape image begins with a four-byte control word. The current specification defines two divisions of control words: markers and record lengths. The assignments are: Control Value Range Assignment ------------------- -------------------------------------------- 00000000 Tape mark 00000001 - 00FFFFFF Good data record, 1 to 16,771,215 bytes long 01000000 - 80000000 Reserved 80000001 - 80FFFFFF Bad data record, 1 to 16,771,215 bytes long 81000000 - FFFEFFFE Reserved FFFEFFFF Erase gap (half-gap in forward reads) FFFF0000 - FFFF00FF Erase gap (half-gap in reverse reads) FFFF0100 - FFFF7FFF Reserved FFFF8000 - FFFF80FF Erase gap (half-gap in reverse reads) FFFF8100 - FFFFFFFC Reserved FFFFFFFE Erase gap (primary value) FFFFFFFF End of medium Graphically, the primary control words are as follows: 31 30 29 29 27 26 25 24 23 22 21 [...] 2 1 0 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 0 0 0 0 0 0 0 0 | 0 0 0 [...] 0 0 0 | tape mark +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 0 | 0 0 0 0 0 0 0 | length > 0 | good data record +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 1 | 0 0 0 0 0 0 0 | length > 0 | bad data record +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 1 1 1 1 1 1 1 1 | 1 1 1 [...] 1 1 0 | erase gap +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 1 1 1 1 1 1 1 1 | 1 1 1 [...] 1 1 1 | end of medium +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ A problem with the current implementation is that a conforming reader that encounters a reserved control word in the image file does not know how to recover. This hobbles the introduction of new SIMH tape features, as a simulator using an older tape library version will fail when it encounters a tape image written by a newer library version. Having no interpretation of the reserved control word, the reader can recover only by advancing the file position to resynchronize with the known format. The problem is the unknown word may be a single four-byte marker or may introduce a record of some undefined length. Without knowing this, the reader can only report a fatal error. An additional limitation of the current format is that data records carry only two identifications: "good" or "bad." Especially when generating tape images from physical magnetic tapes, it may be desirable to retain additional information with the image. For example, a "marginal data" indication (i.e., the data was recovered through error correction rather than a clean read) may be pertinent. It may be desirable to keep data associated with the physical tape (e.g., data density, locations of parity errors within "bad" records, text from the tape label) with the image. It may even be desirable to retain the original NRZI or PE flux changes along with the recovered data. Also, specific tape drive simulators may wish to store private data with the image. For instance, the HP 9144A Cartridge Tape Drive can report to the user information from the tape that indicates whether the cartridge was factory or user certified. This information is stored "outside" of the user-accessible data area. This drive also differentiates between a data record that has been written and one that has been "initialized" but never written. There is no way to represent these requirements within the existing format. Finally, some tape controllers allow a single data record spanning the entire length of a 2400-foot reel to be written. At 6250 bpi, this represents over 170 MB of data. The existing 16 MB limit prevents such a record from being stored in a SIMH tape image without artificially dividing it into smaller sections. As a result of these limitations, three changes to the interpretation of the existing format are proposed: 1. All control words, including reserved values, are placed into either the marker division or the record division. 2. These two divisions are further subdivided into classes "reserved for SIMH use" and classes "reserved for private use." 3. The length allocation for the record division is extended from 24 to 28 bits. The SIMH tape library will be revised to ignore all SIMH-reserved objects present in an image file. Private-reserved objects will also be ignored unless private data support is explicitly enabled. A simulator requesting such "extended" support will be able to write private markers and data records, and will obtain private objects when reading. If extended support is disabled, the library's record reading routines will advance past such objects until a known marker or data record is encountered. That is, they will be treated as though they were erase gaps, taking up space in the file but otherwise "invisible" to the caller. The benefits of this proposal are: - Single data records up to 256 MB are possible. - Private data can be kept in the same file as SIMH-standard information. - A conforming reader will automatically ignore unrecognized objects in an image file. In particular, the standard data part of a tape image containing private data can be successfully read by a reader that does not understand the extended format. - Existing simulators will not be affected either by private data or newer SIMH-standard formats. To provide this support, the following control word interpretation is proposed: 31 30 29 29 27 26 25 24 23 22 21 [...] 2 1 0 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | control class | marker-specific value | marker +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | control class | data length value | record +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ The following backward-compatible control class assignments are proposed: Class Value Assignment ----- ------- ------------------------------------------------- 0 0 Tape mark 0 >0 "Good" data record 1 any \ ... | Private data records 6 any / 7 any Private single-word marker 8 0 "Bad" data record, no data recovered 8 >0 "Bad" data record, some data recovered 9 any \ ... | SIMH-reserved data records E any / F FFFFFFE Erase gap F FFFFFFF End of medium F other SIMH-reserved single word marker Two new status codes are introduced: - #define MTSE_RESERVED 12 - #define MTSE_PRIVATE 13 The use of these codes is explained below in connection with the tape read and write routines. These codes are only returned if the unit is configured for extended SIMH format. An attached unit indicates that it wants to use the extended SIMH format by including a new MT_F_STDEX symbol in the static initialization of the unit's flags: - #define MTUF_F_STDEX 5 - #define MT_F_STDEX (MTUF_F_STDEX << MTUF_V_FMT) The macro defines the unit flag bits that specify the extended SIMH format. A simulator can include this symbol in the static UNIT initialization, and it uses a unit flag area that is already reserved, so the user unit flags are not affected. A tape unit may be configured for multiple tape formats by calling the existing "sim_tape_set_fmt" routine for a SET FORMAT or ATTACH -F command that specifies the new format. This routine is modified to add a new "SIMHEX" format name, corresponding to the MT_F_STDEX value, to its table of formats. When the format is set to MT_F_STDEX, the read routines will return private data records and markers, rather then skipping over them. A simulator not prepared to receive private objects must not permit the user to select the extended SIMH format. To ensure this, the "sim_tape_set_fmt" routine is modified to check the current format on entry. If it is MT_F_STDEX, then a "dynamic unit flag" is set to indicate that the "SIMHEX" format is allowed. A simulator that supports private objects will initialize the unit format to MT_F_STDEX, while existing simulators will default the format to MT_F_STD. The initial entry format establishes whether the routine will allow the extended format for current and future calls. Once the extended format is selected, private objects may be written or read by tape library routines. This is accomplished by adding a new routine to write private markers: - t_stat sim_tape_wrmrk (UNIT *uptr, t_mtrlnt mk); The "mk" parameter specifies the private marker class in the upper four bits and the marker-specific value in the lower 28 bits. If the class is not the private marker class, MTSE_RESERVED is returned. If the class is private but the format is not MT_F_STDEX, MTSE_FMT is returned. To accommodate private data records, the existing write routine is overloaded as follows: - t_stat sim_tape_wrrecf (UNIT *uptr, uint8 *buf, t_mtrlnt cc); The "cc" parameter specifies a data record class in the upper four bits and a data record length in the lower 28 bits. When writing data records, the class can be either the standard "good" or "bad" classes, i.e., upper four bits are zero or eight, or a private record class. If the specified class is a SIMH-reserved class, MTSE_RESERVED is returned. If the class is private but the format is not MT_F_STDEX, MTSE_FMT is returned. Private markers and data records are read with the standard read routines, overloaded as follows: - t_stat sim_tape_rdrecf (UNIT *uptr, uint8 *buf, t_mtrlnt *cc, t_mtrlnt max); - t_stat sim_tape_rdrecr (UNIT *uptr, uint8 *buf, t_mtrlnt *cc, t_mtrlnt max); When the standard format is enabled, the current semantics are unchanged. The "cc" parameter returns just the length portion of the data record marker, and the return status is MTSE_OK for a "good" record, MTSE_RECE for a "bad" record, or a status code corresponding to a standard marker. All other objects present in the tape image are ignored. With the extended format, the "cc" parameter returns either the private marker class in the upper four bits and the marker-specific value in the lower 28 bits, or a data record class in the upper four bits and a data record length in the lower 28 bits. Standard markers are indicated by the appropriate MTSE return status; private markers are indicated by a new MTSE_PRIVATE error return. When a private marker is read, the "buf" and "max" parameters are not used. When a standard or private data record is read, a new MTR_C macro may be used to extract the class, and the MTR_L macro may be used to extract the data length. For the extended format only, the variable addressed by the "cc" parameter must be set on routine entry to a bitmap of the object classes to return. Each of the standard and private classes is represented by its corresponding bit, i.e., bit 0 represents class 0, bit 1 for class 1, etc. The routine will return only objects from the selected classes. Unselected class objects will be ignored by skipping over them until the first selected class object is seen. This allows a simulator to declare those classes it understands (e.g., standard classes 0 and 8, plus private class 2) and those classes it wishes to ignore. Setting the bitmap to zero enables the return of all classes. Setting the bitmap to just classes 0 and 8 is equivalent for reading to selecting the standard SIMH tape format.