SIMH Implementation Notes ========================= The following notes pertain to the way certain features (or planned features) are implemented in SIMH. --------------------------------------- Address Width and Increment, Data Width --------------------------------------- The DEVICE structure contains three fields that affect how device data is examined and deposited. They are: - awidth : address width in bits, 1-64 - aincr : increment between successive addresses, nominally 1 - dwidth : data width in bits, 1-64 The address width determines the range of addresses, and the increment determines the step between one data element and another, and the data width determines the size of the data elements. Here are a couple of example devices from the HP2100 simulator: Device awidth aincr dwidth Address Range ------ ------ ----- ------ ------------- CPU 20 1 16 1 megaword DA 26 1 16 128 megabytes LPT 32 1 8 4 gigabytes PTR 31 1 8 2 gigabytes These fields affect SCP in the following ways. awidth ------ - validates address range in exdep_addr_loop - address display width in ex_addr - address display width in sim_brk_show aincr ----- - decision to print words or bytes in fprint_capac - element count for UNIT_MUSTBUF memory allocation in attach_unit - element count for UNIT_MUSTBUF file write from buffer in detach_unit - address increment for examine loop in sim_save - address increment for deposit loop in sim_rest - address increment for sim_eval examine in fprint_stopped_gen - last valid address calculation for [ALL] in exdep_cmd - address increment for ex_addr call in exdep_addr_loop - addressing units consumed for ex_addr - address increment for examine loop in get_avail - address to byte offset calculation for fseek in get_aval - rounding number of sim_eval words for deposit loop in dep_addr - address to byte offset calculation for fseek in dep_addr - address increment for sim_eval print loop in eval_cmd dwidth ------ - decision to print words or bytes in fprint_capac - data display width in fprint_stopped_gen - data display width in ex_addr - sim_eval data mask in get_aval - sim_eval data mask in dep_addr - sim_eval data mask in eval_cmd - data display width in eval_cmd - byte count for UNIT_MUSTBUF memory allocation in attach_unit - byte count for UNIT_MUSTBUF file read into buffer in attach_unit - byte count for UNIT_MUSTBUF file write from buffer in detach_unit - byte count for UNIT_MUSTBUF buffer copy in sim_save - byte count for UNIT_MUSTBUF buffer copy in sim_rest - byte count for file read in get_avail - byte count for file write in dep_addr For the DC device, we want to: - write words in big-endian format for HPDrive compatibility - examine and deposit bytes and 16-bit words - examine and deposit machine instructions Endianness is a problem. Currently, we have aincr = 1 and dwidth = 8. We cannot specify dwidth = 16 because the default sim_fread in get_aval will assume that the file is little-endian. But with dwidth = 8, sim_eval gets only one byte per element, and fprint_cpu will fail. We cannot supply our own examine routine that reads a pair of bytes into each element because get_aval masks each element to the dwidth. This could be handled in fprint_sym and parse_sym by repacking the sim_eval array when the routines are called for a DC unit. But that adds device-specific tests to generic routines, which seems undesirable. Also, we only get half the number of words needed for the longest instruction. We could get around that by setting sim_emax to twice the number of words, although that would be inefficient for the normal case of examining CPU memory. (In fact, the VAX does just this, but is specifies sim_emax as 60!) Might it be possible to define aincr = 2 and dwidth = 16? We could then write a device-specific examine to read a pair of bytes into sim_eval in the correct order. fprint_sym should then "just work," although the return number of "addressing units" would have to be doubled for aincr = 2 devices. Addresses would still be specified in bytes, but they would increment by 2 (the PDP-11 sppears to work this way). Specifying an odd address for an aincr = 2 device could either be rejected or could be handled in fprint_sym by printing the pair of bytes. --- Actions: Command Displays ---------------- ---------------------------------------- EX DC0 0-3 bytes 0, 1, 2, 3 in octal EX -A DC0 0-3 bytes 0, 1, 2, 3 as characters EX -W DC0 0-3 words 0, 2 in octal EX -W -H DC0 0-3 words 0, 2 in hex EX -M DC0 0-3 words 0, 2 as machine instructions EX -C DC0 0-3 words 0, 2 as character pairs EX DC0 1 byte 1 EX -W DC0 1 Command not allowed Fallback display prints "dwidth" field as PV_RZRO, increments by aincr. Custom display increments by (1 - return). Option 1 (aincr = 1, dwidth = 8): - sim_eval contains bytes - sim_emax must be 20 (10 instructions * 2 bytes/instruction) - no custom examine routine needed - fallback can be used to display bytes - word formats (-C, -M, -W) must pack bytes into words, increment 2 Option 2 (aincr = 2, dwidth = 16): - sim_eval contains words - sim_emax can be 10 - custom examine routine reads big-endian word at even address into sim_eval - fallback can be used to display words - default format displays high/low byte, increments 1 - word formats (-C, -M, -W) increment 2 --------------------------------- Serial Modem Control Line Support --------------------------------- Support for modem control lines exists in the following devices: - 12587A Asynchronous Data Set Interface - 12920A 16-channel multiplexer - 12966A BACI - 30061A TCI The lines supported are: ---- Control ----- ----- Status ----- CD CA SBA CH SCA CC CF SBB CB CE SCF Card DTR RTS STx SRS SRS DSR CD SRx CTS RI SCD ------ --- --- --- --- --- --- --- --- --- --- --- 12587A X X X - - X X X X X - 12920A X X X X - X X X X - - 12966A X X X - X X X X X X X 30061A X X X X - X X X X - - PC Ser X X - - - X X - X X - For any given communicaiton line, there are four possible situations: 1. Telnet connection, no modem control 2. Serial connection, no modem control 3. Telnet connection, modem control 4. Serial connection, modem control Modem control is indicated by the simulator calling either "tmxr_modem_control" or "tmxr_modem_status" for a given line. Once a call has been made for a given line, that line is considered to be under modem control from then on (though it might be desirable to have a control call that resets modem control status). Modem control is indicated by a flag in the EX_TMLN structure, which means that the structure must be allocated by either a serial attach or by a modem control call, and also that it cannot be freed, as modem control status must persist through a detach/attach sequence. Modem control and status calls for a serial connection affect the hardware serial port. Control calls for a Telnet connection hane no effect, except that a DTR drop disconnects the Telnet session. Status calls for a Telnet connection are simulated; CD, CTS, and DSR are asserted if the line is connected and denied if the line is disconnected. The only decision based on modem control is whether to assert DTR and RTS when a serial connection is made and deny them when the connection is broken. This is necessary when not under modem control to enable transmission on the port. It must not be done if the simulator is operating the modem lines explicitly to avoid interference. -------------------------------------- The User View of Terminal Multiplexers -------------------------------------- The terminal multiplexer devices (BACI, MPX, and MUX for the 2100, ATCD for the 3000) attempt to present a logical picture of the multiplexer to the user when interacting via the ATTACH, DETACH, SET, and SHOW commands. This is complicated by the requirement for a network listening port and associated attachable unit, and the potential presence of additional units for controllers or timers. For example, the 12792 8-channel multiplexer for the 1000 ideally would be modeled as an 8-unit device, where units 0-7 correspond to multiplexer ports 0-7. However, this device simulation (MPX) also requires a controller unit (unit 8) and unit to hold the listening port (unit 9). These units must be hidden, so they won't appear in a SHOW MPX report. Moreover, we want to prohibit user access to the hidden units. But we must provide a mechanism to allow attachment of the listening port. To meet these goals, we want to allow these commands: - ATTACH MPX to attach the listening port - ATTACH MPX0-7 to attach individual serial ports - DETACH MPX to detach the listening port - DETACH MPX0-7 to detach individual serial ports We disallow these commands: - ATTACH MPX8-9 to attach the controller and listening unit directly - DETACH MPX8-9 to detach the controller and listening unit directly In addition, we must allow the indirect actions invoked by these commands: - RESTORE to attach both listening and serial ports - DETACH ALL to detach both listening and serial ports - EXIT to detach both listening and serial ports Complicating the model is the fact that RESTORE and DETACH ALL will call the MPX attach and detach routines directly for the poll unit (unit 9), which we must allow, and that EXIT will call the MPX detach routine for all unattachable units (units 0-7 and unit 8), which we must ignore. Further complications arise from wanting to be compatible between the three possible front-ends (3.10 extended, 3.10 base, and 4.0). For a 3.10xtd ATTACH or DETACH command, the "sim_ref_type" variable is set to REF_DEV if a device is specified or to REF_UNIT if a unit is specified. For a 3.10xtd RESTORE, DETACH ALL, or EXIT command, the variable is set to REF_NONE. For 3.10 or 4.0, "sim_ref_type" is a constant REF_DEV. For all versions, RESTORE sets the SIM_SW_REST switch, EXIT sets the SIM_SW_SHUT switch, and DETACH ALL sets no switches. What we want, then, are these actions: Command 3.10 extended 3.10 base 4.0 ----------- ------------- ------------- ------------- ATTACH MPX attach net attach net attach net ATTACH MPX0 attach serial SCPE_NOATT SCPE_NOATT ATTACH MPX8 SCPE_UDIS SCPE_UDIS SCPE_UDIS ATTACH MPX9 SCPE_UDIS SCPE_UDIS SCPE_UDIS DETACH MPX detach net detach net detach net DETACH MPX0 detach serial SCPE_NOATT SCPE_NOATT DETACH MPX8 SCPE_UDIS SCPE_UDIS SCPE_UDIS DETACH MPX9 SCPE_UDIS SCPE_UDIS SCPE_UDIS DETACH ALL detach net detach net detach net detach serial impossible impossible RESTORE mpx detach net detach net detach net attach net attach net attach net RESTORE mpx0 detach serial impossible impossible attach serial impossible impossible These conditions pertain to the listed actions: Action Unit 3.10 extended 3.10 base 4.0 ----------- ---- --------------------- -------------------- -------------------- ATTACH MPX 0 REF_DEV REF_DEV REF_DEV ATTACH MPX0 0 REF_UNIT REF_DEV REF_DEV ATTACH MPX8 8 REF_UNIT REF_DEV REF_DEV ATTACH MPX9 9 REF_UNIT REF_DEV REF_DEV RESTORE 0 REF_NONE, SIM_SW_REST impossible impossible RESTORE 8 impossible impossible impossible RESTORE 9 REF_NONE, SIM_SW_REST REF_DEV, SIM_SW_REST REF_DEV, SIM_SW_REST DETACH MPX 0 REF_DEV REF_DEV REF_DEV DETACH MPX0 0 REF_UNIT REF_DEV REF_DEV DETACH MPX8 8 REF_UNIT REF_DEV REF_DEV DETACH MPX9 9 REF_UNIT REF_DEV REF_DEV DETACH ALL 0 REF_NONE REF_DEV REF_DEV 1 REF_NONE REF_DEV REF_DEV 8 impossible impossible impossible 9 REF_NONE REF_DEV REF_DEV EXIT 0 REF_NONE, SIM_SW_SHUT REF_DEV, SIM_SW_SHUT REF_DEV, SIM_SW_SHUT 1 REF_NONE, SIM_SW_SHUT REF_DEV, SIM_SW_SHUT REF_DEV, SIM_SW_SHUT 8 REF_NONE, SIM_SW_SHUT REF_DEV, SIM_SW_SHUT REF_DEV, SIM_SW_SHUT 9 REF_NONE, SIM_SW_SHUT REF_DEV, SIM_SW_SHUT REF_DEV, SIM_SW_SHUT What we want, then, are these actions for mpx_attach: Called for Unit 3.10 extended 3.10 base 4.0 ----------- ---- ------------- ------------- ------------- ATTACH MPX 0 attach net attach net attach net ATTACH MPX0 0 attach serial SCPE_NOATT SCPE_NOATT ATTACH MPX8 8 SCPE_UDIS SCPE_UDIS SCPE_UDIS ATTACH MPX9 9 SCPE_UDIS SCPE_UDIS SCPE_UDIS RESTORE 0 attach serial impossible impossible RESTORE 8 impossible impossible impossible RESTORE 9 attach net attach net attach net The 3.10 extended SCP will set sim_ref_type to REF_DEVICE for an explicit device reference, to REF_UNIT for an explicit unit reference, and to REF_NONE for implicit references (i.e., by RESTORE, DETACH ALL, and EXIT). So for 3.10 extended, this is all that is needed: Attach ----------------------------------------------- if sim_ref_type = REF_DEVICE attach poll_unit else attach specified_unit Detach ----------------------------------------------- if sim_ref_type = REF_DEVICE detach poll_unit else detach specified_unit ...because sim_ref_type can be only one of the three values. To support 3.10 base as well, the above must be augmented as follows, assuming that REF_UNIT has been redefined locally to an undefined value (i.e., one that doesn't match what SCP returns for a unit reference): Attach ----------------------------------------------- if sim_ref_type = REF_DEVICE attach poll_unit else if sim_ref_type = REF_UNIT or REF_NONE attach specified_unit else error Detach ----------------------------------------------- if sim_ref_type = REF_DEVICE detach poll_unit else if sim_ref_type = REF_UNIT or REF_NONE detach specified_unit else error In this case, we want to prevent all unit attaches and detaches, except in the case of RESTORE or DETACH ALL, which will attach or detach the poll unit directly. If we redefine the REF_UNIT value above to something other than the REF_UNIT Value set by the ATTACH/DETACH command processors, then any attempt to attach or detach a unit via these commands will be rejected. For 4.x support, the issue is complicated by the absence of the sim_ref_type value. The above code would work for 4.x if sim_ref_type was set to REF_DEVICE for ATTACH and DETACH (except DETACH ALL) and to REF_NONE otherwise. We might do this by hooking ATTACH and DETACH for 4.x only. Otherwise, the code above must be modified as follows, assuming that sim_ref_type has been defined as a constant with value REF_DEVICE: Attach ---------------------------------------------------------------------------- if sim_ref_type = REF_DEVICE and specified_unit = unit_0 attach poll_unit else if sim_ref_type = REF_UNIT or REF_NONE or SIM_SW_REST in sim_switches attach specified_unit else error Detach ---------------------------------------------------------------------------- if sim_ref_type = REF_DEVICE and specified_unit = unit_0 or poll_unit detach poll_unit else if sim_ref_type = REF_UNIT or REF_NONE or SIM_SW_SHUT in sim_switches detach specified_unit else error While the above works "universally," it contains redundancies for 3.10 extended. If the value is REF_DEVICE, the specified unit will always be be unit_0, and if sim_switches is SIM_SW_REST or SIM_SW_SHUT, the value will always be REF_NONE, Moreover, this extra code must be replicated in all multiplexer devices. Support for 4.x is problematic and may have to be discontinued, depending on future changes. So it would be better to have all of the 4.x dependencies centralized, ideally in a separate source module that could be dropped, but otherwise at least in the hp_sys module, isolated by conditional compilation. ----- Could this be simplified? What about: status = tmxr_attach_unit (&mpx_desc, &mpx_poll, uptr, cptr); // or mpx_poll_number? ...where the extended version does the above REF_DEVICE test, and the non-extended version uses this substitution: if (uptr == mpx_desc.dptr->units // if unit 0 || uptr == mpx_poll && sim_switches & SIM_SW_REST) // or poll unit and restoring status = tmxr_attach (&mpx_desc, &mpx_poll, cptr); else status = SCPE_NOATT; ...expressed as: #define tmxr_attach_unit(mptr,pptr,uptr,cptr) \ ((uptr) == (mptr)->dptr->units || (uptr) == (pptr)) \ ? tmxr_attach (mptr, pptr, cptr) \ : SCPE_NOATT) For detach, what about: status = tmxr_detach_unit (&mpx_desc, &mpx_poll, uptr); // or mpx_poll_number? ...for the extended version, and: if (uptr == mpx_desc.dptr->units // if unit 0 || uptr == mpx_poll) // or poll unit status = tmxr_detach (&mpx_desc, &mpx_poll); else status = SCPE_NOATT; ...expressed as: #define tmxr_detach_unit(mptr,pptr,uptr) \ ((uptr) == (mptr)->dptr->units || (uptr) == (pptr)) \ ? tmxr_detach (mptr, pptr) \ : SCPE_NOATT) If this works, then we won't need to simulate "sim_ref_type" nor trap the ATTACH and DETACH commands in hp_sys.c, and we wouldn't need a special 4.x in hp_defs.h. --------------------------------- Calibrated Timers and Breakpoints --------------------------------- Each time SCP stops execution for a breakpoint, in particular string breakpoints, all calibrated timers are reinitialized. This is due to the "sim_rtcn_init_all" call just before the "sim_instr" call in "run_cmd". This resets the timers to their initial values, which are typically much faster than the eventual calibrations, and restarts the calibration process. In a command file that has a large number of prompt/response pairs, the timers are continually reinitialized, so calibrated operation never occurs. This can be seen in the HP 3000 "diag-online.sim" execution, where MPE reports six seconds of CPU time, even though only about three seconds of wall-clock time has elapsed. Running the same command file with 4.x reports one second of CPU time. SIMH 4.x provides for this by replacing the "sim_rtcn_init_all" call with a call to a new "sim_start_timer_services" routine (sim_timer.c). The comments for that routine say: If we're quickly running again after being stopped for less than the time of one calibrated clock tick, then don't force a complete recalibration of any timers that may have been previously running. Basically, the initialization call is skipped if the difference between the current (i.e., restarting) time and the time at which simulation last stopped is less than ten milliseconds. We'd like to do the same thing in 3.x, but there currently isn't any way to do that without replacing "run_cmd" in its entirety. We already shim "run_cmd", but we can't, e.g., save and restore the calibrated timer arrays around the initialization call in "sim_timer.c" because they aren't global. Attempting to re-initialize each timer to its current value in the "sim_instr" postlude helps somewhat, but it also restarts the one-second initial calibration period. Typcially, the prompts and responses occur much faster than one second apart, so the calibration period never completes. It would be possible to hook the call in "run_cmd", but the mistiming only shows up in situations where large numbers of prompt/response pairs are scriped, and timer calibration is important. In practice, whether the diagnostic reports the correct CPU time is irrelevant, and most cases where the time is important, scripting only occupies a small startup regime, e.g., to set the system clock. At the moment, this issue is unresolved. ---------------------- Buffered Serial Output ---------------------- Buffering TMXR output speeds up Telnet nicely, but it slows down serial output. Consider a 100-character write. MPE writes 80 bytes plus an ENQ to the output buffer, taking "n" milliseconds. The ENQ forces a flush, and those 81 bytes are sent to the terminal at 9600 baud. Then a delay ensues until the terminal returns ACK, and another delay (<= 10 milliseconds) ensues until the input poll is performed. Then the serial line is idle again while the output buffer is filled. This only affects 3.x output, as 4.x serial output is not buffered. Serial buffering is desirable in the case of the HP2100 MPX device. When processing input editing characters, MPX may need to output multiple characters in a single event service call. For example, it responds to receiving BS by sending SPACE and BS, and to receiving DEL by sending BACKSLASH, CR, and LF. The current code cannot handle SCPE_STALL, which would occur (and does occur in 4.x) if the serial line is unbuffered. So MPX calls "tmxr_linemsg" which, for 4.x, detects the stall and sits in a 10-millisecond "sleep" loop while waiting for the character to be accepted. This isn't ideal, as it may cause the clock to lose wall calibration. The 3.x version of "tmxr_linemsg" just blindly assumes that "tmxr_putc_ln" cannot fail and would lose characters if a stall occurred. Moreover, the buffer size is a constant in 3.x, so there's no easy way of shortening the buffer length except by shimming "tmxr_putc_ln" and calling "tmxr_poll_tx" after each character output. Ideally, the first few characters would be flushed from the buffer so that the serial port could begin output almost immediately after generation. Then, while those are being output (at a maximum rate of 9600 baud, or about one per millisecond), the remainder of the buffer could be filled and then posted to the serial port. This would maximize throughput. In practice, this isn't that much of a concern, for two reasons. First, a full line of 80 characters can be output to the buffer in FASTTIME mode in about 3 milliseconds. So the delay only represents an increase of about 4%. Second, an unconditional buffer flush is performed at every poll service entry, i.e., every 10 milliseconds. So that 80-character output string may fall partly across the poll time, in which case the serial output will begin before the full string is output. In this case, the 4% delay represents a maximum delay; the average will be less. As noted above, it would be possible to shim "tmxr_putc_ln" and automatically flush the buffer after, say, every 30 characters (the BACI does an ENQ/ACK handshake every 33 characters, whereas the other multiplexers use 80 characters as the threshold). But it isn't necessary and probably wouldn't have a visible gain for the added complexity. ------------------------------ VM-Specific Handler Interfaces ------------------------------ Optional VM-specific handlers are implemented as pointers that are statically initialized to NULL. A VM may assign one or more of these to point at routines that will be called by the SCP. For example: extern void (*sim_vm_post) (t_bool from_scp); and then: sim_vm_post = &local_post; Generally, this is done in the CPU power-on reset section when the initial memory pointer is NULL, or in the optional one-time VM init. The latter is done by defining: void (*sim_vm_init) (void) = &local_init; ...which overrides the default NULL value that is established otherwise. --------------------------- Simulation SAVE and RESTORE --------------------------- The simulator SAVE command saves the values of the following unit structure fields: filename -- the name of the attached file time -- the unit activation time flags -- the user flag values capac -- the unit capacity u3 -- a user-specified value u4 -- a user-specified value u5 -- a user-specified value u6 -- a user-specified value It does not save: pos -- the current file offset buf -- the I/O buffer word wait -- the current service request time ...or any variable not referenced by a REGister element. ------------ Attach Modes ------------ The SCP "attach_unit" routine tries to open all files in read/write mode. This is to permit EXAMINE and DEPOSIT to work on all device files, regardless of the underlying device. So, for example, it is deemed desirable to DEPOSIT to a paper tape reader file or EXAMINE a line printer output file. This does not usually present any sort of problem, except in a few cases: * Read-only devices, such as paper tape readers, will create a new zero-length file if the specified file does not exist, unless the -E switch is specified. * A UNIT_SEQ device, such as a printer, cannot be attached to a pipe because the "fseek" resulting from that flag fails. * Output to a Unix pipe fails; opening in read/write mode appears to connect both ends of the pipe to the same program. It is desirable to resolve these problems without altering the existing semantics. The first problem is easily solved by automatically adding the -E switch to all read-only devices. This will require adding an attach routine to the HP 2100 PTR device, which is the only device that relies exclusively on the default attach behavior. The second problem can be solved by removing UNIT_SEQ from write-only devices, although this will eliminate the possibility of positioning the output medium by setting POS. Alternately, UNIT_SEQ can be added or removed dynamically in the device attach routine, depending on a test for a pipe. The third problem cannot be accommodated by the existing "attach_unit" action. This is because all files are opened in read/write (i.e., update) mode, and this interferes with pipe operation by opening both ends of the pipe. An additional mode (write-only) must be added for proper pipe operation. The current "attach_unit" open behavior is influenced by the -E, -N, and -R switches, and the UNIT_ROABLE flag, as follows: -R -N -E RO Mode Action -- -- -- -- ---- --------------------------------------------------- Y x x N -- "Read only operation not allowed" Y x x Y rb open for reading N Y x x wb+ truncate to zero length or create file for update N N x x rb+ open file for update If rb+ fails, then if the error code is EROFS (file resides on a read-only file system) or EACCES (permission is denied): N N x N -- "Read only operation not allowed" N N x Y rb open for reading If the error code is something else: N N Y x -- "File open error" N N N x wb+ truncate to zero length or create file for update Logically, the existing code does: if -R then if UNIT_ROABLE then rb (read) else SCPE_RO else if -N then wb+ (read/write new) else try rb+ (read/write existing) if write not allowed then if UNIT_ROABLE then rb (read) else SCPE_RO else if -E then SCPE_OPENERR else wb+ (read/write new) One option is to add a new -W switch that would work in conjunction with the -N switch as follows: if -W then if -N then wb (write new) else ab (append new) This will work, but it has the drawback that append mode does not allow file positioning ("Opening a file with append mode causes all subsequent writes to the file to be forced to the then current end-of-file, regardless of intervening calls to the fseek function"). This would prevent setting POS to reposition, and so would not require the UNIT_SEQ flag, eliminating the fseek call prior to execution resumption. Ideally, for normal files we would want mode "wb+" for -N and "rb+" otherwise, with an fseek to the EOF after attaching, and mode "wb" for a pipe file. In the latter case, positioning does not make sense. The HP simulator devices use the following unit flags for devices that can be attached to files (i.e., that call "attach_unit"): Device SEQ FIX ROA ------ --- --- --- DS - Y Y LP Y - - DA - Y Y DP - Y Y DQ - Y Y DR - Y - DS - Y Y LPS Y - - LPT Y - - PTR Y - Y PTP Y - - TTYPUN Y - - ...and the following that call "sim_tape_attach": Device SEQ FIX ROA ------ --- --- --- MS - - Y MSC - - Y MTC - - Y Pipes only make sense for UNIT_SEQ devices, because they cannot be positioned. The decision as to which end of the pipe to open (read or write) can be made by looking at the UNIT_ROABLE flag, which is not present on write-only devices. The "stat" function can be called to get the file type of a specified filename prior to opening. The "st_mode" field will be S_IFIFO for a pipe and S_IFREG for a normal file. Note that Unix uses S_IFIFO, MSVC uses _S_IFIFO, Mingw uses _S_IFIFO but defines S_IFIFO as an alias, and Cygwin uses S_IFIFO but defines _S_IFIFO as an alias. "stat" returns 0 if the file exists and -1 if it does not. To handle pipes transparently, we could shim the "attach_unit" routine. The "ex_attach_unit" shim would operate as follows: is_pipe := stat () = 0 and then S_IFIFO in st_mode if is_pipe then if not UNIT_SEQ then error else if UNIT_ROABLE then add -R else add -W status := attach_unit () if status = OK then if is_pipe then remove UNIT_SEQ else if not UNIT_ROABLE then sim_fseek (SEEK_END) if ferror then clearerr Callers of sequential devices must set the UNIT_SEQ flag in the associated device attach routine. Otherwise, a "detach_unit" shim would have to be created to restore UNIT_SEQ to the unit if the (open) stream refers to a pipe; this would require an fstat call. This would then require only this "attach_unit" addition in SCP: if -W then wb (write new) ...because -R would open "rb" (read-only). This arrangement does not alter normal file handling but allows pipes to be specified as output devices. --------------------------- SIMH Tape Format Extensions --------------------------- The SIMH tape format specification ("SIMH Magtape Representation and Handling") says that a conforming tape image contains a series of objects representing either "metadata markers," such as tape marks, or data records. Each object is introduced by a 32-bit control word. Several four-byte markers are defined, as is the format of data records, which begin and end with identical data length control words that bracket the data payload. All of the remaining control word bit patterns are reserved. Data record control words use bit 31 for an error indicator, reserved bits 30-24 "must be zero," and record length bits 23-0 "must be non-zero." However, the current SIMH tape library does not enforce this. It strips bit 31 and uses bits 30-0 as the record length. Enforcing the reserved bits restriction would limit individual data records to 16 MB each. As the library only reads and writes full records, a simulator capable of reading or writing a record of maximum size would require a 16 MB buffer. In normal use, this would be far larger than most records. However, some tape drives are capable of writing a single record that encompasses an entire tape reel. This cannot be accommodated with the current format. At any given file position, interpretation of a tape image begins with a four-byte control word. The current specification defines two divisions of control words: markers and record lengths. The assignments are: Control Value Range Assignment ------------------- -------------------------------------------- 00000000 Tape mark 00000001 - 00FFFFFF Good data record, 1 to 16,771,215 bytes long 01000000 - 80000000 Reserved 80000001 - 80FFFFFF Bad data record, 1 to 16,771,215 bytes long 81000000 - FFFEFFFE Reserved FFFEFFFF Erase gap (half-gap in forward reads) FFFF0000 - FFFF00FF Erase gap (half-gap in reverse reads) FFFF0100 - FFFF7FFF Reserved FFFF8000 - FFFF80FF Erase gap (half-gap in reverse reads) FFFF8100 - FFFFFFFC Reserved FFFFFFFE Erase gap (primary value) FFFFFFFF End of medium Graphically, the primary control words are as follows: 31 30 29 29 27 26 25 24 23 22 21 [...] 2 1 0 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 0 0 0 0 0 0 0 0 | 0 0 0 [...] 0 0 0 | tape mark +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 0 | 0 0 0 0 0 0 0 | length > 0 | good data record +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 1 | 0 0 0 0 0 0 0 | length > 0 | bad data record +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 1 1 1 1 1 1 1 1 | 1 1 1 [...] 1 1 0 | erase gap +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | 1 1 1 1 1 1 1 1 | 1 1 1 [...] 1 1 1 | end of medium +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ A problem with the current implementation is that a conforming reader that encounters a reserved control word in the image file does not know how to recover. This hobbles the introduction of new SIMH tape features, as a simulator using an older tape library version will fail when it encounters a tape image written by a newer library version. Having no interpretation of the reserved control word, the reader can recover only by advancing the file position to resynchronize with the known format. The problem is the unknown word may be a single four-byte marker or may introduce a record of some undefined length. Without knowing this, the reader can only report a fatal error. An additional limitation of the current format is that data records carry only two identifications: "good" or "bad." Especially when generating tape images from physical magnetic tapes, it may be desirable to retain additional information with the image. For example, a "marginal data" indication (i.e., the data was recovered through error correction rather than a clean read) may be pertinent. It may be desirable to keep data associated with the physical tape (e.g., data density, locations of parity errors within "bad" records, text from the tape label) with the image. It may even be desirable to retain the original NRZI or PE flux changes along with the recovered data. Also, specific tape drive simulators may wish to store private data with the image. For instance, the HP 9144A Cartridge Tape Drive can report to the user information from the tape that indicates whether the cartridge was factory or user certified. This information is stored "outside" of the user-accessible data area. This drive also differentiates between a data record that has been written and one that has been "formatted" but never written. There is no way to represent these requirements within the existing format. Finally, some tape controllers allow a single data record spanning the entire length of a 2400-foot reel to be written. At 6250 bpi, this represents over 170 MB of data. The existing 16 MB limit prevents such a record from being stored in a SIMH tape image without artificially dividing it into smaller sections. As a result of these limitations, three changes to the interpretation of the existing format are proposed: 1. All control words, including reserved values, are placed into either the marker division or the record division. 2. These two divisions are further subdivided into classes "reserved for SIMH use" and classes "reserved for private use." 3. The length allocation for the record division is extended from 24 to 28 bits. The SIMH tape library will be revised to ignore all unknown SIMH-reserved objects present in an image file. Private-reserved objects will also be ignored unless private data support is explicitly enabled. A simulator requesting such "extended" support will be able to write private markers and data records, and will obtain private objects when reading. If extended support is disabled, the library's record reading routines will advance past such objects until a known marker or data record is encountered. That is, they will be treated as though they were erase gaps, taking up space in the file but otherwise "invisible" to the caller. The benefits of this proposal are: - Data records up to 256 MB are possible, ensuring that a single record spanning the entire tape reel can be represented. - Private data can be kept in the same file as SIMH-standard information. - A conforming reader will automatically ignore unrecognized objects in an image file. In particular, the standard data part of a tape image containing private data can be successfully read by a reader that does not understand the extended format. - Existing simulators will not be affected either by private data or newer SIMH-standard formats. To provide this support, the following control word interpretation is proposed: 31 30 29 29 27 26 25 24 23 22 21 [...] 2 1 0 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | control class | marker-specific value | marker +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | control class | data length value | record +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Control words are written in little-endian format in the tape image file, regardless of host platform orientation, maintaining compatibility with the current SIMH format. The following backward-compatible control class assignments are proposed: Class Value Assignment ----- ------- -------------------------------------- 0 0 Tape mark 0 >0 "Good" data record 1 any \ ... | Private data records 6 any / 7 any Private single-word marker 8 0 "Bad" data record, no data recovered 8 >0 "Bad" data record, some data recovered 9 any \ ... | SIMH-reserved data records E any / F any SIMH-reserved single word marker Currently, two Class F markers are defined: one indicating an erase gap, and the other indicating the end-of-medium. An erase gap appears in a file as a set of four-byte erase gap markers. The count of markers reflects the physical length of the gap at the assumed density. For example, a three-inch gap at 800 bpi would occupy 2400 bytes on a physical tape. In a tape image representing an 800 bpi tape, 600 four-byte erase markers would be written. A tape file need not end with an EOM marker; the physical end-of-file serves the same purpose. However, if an EOM is present, the SIMH tape library will not read or position past the marker. This implies that an EOM will never be seen when reading an image in the reverse direction. Part of the Class F marker range must be reserved to recognize "half-erase-gap" markers. These arise because data records occupy a multiple of two bytes, while markers occupy four bytes. If a data record that overwrites a longer erase gap occupies a multiple of four bytes, then it would overlay an integral number of erase gap markers. Interpretation in this case is straightforward, as the first control word following the trailing data record length word is the defined erase gap marker. However, if the overwriting data record occupies only a multiple of two bytes, then it will overlay half of the erase gap marker that follows the trailing record length word. Graphically, the problem is as follows: ... FE FF FF FF FE FF FF FF Original erase gap 02 00 00 00 | | | New trailing data record length word | | \ / \ 02 00 00 00 FF FF FE FF FF FF Resulting tape image ----------- ------------ Length Word Erase Gap A forward read of the image after the data record retrieves the Class F marker value of FF FF FE FF because it reads half of the overwritten marker and half of the following full marker. This special value is recognized as a "half-gap" marker, and reading is realigned by backing up the file position by two bytes. A forward read then continues with the first full erase gap marker of the remaining gap. When reading in reverse, the problem is more difficult. Referring to the above diagram, after reading full four-byte erase gap markers as the file position retreats toward the start of the remaining erase gap, the four-byte value preceding the last full marker of the original gap consists of half of the overwritten gap marker (FF FF) and half of the upper two bytes of the preceding data record length word (00 00). In this case, the Class F marker will have the value FF FF 00 00. This too is recognized as a half-gap marker, and realignment is done by backing up the file position by two bytes to point at the first byte of the erase gap. A reverse read then continues with the data record by retrieving the complete four-byte trailing data record length word. The difficulty is that "the half-gap marker" is actually a range of Class F marker values. They all start with FF FF (the truncated half of the first erase gap marker), but the following two bytes from the upper part of the length word may assume almost the full range of 16-bit values: from 00 00 through FF FD. We cannot allow the values FF FE or FF FF, because then the marker would have the same value as a full erase gap or EOM marker. Graphically, then, the range of values that must be interpreted as half-gap markers is shown below: FE FF FF FF Erase gap (primary value) 01 00 00 00 | Data class 0, lowest count FF FF FD FF | Marker class F, highest value | | ----- ----- 00 00 FF FF Half-gap read in reverse direction (lowest) FD FF FF FF Half-gap read in reverse direction (highest) As the lower two bytes of the half-gap marker comes from the upper two bytes of the preceding control word, Class F markers with values above FF FD FF FF must be reserved for half-gap interpretation and cannot be assigned as valid markers. Any Class F marker from the start of the range to the above value can be designated as a future marker value. Therefore, the Class F range assignments are as follows: F0000000 - FFFDFFFF Reserved for future use (available) FFFE0000 - FFFFFFFD Reserved for erase gap interpretation FFFFFFFE Erase gap (primary value) FFFFFFFF End of medium Values within the reserved erase-gap interpretation subrange are as follows: FFFE0000 - FFFEFFFE Illegal (would be seen as full gap in reverse reads) FFFEFFFF Interpret as half-gap in forward reads FFFF0000 - FFFFFFFD Interpret as half-gap in reverse reads A conforming writer will never write the illegal marker values, and a conforming reader will recognize the half-gap marker values and resynchronize as described above. Library Implementation ~~~~~~~~~~~~~~~~~~~~~~ A simulator indicates that it wants to use the extended SIMH tape format for a given unit by including a new MT_F_STDEX symbol in the static initialization of the unit's flags. It is defined as: #define MTUF_F_STDEX 5 #define MT_F_STDEX (MTUF_F_STDEX << MTUF_V_FMT) The macro defines the unit flag bits that specify the extended SIMH format. A simulator can include this symbol in the static UNIT initialization, and it uses a unit flag area that is already reserved, so the user unit flags are not affected. A magnetic tape simulator that wants to support multiple tape formats including extended SIMH format must declare extended support initially and allow the user to change formats with calls to "sim_tape_attach" that specifies the -F switch, or to "sim_tape_set_fmt". When these routines are called with extended format currently enabled, a flag is set in the unit's dynamic flags field that allows future calls to return to extended format. An attempt to change to extended format when the flag is not set will be rejected with an Invalid argument error. This ensures that a tape simulator that does not understand the extended format will not permit the user to select the format. When the format is set to MT_F_STDEX, the read routines will return private data records and markers, rather then skipping over them. A simulator not prepared to receive private objects must not permit the user to select the extended SIMH format. To ensure this, the "sim_tape_set_fmt" routine is modified to check the current format on entry. If it is MT_F_STDEX, then a "dynamic unit flag" is set to indicate that the "SIMHEX" format is allowed. A simulator that supports private objects will initialize the unit format to MT_F_STDEX, while existing simulators will default the format to MT_F_STD. The initial entry format establishes whether the routine will allow the extended format for current and future calls. A simulator supporting extended format may see one new status code: #define MTSE_RESERVED 12 ...as described below. This code is only returned if the unit is configured for extended SIMH format. Once the extended format is enabled, private objects may be written or read by tape library routines. This is accomplished by adding a new routine to write private markers: - t_stat sim_tape_wrmrk (UNIT *uptr, t_mtrlnt mk); The "mk" parameter specifies the private marker class in the upper four bits and the marker-specific value in the lower 28 bits. If the format is not MT_F_STDEX, an MTSE_FMT error is returned. If the class is not the private marker class, an MTSE_RESERVED error is returned. To accommodate private data records, the existing write routine is overloaded as follows: - t_stat sim_tape_wrrecf (UNIT *uptr, uint8 *buf, t_mtrlnt cc); The "cc" parameter specifies a data record class in the upper four bits and a data record length in the lower 28 bits. If the format is not MT_F_STDEX, then the class can be either the standard "good" or "bad" class, i.e., upper four bits are zero or eight; if another class is specified, MTSE_FMT is returned. If a "good" record is specified with a zero length, then routine returns MTSE_OK with nothing done. If the format is MT_F_STDEX, then a private data record class may also be specified. If the specified class is a SIMH-reserved class or the private marker class, MTSE_RESERVED is returned. If a "good" record is specified with a zero length, MTSE_INVRL is returned. Private markers and data records are read with the standard read routines, overloaded as follows: - t_stat sim_tape_rdrecf (UNIT *uptr, uint8 *buf, t_mtrlnt *cc, t_mtrlnt max); - t_stat sim_tape_rdrecr (UNIT *uptr, uint8 *buf, t_mtrlnt *cc, t_mtrlnt max); When the standard format is enabled, the current semantics are unchanged. The "cc" parameter returns just the length portion of the data record marker, and the return status is MTSE_OK for a "good" record, MTSE_RECE for a "bad" record, or a status code corresponding to a standard marker, such as MTSE_TMK. All other objects present in the tape image are ignored. When the extended format is enabled, the variable addressed by the "cc" parameter must be set before calling the routine to a bitmap of the object classes to return. Each of the classes is represented by its corresponding bit, i.e., bit 0 represents class 0, bit 1 for class 1, etc. The routine will return only objects from the selected classes. Unselected class objects will be ignored by skipping over them until the first selected class object is seen. This allows a simulator to declare those classes it understands (e.g., standard classes 0 and 8, plus private classes 2 and 7) and those classes it wishes to ignore. Markers and data records in the SIMH-reserved classes are read and interpreted by the tape library, regardless of whether or not the corresponding class bits are set. The bits only affect whether the objects are returned to the caller. Setting the bitmap to zero (no classes selected) will cause the routine to return only when it encounters a tape mark, end-of-medium, or the physical end of file -- an action identical to that of the "space record forward" routine. On return, the variable addressed by the "cc" parameter contains either the marker class in the upper four bits and the marker-specific value in the lower 28 bits, or a data record class in the upper four bits and a data record length in the lower 28 bits. The new MTR_C macro may be used to extract the class, and the MTR_L macro may be used to extract the data length. Standard markers are indicated by the appropriate MTSE return status values. If the SIMH-reserved marker class is selected, the marker will be returned in addition to being interpreted by the tape library. If a "bad" class data record is selected and read, MTSE_RECE will be returned, in addition to the class and length in the "cc" parameter variable. Reads of the "good" data class and all private data and marker classes return MTSE_OK. If SIMH-reserved classes are selected and read, MTSE_RESERVED is returned if the object is not recognized by the tape library; otherwise, MTSE_OK is returned for data records, or the MTSE value appropriate for the standard marker is returned. Reads of marker class objects do not use "buf" and "max" parameters. When a standard or private data record is read, a new MTR_C macro may be used to extract the class, and the MTR_L macro may be used to extract the data length. If a simulator supports multiple tape formats, the extended format is selected by specifying the name "SIMHEX" to a SET FORMAT or ATTACH -F command. These commands call "sim_tape_set_fmt" or "sim_tape_attach", respectively (the latter calls "sim_tape_set_fmt" internally). This routine is modified to add a new "SIMHEX" format name, corresponding to the MT_F_STDEX value, to its table of formats. Rejected Alternates ~~~~~~~~~~~~~~~~~~~ Alternate ways of enabling the extended SIMH tape format: - A new MTUF_STDX unit flag. Including this symbol in the static UNIT initialization enables the use of the extended SIMH format when the standard SIMH format is selected, i.e., when MT_F_STD is used. If the unit also accepts the ATTACH -F command to set the tape format, then the flag is ignored when one of the other formats is enabled. A drawback is that this method reduces the available user unit flags for tape devices, which may impact existing simulators. - A "sim_tape_set_fmt" routine call. This requires a call from the unit's power-on reset routine that specifies a new format string identifier (e.g., "SIMHEX"). A drawback is that if the tape device supports multiple formats, and the user has selected a different format (e.g., TPC), this will reset it if RESET -P is entered. - A new "sim_tape_extend" routine call. This would establish the extended SIMH format. It requires adding a new library routine that must be called from the unit's power-on reset routine. This has the same drawback as above, i.e., resetting the preference if the user enters the RESET -P command. For the second option, private objects are written with these two new tape library routines: - t_stat sim_tape_wprecf (UNIT *uptr, uint8 *buf, t_mtrlnt cc); The "cc" parameter specifies the private record class in the upper four bits and the data record length in the lower 28 bits. The class must be one of the private record classes, or MTSE_RSRVD is returned. If the format is not STDX, MTSE_FMT is returned. - t_stat sim_tape_wrmrk (UNIT *uptr, t_mtrlnt mk); (Operation is as described earlier.) Private objects are read with these new tape library routines, which also read SIMH-provided objects: - t_stat sim_tape_rprecf (UNIT *uptr, uint8 *buf, t_mtrlnt *cc, t_mtrlnt max); - t_stat sim_tape_rprecr (UNIT *uptr, uint8 *buf, t_mtrlnt *cc, t_mtrlnt max); The "cc" parameter returns the class in the upper four bits and the record length in the lower 28 bits. The MTR_C macro may be used to return the class, and the MTR_L macro may be used to return the data length. [...except that sim_tape_rprecX will read either a regular data record, a private data record, or a private marker. Regular markers are indicated by the MTSE code. This is identical to option 1...] -------------------- Expanded MTAB Access -------------------- The modifier structure (MTAB) used by the device- and unit-specific SET and SHOW commands provides a simple and elegant method of manipulating the 16 bits of the user portion of the UNIT flags field -- the regular MTAB. For other modifier targets or to modify numeric value fields, the extended MTAB mechanism offers complete flexibility, albeit at the cost of the simplicity of regular MTABs. The main cost is that extended MTABs require the use of validation and print routines. Often those routines are trivial, doing nothing more than setting or clearing a flag in the DEVICE flags field, or printing a word if the flag bit is set. If the extended SET takes a numeric value parameter, then the validation routine must parse the parameter, validate against any minimum and maximum value restrictions, mask the target, and insert the new value. If several fields are to be set, then several validation and print routines must be written, each of which is only slightly different than the others. The extra work to use extended MTABs to manipulate DEVICE flags creates the temptation to use a UNIT flag for an option that is logically part of the device (e.g., a strap on the controller card) rather than part of the connected peripheral (e.g., a strap on the drive). For devices with single units, distinguishing between the device and the unit is usually not needed. For devices with multiple units, though, this blurs the distinction between them, potentially creating confusion for the user when issuing SET or SHOW commands ("Do I specify the device name or the unit name?"). It would be helpful to have a mechanism that combines the ease of use of regular MTABs with the flexibility of extended MTABs. This is a proposal for such a mechanism -- the expanded MTAB. The goal is to eliminate the need for validation and print routines in the majority of cases, or, if elimination is not possible, then to reduce the code in such routines to just the unique processing required. Expanded MTABs work with SET/SHOW and SET/SHOW commands to modify: - The user portion of the "flags" field of the DEVICE or UNIT structure. - Any portion of a user field (e.g., "u4") of the UNIT structure (or of the DEVICE structure, should one be added in the future). For units, the field designated in unit 0 identifies the field that will be modified when another unit is specified (e.g., a command specifying unit 3 modifies the indicated field in the fourth UNIT in the units array). - Any portion of a global "uint32" scalar variable for a device command or a global "uint32" array element corresponding to the unit number for a unit command. The modifications may apply to: - A flag or set of flags indicated by a "SET