HP 2000A Source Reconstruction ================================ J. David Bryan 2019-09-12 Al Kossow's "hp_dsd_read_20190610.zip" archive contains the "20596.tap" tape image. This tape contains the 36 source files for revision F of HP 2000A Time-Shared BASIC (20596-80001 through 20596-80036). However, 20596-80017 is bad. It starts with routine "#STST" at address 23105 of the binary and ends with instruction "STA SBPTR" at address 23145. The next source file, 20596-80018 resumes with routine "TAPER" at address 24407. So some 600+ instructions are missing. Fortunately, absolute binaries for revision F were also available from Bitsavers. Also, each TSB version (A, B, C, etc.) was built on the prior version, and the sources for 2000E and 2000F are available, as is a listing of 2000C. So much of the missing code could be resurrected by identifying those parts of C, E, and F that were common or similar. The reconstruction process began with reverse-assembling the binaries to produce pseudo-source listings. This proved to be somewhat difficult, in that while there are a number of inverse assemblers in the HP 1000 Contributed Software Library (CSL), all but four of them handle relocatable binaries only. One of the four that do handle absolute binaries (program "IMREL") produces output compatible with the RTE-6/VM "MACRO" assembler, rather than the earlier "ASMB" format. Of the remaining three, two run under SIO and BCS, which are awkward to use. The remaining one (the "DOS-M ABSOLUTE OBJECT DECODER") runs under DOS. This program was run on a DOS-III system running on the HP2100 SIMH simulator. The missing sections were first delineated from the inverse assembly listing and then identified by comparing the assembly-language instruction sequences to those in the 2000E source. Many of the missing routines were unchanged in the E source and could be copied verbatim to the reconstructed 80017 source file. A few had minor changes (e.g., for the difference in supported disc drive geometries) that were easily identified and added. Some of the routines had been reordered from A to E, so some visual scanning of the sources was needed. The only difficult part was the section headed "ARITHMETIC SUBROUTINES." This is because 2000A used software floating-point routines, whereas 2000E uses the hardware floating-point instructions present in its target CPU. Fortunately, the 2000C listing had software FP routines that were similar -- but not identical -- to those present in the 2000A inverse-assembly listing. So some changes had to be introduced to ensure that the resulting reconstructed source matched the existing binary. Once the 80017 file was reconstructed, it was verified by assembling the entire 2000A system consisting of files 20596-80008 through 20596-80036 (80001 through 80007 comprise the TSB loader). Assembly was performed on a (simulated) RTE-IVB system using the RTE assembler. The resulting absolute binary file could not be compared directly to the existing TSB binaries. This is because the various HP assembler versions produce differing -- though equally valid -- absolute binary files. Consider, for example, a file containing one absolute record of 20 words loading at address 100, and another file containing two absolute records of 10 words loading at addresses 100 and 110. The files would not compare, even though the CPU memory images from loading the two files would be identical. So to compare the reconstructed binary with the existing binary, both had to be run through the inverse assembler to produce the equivalent assembly-language listings. These were compared and found to be identical, proving that the reconstructed source file matches the original (except, perhaps, for the comments). The reconstructed assembly-language statements are delineated by the comments: [JDB] START OF RECONSTRUCTION OF 20596-80017 ...and: [JDB] END OF RECONSTRUCTION OF 20596-80017 in the 20596-80017 source file.