Wednesday, 30 May 2012

My First Program, My First Software Bug, My First Software Engineering Lesson (1966)

Introduction to Computer Programming

In 1966 in the last year of my secondary schooling, I had the great fortune to be able to attend a week-end camp for science students, addressed by a number of scientists from a range of disciplines.  One speaker that caught my attention, spoke about computer programming and introduced us to FORTRAN-2. After less than 2 hours learning, I wrote a short program on coding forms, which our lecturer took back to his office that night, punched onto (Hollerith) cards and ran on his work computer, bringing the output listing back the next day.

The Program

I had chosen to compute and list the surface area and volume of a series of spheres over a range of diameters.

The Bug

The astute/experienced programmers will have probably guessed the bug already. Yes, I had used a floating point variable as an iterator with an equality test for loop termination. So of course my loop never terminated and my program ran for the full 2 minutes time allocation with a couple of hundred pages of output.  

The Lesson

I hadn't learnt yet about about the imprecision of representing real numbers in finite binary form. The lesson was not just about the limitations of numerical representation, but that computing is a harsh master, so fastidiously precise that it is almost dumb. I had learnt that software engineering spans theory (mathematical logic of algorithms) and experiment - that to err is human and the norm - that the job's not finished until the inevitable errors are found and debugged.

If I had not made that first mistake, I wouldn't have learnt much!

Tuesday, 29 May 2012

Hospital Laboratory Data Acquisition (1971)

This was my first full-time job after completing my BSc in Computer Science, during 12 months deferment in 1970, before returning to do my Honors year and MSc research. It was supposed to be a small job to finish off some debugging and documentation. Little did I realize that I would end up utilizing a lot of the software engineering and operating systems knowledge I had just finished studying.

At this time, computing in hospitals was almost exclusively administrative, batch processing in remote data centres. Real-time, on-line processing was very new, rare and isolated. Commercialized automated equipment was still to make its mark.  

Introduction to Biochemical Analysis

The biochemistry department already used a form of automation in the form of continuous flow chemical analyzers (flame photometers and colorimeters) and was an early adopter of computing. Patients' biochemical samples are placed into 10ml sample cups on a carousel of 40 cups. The samples are interspersed with standard assay samples for calibration. The samples are aspirated in turn (alternating with distilled water to separate the samples) into the continuous flow system of tubes where it is split into multiple parallel flows (up to 8 parallel tests are performed). Each stream is mixed with reagents that produce a colour proportional to the concentration of the chemical being tested (measured by a colorimeter), or is sprayed into a flame photometer where the colour of the flame indicates the concentration of the chemical (typically the Sodium and Potassium electrolytes).

The output signal from the colorimeters and photometers are recorded onto paper roll chart recorders. The flow of samples and reagents in the plastic tubes are interspersed with bubbles of air to minimise inter-sample contamination. The charts record a wavy line where the height of each peak represented the result of a single sample. The manual method of reading the results had an operator draw a curve through the peaks of the 5 standard assays of known concentrations at the start of each batch. The height of each peak corresponding to each patient's sample was read off against the standard curve to give the concentration of that chemical in that patient sample. The results were hand-written on to each patient's master record card, a photocopy of which was sent back to the requesting doctor.

Computing Meets Biochemical Analysis

The system I inherited was a small PDP-8/S (16K x 12bit words of memory and 32K words of directly addressable drum storage) interfaced through ADCs (Analogue-Digital Converters) to the chart recorders.

A data entry process records identifications of all the samples in each batch to be tested.  As the batch test runs, the computer "reads" the height of each peak on each chart. The peak heights and assay concentration of the standards are used to generated a 3rd order best-fit polynomial representation of the standards curve, against which the height of each patients sample peak is read-off. The results from all the charts are collated (due to different flow lags times, the charts are not in synch.) and printed on a strip of adhesive label along with the patient/sample identifier.

Debugging the System Flaw

When I arrived, the system was essentially working, except at irregular intervals, the software "crashed". A technical 'aside' is in order so you can understand the primitive environment. The PDP-8 came with a couple of rudimentary "monitors" (one could hardly call them "operating systems") neither of which were suitable for these needs. So apart from a floating-point arithmetic library, the system (from the interrupt handler level up) was written from scratch in assembler. Now this very small (cheap) system didn't have the capacity to compile/assemble our code and a cross-assembler had been written for a CDC-3200 main-frame - 2 trays of punch cards assembled to a 4 inch roll of punched paper-tape binary. Needless to say, assembly was an over-night job. The paper-tape was loaded through an ASR-33 (10cps) teletype (did I mention this was a "cheap" system?).

Debugging was done through the console toggle switches (in binary) and patches were saved on short pieces of punched-tape.

The Bug that Broke the System's Back

The bug I finally found was system critical/terminal. The system had 2 teletypewriters, one for data entry and listings, and the other was essentially a real-time log. The real-time logging was triggered from within an interrupt routine which called the printing subroutine. Now the PDP-8 did not have a "stack" architecture but used the (now) out-dated method of storing the return-address at the entry point of sub-routine. My problem was that if the batch printing code was executing inside the printer sub-routine when the logging print interrupt occurred, the same sub-routine was called a second time and the original return-address was over-written. Once I found it, I immediately recognized a classic software-engineering problem of a "critical code section". I tried a couple of methods to protect this section and serialize access to it, with mixed success. There were a number of other issues with the system around capacity and flexibility that convinced me that a major rework was in order.

The Rewrite

The printer control was rewritten as a proper "operating systemesque" table driven (we now had 3 printers to handle), generic, fully buffered, printer handler.

All the data entry functions were moved into "over-lay" sections, loaded on request from the drum-storage. This freed up memory for data storage so that multiple batches could be setup ahead of time.
The core data processing fell into a series of quite distinct, chronological chunks. To support increased, flexible data storage and improved "multi-processing", a data-driven "message passing" type of architecture was used. Data storage was broken up into a pool of blocks, that moved from the free pool, though a series of queues attached to each piece of processing code. As each section of code finished processing its current block of data, it queued the data up for the next processing step, then commenced processing the next block of data from its input queue, or put itself on the idle process queue. The processing steps were broken up at each point where a delay could occur so no "time-slicing" was required.

In Summary

The rewritten system performed very smoothly thereafter and was implemented in two of Melbourne's biggest hospitals, then later at a New Zealand hospital. However, like many home-grown systems, it was soon overtaken by commercial equivalents. But it was a system I was proud of and "we did it first"!

PS. The 'S' in PDP-8/S stood for 'serial' (and 'slow' (and cheap)).  The machine had a 1 (one) bit ALU (Arithmetic Logic Unit) so bits of the 12 bit words were 'shifted' through the ALU (and the result shifted out) to perform addition. Subtraction required negation then addition. Multiplication and division had to be library functions.

PPS. The data-driven multi-processing approach was used in "Carbine", a very early Totalizator system for Melbourne's racing industry, and was described by its designer, John Marquet, in "Operating Systems for On-line Commercial Work on Mini-computers", presented at the 6th Professional Development Seminar of the Victorian branch of the Australian Computer Society, c.1971.

References

Computerised Data Acquisition in Biochemical Laboratories. Digest of the Ninth International Conference on Medical and Biological Engineering, 1971, p.133

Auto-Analyser Data Acquisition System, Proc. DECUS-Australia Symposium, May 1971, pp.33-37

MUDBOS, A Multi-Device Buffered Output System. Proc. DECUS-Australia Symposium, May 1971, pp.39-41