Thursday 19 July 2012

Engineering Maintenance Management (1985)

Introduction

Although this was a mini-computer supplier, computerisation of business functions was still slow, especially in this small Australian subsidiary of this US manufacturer. A lot of business systems were still typically paper based data collection keyed into batch systems for reporting.

We were in the process of bringing the spare-parts inventory control system on-line (which project was initiated from finance and audit). A new National Field Engineering Manager had been hired who had started asking questions that the existing systems could not support, such as device type failure rates, average cost to repair, customer call response times, etc.

Existing Systems

The spare-parts inventory management system is described elsewhere which introduced tracking of good parts taken out on service calls and the non-equivalent part swap-out process.

A customer equipment maintenance contract system had records of all equipment under service but was primarily an invoicing system.

Every service call had a work-sheet completed detailing call, response, travel and completion times, equipment repaired, parts used, etc. A local system had been developed for data entry of this data and some basic reporting had been developed. The annual budgeting system used gross counts of staff and service calls to compute utilization and combined with equipment population and sales projections, produced projected staff requirements.

Enhancements - Stage 1

The first stage was to enhance the work-sheet data entry application with full data validation of customer contracts, equipment, spare-parts and engineers. The database created for the spare-parts inventory system, with its transaction logging facility, was enhanced to record the work-sheet data with improved data structures and indexing for better reporting. Outputs from this system them automatically fed into the annual budgeting process supplying actual totals for the year.

Some interesting results were starting to be seen in the reports from this data, which lead to the decision to go ahead with a full Call Centre System.

Enhancements - Stage 2 - Call Centre Management

There were two key drivers for the call-centre system. First was to capture and validate customer call information while the customer was still on the phone, including precise identification of the equipment at fault. We had found a number of customers were not putting all their equipment under service contract and were logging so-called contract service calls for non-contract equipment (this was especially easy with terminals - a customer might have 20 terminals under contract, but in fact have 100, often bought via the "grey market").

The second driver was to maximise engineer productivity. The engineer would call in job completion from the customer site, being lead through a predetermined list of responses to capture his full "job sheet" including (for the first time) the actual device id of the equipment being repaired.  He could then be directed immediately to his next call without having to return to the depot.

Successes

All in all, the above systems proved very successful. Three successes stand out.

First, by exactly identifying the equipment items being serviced, we were able to bring a lot of "grey" equipment under contract.

The volume of terminals being serviced by "swap-out" brought to light the idea of having a service van just full of terminals, circulating in the city with a courier who could do the "swap-over" rather that incurring the cost of a full engineer service call.

By the end of the first year when we started analysing device type failure rates and cost-to-repair, it became obvious that a particular model of terminal was so fault prone, that its average cost-to-repair was not covered by the contract service price. A heavily discounted replacement sales programme was put into place to upgrade all these devices to more modern, more reliable types.  This was a result that our American head-office had not even picked-up on.



Who's the Client? What's the Deliverable? (1995-7)

Introduction

I hope you will understand my discretion in suppressing certain details and names of companies.  The project was a massive, city-wide, integrated, distributed system with a high daily transaction rate and cash flow, developed for and to be run for the state government.

The contract was won by a Joint Venture Consortium formed between a hardware supplier, a multi-national computer supplier (my employer) supplying the central computers and central management and reporting software, and a cash-services company.

The Warning Signs

We Software Engineers pressed on in typical bespoke software development fashion, collecting and documenting requirements, developing functional design specifications for review and sign-off by the client (end-user SMEs). But obtaining "sign-off" was like trying to extract "hen's teeth" - we put it down to typical public-servants' reluctance to put their name "on the dotted line" and take responsibility for their decision.

Next was the continuous pressure for requirements "creep". Whilst both major suppliers had formal documentation, quality and change-control processes, they were different and weren't coordinated - one came from the engineering discipline and the other from commercial software development backgrounds.  What was seen as acceptable to one supplier was seen as a Change Request to the other.

As the pressure of time and budget increased, the above scope-creep issues had a strange effect on the over-all programme management style. I call it "pendulum management". In alternate months, the key management message alternated from "tighten up, work to budget and time-line", then "keep the customer happy, give him what ever he wants".

The "aha" moment came when we came to specify the functionality for managing the discrepancies between sales transaction data and cash collected. We went to the client users for their "requirements" and, to our surprise", were told, "Its not our problem. Its the consortium's problem.  We simply require that you pay us the higher of the cash collected (assuming sales transactions were lost), or the sale transactions amount (assuming cash has been lost)" - it was a classic "heads they win, tails we lose" situation.

Who's the Client?

This was the point when the "client dilemma" really struck home (to we software engineers at least). The Consortium's contract was in fact a 10 year "Service Contract", to first of all to build the system for an up-front payment (with ownership retained by the consortium), and then to operate the system including provision of enquiry and reporting services to the public servants. The "requirements" of the system being built were to "provide the contracted service".

Our "cash-to-sales reconciliation" problem (above) needed to be resolved by the consortium's operational and accounting staff, who had as yet not been appointed. In fact, at this point, the consortium's operating company comprised a single project manager! The two major partners in the consortium had forged ahead almost independently with no thought or plans for how the ongoing service would be provided and any of its impacts on the requirements of the systems being built.

Our Cash-to-Sales Reconciliation Solution

A major source of issues in cash-to-sales reconciliation, was the distributed nature of the POS equipment and in about half the cases, manual transfer of transaction data (by key-drive type device) to interface into the central system. Transaction batch identification had been included and a certain degree of redundancy. What was agreed as being required, was initial reconciliation when collected cash was counted per identified transaction batch (or group of data batches). Data completeness integrity controls would be needed so that an alert could be flagged indicating that it was known that some data was missing. Then when/if the missing data did arrive (possibly via error correction) an adjustment could be raised against the matching reconciliation and a CR/DR raised.

Needless to say, such a level of audit control had not been anticipated nor built by the hardware supplier and some robust negotiation around changes to the core data interfaces was required. To be fair to the hardware supplier, they had an enormous micro-software change control problem in coordinated distribution of updates across hundreds of devices, not to mention the data storage chips attached to every cash container.

(Technical aside: The core issue revolved around a forward singly-linked list being less robust that a bi-directional doubly linked list).

The End-Game

The total system finally went into production, grossly late and over budget (there were political imperatives (read election) that the system must not "fail"). The "service" has run well past the original 10 years, since the replacement system has had numerous problems of its own.

There was "robust negotiation" (litigation?) between the government and the consortium over the cost of the over-run and scope-creep (with eventual confidential settlement). I was involved in extensive "function point analysis" of the original (very vague) contractual (service) requirements and the system "as built" in order to identify and quantify the scope increase. But even this exercise was predicated on the bespoke software development model.

Tuesday 26 June 2012

Secure Buildings Access Control (1993)

Something Seems "Fishy"

I should have realized at the start that this project smelt a bit "fishy". I had been hired by a past colleague manager for a 6 month contract to "just wrap up this major project to sign-off acceptance". The warning sign was that the original development team had resigned 'en mass'.

But times were tough, the clients were prestigious and I was up for a challenge. The project - "Building Security and Access Control" at the Australian Federal Parliament House (the security pass office operated by the Australian Federal Police), and the Department of Defense Security Office (so at least we didn't have the prospect of the clients going bankrupt on us).

My colleague took over the project management, I was the general technical "dog's body", tester, implementer, trainer and day-to-day client liaison, and we got one developer back to do code fixes.

The purpose of this blog is not to describe the whole project, but just to high-light some interesting, even humorous, aspects of the projects and lessons to learned.

Be Very Careful of System Parameters with a Sensitive Human Aspect

One aspect of the implementation at DoD, was interfacing to an "intelligent" revolving door. These doors had two special security features. There was an air-volume displacement measure to detect if two people tried to go through the door together. The other was a sonic detector of flat surfaces to detect people exiting with boxes etc.  These detectors were connected to a recorded voice warning enunciater. 

The first warning that we had of the need to tweek the parameters was when one of the more 'robust' ladies of the DoD tried to enter - the door went into reverse to back her out, announcing "One at a time please!" Oops!

Next we had an Admiral try to exit wearing his (flat top) hat. Oops! (Staff, including Admirals, had to be reminded that protocol barring wearing of hats indoors, included inside the exit doors).

The Most Difficult Part of Government Contracts is Getting Paid

The major problem of both projects was the lack of any robust, agreed set of Acceptance Criteria. And government departments are especially sticklers for not paying a cent until satisfactory acceptance has been signed-off. This is a real trap for small technology companies, as this was, who seem to think that all they need to do is to develop a good technical solution.

One thing I have learned over the years is that "Acceptance" is NOT "Testing" and that "User Acceptance Testing" is a misnomer.  

The Acceptance "Demonstration" must be specified with a well defined set of inputs which will produce the expected, documented outputs. "Dry runs" of "Acceptance" must be run before-hand error-free before attempting the sign-off demonstration to the customer. The acceptance criteria must NOT allow unfettered trial-and-error usage. There must be no unexpected errors. Demonstration to the customer for acceptance sign-off is not the time to be still discovering bugs. After acceptance, there will be a warranty period in which bugs uncovered by the customer during normal use, can be fixed.

If a "big bang" acceptance sign-off is of concern to the customer, then negotiate the contract with staged payments with the majority paid on primary "acceptance", and the balance after some defined settling-in period - define incident/problem severity levels and acceptance of settling-in defined along the lines of "2 weeks with no 'Severity 1' errors, and no more than 2 'Severity 2' errors per week" (as an example).

Getting Burned

Did we finally get paid? Not in my time there. Four months in, my pay cheques stopped coming. I stuck it out to the end of my term, but soon after, the company filed for bankruptcy and I lost out 2 months salary and a month's travel and accommodation expenses. Just before the end, two more clients had been signed up for the system. That part of the business was the only section to survive, sold off at 10 cents in the dollar and the administrators were chasing payment of the above contracts.


Monday 25 June 2012

Client-Server By Mag-tape Exchange (1983)

I had been hired by this Australian national office of a multi-national mini-computer manufacturer, to setup the internal data processing systems.  

Spare-Parts Inventory Tracking

A major issue was their spare-parts management which was getting "slammed" by the auditors for enormous write-offs/adjustments after every annual stock-take.

The spare-parts system was a national, warehouse-centric system using the US head-office software. All transactions came in on "movement tickets" for keying. These covered engineer "consumption" in repair of client machines, inter-office transfers (of both good and faulty parts) and repairs (faulty in, good repaired out). The potential for transcription and keying errors was high. There was the issue of "handedness" where left-hand and right-hand variants had similar part numbers. But from a financial point-of-view, variants of circuit boards was more critical - a more expensive board with more memory or a faster processor might vary in part number only by a suffix.

Customer "Loans" Issue

But it was "home-brew" software by one branch office for tracking their own spare parts use, that finally brought home the real culprit. On service calls, engineers took out some good parts in anticipation of what might be wrong, then on site replaced the faulty customer part with a good part. If they were identical part numbers then all good. But if an identical good part was not available, the engineer would put in a good part of a higher rating (eg. more memory), so the faulty part brought in (for repair) had a different part number. In theory, when the customer's part was repaired, it was supposed to be taken back, installed and the original good part brought back. This "home-brew" local system kept track of these "loans" and reminded the engineers. A similar issue arose with sales reps. "lending" customers a higher rated board to trial before buying. Again, the national system did not cater for this concept of "loaning" parts to customers.

On-line Data Entry Client with Transaction Logging

My solution had a couple of phases. Firstly, the new system would be multi-user on-line data entry at each point where the original movement tickets would have been written. On-line data entry then provided immediate data validation of part numbers. A new transaction type of "Loan" was then introduced - this was expanded from the loan to customer", to recording of ALL good parts that a service engineer took on his service call and recording everything he brought back, good, faulty or alternate part number. Initially, inter-office movements still needed the manually prepared movement-ticket from the non-automated office.

The second phase was implementation of this system in all the branch offices. This is especially important in a country as geographically dispersed as Australia. In the states of Queensland and Western Australia, engineers go out on week-long circuits of preventive maintenance of remote sites and so have to take a large stock of spare parts with them.

Transaction Server

Both phases had issues that led to a single solution. Real-time data entry into a system that did not have a robust database with transaction level data integrity, had potential of data loss with no fall-back paper work for data re-entry. At this time, the INTERNET was not yet available, and leased-line networking to all the branch offices was not economical.

My solution was a custom developed transaction processing system, which we would now call "client-server". Multiple data-entry front-ends accessed the database read-only for data validation, and compiled a transaction record that was sent (by inter-process communication) to a single threaded database updater process. The first step of the updater was to write the transaction records to a serial transaction log. These transaction logs were backed-up daily. In the branch offices, these transaction log files were written to mag tape and sent to head-office along with the regular parts-for-repair transfer. At head office, the transaction log files for all branches were then read (off tape) and fed into the same head-office database updater process, so that a National database was now available. For the first time, the national spare-parts planner had a reasonably up-to-date picture of the distribution of parts across all offices. At head-office, parts receipting now simply checked off the electronic parts transfer "ticket" received from the sending office. New weekly reports to each branch manager listed all parts currently out on "loan" that had to be retrieved or swapped-back.

The test of this system of processing transaction logs files, came when the National system first went live. The Melbourne (National) office had been using what would become the Branch system. When the National database was initialized with the backup after the last stock-take, we simply processed six-months worth of transaction logs to bring the National system up-to-date, without a hiccup.

International Adoption

Subsequently, this system was adopted by the European subsidiaries who were having the same issues with the US warehouse-centric system.

Integration in Service Call Management

Subsequently, this system was integrated into a Maintenance Call Centre system for dispatching engineers to service calls, recording of travel, time and parts used, which all fed into the service costs analysis system.

Friday 8 June 2012

Where's the Fire? (1987)

I was working at one of Melbourne' large stock-broking firms, implementing the interface between their back-office system and the accounting software we marketed, as well as developing customized reports. One morning, a whistle blew. I looked around, thinking it was perhaps a fire drill, but no-one was moving. Oh well, it can't be important, so I worked on. About 15 minutes later, the whistle sounded again, and again no-one moved. 

I asked the client colleague working near me, what was the whistle about. He explained that some 6 months earlier, soon after the back-office software was first installed, they found they were getting intermittent system crashes. The software supplier was called in, and after investigation, they declared that there was a critical administrative update that conflicted with other normal operations. The solution they provided, was a whistle that the administrator had to blow warning other users that a critical update was about to be performed and that the other users needed to exit any update functions they were performing. On completion of the critical update, the whistle was blown again to signal the "all clear".

As a work-around, their simple "hardware" solution worked and was certainly cheap. I don't know whether they actually provided a software interlock on the critical code section in the next software release or not!

Tuesday 5 June 2012

Fingers in the Till - Forensic Initiation (c.1986)

I was doing a little bit of maintenance work on the software we had built for a large cinema chain (support of new ticket printing hardware). Their IT manager called me aside confidentially (oops, what have I done?).  But he explained that management thought they had a problem with cash going missing from the till of the concession (snack bar). Would I be able to analyse their cash register data log to report shortages/surpluses per shift per user with long-term averages?  

Fortunately, the cash/sales reconciliation software had been well designed and recorded everything needed, so without too much effort we could quickly see that one particular operator quite consistently recorded a cash shortage, whereas the average of all other operators was essentially a zero balance. With a Statutory Declaration attached to my signed report, I heard nothing more other than that the subject staff-person no longer worked for them.

It was quite a simple job but was my first taste of "forensic" data analysis and reinforced the importance of not just updating a database, but when it comes to money, the importance of also recording who, what and when.

Interpretive Dynamic Graphic Displays (1980)

This is a follow-on post to "Sewage Works Data Acquisition and Process Control".

Graphics Displays "Tailorable" by End Users?

Yes, that was in the "Requirements". When I arrived on the project, a couple of engineers were looking at how they could design a "paint" program! That seemed like total over-kill for a piece of functionality that would be very rarely used. Sometimes we must be "creative" in interpretation of wording of "requirements".

The Graphics Hardware and Drawing Language

Fortunately, the engineering company had selected and purchased a sophisticated, programmable graphics display terminal (RAMTEK). It had an inbuilt interpreter for a drawing language. The drawing instructions were fed into the terminal as an ASCII data stream.  For example:-

        [COLOR RED][RECT <x-value>,<y-value>,<width value>,<height value>]

All <values> were in pixels. (Forgive me RAMTEK if I haven't remembered your syntax correctly). So in fact it was quite reasonable to draw up the required layout on graph paper and code the required drawing sequence. In today's parlance, we could call it a "mark-up language".

Language Extensions to make it Dynamic

But the graphics we required had to display dynamic data in "real-time". There would be meter readings, red/yellow/green signals, switches/stop-valves that indicated "open/closed", arrows with changeable direction, etc. - think "Dashboard".

My solution was to extend the RAMTEK language with three simple constructs.

  • Firstly, values could be replaced by named variables that referred to values in the in-memory data streams. There are two types of variable, booleans (ON/OFF) and integers, represented as "Bnnn" and "Innn".
  • Secondly, labels could be assigned to any statement - eg. Lnnn:.
  • Thirdly, a simple variable test construct was defined with simple boolean and logical operators, that if tests "true" directs "execution" to a specified label (eg. [IF B123 ON L456].

In today's parlance, this is known as "templating" (eg. PHP). The implementation was our own interpreter program that processed a template file, substituting variables with their real-time values, performing the testing and branching, continuously outputting a stream of drawing commands to the terminal. The end of a template file simply looped back to the start. So the "real-time" screen refresh was determined by how quickly this interpretation loop executed.

Was It "Tailorable" by End Users?

The "template" file (program?) was a simple text file of reasonably readable instructions. Sure, it was tedious to update but essentially clerical, so we argued that it was "tailorable".

The Actual Display Graphics

There were three different displays delivered with the system.

  • The hydrolic flows through the works (the first stage of the plant is to filter and separate the liquid from the solid 'sludge').

  • The "sludge" system monitoring. (The sludge is pumped through large anaerobic digester tanks, the active bacteria in which are highly sensitive to temperature and acidity, so monitoring of temperatures and pump flows is critical in such a plant).

(apologies for the hand-drawn schematic - that's all I have left from the project)

  • The electrical generation system. The above anaerobic digesters produce methane which is drawn off and feeds electricity generators which power the plant as well as sending excess power back into the city power grid. Its graphic display showed generators' status, voltages, currents, circuit-breaker statuses, etc.

Sewage Works Data Acquisition and Process Control (1980)

"Real Time" or "Data Processing"?

The sewage works of this large New Zealand city was in need of major redevelopment and modernization.

A civil engineering company was the prime contractor, and an electrical automation engineering company was sub-contracted to build the plant control systems. They had experience in micro-processor controllers for basic automation, but they needed someone to take on the back-end computing requirements of control-room data presentation and all levels of reporting.

When I arrived, their engineers were looking at (small) technical issues around the periphery of the requirements. The first thing I did was to review the requirements holistically and perform a chronological functional decomposition and came up with the following Data Flow Diagram.

(apologies for the hand-drawn diagram - that's all I have left from the project)

The engineering management were stunned when I showed them that less than 20% of the system was "real time" and the majority was "data processing" not unlike many commercial systems.

The Central Data Flow

The raw input data from the plant comprised 700 analog measurements (flow rates, tank levels, temperatures, voltage and current), and 150 digital (binary) "alarm" signals. All these inputs were sampled 4 times/second. A simple data communications protocol was designed to interface with the micro-processor controlled electrical connection panels.

Current (Real Time) Data

The "current" data status of the plant was required to be available for display on graphic displays in the control room and selected "alarm" status changes needed to be logged on a printer. The very nature of the type of plant being monitored and the data processing responsiveness of the graphics and printing showed that a 2 second response time was quite adequate, so the design was to average input data in blocks of 8 values to give 2 second averages.

But this data rate was still too fast to write to disk and the "real time" graphic display would not be responsive enough if it had to continually retrieve data from disk. The solution was to use a "mapped, shared memory" data pool. The data was also accumulated in buffers for writing to disk files at 30 second intervals.

Data Accumulation

A series of batch jobs then ran hourly, 8 hourly (per shift), daily and monthly, to summarize results from the lower level and write into the accumulation file at the next level.

Flexible Data Management - self-describing meta-data

A core feature of the system requirements, was for maximum flexibility in describing data, both the real-time inputs and the as yet undetermined manual inputs (typically the results of chemical analyses). Similarly, reports' layouts and contents required maximum flexibility of design and changes. A highly table-driven system was designed.  

But all these tables would need to be maintained (and possibly additional tables added later), so a table-driven table maintenance method was designed with its own table of meta-data that described the contents of the other tables to be maintained. The obvious (?) approach was to have it "self-describing" - the first entries in the meta-data table were manually entered to describe the meta-data table itself. The data maintenance program was then used to populate the remainder of the meta-data table, which could then be used to populate the system data tables.

Files were defined as a collection of records of a given type, and records were defined as a set of data items of various types.  Data items were defined with a long and short name, a data type and some very simple range constraints.

Process Control

A generic process control module was developed, with control tables associating outputs with the feedback inputs along with a set of process control parameters (2nd order function).

Generic Report Definitions

Shift and Daily reports were quite simple lists. Report definition tables specified which variables were to be reported.

The Monthly reports were all data analysis type reports. A given, fixed set of various sorts of analysis algorithms were required (various sorts of means, statistical measures, etc). Again, report definition tables specified which algorithm was to be applied to which set of variables for each report.

Summary

In all, a very satisfactory result was achieved. I was assisted by two new university graduates on a vacation contract, to complete the system in 6 months.

Wednesday 30 May 2012

My First Program, My First Software Bug, My First Software Engineering Lesson (1966)

Introduction to Computer Programming

In 1966 in the last year of my secondary schooling, I had the great fortune to be able to attend a week-end camp for science students, addressed by a number of scientists from a range of disciplines.  One speaker that caught my attention, spoke about computer programming and introduced us to FORTRAN-2. After less than 2 hours learning, I wrote a short program on coding forms, which our lecturer took back to his office that night, punched onto (Hollerith) cards and ran on his work computer, bringing the output listing back the next day.

The Program

I had chosen to compute and list the surface area and volume of a series of spheres over a range of diameters.

The Bug

The astute/experienced programmers will have probably guessed the bug already. Yes, I had used a floating point variable as an iterator with an equality test for loop termination. So of course my loop never terminated and my program ran for the full 2 minutes time allocation with a couple of hundred pages of output.  

The Lesson

I hadn't learnt yet about about the imprecision of representing real numbers in finite binary form. The lesson was not just about the limitations of numerical representation, but that computing is a harsh master, so fastidiously precise that it is almost dumb. I had learnt that software engineering spans theory (mathematical logic of algorithms) and experiment - that to err is human and the norm - that the job's not finished until the inevitable errors are found and debugged.

If I had not made that first mistake, I wouldn't have learnt much!

Tuesday 29 May 2012

Hospital Laboratory Data Acquisition (1971)

This was my first full-time job after completing my BSc in Computer Science, during 12 months deferment in 1970, before returning to do my Honors year and MSc research. It was supposed to be a small job to finish off some debugging and documentation. Little did I realize that I would end up utilizing a lot of the software engineering and operating systems knowledge I had just finished studying.

At this time, computing in hospitals was almost exclusively administrative, batch processing in remote data centres. Real-time, on-line processing was very new, rare and isolated. Commercialized automated equipment was still to make its mark.  

Introduction to Biochemical Analysis

The biochemistry department already used a form of automation in the form of continuous flow chemical analyzers (flame photometers and colorimeters) and was an early adopter of computing. Patients' biochemical samples are placed into 10ml sample cups on a carousel of 40 cups. The samples are interspersed with standard assay samples for calibration. The samples are aspirated in turn (alternating with distilled water to separate the samples) into the continuous flow system of tubes where it is split into multiple parallel flows (up to 8 parallel tests are performed). Each stream is mixed with reagents that produce a colour proportional to the concentration of the chemical being tested (measured by a colorimeter), or is sprayed into a flame photometer where the colour of the flame indicates the concentration of the chemical (typically the Sodium and Potassium electrolytes).

The output signal from the colorimeters and photometers are recorded onto paper roll chart recorders. The flow of samples and reagents in the plastic tubes are interspersed with bubbles of air to minimise inter-sample contamination. The charts record a wavy line where the height of each peak represented the result of a single sample. The manual method of reading the results had an operator draw a curve through the peaks of the 5 standard assays of known concentrations at the start of each batch. The height of each peak corresponding to each patient's sample was read off against the standard curve to give the concentration of that chemical in that patient sample. The results were hand-written on to each patient's master record card, a photocopy of which was sent back to the requesting doctor.

Computing Meets Biochemical Analysis

The system I inherited was a small PDP-8/S (16K x 12bit words of memory and 32K words of directly addressable drum storage) interfaced through ADCs (Analogue-Digital Converters) to the chart recorders.

A data entry process records identifications of all the samples in each batch to be tested.  As the batch test runs, the computer "reads" the height of each peak on each chart. The peak heights and assay concentration of the standards are used to generated a 3rd order best-fit polynomial representation of the standards curve, against which the height of each patients sample peak is read-off. The results from all the charts are collated (due to different flow lags times, the charts are not in synch.) and printed on a strip of adhesive label along with the patient/sample identifier.

Debugging the System Flaw

When I arrived, the system was essentially working, except at irregular intervals, the software "crashed". A technical 'aside' is in order so you can understand the primitive environment. The PDP-8 came with a couple of rudimentary "monitors" (one could hardly call them "operating systems") neither of which were suitable for these needs. So apart from a floating-point arithmetic library, the system (from the interrupt handler level up) was written from scratch in assembler. Now this very small (cheap) system didn't have the capacity to compile/assemble our code and a cross-assembler had been written for a CDC-3200 main-frame - 2 trays of punch cards assembled to a 4 inch roll of punched paper-tape binary. Needless to say, assembly was an over-night job. The paper-tape was loaded through an ASR-33 (10cps) teletype (did I mention this was a "cheap" system?).

Debugging was done through the console toggle switches (in binary) and patches were saved on short pieces of punched-tape.

The Bug that Broke the System's Back

The bug I finally found was system critical/terminal. The system had 2 teletypewriters, one for data entry and listings, and the other was essentially a real-time log. The real-time logging was triggered from within an interrupt routine which called the printing subroutine. Now the PDP-8 did not have a "stack" architecture but used the (now) out-dated method of storing the return-address at the entry point of sub-routine. My problem was that if the batch printing code was executing inside the printer sub-routine when the logging print interrupt occurred, the same sub-routine was called a second time and the original return-address was over-written. Once I found it, I immediately recognized a classic software-engineering problem of a "critical code section". I tried a couple of methods to protect this section and serialize access to it, with mixed success. There were a number of other issues with the system around capacity and flexibility that convinced me that a major rework was in order.

The Rewrite

The printer control was rewritten as a proper "operating systemesque" table driven (we now had 3 printers to handle), generic, fully buffered, printer handler.

All the data entry functions were moved into "over-lay" sections, loaded on request from the drum-storage. This freed up memory for data storage so that multiple batches could be setup ahead of time.
The core data processing fell into a series of quite distinct, chronological chunks. To support increased, flexible data storage and improved "multi-processing", a data-driven "message passing" type of architecture was used. Data storage was broken up into a pool of blocks, that moved from the free pool, though a series of queues attached to each piece of processing code. As each section of code finished processing its current block of data, it queued the data up for the next processing step, then commenced processing the next block of data from its input queue, or put itself on the idle process queue. The processing steps were broken up at each point where a delay could occur so no "time-slicing" was required.

In Summary

The rewritten system performed very smoothly thereafter and was implemented in two of Melbourne's biggest hospitals, then later at a New Zealand hospital. However, like many home-grown systems, it was soon overtaken by commercial equivalents. But it was a system I was proud of and "we did it first"!

PS. The 'S' in PDP-8/S stood for 'serial' (and 'slow' (and cheap)).  The machine had a 1 (one) bit ALU (Arithmetic Logic Unit) so bits of the 12 bit words were 'shifted' through the ALU (and the result shifted out) to perform addition. Subtraction required negation then addition. Multiplication and division had to be library functions.

PPS. The data-driven multi-processing approach was used in "Carbine", a very early Totalizator system for Melbourne's racing industry, and was described by its designer, John Marquet, in "Operating Systems for On-line Commercial Work on Mini-computers", presented at the 6th Professional Development Seminar of the Victorian branch of the Australian Computer Society, c.1971.

References

Computerised Data Acquisition in Biochemical Laboratories. Digest of the Ninth International Conference on Medical and Biological Engineering, 1971, p.133

Auto-Analyser Data Acquisition System, Proc. DECUS-Australia Symposium, May 1971, pp.33-37

MUDBOS, A Multi-Device Buffered Output System. Proc. DECUS-Australia Symposium, May 1971, pp.39-41