Thirty five years ago someone offered to pay me for writing a computer program. "What a great way to make a living," I thought.
When the program was working, I proudly showed the result to my customer. He had the sheer audaucity to check the results by hand. When the numbers didn't add up, I was, shall we say, chagrined.
Since that time I have written a lot more programs for money, but I've never forgotten the lesson learned on that first assignment. Writing programs that are not only "done" but are also demonstratably correct has been an important goal in my professional life.
Thus when I see the teaser "How to Prevent Software Disasters" on the front of the June 2006 "Scientific American," and an article "Dependable Software by Design" by Daniel Jackson, you can be sure it had my attention.
After the usual scare stories -- in this case the ones involving the Denver Airport, the IRS, the FBI, and the FAA -- along with cautionary scenarios of out-of-control computers running planes, trains, and automobiles, financial markets, and machine tools (the author forgot medical equipment, oh, never mind,that showed up in a sidebar a couple of pages later) the article got down to business.
My first warning that this was going to be yet another silver-bullet article came when I read, "Almost all grave software problems can be traced to conceptual mistakes made before programming started."
Um, excuse me, out here in the field it turns out that the overwhelming majority of mistakes have to do with the little fiddly bits, not the grand scheme of things. For every Denver Airport Baggage system that couldn't possibly work due to faulty analysis of the problem to be solved, there are hundreds or thousands of "little problems" like the Mars Observer missing the planet because someone used the wrong measurement system, or the software viruses that were able to cripple the internet because some programmer neglected to include a simple check for the length of an incoming message.
"Ok," I thought. "We'll ignore that one and move on. There's often a pony to be found in this type of article." And then I came to the statement: "The idea is to simulate every state that the software can take to determine that none leads to a failure." I must admit I broke out laughing at that point.
Consider the climate control system in my car. It's not very complicated, A rough estimate of the "possible states" for the heater and air conditioning controls on the dashboard is around 2000. The estimate is rough because the temperature control is analog so I'm estimating about ten possible settings. Factoring in the radio gets the number of possible states for the dashboard controls well into the millions, and we haven't even handled half of the dashboard. I daresay generating all possible states for my single automobile would take modern computers (all of them!) more time than has passed since the automobile was invented.
Yes, it turns out the author is aware of the state explosion problem, and he didn't really mean "simulate every state..." He really meant "use some heuristics to search for possible pathological states..." Apparently those states that are identified by the developer as being pathological (unreachable files in his example.) Once again, referring to my experience in "the real world" it's not the pathological cases that you can think up ahead of time that get you. It's the ones that come as a surprise. "Gosh, I never imagined that a trader would enter the price in the quantity field and vice versa. There goes the company."
Simulating a model is just another form of testing in which a negative result (we found a failure case) is important and significant, but a positive result (no problems found here!) can be used, along with $5.00 to buy a fancy cup of coffee at Starbucks. The authors are describing an automated testing tool that is driven from a model of the program behavior.
To make this system work, the author has invented a new language to describe the design of the program, Alloy. Presumably Alloy expresses the design at a higher more abstract level than the typical languages (Java, Ruby, C++, Python, ...) used by programmers, and therefore allows a higher level of validation (aka testing) to be applied to the design. The question not answered by the author is what relationship exists (if any) between the Alloy model of the program design and the actual program as executed by the computer?
Some possibilities include automatically generating the "low level" computer code from the high level model, automatically verifying the correspondence between the model and the actual implementation, or depending on the intellegence and understanding of the software analysts and developers involved in the project to be sure that the relationship holds as expected. Each of these possibilities has corresponding problems.
If the actual code is auto-generated what you have is an executable model. In fact you've invented a new programming language. Now rather than programming in Java, the software developers will be programming in Alloy. This is not necessarily a bad thing, but it means the Alloy modeling language needs to be rich enough to address all the issues that software developers must worry about when writing programs. And, just as mistakes are often made by programmers writing in Java or Python, mistakes will be made in Alloy. If it turns out that Alloy is a good language and can hide more of the complexity of programming that can traditional languages, then progammers will make fewer mistakes. However if Alloy is not seen as a programming language and analyzed as such, we'll never know how expressive and safe it is.
Suppose we try the approach of automatically verifying the equivalence between the Alloy model and the "real" program. Once again Alloy is a programming language, but this time it does not need to handle all the low level details. They can be addressed in the "actual" programming language (goodness, I hope the programmer gets the details right.) All that needs to happen is that the high level abstraction expressed in Alloy must be compared to the abstraction expressed in Java or Python. To put this another way, we need a tool that can examine a Java program and understand what it is supposed to do.
Making this comparison is an interesting, and quite difficult problem to solve. One wonders, we had a way to extract the higher level meaning of a program from the lower level C# or VB or Java source, why we can't just validate this representation of what the program means directly using the same techniques that are used to validate an Alloy program (er... model.)
This does, however, raise and interesting possibility. Rather than inventing a new higher level language in which to express the program, maybe it would be more productive to have two teams writing the same program and an automated comparison of the results of the two team's work. Surely it is easier to compare two C++ programs to each other than it is to compare a program written in an abstract modelling language to one written in a low level implementation language.
As an aside, let me mention the stiff resistance such a plan will encounter in the "real world" of managers and business people who already think that it takes too long to develop software so don't even think of using two people to do one job! I happen to know this because many developers would like to spend half their time writing code that tests to be sure the other half of our code behaves the way we expect it to. We call it "unit testing" and it's mighty difficult to get the funding to do it from some short sighted organizations.
The remaining approach for verifying the correspondence between the Alloy model and the actual code is to depend on the native intelligence and experience of the software developers. This is a good thing to the extent that it helps the developers involved analyze and understand the program at a higher level than the one at which they normally work. In fact I speculate that a great deal of the benefit claimed for Alloy comes because of just this effect. In order to create an Alloy model the developers must come to a deeper understanding of the basic design of the program, and in doing so, they end up creating a better system.
So, I congratulate the author on addressing a difficult topic and bringing some new ideas to the table (or at least some interesting new variants of the old ideas that have been around for a while) but I do wish he had done a better job of understanding the meaning of this work. I cringe when he suggests that "governments may even establish inspection and licensing regulations that enforce high-quality program construction techniques" implying that modeling the program in Alloy is such a good thing that it should be imposed by fiat.
Tuesday, May 23, 2006
Subscribe to:
Posts (Atom)