This is a response to Corvus’ Designing for Failure. The only reason I’m posting it here is because I felt that it got way too long to be a mere comment. Feel free to point out flaws and holes in my logic. I know that systems can be designed to react to failures, and I feel that self-repairing systems are possible given enough knowledge of the system, and data for reconstruction.
Let's look at the hardware engineering aspect first. Hardware is a limited system. The only parts that exist are those that have been designed into the system. Very rarely in hardware systems do you get any sort of emergent output. Failure in hardware system means pulling out a block from the Jenga tower of the system. If the system is designed well, the tower may not fall, but it does become less stable. Of course, we can do a couple of things to alleviate this. Firstly, we can design in redundancy, essentially building a very robust tower. Or, we can design systems to monitor the original system. This is a process of diagnostic; we take the output that the system produces and double check it against expected output. Note that this is not feasible in every case, nor is it foolproof. The largest flaw is that we assume that the diagnostic system will be functioning correctly; without accurate diagnostic data, the system self-corrupts.
This is completely different in the realm of software. Primarily because software is fluid in nature, and not fixed like hardware. A system of self-diagnosis is both easier to produce and far more reliable*. But it is also possible to design a self-repairing, or (even as Craig has suggested) a self generating system. Code can write itself for the purposes of repair or active design, although this is not an easy task.
As a simple diagnostic system, your engine would be very capable of tracking the progress and subsequent missteps of users. You would be easily able to detect when a user has wandered too far from existing plot points, and provide you information to generate new ones. I don’t think that this is far from the designer/tracking tools that you have outlined. You would also be able to easily spot plot holes as users encounter them, and perhaps even before. As the system ages, it may be able to fix certain flaws in the narrative based on previous fixes that have been applied. This could be done by having it asses story flow and place appropriate sub-plots, characters, and known archetypes as needed to focus the players along a narrative path (or paths).
It is hard to design to prevent failure, and to do so requires a lot of testing and time. You have to understand the whole system, its dependencies, its weak points, and its strengths. Then you have to determine what level of failure is acceptable and design fail-safes to ensure that the weak points stay within acceptable operability. This is usually an iterative process, each successive version has a better design because the flaws of the previous one were analyzed and removed. New flaws will always be created, because no system in perfect.
The same process would exist in a system designed for failure (and self-repair). Each iteration would become stronger from having the flaws removed. But, because it is an emergent process, new flaws will tend to be created. I think that the core of this challenge is in representing the narrative and story elements within the computer system is such a way that they can be easily reused and refurbished. This would require a deep understanding of story mechanics and elements. And it would not work in all cases at first, but over time, as associations are built within the program, it would get better at representing, fixing, and designing stories.* Not strictly true, as assumptions about the integrity of the data are still being made