The NOTAM Database Crash: What Happened
A nearly 20-hours-long crash of the FAA's NOTAM database last week occurred because of a drive failure that took place "in the middle of updating the information on the hard drive," which in turn "screwed up the database," Barry Davis, manager of the aeronautical information management for the FAA, told ComputerWorld.com. The box in question was a Sun Microsystems Inc. server, according to the FAA, that was nearing the end of its life expectancy. Its failure put controllers to work disseminating the NOTAM information to pilots. Davis' team already had replacement equipment on hand, they just hadn't yet performed the replacement. Because of that, the hardware recovery portion of the fix "was quite simple -- we just put the boxes in," said Davis. Unfortunately, when they did that, they moved a data error over to the backup system, thereby corrupting it and causing the system to run slowly and in a manner that appeared to be deteriorating. In the end, the latest information had to be pulled from the corrupted database, re-imported into the new database and resynchronized with all the subsystems. Davis' team then put the system back online and stuck around into the evening to make sure there were no more surprises.