Say Again? #13:
System Safety Theory and Practice
Air traffic control is in the news again, which means more people asking more questions (and making more uneducated statements). To head some of them off at the pass, AVweb's Don Brown uses this month's Say Again column to explain how the U.S. ATC system uses many layers of safety to prevent midair collisions, and what would cause it to fail.
I know it will come as no surprise to you that the recent midair collision on the Swiss-German border has been dominating my thoughts this week. It's the ultimate nightmare in ATC, for pilots, passengers, and controllers alike. We will have to wait months before we know precisely what went wrong.
The waiting will be difficult, but we have all seen that jumping to conclusions leads to more trouble than it gets us out of. As unpleasant a reality as it is, this tragedy will focus much attention on our system from people that rarely think about how our system works. Our ATC system in America is recognized universally as being safe. I suspect the same was true of the Swiss ATC system, though. And let me make that really clear while we're here: I don't know a thing about the Swiss ATC system. This article is in no way going to be about the Swiss ATC system or any other country's ATC system. It's about ours.
Inevitably, somebody is going to ask, "Could it happen here?" Beyond the obvious reply of, "Anything is possible," there isn't a good answer for that question. We don't know what happened yet. What I can do — what I will attempt to do — is tell you the things that would have to go wrong in our ATC system for it to fail.
The Cheese Model
The best way I've found to think of system safety was in a paper I read a few months ago. You can find a similar paper on the same subject here. In a horrible coincidence, the analogy device used in the paper was Swiss cheese. The coincidence didn't even hit me until I started typing this article. I've thought long and hard about how to get around it. I couldn't, so we'll just have to face it and move on.
The analogy compares the layers of safety built into a system with layers of Swiss cheese. Each layer has the inevitable holes in it. But if you have enough layers in your system, your overall system won't have any holes all the way through. You will have a solid barrier against any system failure. That will be the basis for this article. We're going to look at some layers in the enroute ATC system and see how many holes are in them.
The first time any controller becomes aware of a flight is by the Flight Progress Strip. On this strip of paper, all the information necessary to handle the flight is included: callsign, type aircraft, altitude, route of flight, time, and so on. At least one strip is printed, per flight, for each sector the flight will enter. The strip is then delivered to the appropriate sector.
Flight Progress Strip
The sector controller will take the strip, review it for any errors, and then place it, in time sequence, along will all the other strips on flights that will enter the sector. The controller will look for any conflicts with other traffic based on altitude (is anyone else at the same altitude?), route of flight (is anyone else at the same altitude on a conflicting route of flight?), and then by time (if the altitude is the same and the route crosses [or conflicts], how far apart will they be?)
Keep in mind that this should all happen five to 15 minutes before the flight actually enters the sector. This process is at the very heart of enroute ATC, and was developed in the very beginning. To this day, it is still the basis for non-radar ATC. Accordingly, it is mostly used by the "non-radar" controller, also know as the "data controller." Hence the term D-side controller.
Should the controller notice any conflicts, he will place a red W on both strips next to altitude. The first layer is in place. The controllers realize they have a problem and start searching for the best solutions.
The Holes in Layer One
While this is a very old and straightforward process, there are very few people that possess the capability to accomplish it. Almost anyone can do it with two airplanes. For instance, both flights are at FL330, one is eastbound on J22, and the other one is northbound on J53. They are both estimated to cross PSK at 1111Z. In other words, they'll be at the same place, at the same time, and at the same altitude. That's simple enough. But try doing it with 10 flights. The degree of difficulty rises exponentially for each flight added and for each crossing point added. Add in a few aircraft that need to change altitudes, and it gets far more complex.
This ability is what has always separated controllers from the general public. We have this innate ability to "see" this "picture" in our heads and to keep it constantly updated. Much like professional musicians can see a piece of sheet music and "hear" it in their heads, a controller can look at a board full of strips and "see" the traffic.
It is a mentally demanding task that requires a talented individual trained to very high standards. In other words, an expensive individual. It's expensive to find them, expensive to train them, and it's expensive to keep them trained. Add into the mix that it's much easier to just see the traffic on the radar, and that we don't even have D-sides half the time ... and you're left with a piece of cheese that is so thin that it's more "hole" than it is cheese.
5-4-6. RECEIVING CONTROLLER HANDOFF
The receiving controller shall:
b. Issue restrictions that are needed for the aircraft to enter your sector safely before accepting the handoff.
This means, simply, that before you accept a handoff on an aircraft that wants to enter your sector, it should be free of conflicts, or you need to issue control instructions to resolve any conflicts first. So, if you didn't use your strips to separate the traffic with non-radar procedures, this will be the first opportunity to use radar separation. If there is a conflict, you will need to move the aircraft under your control (assuming one of the aircraft is actually in your sector), or have the transferring controller move the aircraft he is trying to hand off.
One reason separation should be ensured prior to taking the handoff is that a transfer of radio communication is anything but a sure thing. Controllers forget to switch airplanes, pilots dial in wrong frequencies, and radios can fail. Any number of things can happen and none of them are good.
Once again, human nature, and a strong desire to make the system more efficient, start adding more holes than originally existed. All controllers have a personal level of comfort in working traffic. Some get nervous when aircraft will have less than 20 miles between them. Some don't get nervous until they see the aircraft will have less than two miles of separation. The phrase "enter your sector safely" is wide-open to interpretation. Radar separation, despite being vastly more precise than non-radar separation, isn't as cut and dried as you might think. Besides, when it comes down to moving an aircraft nearly every time you take another handoff, it's easier to believe that it might work (and fix it later if it doesn't) than to make a hundred more transmissions or telephone calls a day.
If layers one and two still leave a hole through our system, then we have at least one more chance on the human side of the equation:
5-1-8. MERGING TARGET PROCEDURES
b. Issue traffic information to those aircraft listed in subpara a whose targets appear likely to merge unless the aircraft are separated by more than the appropriate vertical separation minima.
c. If the pilot requests, vector his/her aircraft to avoid merging with the target of previously issued traffic.
NOTE - Aircraft closure rates are so rapid that when applying merging target procedures, controller issuance of traffic must be commenced in ample time for the pilot to decide if a vector is necessary.
How this affects separation is not obvious to most people. But if you'll stop and think about it, in order to make the judgment that the aircraft are separated "by more than the appropriate vertical separation minima," the controller actually has to determine the altitude of both aircraft that are merging before he can decide whether or not to apply merging target procedures. I've said it before but it bears repeating: If you think that a controller hasn't ever applied the merging target procedure only to discover that both aircraft were at the same altitude, then you'd be wrong. It's rare, but it has happened.
If you're scratching you head about now, don't worry. I assure you some controllers reading this are also. I admit that I've taken a convoluted path in presenting these three layers. These three layers aren't the only layers in our safety system but I wanted to group them together. Each one, applied properly, can affect the separation of aircraft. But as we all know, it's all but impossible to apply any one safety procedure correctly 100 percent of the time.
The reason I grouped these three layers together is because they have fallen into disuse. I'll pay you five bucks if you can find a pair of red W's on anybody's strip board in Atlanta Center. Because of the way some airspace has been designed, virtually every aircraft in the sector would require a red W. Furthermore, "thinking non-radar" is tough. Random routes (without any common crossing points) makes it even tougher. Throw in 40 to 80 flight progress strips and it becomes virtually impossible.
To further flog this dead horse, think about the hub-and-spoke system. The whole system, by its very design, is supposed to put all the aircraft at the same place at the same time. It's the antithesis of ATC theory. What's interesting is that, in a backwards sort of way, it validates the theory of non-radar control. The airlines don't have a radar scope they use to vector all their aircraft to arrive at the airport at the same time. They do it strictly based on time. And if you've ever seen an inbound push into a hub airport, you know that it works very well indeed.
Okay, okay. I'll move on.
Coordinating prior to taking a handoff is still pretty common among many controllers. But their number is diminishing rapidly. A paradigm shift (yeah, I can use buzzwords too) is taking place in enroute ATC. We are drifting away from "positive control" and becoming more and more reactionary. Instead of "reaching out" to move an aircraft long before a potential conflict, we are waiting to see if it really is a conflict. Of course, if there really is a conflict, then we have less time to resolve it.
The Last Chance
The application of merging target procedures is perhaps the most obvious area of concern. With the advent of advanced navigation and the increase in traffic, the instances where the procedure should be applied are growing rapidly. Yet, it's being applied less and less. Before I give you the number-one reason why it's being applied less and less, let me ask if this sounds familiar:
"Airliner123, traffic twelve o'clock, one zero miles, opposite direction, at flight level three three zero."
In a bored voice the pilot replies, "We're IMC, Airliner123."
"Roger. Airliner354, traffic twelve o'clock, eight miles, opposite direction, at flight level three one zero, a Boeing seven thirty seven."
In a really bored voice the pilot replies, "We're IMC too, Center." From the tone of voice used you really expect to hear "you big dummy" tacked onto the end of the pilot's transmission.
Go back and read the rule. Does it say, "Call the traffic unless the aircraft is in IMC?" I didn't think so. As with most rules, there's more than one reason for it. The most important reason for this rule is to keep you alive (you big dummy).
The number-one reason controllers aren't calling traffic? "I don't need to call the traffic. He sees him on TCAS, you big dummy." I don't think anyone will be espousing that excuse anymore. They'll probably revert to the "I'm too busy" excuse. Unfortunately, there's a lot of truth to that excuse. Often times controllers are too busy to call all the traffic they are working.
The really thick layer of "cheese" lies between layer two and three. A controller notices the vast majority of conflicting traffic merely by continuously scanning the radar scope and looking for conflicts. This layer is strengthened by additional personnel. The D-side, when not otherwise engaged in other duties, scans the radar scope also. If traffic volume warrants it, a third controller called a Tracker (some call them Handoff Controllers) will plug into the sector and scan the radar also. As history has proven, it is a very effective and safe system. But it's not perfect.
An automated safety layer to back up a controller's scan is the Conflict Alert system. The computer continuously scans all the traffic, searching for predicted conflicts. Should the computer predict a conflict within a certain period of time (three minutes at Atlanta Center), the computer will cause the data blocks of both aircraft to start blinking. It's a real attention getter if you aren't expecting it. Again, it is a very effective system, but it isn't perfect. The software logic doesn't work well for an aircraft that is in a turn (holding patterns are a major problem), and there are some other technical and human-factors problems. And yes, it is occasionally taken down for maintenance.
I have two hopes in writing this article. First, I hope you'll gain a better understanding and a deeper appreciation of how robust our safety system is. Its performance over the length of my career has been truly outstanding. At some point in time, I've seen every layer in this system fail. But every time I saw a layer fail, some other layer prevented the whole system from failing.
Space limitations won't permit me to list every layer (besides, I'm not even sure I know them all), but I must mention the greatest layer: the people that operate the system. I have seen controllers, pilots, and technicians perform extraordinary feats to keep this system from failing. Many people (including myself) are quick to point out human frailties in a system. But their abilities far outweigh any shortcomings.
Hope for the Future
My second hope is that, with a better understanding of the safety layers built into our system, you'll understand the pressures involved in maintaining them. The pressure to further automate ATC is enormous. The need to do so is obvious. What isn't so obvious to outsiders (and many insiders) is the effect this automation has on the various layers of safety built into the system. Each layer is important. How important depends on the circumstances. We cannot, and must not, allow the pressure to make the system more efficient degrade or remove any layer of safety.
Have a safe flight!
Facility Safety Representative
National Air Traffic Controllers Association