NOTAM System Outage Halts U.S. Flights (UPDATED)

59

Technicians were doing a rare reboot of the NOTAM system when the decision was made to issue a ground stop early Wednesday. The system got glitchy on Tuesday afternoon and the agency found a single corrupted file in both the main and backup system according to sources interviewed by CNN. After nudging the system along through Tuesday night, the decision was made to do a reboot in the early morning hours of Wednesday, the network reported. It took longer than anticipated to come back and at 7:30 a.m. the decision was made to halt all departures. It didn’t last long but the effect was far-reaching.

As of 8:15 a.m. Eastern Time (EST), departures were resuming at Newark Liberty International Airport (EWR) and Atlanta Hartsfield-Jackson International Airport (ATL). The ground stop was lifted and domestic departures at all other airports began to resume at 9 a.m. EST. As of late Wednesday, the system was mostly back to normal but Mother Nature has other plans. Major weather systems on the West Coast, across the Plains and in the Southeast threaten to add to the misery.

Cyberattack has been all but ruled out as a cause. White House press secretary Karine Jean-Pierre stated that there was no evidence that the outage was caused by a cyberattack, noting that the Department of Transportation will be conducting a full investigation. An estimated 5,000 commercial flights had been delayed by the issue as of 10 a.m. EST. Both Transportation Secretary Pete Buttigieg and President Joe Biden said an investigation will be launched into the outage.

Kate O'Connor
Kate O’Connor works as AVweb's Editor-in-Chief. She is a private pilot, certificated aircraft dispatcher, and graduate of Embry-Riddle Aeronautical University.

Other AVwebflash Articles

59 COMMENTS

  1. “White House press secretary Karine Jean-Pierre stated that there is currently no evidence that the outage was caused by a cyberattack”

    Why would blaming the outage on internal (administration) failures make it any less scary?

    • Internal problems (and they generally are not “administration” related, i.e. someone screwed up) can be expected from time to time. For example, we all lose things occasionally. A cyberattack is the equivalent of someone breaking into your house and TAKING things. Most people would be more concerned about the latter than the former.

      • If nationwide outages can occur if just one person trips over a power cord then that is still the administration.

  2. Inexcusable, especially the ground stops. Next thing you know the FAA will be putting out nationwide ground stops for cloud cover!

    • “Inexcusable” is rather harsh. Everyone wants everything to work flawlessly all the time, but that is most certainly unrealistic. And as far as thinking a ground stop is overkill (comparing NOTAMS to cloud cover), NOTAMS are a key component to making the US aviation system as safe as possible. While these things are certainly an interruption, safety must be kept as the first priority. Certainly over “convenience”.

      • With respect, baloney! Sorry about being direct, today’s ground stop was ridiculous. I am a pt 135 pilot who flew a trip after the GA part of the ground stop was ended. There are ways around this “notam outage” that may involve more work but are not unsafe. I would tell that to any FAA official, or any body else who would question my departure. I stand by my original comments!

      • The NOTAMS were still available, already downloaded in multiple planning systems. New NOTAMs weren’t loading, but flights could safely be flown with updates via ATIS, over the radio, and delaying the execution of the underlying work or outages for that day. Don’t let the “Men and equipment working near runways starry mowing the grass until the issue was resolved”.

  3. Are we making ourselves ever more vulnerable with the increase in and mania for computerized everything? Not saying we shouldn’t use them but with the Russians, Red Chinese, and North Koreans prowling around on the internet we are capable of being had also think EMP. Love her statement “currently there is no evidence.” It happened last night and this is just a tad premature.

    • In short, yes.

      The longer answer is that resilient and secure computer systems are not cheap or easy to maintain, yet IT staff who maintain the systems are often some of the first people cut, and many organizations don’t want to pay for a fully-resilient system, so compromises are made. And even if a fully-resilient system is purchased and maintained properly, there are still ways in which it can be disrupted (often as a result of human error).

      This isn’t to say we shouldn’t use these systems. But critical infrastructure should have some form of backup system, at least to allow reduced or delayed service levels. We have procedures to fly IFR in the case of lost-comms or non-radar environments, but apparently we don’t have procedures for the loss of the NOTAM system.

    • As a person who spent much of his working life in the high quality end of the tech world, I get bitter about this. In the early 2000’s there were a lot of software people taking big advantage of bad decisions made by the customers who let their vendors put them in positions where they could easily be leveraged for ridiculous support terms. Read up on Computer Associates and their business model if you are interested. One outcome of this was the open source movement which helped destroy any chance at security against malicious actors and bad software. Open source is a religion, like socialized medicine, and very few people are able to rationally discuss it.
      On the other hand, the entire IT world, including me, chased the easy to use benefit allowing more and more idiots onto the systems. Sometimes I feel like I spent well over a decade increasing the speed of garbage in, garbage out from paper systems to web enabled ones.
      And all that just magnifies the problem of management making bad choices. So many bad managers.

      • “…the open source movement which helped destroy any chance at security against malicious actors and bad software. Open source is a religion, like socialized medicine, and very few people are able to rationally discuss it.”

        As a person who spent much of his working life in the tech world, though I won’t claim it was always the “high quality” end, I disagree with you that the open source movement is one of the top destructors of “any chance at security against malicious actors and bad software”. But then I may be part of the open source religion, just as I benefit from accessible medical care without fear of bankruptcy, thanks to socialised medicine.

        But I suspect you and I would agree that quality and resilience of software-based systems does not come cheap or easy. It requires strategic commitment by the organisation, it requires a lot of effort from designers, testers, and managers, and it requires money. Too many organisations have balked at those requirements, and as a result they have fragile systems.

        Add the NOTAM system to the case studies of Southwest’s crew scheduling system, the SolarWinds Orion software supply chain, and many many more.

        • “Destroying any chance” is admittedly a bit hyperbolic, but the destruction of a proper market in operating systems is likely costing the customers more than Linux will ever save them.
          Don’t even get me started on socialized medicine. I had a very bad outcome precisely due to classic central command issues and there was no chance of relief.
          I’ve worked with government and corporate customers and they can make a lot of the same mistakes as well as different ones.
          It will be interesting to hear what happened if we ever get close to the real story.

      • Oh well, at least most companies have stopped using Sysman as the top level password…. Or is that too big an assumption for the FAA?

    • “ “currently there is no evidence.” It happened last night and this is just a tad premature.”

      What’s premature about saying “currently”? At the time she made the statement they didn’t have any evidence of a hack by bad actors. She didn’t say that we weren’t hacked, she simply said there’s no evidence of it yet.

      • I personally have no faith or trust in anything which comes from this “administration.” “Tad premature” and “currently” because these people shoot first and think later, maybe.

    • Now that you’ve hurled the first offensive stone, Horace, do you really think the Secretary of Transportation, an Oxford Rhodes Scholar who spent seven years as an officer in the Naval Reserves, and voluntarily deployed to Afghanistan, had anything to do with the NOTAM outage?

      Or did you just want an excuse to spew your ignorant homophobic spittle?

      • Well, if he’s a military officer, and in the chain of command, he ought to at least feel responsible.
        I will be impressed if he shows any earnest sense of failure over this.

  4. The Swedish NOTAM system had been away for many months in Sweden due to new computer system – but we could always log into the NOTAM systems in neighbouring countries.
    I assume you in the US also have the standard that when a NOTAM is issued it goes into
    the international NOTAM system at the same time as it is logged at the national system.
    So the US system should have been redundant?

    • Lars, That makes a great deal of sense. A document posted on the FAA’s website on December 2, 2022 appears to state that the FAA is in the process of bringing its format and system in compliance with ICAO standards so that they are posted world wide on other country’s computer systems. But it doesn’t appear that US NOTAMS are available at this point.
      https://www.faa.gov/air_traffic/flight_info/aeronav/notams/

  5. For the past few years, the FAA has been preparing for/upgrading the old NOTAM system to be more compatible with those of other countries (not all) around the world. That was suppose to finish around summer of this year. It would not surprise me if someone threw a switch before something was ready and brought the whole thing to it’s knees. Or, worse, they thought it was ready and forgot to separate the backup/standby system prior to implementing a change (making it unrecoverable as the standby system incorporated the change when it rolled into the production system).

    For a system that has such a high impact on air travel, it seems the design is not as resilient as folks thought. Much like the ground control issue in Florida a week or so ago that caused delays in the region until they figured out how to resolve it.

  6. Curious that the NOTAM system could have this effect. What other secondary systems that the FAA has running could also shut things down in this way? Whether this was an attack or an “administrative problem”, it shows a weakness in the system or the way the system is used/thought of by the FAA.

    Since I operate under Part 91, I’ve never looked at what 121 and 135 say about the requirement for a NOTAM briefing before a flight. Part 91 doesn’t seem to require it as the closest I see is 91.103 which says “Each pilot in command shall, before beginning a flight, become familiar with all available information concerning that flight.” The key word is “available” … if the NOTAM system is not available does that excuse a pre-flight briefing requirement of knowing what applicable NOTAMs there are at the time of the flight?

    Personally, I’d want to know the NOTAMs that affected my flight before I make the flight and I suspect that’s the safety philosophy of the FAA on this one.

    It’ll be interesting to see what really was the cause of this system failure and also what safeguards will go into place for the future (for all secondary IT systems like the NOTAM system) as the cost of ground stopping all flights is enormous!

    • Same for pt135. Don’t know about pt121. There was absolutely no reason to ground stop everything because of the notam computer failure. Telephone calls to FBO’s to check for airport issues still works, just takes a little more time and effort. Like I said earlier, sooner or later the FAA will start ground stopping everything for cloud cover if we don’t clean house at the FAA.

    • I think it might be 121.601(a): “The aircraft dispatcher shall provide the PIC all available current reports or information on airport conditions and irregularities of navigation facilities that may affect the safety of flight”. If the NOTAM system is down, the dispatcher can’t provide information on the “airport conditions and irregulatiries of navigation facilities”.

      I believe the wording “all available” assumes that the TFR, NOTAM, etc sources are functional.

      • Assumptions can be hazardous. If we can assume all sources (TFR, NOTAM) are functional, there’d be no reason to have the ‘available’ qualifier in the rule.

  7. “the system went down at 2028Z on Jan. 10”.

    According to the article 2028Z on Jan 10th would be 1528EST yesterday afternoon.

    The article and the ATCSCC Advisory do not agree. The advisory (quoted below) states the outage occurred on 1/11/2023 at 2028Z. That would be 1528EST today which is well after the event. ???

    ATCSCC Advisory
    ATCSCC ADVZY 013 DCC 01/11/2023 NOTAM SYSTEM EQUIPMENT OUTAGE_FYI
    MESSAGE:

    EVENT TIME: 10/2028 – 11/0700

    ***REPLACES/EXTENDS ADVZY 006***

    THE UNITED STATES NOTAM SYSTEM FAILED AT 2028Z. SINCE THEN NO NEW
    NOTAMS OR AMENDMENTS HAVE BEEN PROCESSED. TECHNICIANS ARE CURRENTLY
    WORKING TO RESTORE THE SYSTEM AND THERE IS NO ESTIMATE FOR
    RESTORATION OF SERVICE AT THIS TIME. THERE IS CURRENTLY A HOTLINE IN
    EFFECT WHICH HAS NAIMES/FAA FACILITIES/STAKEHOLDERS IN ATTENDANCE.
    THIS HOTLINE INFORMATION IS CONTAINED WITHIN ADVZY 004. THIS ADVZY
    WILL BE UPDATED AS NECESSARY.

    EFFECTIVE TIME:

    110418 – 110730
    SIGNATURE:

    23/01/11 04:18

    • Rick C.: the text you quoted does not say that the outage occurred “on 1/11/2023”. The advisory is _dated_ 1/11/2023, and says the outage occurred “at 2028Z”, date unspecified. It is reasonable to infer that the advisory was sent _after_ the the event it describes, rather than before it. Hence, _if_ the advisory was sent on 1/11/2023, on or before 1/11/2023 20:27Z, then the most reasonable interpretation of “at 2028Z” would be “at 1/10/2023 20:28Z”.

      • Jim D.: Yes sir! I think you are correct. I does make me wonder though, if the outage occurred yesterday, Jan 10 at 3:28 pm EST, why did it take so long to implement a nationwide ground stop?

        • Also, just FYI, I see the original AVWEB article has been edited to to remove the original reference to the Jan 10 date. The title now adds the word ‘UPDATED’. I guess the author could have been confused by the wording as well.

  8. A little more detail about this Ground Stop would be helpful. The system apparently went down at 2028Z yesterday, but flights were still departing; my commute home last night departed on time at 2340Z. A quick search shows no significant delays of last night’s red-eyes, nor this mornings 1100Z departures, so just when was this Ground Stop?

    • What I want to know is who was the idiot at the FAA who decided to issue a ground stop in the first place. If the airlines felt they needed to stop flying then that is their choice. I flew a trip this morning after GA stop was discontinued. Some support from my company along with a couple of phone calls to confirm any destination issues and away I went.

  9. NOTAMs communicate information concerning the rules and regulations that govern flight operations, the use of navigation facilities, and designation of that airspace in which the rules and regulations apply.

    (c) When a NOTAM has been issued under this section, no person may operate an aircraft, or other device governed by the regulation concerned, within the designated airspace except in accordance with the authorizations, terms, and conditions prescribed in the regulation covered by the NOTAM.

    § 91.139 Emergency air traffic rules.
    (a) This section prescribes a process for utilizing Notices to Airmen (NOTAMs) to advise of the issuance and operations under emergency air traffic rules and regulations and designates the official who is authorized to issue NOTAMs on behalf of the Administrator in certain matters under this section.

    91.103 Preflight action. Each pilot in command shall, before beginning a flight, become familiar with all available information concerning that flight.

    There have been several NTSB decisions in the past decade(s) stating that just because the PIC couldn’t access “it”, didn’t relinquish responsibility of the PIC to still ‘obtain’-ALL information concerning that flight before commencing upon that flight. That information includes NAV aid outages, runway lighting, TFR’s, etc.
    AC 91-92 – Pilot’s Guide to a Preflight Briefing
    https://www.faa.gov/regulations_policies/advisory_circulars/index.cfm/go/document.information/documentID/1036892

  10. 7.4.1.8 NOTAMs. Check NOTAM information affecting the flight. This includes:
    • Domestic NOTAMs.
    • International NOTAMs (when flight extends beyond U.S. airspace).
    • Special Use Airspace (SUA) NOTAMs (e.g., restricted areas, aerial
    refueling, night vision goggles (NVG) operations, military operations
    areas, military training routes, and warning areas).
    • NOTAMs for field conditions (FICON).

  11. If the FAA couldn’t easily fix the LODA issue which THEY caused by having the Administrator merely wave his magic twanger, what makes anyone think that a more serious issue like this could be fixed or prevented in the first place? Having to have the Congress include fixing it in a Bill on a totally unrelated subject is tantamount to criminality.

    Computers are nice until they don’t work. I’ve been in FL when the power was out for days and stores couldn’t deal with figuring out how to sell the stuff they had in the store manually.

    Heaven help us all.

  12. The FAA said the crippling delays that affected thousands of flights appears to have been caused by a problem with a corrupted file in the Notice to Airmen (NOTAM) computer system, which sends pilots vital information they are required to obtain in order to fly.

    A corrupted file affected both the primary and backup system, a senior government official said Wednesday evening, adding that officials continue to investigate.

    “The FAA is continuing a thorough review to determine the root cause of the Notice to Air Missions (NOTAM) system outage,” the agency said in a statement. “Our preliminary work has traced the outage to a damaged database file. At this time, there is no evidence of a cyber-attack.”

    “One of the questions we need to look at right now, and one of the things I’m asking from the FAA, is what’s the state of the art in this form of message traffic?”. “And again, how is it possible for there to be this level of disruption?” stated Transportation Secretary Pete Buttigieg.

  13. Outage was caused by a “corrupted” file in the Database? Really? Files don’t just “corrupt” themselves. Something happened. A hardware failure (did the array behind the DB have a problem and lost data?), a transfer failure (someone uploaded a file and no one bothered to compare the checksum against it to ensure it’s integrity?), an approved change (the admins incorporated design changes into the database which caused a slow-down after dropping indexes which should have remained active – this can slow things down considerably and the recover would require rebuilding indexes, slowing things down considerably during rebuild), and unapproved change where someone entered the wrong commands on the wrong system (I thought I was on the development box and wiped out a bunch of production files – let’s try restoring them one-by-one and see if we can cover our butts!), or outright intentional sabotage (most cyber attacks come from within the organization).

    We should be asking how it was corrupted so we can implement steps to ensure it never happens again.

  14. A corrupted single file brought down the system and stopped US aviation? Not a very robust system architecture. Get ready for the next breakage while the politicians begin the finger pointing.

    • I do not care the “why” a file was corrupted (terrorism, accident, etc) I care that the whole system could be downed by a single point of failure. The buck stops with the administraton, otherwise the FAA needs to change it’s name to FA.

  15. So a government agency, FAA, causes a massive, nationwide airline delay, and its parent, DOT, fines the airlines for the delay that its incompetence caused. Bernie Sanders, Elizabeth Warren, Richard Blumenthal, want to make the fine timetable even shorter.

    Only Government could do something like this.

  16. I don’t believe the FAA has the authority to regulate non commercial aviation. Remember, the FAA is an arm of the Executive branch of government, and derives it’s Constitutional authority from the so called “interstate commerce clause” Our non commercial GA flight was grounded needlessly, and without proper authority. There was no emergency, and all the information for the flight was already known.

  17. Odd… this system wasn’t considered ‘mission critical’… no back systems. The DATA is backed up, but there is no standby system as there is for RADAR, Comm, etc…
    Instant NOTAMS were not the norm until recently. Now they are critical for any IFR flight.

  18. The “corrupted” file was titled “NOTICE_TO_AIRMEN.dbs”. The updated front end software called for a file named “NOTICE_TO_AIR_MISSION.dbs”. In other words, it wasn’t “corrupt”, just woke. 😀

LEAVE A REPLY