A Czech ISP made some changes this morning which cause routing updates to increase from a few thousand per second to around 25k per second at its peak. Newly-connected BGP routers typically provide information about themselves to each and every other BGP router on the internet. One of these pieces of information is AS ( autonomous system numbers ) which is used to make sure there are no loops in the network. AS paths also indicate a preference for multiple paths between same ends and makes sure that data flows over the best path.
If you were connected to two ISPs via BGP and wanted to shift more traffic through ISP B than A, you could ‘simply’ increase the path cost to ISP A. The problem occurred when the length of the AS path exceeded 126 numbers which older Cisco routers balk at. A second bug could trigger when the AS path exceeds 255 numbers.
Internet routing protocols are a labyrinth of complexity and can easily broken as the problematic ISP, SuproNet , showed today. Not all their fault as a number of unpatched systems were the real culprit. In his book “Internet Routing Architectures” about the BGP protocol, Sam Halabi writes, “Some people are surprised when networks fail. I’m surprised when they don’t.”