Routing Protocols ~ OSPF ~ Part II

Hello homies and homettes. I hope you are all having a great day so far. Welcome to Part 2 of the routing protocols series regarding OSPF. This time we are going to dig deeper in OSPF and try to visualize the “low-level” detail behind it. Allow me to take your hand and walk you through OSPF once again, but this time there is a lot more coming at you so hold my hand tight. I hope the OSPF introduction part made you curious and you can’t wait to learn more about this bad boy. No rant or personal opinion this time, let’s get straight to the topic. Get ready to get bombarded with more theory lads. I will try to add as many pictures as possible so it can be entertaining and fun to read. The topics that will be covered in this article are:

  • The OSPF big picture
  • Metric
  • Neighbor Relationships

Let’s start off with the picture behind OSPF. In the previous article, we talked about areas but we didn’t get into enough detail. We were mainly focused on one and the most important one in order to get OSPF working, the area 0. Let’s have a look at the picture below.

This is our topology for now. R1, R2, R3 are routers as you may have guessed and they belong to area 0. However, area 0 has its limitations. The routers in a single area have to have the same link state database. What do I mean by that? You can either see it as a topology table or as the information the routers use in order to generate their routing table. Imagine those 3 routers being connected to 300 different networks. So R1 is gonna send all 300 networks to R3, R3 is going to send all 600 networks to R2(his and R1’s) and R2 is going to send his 300 networks to R3. In the end, each of these blue mofos will have 900 networks in their routing table. If you don’t understand why, the key is “link state.” I have a link for you to read in case you are confused, which will make you understand why each router has added to its routing table the networks of the others as well. Don’t hesitate, give it a shot.

Single area OSPF does not support the key concept I mentioned in my previous article which is summarization. Summarization will allow us to take a bunch of networks and sum them up in fewer advertisements. The goal is to use one advertisement. So let’s say we have the networks 192.168.1, 192.168.2, 192.168.3 etc. We could sum all these up into 192.168.0.0/16. If you don’t understand why /16, then you need some subnetting revision my friends, but don’t worry because I got your back. You can either have a look at my subnetting article, or if my article makes no sense, you can google!

What that says to the router who did the summarization is “Hey, I have every single network that starts with 192.168.” So practically we replaced a routing table with a huge number of subnets with a routing table of just one. Much smarter and more efficient, right? That manoeuvre makes 2 things possible:

  1. The router is faster because the less routes his friends gets to know, the faster he is able to route the packets.

  2. Fault Containment. What do I mean by that? If one of the networks R1 is connected went down, he would have to send an update about it to the rest of the routers. Why? Because all the routers in the area have to have the same routing information. It’s the requirement of a Link State protocol. And since you may not know this, but, OSPF is a Link State routing protocol. Assume R2 goes offline. Then 300 networks would disappear. R1 and R3 would be like “oh crap they are down, I repeat they are down. Mark them as unreachable.” Now assume R2 is back again. R1 and R3 back at it again saying “Wow wow they are back they are back. They are actually reachable.” and then boom, the routers blow up. Ok they don’t blow up but they are definitely being confused because they are constantly processing these routing updates and they end up dropping the traffic because there is no way of containment. Imagine that situation happening, because it’s possible. It’s an infinite loop.

WELCOME TO PART 2

I know what you are thinkng. This is a magnificent piece of art and it’s also so confusing. I know, I know but have no fear. I promise I will do my best to explain it to you as simply as possible.

Let me introduce you to a key term, known as Area Border Router. Where does the name come from? They sit on the border of the area. I know, it still does not make sense so let’s “zoom in” a little bit in the picture. Specifically I’m going to zoom in the part of the picture where R “sits” on the border between Area 0 and Area 1. This is what it actually looks like.

By putting an interface in more than one area, we make R1 an Area Border Router, aka ABR. What is an ABR able to do?

Summarize - Fewer advertisements as I mentioned before.

Fault Containment - For example, if a router dies in Area 0, what’s going to happen? I want your full attention now. Focus! Fault containment is depending 100% on summarization. And since I love examples, let me give you one.

Looking at the picture above, let’s assume Area 1 has a bunch of routers belonging to the 192.168.1, 192.168.2, 192.168.3, 192.168.4, 192.168.etc networks. In other words, we have all the 192.168 networks in that area. First of all let me tell you this. Fyi, you can set up an ABR that does not do summarization. So he can take all those networks and pass them through from Area 1 to Area 0. That would totally defeat the purpose of having areas but that’s the default setting when configuring OSPF.

Back to our example. If something goes wrong with the 192.168.1 network, then the router would send a message to all the routers including ABR and ABR would pass it on Area 0. But if we implement summarization by summing all those networks up to 192.168.0.0/16, ABR would think “Hmm, fair enough, the networks on my left are all in the 192.168 network.”

Then what’s going to happen if the 192.168.3 network goes down in Area 0? As soon as the info about the router going offline reaches the ABR he is going to think "I know he is down, but I’m not going to pass that on because frankly nobody on my right(Area 0) even knows that the 192.168.3 exists. Think about it for a second. The routers in Area 0 have no clue about the 192.168.3 network. All they know about is the 192.168.0.0 network. So even if the ABR passes on the info that a network went down from Area 1 to Area 0, the cheeky mofos on the right would think “Well, ok but, we don’t have any 192.168.3 network in our routing table anyway. Why would we care?” It kinda supresses the update if you can tell.

Something tells me you have a question though. “What if computer B from Area 0, wants to send a message to a computer X in Area 1?” What’s going to happen? The message is going to travel through each node of Area 0 thinking that X is alive but once it reaches ABR (ABR has the specific information. He knows everything that is going on in Area 1) he is going to say “Sorry mate, that network is down, I’m going to drop your packet. Good luck next time.”

Do you see how convenient this is? Do you see how much more efficient this is than sending messages all the time from Area to Area when someone goes offline? An area could have 500 routers. I cringe just by thinking about it. So that’s the job of an Area Border Router.

Moving on, looking at the first picture, there is a term standing on the right side named as “Autonomous System Boundary Router”. Confusing term isn’t it? Autonomous System Boundary Router connects you to networks outside of your own, such as the Internet. You can see it as a boundary of the OSPF network as a whole. Big concept is about to be thrown at you.

~ ABR and ASBR are the only routers in OSPF that can summarize. ~

The rest of the routers inside an area can’t summarize. They don’t have this ability. The only places they can summarize are on the border of an area (ABR) and on the border of the OSPF network as a whole (ASBR). I hope it makes sense. Keep in mind they don’t become ABR and ASBR by themselves. We, as network admins, configure them that way.

Two more design rules worth mentioning:

  1. Area 0 is always the first area you create. When you first start building your area, you have to start with area 0, aka Backbone Area which goes along with the second rule.

  2. All others areas have to connect directly to the backbone. Meaning (have a look at the second picture), Area 2 has to have a router that has an interface in the backbone and one interface in area 2. Same goes for area 1’s router.

In conclusion, OSPF multi area environment is created solely for the reason of summarization. You would not want to break your network into multiple areas if you didn’t have the need for summarization. I know I’m repeating the word “summarization” quite a lot but it’s an important term to grasp, that’s why.


Solidifying The OSPF Neighbor Relationship

Let’s talk about neighbor relationships now. This is the key in order to understand how OSPF works and how to troubleshoot it if something goes wrong. More than 95% of the time, OSPF troubleshooting deals with the neighbor relationship because if 2 routers can form neighbors, then there is a high chance of them exchanging routes. The are a couple of steps before that happens though.

STEP 1

  • Determine their Router-ID

If you remember from the last article, R1 is going to send a hello message in order to form a neighbor. Before it ever does that, it has to identify itself by picking it’s router ID. I know, confusing sentence once again but I’m here to explain it to you. What is the router ID? It’s the router’s name in the OSPF process. Think of yourself, while saying hello to somebody, most of the time you say “Hello, my name is x1337x.” The router’s name can even be an IP address. The command in order to configure the ID in a Cisco device at least is Router-ID name. However, if you don’t type in a router ID, the default option (talking again about Cisco devices) is either the loopback address or the highest active interface when OSPF starts. What do I mean by the second part of the last sentence? If your router’s interfaces are assigned with the IPs 10.0.0.1, 172.20.0.3, 192.168.1.4, it’s going to pick the 192.168.1.4 as its router ID.

STEP 2

  • Add interfaces to the Link State database. network command

Let me refresh your memory about the cisco “network” command.

  1. Identifies which interface(s) to use to send hello packets.
  2. Identifies which network(s) to advertise.

So if I want to send hello packets on the 172.168.1 interface and at the same time advertise it on other interfaces I have to type in the Cisco IOS network 172.168.1.0 0.0.0.255 area 0. And in case we want to advertise the 10.0.1 network to R2, we would have to type network 10.0.1.0 0.0.0.255 area 0.

STEP 3

  • Send hello messages on chosen interfaces.

This step includes quite a lot of info and I mentioned some of them in the previous article. But here is a bigger list:

  1. Router ID
  2. Hello and Dead Timers *
  3. Network Mask *
  4. Area ID *
  5. Neighbors
  6. Router Priority
  7. DR / BRD IP Address
  8. Authentication Password *

That’s all I can remember right now. The asterisks mean that those MUST match between the routers.

Think of the hello message as an envelope and when the router receives it, it pulls out the piece of paper that is inside the envelope and reads all the info. Let’s focus on the DR / BDR right now because there is a tiny theory behind those. Those? Yes those! DR / BDR are the designated and backup designated router respectively. Let’s say we have the 0x00sec company that looks like this:

Imagine all those routers representing difference offices in different cities and they are connected to the central building of 0x00sec. Of course, there will be a bunch of routers in the 0x00sec office as well. Let’s have a closer look, shall we?

Imagine those lines that are drawn on the routers representing the interfaces for the external 0x00sec offices. There is a problem though. All these routers without the DR / BDR concept will form neighbor relationship. You may be asking “Isn’t that what we want?” Well, sure, but what if one of the links goes down? The router is going to send an update to all of his neighbors “Dude, office here in freaking Iceland, I’m down!”. When the routers receive this message will update their neighbors and so on. You can see where I’m going with this. All of them are going to be like “aaah office down” “aaah office down”. C-h-a-o-s.

In order to prevent, there is this DR / BDR concept, where one router will be elected as the designated router and the other one will be the backup designated router. This way, when something goes wrong, the routers are going to send their updates ONLY to the DR and BDR. Thus, the rest of the routers won’t be full neighbors with the router that went down. They will see him but they will think “you know what? there is a DR.” Meaning, everybody will form full neighbor relationship only with the DR. So when a router goes down, he will send his update to the DR and then DR will get that message out to the rest of peeps, therefore, stopping the chaos I mentioned before. So the router priority that was in the list above identifies who is going to the DR and BDR. The one with the highest priority becomes the DR.

If you are still confused about this concept don’t worry, it’s advanced, Just focus on the high level view of it.

STEP 4

  • Receive Hello

The router will receive the hello message and does a check of his compatibility with this these fields below:

  1. Check hello / dead timers
  2. Check subnet masks
  3. Check authentication
  4. Check area ID

IF AND ONLY IF they are compatible he will move to the next step.

STEP 5

  • Send Hello Reply

Specifically, the router is going to think “Am I listed already as your neighbor in your hello packet? If yes, then I will reset your dead timer. If no, I will add you as my new neighbor.” (Remember, each hello packet includes the neighbors of the router). So what do I mean with the reset dead timer part? Let’s say the router who sends the message sends hello every 10 seconds and its dead timer is 30 seconds. So if the neighbor router has already formed a relationship with the one who sent the hello message is going to say “I will reset your dead timer back to 30 seconds and I will wait for your next hello in 10 seconds so you can prove me you are still reachable.”

Wow, this is insane. I’m looking back to what I wrote and there is so much more new info thrown at you. I think it’s already too much to handle so I will stop the steps here or maybe add them later if I see that you guys can take it. Either way, as you can see OSPF is huge but it’s also quite awesome and secure. I hope when you read that sentence you have understood a bit of what I’m saying at least if not all. I hope this article has been informative and I thank you for taking the time to read it. As always post your questions down below or PM me.

P.S The article is quite lengthy and I did my best to correct as many syntax mistakes as possible. If you find any, don’t hesitate to let me know. English isn’t my main language.Thank you.

Later…

7 Likes

Great job. Well written, and frankly one of the best articles I’ve seen on the subject. Keep the great work up!

1 Like

Fantastic @airth. Congrats!

1 Like