Page 1 of 1
Slow LSP failover on down links
Posted: 19 Aug 2015 13:43
by jwatki
Hey guys,
Have a Core network with multiple 7750's, recently had an issue between 2 of my 7750's where link went down between them, and the fail over was either too slow, or didn't occur at all, and simply restored when the offending link was restored. Everywhere I read says 50ms switch over, mine was over 2 seconds, slow enough for one of my major customers (cell tower) to see it and tell me before I even knew about it. The only thing I can find that I might be missing on my LSP's is the "Adaptive" command, not sure if that is what I am missing, but these are production routers so I can't go "testing" without alot of headache and overnight work

Any suggestions as to what I'm missing? Let me know if you need to see a different part of the config.
Currently runring C-10.0.R5
Sample of my config:
Code: Select all
mpls
path "igp"
no shutdown
exit
path "follow-igp"
no shutdown
exit
lsp "BSRA01-BSRA02"
to 172.xx.xx.x
cspf
fast-reroute facility
exit
primary "igp"
exit
no shutdown
Re: Slow LSP failover on down links
Posted: 20 Aug 2015 03:25
by zeips
Hi,
The adaptive option is enabled by default. Can you verify that for your lsp there actually exists any protection?? if you say fail over didn't occur at all, maybe there are no bypass tunnels and then if the link fails, lsp needs to be recalculated(wait for IGP) and eventually this lsp is established trough a new path. Anyway check this: show router mpls lsp "BSRA01-BSRA02" path detail
You can check a lot of things with this command, like if any protection of this lsp exists, if yes there should be also information if this protection was ever used, during failure you should also see the node which reports the failure and so on...
Re: Slow LSP failover on down links
Posted: 25 Aug 2015 16:57
by jwatki
zeips wrote:Hi,
The adaptive option is enabled by default. Can you verify that for your lsp there actually exists any protection?? if you say fail over didn't occur at all, maybe there are no bypass tunnels and then if the link fails, lsp needs to be recalculated(wait for IGP) and eventually this lsp is established trough a new path. Anyway check this: show router mpls lsp "BSRA01-BSRA02" path detail
You can check a lot of things with this command, like if any protection of this lsp exists, if yes there should be also information if this protection was ever used, during failure you should also see the node which reports the failure and so on...
Thanks for that command Zeips, here is the output on the LSP that the link failed on, it does show the MBB event on 8/19 which is when I had problems. Do you see anything out of the ordinary or that is missing?
Code: Select all
Adm State : Up Oper State : Up
Path Name : igp Path Type : Primary
Path Admin : Up Path Oper : Up
OutInterface: lag-102:0 Out Label : 261063
Path Up Time: 259d 15:12:22 Path Dn Time: 0d 00:00:00
Retry Limit : 0 Retry Timer : 30 sec
RetryAttempt: 0 NextRetryIn : 0 sec
Adspec : Disabled Oper Adspec : Disabled
CSPF : Enabled Oper CSPF : Enabled
CSPF-FL : Disabled Oper CSPF-FL: Disabled
Least Fill : Disabled Oper LeastF*: Disabled
FRR : Enabled Oper FRR : Enabled
FRR NodePro*: Enabled Oper FRR NP : Enabled
FR Hop Limit: 16 Oper FRHopL*: 16
Prop Adm Grp: Disabled Oper PropAG : Disabled
Neg MTU : 9190 Oper MTU : 9190
Bandwidth : No Reservation Oper Bw : 0 Mbps
Hop Limit : 255 Oper HopLim*: 255
Record Route: Record Oper RecRou*: Record
Record Label: Record Oper RecLab*: Record
SetupPriori*: 7 Oper SetupP*: 7
Hold Priori*: 0 Oper HoldPr*: 0
Class Type : 0 Oper CT : 0
Backup CT : None
MainCT Retry: n/a
Rem :
MainCT Retry: 0
Limit :
Include Grps: Oper InclGr*:
None None
Exclude Grps: Oper ExclGr*:
None None
Adaptive : Enabled Oper Metric : 4
Preference : n/a
Path Trans : 14 CSPF Queries: 37830
Failure Code: noError Failure Node: n/a
ExplicitHops:
No Hops Specified
Actual Hops :
172.xx @ n Record Label : N/A
-> 172.xx @ Record Label : 261063
-> 172.xx Record Label : 260988
ComputedHops:
172.xx -> 172.xx -> 172.xx
ResigEligib*: False
LastResignal: 08/25/2015 16:42:35 CSPF Metric : 4
Last MBB :
MBB Type : TimerBasedResignal MBB State : Success
Ended At : 08/19/2015 03:55:48 Old Metric : 8
Signaled BW: 0 Mbps
Re: Slow LSP failover on down links
Posted: 28 Aug 2015 03:53
by zeips
It looks fine actually:
Actual Hops :
172.xx @ n Record Label : N/A
-> 172.xx @ Record Label : 261063
-> 172.xx Record Label : 260988
you have both link and node protection on your headend so signaling of the bypass tunnel was successful. MBB as well.
So mpls looks fine but this can't be proof that there weren't any issues. Maybe problems occurred somewhere on service layer or application itself. Would be good to test this in the Lab and then check everything step by step.
If something more come up to my mind I will let you know.