Page 1 of 2

OS6360 loop-detection problem

Posted: 24 Mar 2025 08:30
by dzoki
Hi all,

On switch Alcatel-Lucent Enterprise OS6360-P24 8.9.94.R04 GA i configured ports 1/1/1-24 to protect from loop:
qos no user-port filter user-port shutdown bpdu
policy port group UserPorts 1/1/1-24
qos apply

and also

loopback-detection enable
loopback-detection port 1/1/1-24 enable

so i test when i connected dummy switch on one port and then on dummy switch create loop, i get on alcatel cpu 100% even is port shutdown

switch-> show interfaces 1/1/23
Chassis/Slot/Port : 1/1/23
Operational Status : down,
Port-Down/Violation Reason: bpdu,
Last Time Link Changed : Sun Jan 18 03:29:30 1970,
Number of Status Change : 26,
Type : Ethernet,
SFP/XFP : N/A,
Interface Type : Copper,
EPP : Disabled,
Link-Quality : N/A,
MAC address : 94:24:e1:e9:f0:1e,
BandWidth (Megabits) : - , Duplex : -,
Autonegotiation : 1 [ 1000-F 100-F 100-H 10-F 10-H ],
Long Frame Size(Bytes) : 1552,
Inter Frame Gap(Bytes) : 12,
loopback mode : N/A,
Rx :
Bytes Received : 21801379813, Unicast Frames : 2,
Broadcast Frames: 54881206, M-cast Frames : 283965374,
UnderSize Frames: 0, OverSize Frames: 0,
Lost Frames : 0, Error Frames : 4,
CRC Error Frames: 0, Alignments Err : 0,
Tx :
Bytes Xmitted : 71713, Unicast Frames : 0,
Broadcast Frames: 262, M-cast Frames : 566,
UnderSize Frames: 0, OverSize Frames: 0,
Lost Frames : 0, Collided Frames: 0,
Error Frames : 0, Collisions : 0,
Late collisions : 0, Exc-Collisions : 0

Which command shutdown port permanently if loop detected and does not attempt to bring it up?
I solved this temporarily with the command interfaces 1/1/23 admin-state disable but this is not an acceptable solution.
Thank you all.

Re: OS6360 loop-detection problem

Posted: 24 Mar 2025 08:37
by Cristek

Code: Select all

violation recovery-maximum 0
violation recovery-time 0
the 'maximum' by default is 10 (so that port would only recover 10 times)
the 'time' by default is 300s (after 300s it recovers - as long as it didn't exceed those 10 times)

Re: OS6360 loop-detection problem

Posted: 24 Mar 2025 08:48
by dzoki
@Cristek Thank you for commands.
i try this commands:
switch-> violation recovery-maximum 0
switch-> violation recovery-time 0
ERROR: Retry time should be in the range 30-600.
switch-> violation recovery-time 30

Save and try but still cpu is 100% on the switch and every service on the switch is stuck. So this commands dont solve the problem.

Re: OS6360 loop-detection problem

Posted: 24 Mar 2025 08:58
by Cristek
The 'recovery time' doesnt really matter because you told the switch to never recover at all.
Still, that commands works for me perfectly so I'm guessing maybe it's because you are a few versions behind?

Code: Select all

Fcc->
Fcc-> 
Fcc-> sh sy
System:
  Description:  Alcatel-Lucent Enterprise OS6360-P48X 8.9.221.R03 GA, October 12, 2023.,
  Object ID:    1.3.6.1.4.1.6486.801.1.1.2.1.16.1.9,
  Up Time:      224 days 3 hours 39 minutes and 23 seconds,
  Contact:      Alcatel-Lucent Enterprise, https://www.al-enterprise.com,
  Name:         Fcc,
  Location:     Unknown,
  Services:     78,
  Date & Time:  MON MAR 24 2025 12:52:33 (BST)
Flash Space:
    Primary CMM:
      Available (bytes):  609288192,
      Comments         :  None

Fcc->
Fcc-> 
Fcc-> 
Fcc-> sh con sn port-manager 
! Port_Manager: 
violation recovery-maximum infinite
violation recovery-time 600

Fcc-> 
Fcc-> 
Fcc-> 
But try upgrading to the latest version as you are like 3 or 4 versions behind. Maybe that's a weird bug or something.

EDIT: I just realized I was using a different config than yours on this site I copy pasted. But as long as you put your 'recovery-maximum' at 0, then it never recovers, no matter the timer

Re: OS6360 loop-detection problem

Posted: 24 Mar 2025 09:04
by dzoki
switch-> sh system
System:
Description: Alcatel-Lucent Enterprise OS6360-P24 8.9.94.R04 GA, March 28, 2024.,
Object ID: 1.3.6.1.4.1.6486.801.1.1.2.1.16.1.5,
Up Time: 0 days 5 hours 28 minutes and 37 seconds,
Contact: Alcatel-Lucent Enterprise, https://www.al-enterprise.com,
Name: switch,
Location: unknown
Services: 78,
Date & Time: SUN JAN 18 1970 04:07:45 (UTC)
Flash Space:
Primary CMM:
Available (bytes): 699023360,
Comments : None

switch-> sh con sn port-manager
! Port_Manager:
violation recovery-maximum 0
violation recovery-time 30

I will try but your version is older then on my switch :) .

Re: OS6360 loop-detection problem

Posted: 24 Mar 2025 09:08
by Cristek
anything in the logs that point you in the right direction?

'show log events/swlog' maybe give you additional info.
when you physically remove the cable from the switch once the port is down, does the performance go back to normal?
other than this, no ideas.

Re: OS6360 loop-detection problem

Posted: 24 Mar 2025 09:44
by dzoki
This is log

1970 Jan 18 04:43:43.589 switch swlogd portMgrCmm main EVENT: CUSTLOG CMM Port 1/1/23 violation cleared - reason Admin Up/Down
1970 Jan 18 04:43:43.590 switch swlogd portMgrCmm main INFO: pvr trap: Violation clear, chass 1, slot 1, port 23: source --, reason --
1970 Jan 18 04:43:43.591 switch swlogd portMgrCmm main INFO: : [pmRemoveViolationOnGPort:3973] Violation on gport 22 Removed
1970 Jan 18 04:43:43.594 switch swlogd intfNi Drv INFO: niEsmSendLinkStatusChgMsg(1153): linkstatus UP sent on peerId=1
1970 Jan 18 04:43:43.594 switch swlogd stpNi _SOKt INFO: stpnimsg_processMsgFromPM: PM_LINK_STATUS_MSGID gPort=x16 linkStatus=1
1970 Jan 18 04:43:43.595 switch swlogd intfCmm Mgr EVENT: CUSTLOG CMM Link 1/1/23 operationally up
1970 Jan 18 04:43:43.613 switch swlogd intfCmm Mgr INFO: esmSetRateLimit: Txing limit=49, trafficType=0, limitType=1 to zslot=0,
1970 Jan 18 04:43:43.613 switch swlogd intfCmm Mgr INFO: esmSendConfMsg: chassis 1 zslot 0 zport 22 Txed conf msgId:393231
1970 Jan 18 04:43:43.613 switch swlogd intfCmm Mgr INFO: esmSetRateLimit: Txing limit=49, trafficType=1, limitType=1 to zslot=0,
1970 Jan 18 04:43:43.613 switch swlogd intfCmm Mgr INFO: esmSendConfMsg: chassis 1 zslot 0 zport 22 Txed conf msgId:393231
1970 Jan 18 04:43:43.613 switch swlogd intfCmm Mgr INFO: esmSetRateLimit: Txing limit=49, trafficType=2, limitType=1 to zslot=0,
1970 Jan 18 04:43:43.614 switch swlogd intfCmm Mgr INFO: esmSendConfMsg: chassis 1 zslot 0 zport 22 Txed conf msgId:393231
1970 Jan 18 04:43:43.614 switch swlogd intfCmm Mgr INFO: cmmEsmHandleNiMsg: Rx CMM_ESM_LINK_STATUS_CHG from chassis 1 NI 1
1970 Jan 18 04:43:43.614 switch swlogd intfNi Drv INFO: niEsmGetEvent:RX CMM_ESMDRV_CMD_CONFIG_MSGID
1970 Jan 18 04:43:43.614 switch swlogd intfNi Drv INFO: eniApplyConfig: cmd:21 zport:22 cmdHy:0, apMedia:2
1970 Jan 18 04:43:43.614 switch swlogd intfNi Drv INFO: niEsmGetEvent:RX CMM_ESMDRV_CMD_CONFIG_MSGID
1970 Jan 18 04:43:43.614 switch swlogd intfNi Drv INFO: eniApplyConfig: cmd:28 zport:22 cmdHy:0, apMedia:2
1970 Jan 18 04:43:43.614 switch swlogd intfNi Drv INFO: niEsmGetEvent:RX CMM_ESMDRV_CMD_CONFIG_MSGID
1970 Jan 18 04:43:43.614 switch swlogd intfNi Drv INFO: eniApplyConfig: cmd:34 zport:22 cmdHy:0, apMedia:2
1970 Jan 18 04:43:48.478 switch swlogd healthCmm main EVENT: CUSTLOG CMM Port 1/1/23 rising above receive threshold.
1970 Jan 18 04:43:58.486 switch swlogd healthCmm main EVENT: CUSTLOG CMM Port 1/1/23 falling below receive threshold.
1970 Jan 18 04:46:10.252 switch swlogd stpNi _SOKt INFO: stpnimsg_processMsgFromPM: PM_LINK_STATUS_MSGID gPort=x16 linkStatus=0
1970 Jan 18 04:46:10.254 switch swlogd intfNi Drv INFO: niEsmSendLinkStatusChgMsg(1153): linkstatus DOWN sent on peerId=1
1970 Jan 18 04:46:10.257 switch swlogd intfCmm Mgr EVENT: CUSTLOG CMM Link 1/1/23 operationally down
1970 Jan 18 04:46:10.289 switch swlogd intfCmm Mgr INFO: cmmEsmHandleNiMsg: Rx CMM_ESM_LINK_STATUS_CHG from chassis 1 NI 1
1970 Jan 18 04:46:10.290 switch swlogd portMgrCmm main EVENT: CUSTLOG CMM Port 1/1/23 in violation - source 2 reason bpdu
1970 Jan 18 04:46:18.599 switch swlogd healthCmm main EVENT: CUSTLOG CMM NI 1/1 rising above CPU threshold.

Re: OS6360 loop-detection problem

Posted: 24 Mar 2025 09:45
by dzoki
Cristek wrote: 24 Mar 2025 09:08 anything in the logs that point you in the right direction?

'show log events/swlog' maybe give you additional info.
when you physically remove the cable from the switch once the port is down, does the performance go back to normal?
other than this, no ideas.
Yes. everything is normal than.

Re: OS6360 loop-detection problem

Posted: 24 Mar 2025 13:06
by silvio
Which command shutdown port permanently if loop detected and does not attempt to bring it up?
I solved this temporarily with the command interfaces 1/1/23 admin-state disable but this is not an acceptable solution.
To check the violation use:

Code: Select all

show violation
The shutdown (so also the permanent) you can end with.

Code: Select all

> clear violation port ...
If the port is down than the cpu should not have 100% anymore.
The permanent shutdown you have configured with "violation recovery-maximum x" is for user-ports and a lot of other features. But not for LBD.
I prefere to set the transmission timer for LBD to the lowest entry (5 sec).
BR Silvio

Re: OS6360 loop-detection problem

Posted: 26 Mar 2025 07:43
by dzoki
Hi all,

I solve issu with loop-detection by turning it off policy port group UserPorts.
How to turning off qos, now look like this:
qos no user-port filter user-port shutdown bpdu
qos apply


Thank you.