Funky Network Error

ShaneF

Funky Network Error

Post by ShaneF »

Hey Guys,

I have these incidents from the log on my OXE R6 which happened last week just before the system rebooted itself - could anyone assist?

Also, is there a command to be able to see what processes are running - there's another issue where the 4760 doesn't seem to get all the tickets now so the logs are incomplete and useless.

Thanks :D

22/06/06 08:25:41 000001M|02/01/-/---|=3:4401=Ethernet broadcast reception disabled due to excessive traffic
22/06/06 08:25:41 000001M|01/00/-/---|=3:4401=Ethernet broadcast reception disabled due to excessive traffic
22/06/06 08:25:42 000001M|--/--/-/---|=3:1566=Ethernet broadcast reception disabled due to excessive traffic
22/06/06 08:25:57 000001M|--/--/-/---|=3:1566x002=Ethernet broadcast reception disabled due to excessive traffic
22/06/06 08:25:58 000001M|01/00/-/---|=5:0409=The inter-ACT link over IP from (1 9 1) is up
22/06/06 08:26:04 000001M|--/--/-/---|=0:1600=IO1 driver error, 255 No C1 wrong in output
22/06/06 08:26:09 000001M|01/--/-/---|=2:2043=Loss of the 1 CRYSTAL
22/06/06 08:26:09 000001M|01/00/-/---|=2:2042=Loss of a GD type cpl
22/06/06 08:26:09 000001M|01/01/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:09 000001M|01/02/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:09 000001M|01/03/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:09 000001M|01/04/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:09 000001M|01/06/-/---|=2:2042=Loss of a PRA type cpl
22/06/06 08:26:09 000001M|--/--/-/---|=2:2140=Alarm : TRUNK resources quantity critical
22/06/06 08:26:09 000001M|01/07/-/---|=2:2042=Loss of a PRA type cpl
22/06/06 08:26:09 000001M|01/07/0/000|=4:2085=Access T2, board (1,7) : synchronization stopped
22/06/06 08:26:09 000001M|02/--/-/---|=2:2043=Loss of the 2 CRYSTAL
22/06/06 08:26:09 000001M|03/--/-/---|=2:2043=Loss of the 3 CRYSTAL
22/06/06 08:26:09 000001M|01/27/-/---|=3:2490=Loss of a virtual coupler GPA (1,27) of the associated coupler GD (1,0)
22/06/06 08:26:09 000001M|02/00/-/---|=3:2490=Loss of a virtual coupler GPA (2,0) of the associated coupler MEX (2,0)
22/06/06 08:26:09 000001M|02/01/-/---|=2:2042=Loss of a GA type cpl
22/06/06 08:26:09 000001M|02/02/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:09 000001M|--/--/-/---|=2:2140=Alarm : SET resources quantity critical
22/06/06 08:26:09 000001M|02/03/-/---|=2:2042=Loss of a GA type cpl
22/06/06 08:26:09 000001M|02/04/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:09 000001M|02/05/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:09 000001M|02/06/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:09 000001M|02/07/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:09 000001M|02/08/-/---|=2:2042=Loss of a Z type cpl
22/06/06 08:26:09 000001M|02/09/-/---|=3:2490=Loss of a virtual coupler GPA (2,9) of the associated coupler GA (2,1)
22/06/06 08:26:09 000001M|02/10/-/---|=3:2490=Loss of a virtual coupler GPA (2,10) of the associated coupler GA (2,3)
22/06/06 08:26:10 000001M|03/00/-/---|=3:2490=Loss of a virtual coupler GPA (3,0) of the associated coupler MEX (3,0)
22/06/06 08:26:10 000001M|03/01/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:10 000001M|03/02/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:10 000001M|03/03/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:10 000001M|03/04/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:10 000001M|03/05/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:10 000001M|03/06/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:10 000001M|03/07/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:10 000001M|03/08/-/---|=2:2042=Loss of a UA type cpl
22/06/06 08:26:20 000001M|01/01/0/000|=3:1404=ABCA Wrong MSG 0 IE 1 12 : 4
22/06/06 08:26:26 000001M|--/--/-/---|=3:1404=ABCA Wrong MSG 0 IE 1 12 : 4
22/06/06 08:28:29 000001M|01/00/-/---|=4:0740=Beginning of an INT/IP downloading @:00.80.9f.34.0f.3e (starttscip)
22/06/06 08:28:29 000001M|01/00/-/---|=5:0741=End of downloading of an INT/IP board @:00.80.9f.34.0f.3e (starttscip)
22/06/06 08:28:30 000001M|--/--/-/---|=2:0746=Configuration error during the download of a TSC/IP set
22/06/06 08:28:31 000001M|01/00/-/---|=4:0740=Beginning of an INT/IP downloading @:00.80.9f.34.0f.3e (binmg)
22/06/06 08:28:31 000001M|01/00/-/---|=5:0741=End of downloading of an INT/IP board @:00.80.9f.34.0f.3e (binmg)
22/06/06 08:28:46 000001M|01/00/-/---|=5:0409=The inter-ACT link over IP from (19 1) is up
22/06/06 08:28:47 000001M|01/00/-/---|=0:5857=GD/GA/INTIP/RGD : reason of reboot 2
Ben S

Post by Ben S »

Yikes, doesn't look good at all. You can get more information about the errors from the command line by typing incinfo US0 (or the launguage you prefer) and then the error number (2043 is the loss of Crystal, 1600 is the IO1 driver error etc)

In terms of processes running, I'm not 100% sure but you can try the trusty old *nix command - top

Cheers.. Ben
ShaneF

Cheers - one more thing

Post by ShaneF »

Thanks Ben! Gold star for you.

I knew it was *nix based, but I am quite a newbie when it comes to all things *nix at the moment, just installed my first FC5 server at home. Joy...

Anyways, could someone please have a look at these and tell me if there are any that shouldn't be running?

I am the only admin on this system, and I have only included process lines initiated by user mtcl, thinking that someone else has logged in and had a fiddle...

12:39pm up 9 days, 16:34, 1 user, load average: 11.20, 11.14, 11.06
221 processes: 220 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 6.9% user, 11.0% system, 0.0% nice, 81.9% idle
Mem: 127128K av, 124160K used, 2968K free, 0K shrd, 16596K buff
Swap: 524624K av, 2136K used, 522488K free 55664K cached

PID USER CLS PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
8765 mtcl UNIX 24 0 1208 1208 936 R 16.2 0.9 1:16 top
757 mtcl FIFO 99 -12 892 892 756 S < 0.0 0.7 0:00 #cmisd
758 mtcl UNIX 36 -12 880 880 764 S < 0.0 0.6 0:00 sqlsrv
760 mtcl UNIX 36 -12 892 892 756 S < 0.0 0.7 0:06 cmisd
761 mtcl UNIX 36 -12 892 892 756 S < 0.0 0.7 0:02 cmisd
763 mtcl UNIX 36 -12 540 540 456 S < 0.0 0.4 0:01 remad
770 mtcl FIFO 99 -12 1608 1608 1208 S < 0.0 1.2 0:00 #mailsys
771 mtcl UNIX 34 -11 1608 1608 1208 S < 0.0 1.2 0:15 mailsys
772 mtcl RR 24 -12 1608 1608 1208 S < 0.0 1.2 0:00 mailsys
773 mtcl UNIX 34 -10 1608 1608 1208 S < 0.0 1.2 0:00 mailsys
774 mtcl UNIX 36 -12 1608 1608 1208 S < 0.0 1.2 0:00 mailsys
777 mtcl UNIX 36 -12 572 572 500 S < 0.0 0.4 0:00 gwLinux
779 mtcl UNIX 34 -11 1608 1608 1208 S < 0.0 1.2 0:00 mailsys
780 mtcl UNIX 36 -12 1608 1608 1208 S < 0.0 1.2 0:00 mailsys
781 mtcl UNIX 34 -11 1608 1608 1208 S < 0.0 1.2 0:00 mailsys
880 mtcl UNIX 36 -12 624 624 544 S < 0.0 0.4 0:00 btracer
881 mtcl UNIX 36 -12 552 552 476 S < 0.0 0.4 0:00 blackbox
882 mtcl UNIX 36 -12 680 680 556 S < 0.0 0.5 0:00 mtracer
885 mtcl UNIX 36 -12 688 688 600 S < 0.0 0.5 0:00 dectobsproc
942 mtcl FIFO 99 -12 1044 1044 640 S < 0.0 0.8 0:00 #download
945 mtcl FIFO 99 -12 1708 1708 1400 S < 0.0 1.3 0:00 #abca_serv
947 mtcl UNIX 36 -12 1044 1044 640 S < 0.0 0.8 0:01 download
949 mtcl UNIX 36 -12 1044 1044 640 S < 0.0 0.8 0:00 download
953 mtcl UNIX 36 -12 1044 1044 640 S < 0.0 0.8 0:00 download
956 mtcl FIFO 80 -12 1708 1708 1400 S < 0.0 1.3 0:00 abca_serv
957 mtcl FIFO 78 -6 1708 1708 1400 S < 0.0 1.3 0:00 abca_serv
962 mtcl FIFO 78 -12 1708 1708 1400 S < 0.0 1.3 0:00 except
964 mtcl FIFO 77 -12 1708 1708 1400 S < 0.0 1.3 0:00 sympac
965 mtcl FIFO 78 -12 1708 1708 1400 S < 0.0 1.3 0:00 sbc
966 mtcl FIFO 77 -12 1708 1708 1400 S < 0.0 1.3 0:00 fms
967 mtcl FIFO 77 -12 1708 1708 1400 S < 0.0 1.3 0:09 vms
968 mtcl FIFO 77 -12 1708 1708 1400 S < 0.0 1.3 0:00 blf
969 mtcl FIFO 78 -12 1708 1708 1400 S < 0.0 1.3 0:00 phb
970 mtcl FIFO 78 -12 1708 1708 1400 S < 0.0 1.3 1:12 server
971 mtcl FIFO 77 -12 1708 1708 1400 S < 0.0 1.3 0:00 mao
972 mtcl FIFO 78 -12 1708 1708 1400 S < 0.0 1.3 0:00 popup
973 mtcl FIFO 78 -12 1708 1708 1400 S < 0.0 1.3 0:00 surv
989 mtcl UNIX 36 -12 628 628 552 S < 0.0 0.4 0:02 ml_serv
1007 mtcl FIFO 99 -12 900 900 768 S < 0.0 0.7 0:00 #srv_obs
1009 mtcl UNIX 36 -12 900 900 768 S < 0.0 0.7 0:00 srv_obs
1010 mtcl UNIX 36 -12 900 900 768 S < 0.0 0.7 0:00 srv_obs
1011 mtcl UNIX 36 -12 900 900 768 S < 0.0 0.7 0:00 srv_obs
1012 mtcl UNIX 36 -12 900 900 768 S < 0.0 0.7 0:00 srv_obs
1013 mtcl UNIX 36 -12 900 900 768 S < 0.0 0.7 0:00 srv_obs
1068 mtcl FIFO 99 -12 616 616 532 S < 0.0 0.4 0:00 #sig_h
1069 mtcl FIFO 87 -12 616 616 532 S < 0.0 0.4 0:12 sig_h
1070 mtcl FIFO 75 -6 616 616 532 S < 0.0 0.4 0:00 sig_h
1164 mtcl FIFO 99 -12 1608 1608 1100 S < 0.0 1.2 0:00 #vmail
1165 mtcl FIFO 90 -12 1608 1608 1100 S < 0.0 1.2 0:00 vmail
1166 mtcl FIFO 78 -6 1608 1608 1100 S < 0.0 1.2 0:00 vmail
1167 mtcl FIFO 88 -12 1608 1608 1100 S < 0.0 1.2 0:00 vmail
1194 mtcl FIFO 99 -12 808 808 712 S < 0.0 0.6 0:00 #vnet
1195 mtcl FIFO 99 -12 2340 2340 1384 S < 0.0 1.8 0:02 #internal_ta
1196 mtcl UNIX 36 -12 548 548 480 S < 0.0 0.4 0:00 eaccsrv
1197 mtcl UNIX 36 -12 808 808 712 S < 0.0 0.6 0:00 vnet
1198 mtcl FIFO 99 -12 5704 5704 3844 S < 0.0 4.4 0:00 #maoagent
1199 mtcl UNIX 36 -12 2340 2340 1384 S < 0.0 1.8 1:11 internal_tax
1200 mtcl UNIX 36 -12 2340 2340 1384 S < 0.0 1.8 0:00 internal_tax
1201 mtcl UNIX 36 -12 5704 5704 3844 S < 0.0 4.4 0:00 maoagent
1205 mtcl UNIX 36 -12 5704 5704 3844 S < 0.0 4.4 0:00 maoagent
1209 mtcl UNIX 36 -12 2340 2340 1384 S < 0.0 1.8 4:06 saving
1210 mtcl UNIX 36 -12 1004 1004 872 S < 0.0 0.7 0:00 qlsrv
1211 mtcl UNIX 36 -12 2340 2340 1384 S < 0.0 1.8 0:00 printing
1212 mtcl UNIX 36 -12 2340 2340 1384 S < 0.0 1.8 0:31 compression
1213 mtcl UNIX 36 -12 2340 2340 1384 S < 0.0 1.8 1:26 routing
1214 mtcl UNIX 36 -12 5704 5704 3844 S < 0.0 4.4 0:08 maoagent
1215 mtcl UNIX 36 -12 732 732 648 S < 0.0 0.5 0:00 broadcast
1216 mtcl FIFO 99 -12 700 700 580 S < 0.0 0.5 0:00 #eventmon
1217 mtcl UNIX 36 -12 748 748 676 S < 0.0 0.5 0:00 ioreveil
1220 mtcl UNIX 36 -12 556 556 484 S < 0.0 0.4 0:00 annu_process
1221 mtcl UNIX 36 -12 5704 5704 3844 S < 0.0 4.4 0:17 maoagent
1222 mtcl UNIX 36 -12 1164 1164 988 S < 0.0 0.9 1:06 qlsrv
1223 mtcl UNIX 36 -12 700 700 580 S < 0.0 0.5 0:00 eventmon
1224 mtcl FIFO 99 -12 1232 1232 952 S < 0.0 0.9 0:00 #cstamono
1225 mtcl UNIX 36 -12 700 700 580 S < 0.0 0.5 0:00 eventmon
1226 mtcl UNIX 36 -12 700 700 580 S < 0.0 0.5 0:00 eventmon
1229 mtcl UNIX 36 -12 5704 5704 3844 S < 0.0 4.4 0:03 maoagent
1230 mtcl UNIX 36 -12 1152 1152 992 S < 0.0 0.9 0:03 qlsrv
1231 mtcl UNIX 36 -12 1232 1232 952 S < 0.0 0.9 0:00 cstamono
1232 mtcl UNIX 36 -12 1232 1232 952 S < 0.0 0.9 0:00 cstamono
1233 mtcl UNIX 36 -12 1232 1232 952 S < 0.0 0.9 0:00 cstamono
1234 mtcl UNIX 36 -12 1232 1232 952 S < 0.0 0.9 0:00 cstamono
1235 mtcl UNIX 36 -12 1032 1032 896 S < 0.0 0.8 0:06 qlsrv
1236 mtcl UNIX 36 -12 5704 5704 3844 S < 0.0 4.4 0:05 maoagent
1237 mtcl UNIX 36 -12 1136 1136 976 S < 0.0 0.8 0:08 qlsrv
1238 mtcl UNIX 36 -12 5704 5704 3844 S < 0.0 4.4 0:06 maoagent
1243 mtcl FIFO 99 -12 824 824 716 S < 0.0 0.6 0:00 #srv_suprout
1250 mtcl UNIX 36 -12 824 824 716 S < 0.0 0.6 0:00 srv_suprout
1251 mtcl UNIX 36 -12 824 824 716 S < 0.0 0.6 0:00 srv_suprout

The system is a standard OXE setup, with three shelves, and NO IP-telephony installed.

I'm experiencing issues with tickets for calls either not being logged or not being passed to 4760 for reporting. I have two reports daily, one for incoming and another for outgoing calls.

Ever since the previous-mentioned outage on the 22nd, the report content has shrunk to nothing.


Thank you all ever so much for your time - much appreciated!

Shane
User avatar
frank
Alcatel Unleashed Certified Guru
Alcatel Unleashed Certified Guru
Posts: 3386
Joined: 06 Jul 2004 00:18
Location: New York
Contact:

Post by frank »

Once you login as mtcl, go to /DHS3dyn/account , then do a account ascii , it will list you all the files containing tickets. It is NOT a file per ticket: A file can have 3, 5, 10 tickets inside. Once you are sure that you have tickets for everyday, you can look inside of a file by typing accview -mtf file_name and you will see if you have the tickets that you think that you are missing.

How do you know tickets are missing ?
Did you change anything on the filters ?
Are you sure it's not the 4760 ?
Usually, the PBX has nothing to do with that.. It's mostly like the 4760 is upset..
Code Free Or Die
cavagnaro

Post by cavagnaro »

I'll recommend you to check your network, that broadcast problem brings very annoying problems like ethernet disconnecting, lose connection with clients as CSTA or similars.
User avatar
frank
Alcatel Unleashed Certified Guru
Alcatel Unleashed Certified Guru
Posts: 3386
Joined: 06 Jul 2004 00:18
Location: New York
Contact:

Post by frank »

Cavagnaro,

If you have such problems, I would really check the configuration and the quality of the network..

I have 2 big clients here, with 10 and 15 nodes networked all over the USA, and no such problems..
Code Free Or Die
jimbob

Post by jimbob »

Now that you mention NETWORK PROBLEMS arrrgggghhh.......

We began having problems about 6 months ago, the symptoms being:
- alarms in 4760 saying that it could not communicate with the 4400 PBX
- can ping the IP of both the main and standby CPU
- cannot telnet to the main CPU but can still ping it
- can sometimes get in to the main CPU via the CBRMA board but it takes ages

This required a bascul to clear the problem but it always returned a few days later. Finally, wanting to isolate our network but still maintain management of the system, I connected the spare NIC on the 4760 server to the PBX with a cross-over network cable, set the default gateway, subnet mask etc on the NIC so that only the relevant IPs would be on this wire. This seemed to get around the problem but it eventually returned last week. It also happens to two other 4400 systems.

I keep thinking it is some counter in the 4400 network interface that reaches some limit and shuts down the TCP side of the IP network (ping is a UDP packet). Our maintainers havent got a clue and most of the hot air from Alcatel France is "upgrade to the latest OS" ($90k+).

Any help would be greatly appreciated.

R5.0Ux-d2.314

Regards,
James.
User avatar
frank
Alcatel Unleashed Certified Guru
Alcatel Unleashed Certified Guru
Posts: 3386
Joined: 06 Jul 2004 00:18
Location: New York
Contact:

Post by frank »

How do you have your 4400 connected to the network ?
Are you using one or twon interfaces ? If you are using 2 interfaces, then this could sound like a spanning-tree problem.. If you are using the 2 interfaces to your network, you NEED to have the spanning tree activated on your switches. If you are using only 1 ethernet interface, then I would definitly do a network audit ..
Code Free Or Die
jimbob

Post by jimbob »

Hi Frank. The 4400 WAS on the main network, just from a Cisco switch. The 4760 server has 2 NICs. One to the main network with the standard default gateway and subnet mask. The other is the crossover with a different IP, gateway and subnet mask. The subnet mask is open just enough to include both the 4760 and the 4400 IP addresses.

The outcome is that I can connect my 4760 client to the 4760 server (NIC 1) via the main network. Then the 4760 server (NIC 2) connects to the 4400 via the crossover cable.

But alas the problem still occurred (although after a much longer time span).

Regards,
James.
User avatar
frank
Alcatel Unleashed Certified Guru
Alcatel Unleashed Certified Guru
Posts: 3386
Joined: 06 Jul 2004 00:18
Location: New York
Contact:

Post by frank »

Hi..

What about the PBX itself ?
Is it connected using E0, or E0 and E1 ?

If you don't have a network setup, nor IP phones, try to connect the 4760 into a hub, and the hub into the 4400 , then check the status of the network..

If this is OK, then it really means that your network is fucked up :-)
Code Free Or Die
Post Reply

Return to “GENERAL”