OXE login problem

Post Reply
slabr_DUP

OXE login problem

Post by slabr_DUP »

Hi.

OXE 6.1.1 with ssh enabled and redundancy CPU. Now one CPU is missed due to HDD fail. When I try to login via ssh, it is very long time after putting password when it gives me prompt. Login via console is impossible due to Login timed out error. I've got this kind of situations few times earlier and bascul resolve problem. Now, when I've got only one CPU, bascul during rush hours are not possible. But I wonder what could be wrong. There is no any errors in incidents but in syslog after every minute is info, that httpd is not running and monit try to run it
Dec 14 15:55:47 xa4400a monit[7934]: Process `httpd' is not running.
Dec 14 15:55:47 xa4400a monit[7934]: Start: (httpd) /etc/rc.d/init.d/httpd start
What could be wrong and if I do full restart, OXE will standup normally??

Regards
Slawek
Eliott_DUP

Re: OXE login problem

Post by Eliott_DUP »

I think you provided not enough infos to solve your problem. :?

Try to check the logfiles in /var/log/httpd/ (as root).
slabr_DUP

Re: OXE login problem

Post by slabr_DUP »

Well, nothing special are in httpd logs. Only error.log has the lateest sign:
[Fri Dec 14 09:42:38 2007] [notice] caught SIGTERM, shutting down
So, I decided to restart my OXE. And during shutdown process at final restart there was one error:
/DHS3bin/incid/mailsys
module version: r_lnxemu_33.9.4 - linuxemu@cb100s025 - Wed Jan 31 21:05:57 MET 2007 - Linux 2.4.1-ll-dhs3
040 S mtcl e-Mediate Nmi received
NMI Cause : reg=04.
--> 16/12/07 22:35:55 - End of alarm
alcnmi_monitor: 1524 8
891 883 0 63 -12 - 353Unable to handle kernel NULL pointer dereference at virtual address 00000004
printing eip:
c0114204
*pde = 00000000
Oops: 0002
Dumping to device 0x311 [ide0(3,17)] ...
Writing dump header ...

After that, during startup Linux had to check all partitions because of not cleanly unmounted (grrr, I don't like this kind of problems). So, is this some serious problem, or maybe I shouldn't think about it??

Regards
Slawek
phengvanna

Re: OXE login problem

Post by phengvanna »

Please try to disable IP redundancy
In R.6 reduncy activate for both CPU running and when 1 fail it not allow to login

Cheer
Vanna.
torrentula

Re: OXE login problem

Post by torrentula »

You should be able to login, you just can't make changes to the database while the 2nd cpu is offline. If I remember right, you can enable the mao and everything is fine.
slabr_DUP

Re: OXE login problem

Post by slabr_DUP »

Hmmm, the problem was not to can't login decisively. Via ssh it was possible but there was about 2 min break between login prompt and password (and after login, many things worked very slowly - shutdown -r now was started after about 2 min from Enter keypressed). Via RMA it was not possible to login because of timeout error. Now, after restart, everything works fine: normal login time, duplication configuration is on (even without 2nd CPU) and I can make any changes in configuration. My question is, why it was happen, because is was not first time. Earlier, when two CPUs are OK, the problem is not so difficult because bascul resolve it in any time. But when only one CPU is active, restart is not possible during normal work hours.

Regards
Slawek
Eliott_DUP

Re: OXE login problem

Post by Eliott_DUP »

Mhh. A crashdump is not good.
Maybe a harddisk failure?
Did you try smartctl -e /dev/hda and smartctl -a /dev/hda as root?
Any errors there? Errors like "Bad sector" on console or in incvisu?
An redundant CPUs the mao can be enabled in mgr - IP - Redundancy state = On
But you have to start the standby with an mastercopy after (should be normally the way to start a standby all the time)
For your kind of error I suggest you to create a new hdd before you run into more trouble.
slabr_DUP

Re: OXE login problem

Post by slabr_DUP »

Well, the HDD on 2nd CPU was crashed few days ago and now I am waiting for replace. For the 1st CPU I've just enabled SMART reports as you suggested but for this HDD there are no errors thanks GOD (my firt thought was that second HDD crashes too). On the console and in incvisu there are no any errors too (neither now nor before restart). I don't know what is going on, and till new HDD on the 2nd CPU will be replaced, my dreams will not be calm...
Post Reply

Return to “System”