Jump to content

Server reboots weekly


Recommended Posts

Hi all, hope someone can help shed some light on this problem I’m having.

I just recently started a new job and have been having a few problems with one of the servers here. Every Tuesday/Wednesday for no apparent reason the server reboots and when I arrive on Wednesday morning I have to go in and check all services etc are running correctly. There is no event in event viewer showing a shutdown has occurred and when I 1st logon I have to give a reason for an unexpected shutdown. I have spoke with other engineers here and they say it has been happening for quite a while and they haven’t been able to solve the problem. There are no scheduled tasks or scripts running that might cause this, and nothing that specifically happens between Tuesday and Wednesday either. The server O/S is 2003 Sp1 fully patched, Dual 2.4 Xeon HT CPU's and 3GB RAM. It is currently configured as a terminal server and is also used to run certain bespoke applications, IIS & SQL Server 2000. Unless I really have to I would like to check all O/S & software related possibilities 1st before checking hardware.

Does anyone have any idea where i can start my search for the cause of the problem, or am i going to have to delve right in and start thinking hardware?

Thanks for any advice offered

Rob

Link to comment
Share on other sites


If there are no events before the reboot and it's not generating a dump, I would check for either environmental issues (cleaners unplugging devices to plug in their vacuum cleaners, its not an urban myth) or if you BIOS supports Automatic System Recovery (ASR) then disable this.

ASRs can cause memory dumps to get corrupted, or spontaneous reboots when the BIOS decides the machine is "unresponsive" and hard resets it.

Is the reboot always at the same time of day?

Edited by Mr Snrub
Link to comment
Share on other sites

Thanks for the input, its def not automatic updates because this is done through WSUS and updates are then installed manualy on the servers, as for ASR i will check it next time the server reboots, we dont have any other servers available at the moment to take over the roles and with our busy environment i cant just go shutting down a server whilst i have a poke about.

Cleaning lady is also a no go as the server room is locked and no one has access to it through the night.

As soon as i have anymore information i will update the post.

Thanks again

Rob

Link to comment
Share on other sites

I know you said no hardware yet but I had a very similar problem to this with my last server, my server would restart a couple of times during the week for no apparent reason. My problem turned out to be a faulty power supply so maybe that is something you could check if all else fails.

Link to comment
Share on other sites

I would make sure you're configured for a complete dump, have the paging file on C: (and at least RAM + 50MB), and automatically reboot is checked as well. If these servers are Compaq/HPs especially, you also need to disable the ASR in the BIOS so the dump can be generated.

So you don't see any SaveDump/1001 events (assuming Windows 2000/2003)?

Link to comment
Share on other sites

I have been thinking power supply as i has a similer problem with a desktop machine about a year ago, however the company im working for seem to dismiss hardware problems as a myth. There are no dump files generated but the server is HP so i will definatly check the BIOS and dissable ASR.

As soon as i have any further info i will update.

Thanks

Rob

Link to comment
Share on other sites

I have the same issue on a HP NAS and after disabling ASR in the bios, the server won't reboot but seems to be freezed for 2/3 hours every wednesday morning at six o'clock. Of course all possible bios / firmware upgrade have been made and the Windows 2003 is also updated with lastest service pack and patchs.

Link to comment
Share on other sites

Assuming these things have PS/2 keyboards, and if attached to a KVM it doesn't capture the Scroll Lock key, do the following:

1. If you have a feature like Compaq's Automatic System Restart (ASR), please disable it. This setting is usually found in the BIOS. With this feature enabled, if the BIOS does not detect a heartbeat from the OS, it will restart the server. This will interrupt the dump process.

2. Create or set the following registry value:

Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters

Value: CrashOnCtrlScroll

Type: REG_DWORD

Data: 1

Refer to the following Knowledge Base article for more information on this registry key:

244139 Windows Feature Allows a Memory.dmp File to Be Generated with Keyboard

http://support.microsoft.com/?id=244139

3. Right-Click on the "My Computer" icon on the desktop and select "Properties"; this will open the "System Properties" window. Go to the "Advanced" tab and click "Performance Options". Click "Change" under "Virtual Memory". Set the pagefile to be located on the partition where the OS is installed, and set it to be equal to Physical RAM + 50 MB.

4. Also in the "System Properties" window, click on the "Advanced" tab, then click "Startup and Recovery". Make sure "Complete Memory Dump" is selected (see 4a if this is not in the list). You can change the location of the memory dump file to a different local partition if you do not have enough room on the partition where the OS is installed.

4a. If the "Complete Memory Dump" option in step 4 is not available, you will need to manually set this registry value:

Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl

Value: CrashDumpEnabled

Type: REG_DWORD

Value: 1

5. You will need to reboot the server for these changes to take effect.

The next time that the server is hung, you can go to the console and hold down the RIGHT CTRL key and press the SCROLL LOCK key twice to cause the server to bugcheck and create a memory.dmp file. This .dmp file will show what the server was (or wasn't) doing when it hung.

Note that if you try this with a PS/2 keyboard and it doesn't work, you've got a hardware problem for sure :).

Link to comment
Share on other sites

Thanks everyone for all the response, there is some down time scheduled in for the end of the week so it look like I’m going to get to check these settings sooner than I thought, as soon as I have checked over the settings will update the post.

Thanks once again

Rob

Link to comment
Share on other sites

Is there a UPS attached to this server?

I am aware of a configuration option on APS UPS's which enable a weekly test of the battery. If the battery is dead, then this would result in the power to the server dropping.

Link to comment
Share on other sites

Well, again i would like to say thank you, dissabling ASR has solved the problem. This morning the server hadn't rebooted like it normaly would have done on a wednesday and i have + kudos at work :thumbup .

Cheers

Rob

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...