Jump to content

Random Reboots on (desktop) Server 2003 R2


dhope

Recommended Posts

Hi All,

First question here, so apologies if I'm in the wrong forum etc.

My home desktop is a Shuttle SN45Gv3

Athlon 3200+ (XP, Barton)

2GB PC3200 (Crucial)

2 x 320GB SATA2 (Quantum 6V320F0)

nVidia GeForce 6800

The SN45Gv3 motherboard is an nforce2. I'm not using the onboard NVRaid.

Running Windows Server 2003 R2 (v5.2.3790) as my desktop OS (though not used any of the 'conversion' tools, just turned off the bits I don't need manually).

So, basics done, the machine reboots randomly. There doesn't seem to be any reason why. Event logs don't seem to know anything about it, the first they know is after a reboot when I see a "previous shutdown at xx was unexpected" message.

I've just set the machine to not reboot automatically so I'll post BSOD when they appear.

I've run memtest and MS' memory diag tool without any problems.

Basically I'm at a bit of a loss where to look next. I've reinstalled Windows not too long ago and it's made no difference, so I don't think it's a corrupt dll or similar.

Any help greatly appreciated, Cheers

Duncan

Link to comment
Share on other sites


Hi there, thanks for the quick response.

Temp is fine I think - it's worked happily in the past with IDE drives running XP-Pro with the graphics card overclocked (so more heat). Just d/led MBM5 and it's showing Case 52C, CPU 44.

Just after MBM I got my first BSOD though.

MEMORY_MANAGEMENT

STOP: 0x1A (0X401, 0XC00183CC, 0X18FDE067, 0XC0018319)

Which leads me to believe it may be the RAM after all. I've owned Shuttles in the past and they seem to be notoriously picky about which RAM they choose to work with. This is a matched pair from Crucial though that passes all the tests I can throw at it...

From http://www.ocforums.com/showthread.php?t=420646

0x1A, 0x4E, 0x50 These errors are all most likely caused by a device driver or other system program running amok. Antivirus software can actually cause these errors (NortonAV had an issue that would repeatedly generate a MEMORY_MANAGEMENT or PAGE_FAULT_IN_NONPAGED_AREA, for instance) as well, since they're somewhat priveleged on the system. If you have recently installed a new driver, roll it back. That said, if you've shifted drivers around a lot, they may be indicative of a hardware problem.

Drivers was also something I was wondering about - I've not installed any new drivers, but it is the first time Server 2003 R2 has been installed on this machine with this memory and these hard disks, so might be problems there? I don't know much about analysing memory dumps to find out if a specific driver is causing the problem, but if there's value in saving anything then let me know

Cheers

Link to comment
Share on other sites

  • 3 weeks later...

I suggest that you use some utility to monitor the voltages on your motherboard. Either something supplied by the mobo maker or Everest.

I have experienced in past that voltages on even different cables coming out of the smps vary widely. Over a period of time these will deteriorate further. They can play havoc with your pc & data.

Link to comment
Share on other sites

Each memory dump shows kernel pool memory corruption - that means it is possible that it's a hardware problem (which can be borne out by a memory tester), but it is likely it's a driver that is corrupting a kernel pool tag header if the memory hardware test passes.

What happens in this scenario is that at some point in time while the system is running, a driver overruns or underruns it's allocation into another driver's allocation in the kernel nonpaged pool or kernel paged pool resource areas, causing pool memory corruption in that specific area of the kernel pool. When we come along later on to read or write to that corrupted area again by either the same driver or a different driver, we bugcheck with this STOP code. These are hard to track down without enabling special pool tagging in the kernel, and then we do need a kernel dump.

Here's how to enable special pool and configure the server for a kernel memory dump:

1. Create or set the following registry value:

Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management

Value: PoolTag

Type: REG_DWORD

Data: *

2. Right-Click on the "My Computer" icon on the desktop and select "Properties"; this will open the "System Properties" window. Go to the "Advanced" tab and click "Performance Options". Click "Change" under "Virtual Memory". Set the pagefile to be located on the partition where the OS is installed, and set it to be equal to Physical RAM + 50 MB.

3. Also in the "System Properties" window, click on the "Advanced" tab, then click "Startup and Recovery". Make sure "Kernel Memory Dump" is selected. You can change the location of the memory dump file to a different local partition if you do not have enough room on the partition where the OS is installed.

4. You will need to reboot the server for these changes to take effect.

The next time the issue occurs, you should then get a kernel memory dump that will contain "special pool tagging" data, allowing us to see what drivers make what pool allocations, and if an overrun or underrun occurs the driver itself can be pinpointed (note that this does not necessarily mean it's a hardware driver - anything using a file system filter driver, like A/V and antispyware software, can cause this as well). As long as it's not a bad index pointer from a driver causing this, a kernel memory dump with special pool enabled should tell us where the problem lies.

Link to comment
Share on other sites

  • 2 weeks later...

Cluberti, many thanks for your advice, apologies for the slow reply.

This weekend I picked up some old RAM from home that I know works fine with this machine (matched pair of 512s to relpace the matched pair of 1024s). It's still rebooting so I've made the changes you have suggested and will post the dump file when it next happens.

Cheers again,

Duncan

Link to comment
Share on other sites

Thanks for enabling special pool - you can safely disable that now. Here's the analysis, now that I can see pool:

CRITICAL_OBJECT_TERMINATION (f4)
A process or thread crucial to system operation has unexpectedly exited or been
terminated.
Several processes and threads are necessary for the operation of the
system; when they are terminated (for any reason), the system can no
longer function.
Arguments:
Arg1: 00000003, Process
Arg2: 86534140, Terminating object
Arg3: 865342a4, Process image file name
Arg4: 8095aa0e, Explanatory message (ascii)

Debugging Details:
------------------
PROCESS_OBJECT: 86534140
IMAGE_NAME: csrss.exe
DEBUG_FLR_IMAGE_TIMESTAMP: 0
MODULE_NAME: csrss
FAULTING_MODULE: 00000000

PROCESS_NAME: csrss.exe
EXCEPTION_RECORD: f70b6d10 -- (.exr 0xfffffffff70b6d10)
ExceptionAddress: 7c81555f
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000000
Parameter[1]: 7f00700c
Attempt to read from address 7f00700c

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

CURRENT_IRQL: 0

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".
READ_ADDRESS: 7f00700c
BUGCHECK_STR: 0xF4_C0000005



!THREAD 85f5a7a0 Cid 0424.042c Teb: 7ffda000 Win32Thread: e1717ea8 WAIT: (Executive) KernelMode Non-Alertable
f7197560 Mutant - owning thread 86778518
Impersonation token: e31bc030 (Level Impersonation)
DeviceMap e26729d8
Owning Process 86534140 Image: csrss.exe
Wait Start TickCount 2835 Ticks: 0
Context Switch Count 786 LargeStack
UserTime 00:00:00.062
KernelTime 00:00:00.015
Win32 Start Address 0x00001a0d
LPC Server thread working on message Id 1a0d
Start Address 0x75a548d8
Stack Init f7597000 Current f759649c Base f7597000 Limit f7594000 Call 0
Priority 14 BasePriority 13 PriorityDecrement 0
ChildEBP RetAddr Args to Child
f75964b4 80820128 85f5a7a0 85f5a848 00006101 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])
f75964cc 8081f9de e25583b8 e25583b8 00000000 nt!KiSwapThread+0x83 (FPO: [Non-Fpo]) (CONV: fastcall)
f7596510 f71a9224 f7197560 00000000 00000000 nt!KeWaitForSingleObject+0x2e0 (FPO: [Non-Fpo]) (CONV: stdcall)
f759653c f71ad605 e25583b8 00000000 00000000 Ntfs!NtfsDeleteInternalAttributeStream+0x3f (FPO: [Non-Fpo]) (CONV: stdcall)
f7596558 f717b64d f75967e0 e25583b8 f7596700 Ntfs!NtfsRemoveScb+0x77 (FPO: [Non-Fpo]) (CONV: stdcall)
f7596574 f71ab2b5 f75967e0 e25582f0 00000000 Ntfs!NtfsPrepareFcbForRemoval+0x52 (FPO: [Non-Fpo]) (CONV: stdcall)
f75965bc f71ab0b4 f75967e0 e25582f0 00000000 Ntfs!NtfsTeardownStructures+0x62 (FPO: [Non-Fpo]) (CONV: stdcall)
f75965d8 f71afa3e 0194c400 00000000 dbc4c500 Ntfs!NtfsCommonCreate+0x18a4 (FPO: [SEH]) (CONV: stdcall)
f75967c0 f71aac90 f75967e0 854eb008 f7596908 Ntfs!NtfsCommonCreate+0x143b (FPO: [Non-Fpo]) (CONV: stdcall)
f759698c f725ed90 854eb008 f7596bfc 866f1020 Ntfs!NtfsNetworkOpenCreate+0x96 (FPO: [Non-Fpo]) (CONV: stdcall)
f75969ac f726cabd 000000f2 00000000 f75969e4 fltMgr!FltpPerformFastIoCall+0x300 (FPO: [Non-Fpo]) (CONV: stdcall)
f7596a04 8090f0fc 854eb008 f7596bfc 86734b78 fltMgr!FltpFastIoQueryOpen+0xa1 (FPO: [Non-Fpo]) (CONV: stdcall)
f7596af8 80902fad 86736030 00000000 8654db30 nt!IopParseDevice+0x917 (FPO: [Non-Fpo]) (CONV: stdcall)
f7596b78 80906a15 00000000 f7596bb8 00000040 nt!ObpLookupObjectName+0x5a9 (FPO: [Non-Fpo]) (CONV: stdcall)
f7596bcc 809265cd 00000000 00000000 ffffff01 nt!ObOpenObjectByName+0xea (FPO: [Non-Fpo]) (CONV: stdcall)
f7596d54 8082337b 0068f640 0068f608 0068f670 nt!NtQueryFullAttributesFile+0x152 (FPO: [Non-Fpo]) (CONV: stdcall)
f7596d54 7c82ed54 0068f640 0068f608 0068f670 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f7596d64)
WARNING: Frame IP not in any known module. Following frames may be wrong.
0068f670 00000000 00000000 00000000 00000000 0x7c82ed54

!PROCESS 86534140 SessionId: 0 Cid: 0424 Peb: 7f007000 ParentCid: 039c
DirBase: 3176d000 ObjectTable: e191fca0 HandleCount: 448.
Image: csrss.exe

dt nt!_KMUTANT f7197560
+0x000 Header : _DISPATCHER_HEADER
+0x010 MutantListEntry : _LIST_ENTRY [ 0x86778528 - 0x86778528 ]
+0x018 OwnerThread : 0x86778518 _KTHREAD
+0x01c Abandoned : 0 ''
+0x01d ApcDisable : 0 ''
This thread 85f5a7a0 is waiting on KMUTANT !mu f7197560



!THREAD 86778518 Cid 054c.043c Teb: 7ffdf000 Win32Thread: e2fbd808 READY
Not impersonating
DeviceMap e26729d8
Owning Process 8658d4b8 Image: utorrent.exe
Wait Start TickCount 2835 Ticks: 0
Context Switch Count 265 LargeStack
UserTime 00:00:00.015
KernelTime 00:00:00.000
Win32 Start Address 0x00401000
Start Address 0x77e6b5ff
Stack Init ba578000 Current ba577648 Base ba578000 Limit ba573000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr Args to Child
ba577660 80820128 86778518 867785c0 00000101 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])
ba577678 8081f9de 00000000 00000000 e3192a28 nt!KiSwapThread+0x83 (FPO: [Non-Fpo]) (CONV: fastcall)
ba5776bc f71bfcf9 f7197560 00000000 00000000 nt!KeWaitForSingleObject+0x2e0 (FPO: [Non-Fpo]) (CONV: stdcall)
ba577728 f71ad831 ba577b1c e3192a28 00000001 Ntfs!NtfsCreateInternalStreamCommon+0x53 (FPO: [Non-Fpo]) (CONV: stdcall)
ba577750 f71b1304 ba577b1c e3192a28 00000004 Ntfs!ReadIndexBuffer+0x26 (FPO: [Non-Fpo]) (CONV: stdcall)
ba577780 f71ac4dc ba577b1c e1565068 e31d71e0 Ntfs!FindFirstIndexEntry+0x196 (FPO: [Non-Fpo]) (CONV: stdcall)
ba5778a8 f71aab06 ba577b1c e3192bb8 e3192a28 Ntfs!NtfsRestartIndexEnumeration+0x6c (FPO: [Non-Fpo]) (CONV: stdcall)
ba577acc f71ac180 ba577b1c 85306750 86412368 Ntfs!NtfsQueryDirectory+0x54c (FPO: [Non-Fpo]) (CONV: stdcall)
ba577b00 f71ac452 ba577b1c e3192a28 85f6a8d8 Ntfs!NtfsCommonDirectoryControl+0xbc (FPO: [Non-Fpo]) (CONV: stdcall)
ba577c70 f73a9aea 86412288 85306750 85f6a8d8 Ntfs!NtfsFsdDirectoryControl+0xad (FPO: [Non-Fpo]) (CONV: stdcall)
WARNING: Stack unwind information not available. Following frames may be wrong.
ba577ca0 80828c95 867606a8 86412288 85306750 sptd+0x14aea
ba577cb8 f725fd36 8091c136 866d3bc8 85306750 nt!IofCallDriver+0x45 (FPO: [Non-Fpo]) (CONV: fastcall)
ba577ce4 80828c95 85f6a8d8 85306750 0012f7b8 fltMgr!FltpDispatch+0x152 (FPO: [Non-Fpo]) (CONV: stdcall)
ba577cf8 8090b989 ba577d64 0012f7b8 8091c136 nt!IofCallDriver+0x45 (FPO: [Non-Fpo]) (CONV: fastcall)
ba577d0c 8091c193 85f6a8d8 85306750 86467c18 nt!IopSynchronousServiceTail+0x10b (FPO: [Non-Fpo]) (CONV: stdcall)
ba577d30 8082337b 00000224 00000000 00000000 nt!NtQueryDirectoryFile+0x5d (FPO: [Non-Fpo]) (CONV: stdcall)
ba577d30 7c82ed54 00000224 00000000 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ ba577d64)
0012fa8c 00000000 00000000 00000000 00000000 0x7c82ed54

!PROCESS 8658d4b8 SessionId: 0 Cid: 054c Peb: 7ffdc000 ParentCid: 036c
DirBase: 2aec4000 ObjectTable: e3099420 HandleCount: 137.
Image: utorrent.exe


start end module name
f7395000 f7466000 sptd (no symbols)
Loaded symbol image file: sptd.sys
Image path: sptd.sys
Image name: sptd.sys
Timestamp: Sat Jun 24 05:18:51 2006 (449D037B)
CheckSum: 000A554B
ImageSize: 000D1000
Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0

sptd.sys is the SCSI pass-through filter driver for Daemon Tools, and that's the driver that went in and corrupted pool, and we bugchecked when another application's kernel filter driver (we can't see which due to the corruption, but it's not relevant anyway) went through csrss.exe to read data in pool, and the csrss.exe process bugchecked due to the corrupted information. From the information here, I'd say disabling the sptd.sys driver and then removing Daemon Tools would be a good start.

Link to comment
Share on other sites

Hi there,

I noticed I was running a slightly old version of Daemon Tools (4.03) and have updated to 4.06

Also updated from sptd 1.25 to 1.35. Also stopped Daemon Tools and Alcohol (since I'm guessing the drive emulation works the same way). I'll see if it makes the system more stable.

Might be that Vista will cope better with my system/software - guess we'll find out in the next month or so when the final release arrives on msdn.

Thanks again, will keep the board updated with any results.

Link to comment
Share on other sites

I upgraded to Daemon Tools 4.06 and SPTD 1.35 but was still getting reboots.

I next downloaded and installed the nForce 9.35 Unified Remix, a collection of the latest (some Vista, others tweaked inf) drivers - while having a couple of problems with the internal NIC disappearing I noticed that I had StarForce (copy protection software that can screw with Daemon/SPTD) installed, probably from some game I've since removed - I uninstalled StarForce and everything seems to be more stable since. I was noticing Civ4 Warlords had been playing up a lot, but that seems to be rock solid since, so I'm thinking it could be one of the Vista/Customised drivers works better with Win2003 R2, or maybe StarForce was causing problems, I know it's a hugely unpopular piece of software with a track record of screwing up people's machines...

I'll keep an eye on things and update in a week or so, but things could well be that it's resolved. If so then many thanks for your help in tracking down the problems.

Duncan

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...