Jump to content

Need Help Boot/Login Tracing


jcarle

Recommended Posts

Almost 100,000 forum members... someone's must have an answer to this.

SOMETHING hangs when I login. I'm not sure if it's a boot-hang or a login-hang, but something hangs.

Before anyone even says it, I did try BootVis. Unfortunately, for some odd reason, it won't trace anything other then Disk I/O and Process Creation.

I'm guessing the problem revolves around a driver perhaps but since BootVis won't trace driver delays, I'm out of luck.

Does anyone know a method I could use to identify what's hanging?

Link to comment
Share on other sites


1. Configure for a complete memory dump via the keyboard (has to be a PS/2 keyboard, USB won't work at this point in the boot process):

- Create or set the following registry value:

Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters

Value: CrashOnCtrlScroll

Type: REG_DWORD

Data: 1

- Right-Click on the "My Computer" icon on the desktop and select "Properties"; this will open the "System Properties" window. Go to the "Advanced" tab and click "Performance Options". Click "Change" under "Virtual Memory". Set the pagefile to be located on the partition where the OS is installed, and set it to be equal to Physical RAM + 50 MB.

- Also in the "System Properties" window, click on the "Advanced" tab, then click "Startup and Recovery". Make sure "Complete Memory Dump" is selected. You can change the location of the memory dump file to a different local partition if you do not have enough room on the partition where the OS is installed.

You will need to reboot for these changes to take effect.

2. Time your next boot, so you know exactly how long (in seconds) it takes from the symptoms of the hang to start and stop - we want to dump it sometime in the middle of the "hang" period, so we need to know how long the hang behavior lasts

3. Reboot again, but this time hold down the RIGHT CTRL key and press the SCROLL LOCK key twice when the box is at the middle of the "hang" - this will cause the box to bugcheck and create a memory.dmp file.

4. Once the box reboots and comes back up, let me know and I'll give you a place to upload the memory.dmp file.

Link to comment
Share on other sites

Just give me a q-tip, a stick of gum, and a sock and I can fix anything. :sneaky:

However, there are 2 specific reasons why I would suggest doing it this way:

1. Dumps never lie - they can be misleading, but they always contain the data needed to resolve a problem

2. There are specific load orders for kernel drivers and services, and I think one (or more) kernel service driver is causing your issue. This is currently the only way to troubleshoot that (short of uninstalling everything non-Microsoft and rebooting, and even that may not resolve it if anything is left behind).

A dump of the issue should give me registers, memory, and call stacks of all processes and threads on the system at the time the system is hanging, and I can debug system hangs in my sleep :).

Link to comment
Share on other sites

I've identified part of the problem.

The issue occurs in part because of Diskeeper 2007. If I uninstall it, the problem goes away. As soon as it is re-installed, the problem comes back. However, I'm not certain that Diskeeper is solely responsible. During the 60 - 90 second hang, Windows remains responsive, everything seems normal except I cannot access the network or the internet. Outlook will hang, IE will remain responsive yet will not be able to access any webpage (it keeps searching for it like a down server) and accessing the network hangs Explorer. AnyDVD also refuses to load during this hang. I can load most applications during the hang, things like Notepad have no issue starting during the hang.

As soon as the system un-hangs itself, everything goes back to normal. AnyDVD loads, the internet and the network become accessible. I know the hang is over when the AnyDVD icon shows up in my system tray.

I have three applications that load on start-up. AnyDVD, Serv-U and Diskeeper. If you need a list of other system processes, services, background applications or anything else, let me know. PM sent for the memory dump.

Link to comment
Share on other sites

What appears to be happening from the dump is that we have 7 threads in services.exe waiting on a service to start / respond to a control request. Note that while services.exe is hung, a machine will chug along, but will not be able to open applications properly or get out to the internet until services.exe continues and un-hangs. That's probably why everything seems to still run, but hangs until the issue is cleared. I thought I'd show you what I was seeing in the dump, before I tell you what I think. To start, here's one of the service threads that is stuck waiting, as an example (all 7 threads in a hung state look basically the same):

!THREAD 89420890  Cid 0290.01e0  Teb: 7ffa2000 Win32Thread: 00000000 WAIT: (UserRequest) UserMode Non-Alertable
891a2f18 SynchronizationEvent
Not impersonating
DeviceMap e1000160
Owning Process 89080da0 Image: services.exe
Wait Start TickCount 1854 Ticks: 1209 (0:00:00:18.890)
Context Switch Count 16
UserTime 00:00:00.0000
KernelTime 00:00:00.0000
Win32 Start Address 0x00000ccd
LPC Server thread working on message Id ccd
Start Address kernel32!BaseThreadStartThunk (0x7c810659)
Stack Init abb77000 Current abb76ca0 Base abb77000 Limit abb74000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 0 DecrementCount 0
ChildEBP RetAddr Args to Child
abb76cb8 80502b17 89420900 89420890 804fad6c nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4])
abb76cc4 804fad6c 00000000 00000000 00000000 nt!KiSwapThread+0x6b (FPO: [0,0,0])
abb76cec 805befc4 00000001 00000006 abb76c01 nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])
abb76d50 8054060c 000003cc 00000000 00000000 nt!NtWaitForSingleObject+0x9a (FPO: [Non-Fpo])
abb76d50 7c90eb94 000003cc 00000000 00000000 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ abb76d64)
0105f7a4 7c90e9c0 7c91901b 000003cc 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
0105f7a8 7c91901b 000003cc 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc (FPO: [3,0,0])
0105f830 7c90104b 0001a344 01005f6e 0101a344 ntdll!RtlpWaitForCriticalSection+0x132 (FPO: [Non-Fpo])
0105f838 01005f6e 0101a344 00373f10 00000000 ntdll!RtlEnterCriticalSection+0x46 (FPO: [1,0,0])
0105f860 010060fd 00373f10 00000000 00000000 services!ScStartServiceAndDependencies+0x1d (FPO: [Non-Fpo])
0105f884 010066fe 000d2530 00000000 00000000 services!RStartServiceW+0x8c (FPO: [Non-Fpo])
0105f8d0 77e79dc9 000d2530 00000000 00000000 services!RStartServiceA+0xbd (FPO: [Non-Fpo])
0105f8f0 77ef321a 010066d8 0105f904 00000003 RPCRT4!Invoke+0x30
0105fcf8 77ef36ee 00000000 00000000 000d526c RPCRT4!NdrStubCall2+0x297 (FPO: [Non-Fpo])
0105fd14 77e7988c 000d526c 000afec8 000d526c RPCRT4!NdrServerCall2+0x19 (FPO: [Non-Fpo])
0105fd48 77e797f1 01002579 000d526c 0105fdf0 RPCRT4!DispatchToStubInC+0x38 (FPO: [Non-Fpo])
0105fd9c 77e7971d 0000001f 00000000 0101a138 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x113 (FPO: [Non-Fpo])
0105fdc0 77e7bd0d 000d526c 00000000 0101a138 RPCRT4!RPC_INTERFACE::DispatchToStub+0x84 (FPO: [Non-Fpo])
0105fdfc 77e7bb6a 000d9028 000a70b0 000d5010 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0x2db (FPO: [Non-Fpo])
0105fe20 77e76784 000a70ec 0105fe38 000d5010 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0x16d (FPO: [Non-Fpo])
0105ff80 77e76c22 0105ffa8 77e76a3b 000a70b0 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x28f (FPO: [Non-Fpo])
0105ff88 77e76a3b 000a70b0 00000048 00fbfc98 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])
0105ffa8 77e76c0a 000ad248 0105ffec 7c80b683 RPCRT4!BaseCachedThreadRoutine+0x79 (FPO: [Non-Fpo])
0105ffb4 7c80b683 000d83a0 00000048 00fbfc98 RPCRT4!ThreadStartRoutine+0x1a (FPO: [Non-Fpo])
0105ffec 00000000 77e76bf0 000d83a0 00000000 kernel32!BaseThreadStart+0x37 (FPO: [Non-Fpo])

Here's the thread that's talking from dkservice.exe to services.exe:

!THREAD 88ea7a00  Cid 06c8.06fc  Teb: 7ffdd000 Win32Thread: e1fe3008 WAIT: (WrLpcReply) UserMode Non-Alertable
88ea7bf4 Semaphore Limit 0x1
Waiting for reply to LPC MessageId 00001964:
Current LPC port e27ca180
IRP List:
8923fa18: (0006,0094) Flags: 00000800 Mdl: 00000000
Not impersonating
DeviceMap e1000160
Owning Process 89125910 Image: DkService.exe
Wait Start TickCount 2028 Ticks: 1035 (0:00:00:16.171)
Context Switch Count 2511 LargeStack
UserTime 00:00:00.0000
KernelTime 00:00:00.0156
Win32 Start Address ADVAPI32!ScSvcctrlThreadA (0x77deb479)
Start Address kernel32!BaseThreadStartThunk (0x7c810659)
Stack Init baaf8000 Current baaf7c50 Base baaf8000 Limit baaf4000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16
ChildEBP RetAddr Args to Child
baaf7c68 80502b17 88ea7a70 88ea7a00 804fad6c nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4])
baaf7c74 804fad6c 88ea7bf4 88ea7bc8 88ea7a00 nt!KiSwapThread+0x6b (FPO: [0,0,0])
baaf7c9c 805a1dcf 00000001 00000011 00e9e301 nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])
baaf7d50 8054060c 000000c8 0019c9b0 0019c9b0 nt!NtRequestWaitReplyPort+0x63d (FPO: [Non-Fpo])
baaf7d50 7c90eb94 000000c8 0019c9b0 0019c9b0 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ baaf7d64)
00e9dedc 7c90e3ed 77e7c968 000000c8 0019c9b0 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
00e9dee0 77e7c968 000000c8 0019c9b0 0019c9b0 ntdll!ZwRequestWaitReplyPort+0xc (FPO: [3,0,0])
00e9df2c 77e7a716 0019c9e8 00e9df4c 77e7a747 RPCRT4!LRPC_CCALL::SendReceive+0x228 (FPO: [Non-Fpo])
00e9df38 77e7a747 00e9df68 77de1d58 00e9e344 RPCRT4!I_RpcSendReceive+0x24 (FPO: [Non-Fpo])
00e9df4c 77ef3675 00e9df94 0019ca04 00000000 RPCRT4!NdrSendReceive+0x2b (FPO: [Non-Fpo])
00e9e328 77debc14 77de1d58 77de5768 00e9e344 RPCRT4!NdrClientCall2+0x222 (FPO: [Non-Fpo])
00e9e33c 77debbca 0017ff10 00000000 00000000 ADVAPI32!RStartServiceW+0x1b (FPO: [Non-Fpo])
00e9e37c 5999587c 0017ff10 00000000 00000000 ADVAPI32!StartServiceW+0x1e (FPO: [Non-Fpo])
00e9e3e4 77e42180 00000000 00e9ec90 0017ae90 wmiaprpl!WmiAdapterWrapper::Open+0x85 (FPO: [Non-Fpo])
00e9ead4 77e40e5c 0017ae90 0000007a 77df4d50 ADVAPI32!OpenExtObjectLibrary+0x58f (FPO: [Non-Fpo])
00e9ec48 77e09c8e 00e9ec90 00000000 00e9f158 ADVAPI32!QueryExtensibleData+0x3d8 (FPO: [Non-Fpo])
00e9f020 77df4406 80000004 00e9f13c 00000000 ADVAPI32!PerfRegQueryValue+0x513 (FPO: [Non-Fpo])
00e9f110 77dd7054 80000004 00e9f13c 00e9f160 ADVAPI32!LocalBaseRegQueryValue+0x306 (FPO: [Non-Fpo])
00e9f148 740072e7 00000000 00e9f1d8 00038000 ADVAPI32!RegQueryValueExW+0xa2 (FPO: [Non-Fpo])
00e9f1b0 7400862b 80000004 009462a0 00e9f1d8 pdh!GetSystemPerfData+0x66 (FPO: [Non-Fpo])
00e9f3e4 740052cb 009420e0 00000000 00000007 pdh!GetMachine+0x205 (FPO: [Non-Fpo])
00e9f430 740096b3 00941fb8 00000000 009420c0 pdh!InitCounter+0x1e0 (FPO: [Non-Fpo])
00e9f44c 74009807 00941ef8 00941f68 00000000 pdh!PdhiAddCounter+0xba (FPO: [Non-Fpo])
00e9f4a0 00481b17 00941ef8 00e9f500 00000000 pdh!PdhAddCounterW+0xdd (FPO: [Non-Fpo])
WARNING: Stack unwind information not available. Following frames may be wrong.
00e9f500 006d0065 0072006f 005c0079 00200025 DkService!DecryptPasswordStr+0x1e7
00e9f504 0072006f 005c0079 00200025 006f0043 0x6d0065
00e9f508 005c0079 00200025 006f0043 006d006d 0x72006f
00e9f50c 00200025 006f0043 006d006d 00740069 0x5c0079
00e9f510 006f0043 006d006d 00740069 00650074 0x200025
00e9f514 006d006d 00740069 00650074 00200064 0x6f0043
00e9f518 00740069 00650074 00200064 00790042 0x6d006d
00e9f51c 00650074 00200064 00790042 00650074 0x740069
00e9f520 00200064 00790042 00650074 00200073 0x650074
00e9f524 00790042 00650074 00200073 006e0049 0x200064
00e9f528 00650074 00200073 006e0049 00550020 0x790042
00e9f52c 00200073 006e0049 00550020 00650073 0x650074
00e9f530 006e0049 00550020 00650073 00000000 0x200073
00e9f534 00550020 00650073 00000000 00000000 0x6e0049
00e9f538 00650073 00000000 00000000 00000000 0x550020
00e9f53c 00000000 00000000 00000000 00000000 0x650073

And that thread is waiting on a response from this thread, also in dkservice.exe:

!THREAD 89136da8  Cid 06c8.06cc  Teb: 7ffdf000 Win32Thread: e267f860 WAIT: (Executive) UserMode Non-Alertable
891befec NotificationEvent
IRP List:
89187c20: (0006,0094) Flags: 00000900 Mdl: 00000000
Not impersonating
DeviceMap e1000160
Owning Process 89125910 Image: DkService.exe
Wait Start TickCount 1652 Ticks: 1411 (0:00:00:22.046)
Context Switch Count 216 LargeStack
UserTime 00:00:00.0062
KernelTime 00:00:00.0000
Win32 Start Address 0x004a58b7
Start Address kernel32!BaseProcessStartThunk (0x7c810665)
Stack Init ae16f000 Current ae16ec2c Base ae16f000 Limit ae16b000 Call 0
Priority 6 BasePriority 6 PriorityDecrement 0 DecrementCount 0
ChildEBP RetAddr Args to Child
ae16ec44 80502b17 89136e18 89136da8 804fad6c nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4])
ae16ec50 804fad6c 00000103 00000000 89187c20 nt!KiSwapThread+0x6b (FPO: [0,0,0])
ae16ec78 8057e6e6 00000001 00000000 893c0101 nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])
ae16eca0 8057b761 893c01e8 00000103 891bef90 nt!IopSynchronousServiceTail+0xc6 (FPO: [Non-Fpo])
ae16ed38 8054060c 000000c0 00000000 00000000 nt!NtReadFile+0x55d (FPO: [Non-Fpo])
ae16ed38 7c90eb94 000000c0 00000000 00000000 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ ae16ed64)
0012fba0 7c90e288 7c801875 000000c0 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
0012fba4 7c801875 000000c0 00000000 00000000 ntdll!NtReadFile+0xc (FPO: [9,0,0])
0012fc0c 77deb3cb 000000c0 0012fcd8 00000216 kernel32!ReadFile+0x16c (FPO: [Non-Fpo])
0012fc38 77deb25f 000000c0 0012fcd8 00000216 ADVAPI32!ScGetPipeInput+0x2a (FPO: [Non-Fpo])
0012fcac 77deb568 000000c0 0012fcd8 00000216 ADVAPI32!ScDispatcherLoop+0x3f (FPO: [Non-Fpo])
0012ff0c 0044a1ed 0012ff20 004d65b4 00000001 ADVAPI32!StartServiceCtrlDispatcherW+0xe3 (FPO: [Non-Fpo])
WARNING: Frame IP not in any known module. Following frames may be wrong.
0012ffc0 7c816fd7 0007fbc8 00000000 7ffd4000 0x44a1ed
0012fff0 00000000 004a58b7 00000000 78746341 kernel32!BaseProcessStart+0x23 (FPO: [Non-Fpo])

That thread is in the middle of an NtReadFile operation, and it is causing the delay - not sure why, but here's the culprit in the dkservice.exe process:

!handle 0x0000c0 F 0x89125910
processor number 0, process 89125910
PROCESS 89125910 SessionId: 0 Cid: 06c8 Peb: 7ffd4000 ParentCid: 0290
DirBase: 0a540260 ObjectTable: e1fba198 HandleCount: 285.
Image: DkService.exe

Handle table at e27ba000 with 285 Entries in use
00c0: Object: 891bef90 GrantedAccess: 0012019f Entry: e27ba180
Object: 891bef90 Type: (89e62ca0) File
ObjectHeader: 891bef78
HandleCount: 1 PointerCount: 3
Directory Object: 00000000 Name: \net\NtControlPipe8 {NamedPipe}

You'll probably have to get with Diskeeper to determine what the process is looking for on service startup, because without symbols to the Diskeeper product I can't tell why it's looking for this file, nor why it waits so long after completing this handle operation to a named pipe that it continues. One thing you could try though is to uninstall all of the other software with fsf drivers, like antivirus or antispyware software, to make sure they aren't causing the delay - I've seen other slowness on machines with the VMware server vmount and net drivers installed as well, so removal of vmware server from the machine might help as well. Just a thought.

Link to comment
Share on other sites

The culprit was the fact that the paging file was set to disabled. Apparently Diskeeper does not function properly without the pagefile. Also, turning off the pagefile caused the PerfOS.dll counters to throw an error, which set them to disabled, which in turn caused an error with vmauthd (Authorization service of vmWare workstation).

I enabled the page file, rebooted, re-enabled the PerfOS.dll counters, rebooted, cleared the event logs, rebooted and everything is beautiful.

Link to comment
Share on other sites

Well its good to see Matrix jobs gets done in real life :blink: (but how the ****)

ok i wanna learn.where can i find a manual or help file to those sympthoms and other stuffs in dump file?

i really like to know what they means cause i always thoughts they are useless.

Cheers.

Link to comment
Share on other sites

The best things to do, when wanting to get into this stuff, is probably the following two things:

1. Read (many times over) the book "Windows Internals, 4th Edition" by Mark Russinovich and David Solomon. It teaches you everything you need to know about the innards and inner workings of the Windows NT kernel including memory management, security model, priority, dispatching, etc.

2. Get into a class (preferrably in your area) on Windows Kernel debugging, whether that be from your local training center or some other source.

With a good deal of reading and practice, you should get the hang of it. It's not rocket science, thankfully - more of an art form :).

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...