Jump to content

Mr Snrub

Patron
  • Posts

    765
  • Joined

  • Last visited

  • Donations

    0.00 USD 
  • Country

    Sweden

Everything posted by Mr Snrub

  1. See, if you wait long enough someone smarter than me comes along. Story of my life
  2. For easily reproducible issues it can be quicker to do simple "one at a time" tests, so considered part of root cause analysis (even if it rules the component out by the problem still being present without its presence). Experience Smarter people than me might be able to, but due to the way device and filter drivers work it's more of a "go with your gut" from me
  3. Did you test uninstalling Symantec AV? The dump still has it loaded, with those modules from 2006 present... The pool tagging just confirms what we suspected - the nonpaged pool is exhausted through allocations to "Irp ", which is from I/O request packets. The I/Os themselves are completed, but the pool allocations not freed, most likley due to some driver. The I/Os also seem to be aimed at the various USB root hubs, which is why I also asked about any USB devices that may have been connected to the system recently. If I was a betting man, I would say it's Symantec AV causing the problem from the information we have so far - I would start by uninstalling that and watching the system for ~20 hours (the dumps so far seem to take 16-19 hours to get the point where they crash).
  4. I believe Release To World is scheduled for October 22nd.
  5. I don't know how conclusive it is, but I tried launching \Windows\explorer.exe from my Vista x64 partition whilst booted into Windows 7 x64, and it just throws error 0xc0000142 immediately - I had no intention of trying to replace any system/shell DLLs to test further. Personally the taskbar in Windows 7 has really grown on me, and I no longer miss the quicklaunch bar.
  6. While we wait for the dump with pool tagging enabled... Nonpaged (or nonpageable) pool memory is for dynamic memory allocations in the kernel that cannot be paged out to disk - drivers have to use this pool for data that must be available at all times, as an page fault (request for a virtual page not resident in physical RAM, but in the page file on disk) is not allowed when they have control.This is the classic IRQL_NOT_LESS_THAN_OR_EQUAL bugcheck, if the driver developer makes this assumption. Because the nonpaged pool region has to take physical memory, and is a subset of the 2GB kernel space, its absolute maximum is capped at 256MB (but systems with less than ~768MB RAM, or using /3GB would have less than this as their limit). Because it is a finite system resource, once it is no longer required an allocation is meant to be returned to the pool by marking is as free. (The other, larger pool is paged pool - this is the same concept of dynamic memory allocations in the kernel, but these ones are non-critical data that we can put into the page file as needed to free physical memory.) What do you have in the way of USB devices connected to the machine? I ask because I had a poke around the nonpaged pool region to see if there are any clues, and saw a lot of Irps (I/O request packets), and so ran the !irpfind command to get a summary: 1: kd> !irpfind unable to get large pool allocation table - either wrong symbols or pool tagging is disabled Searching NonPaged pool (827b6000 : 8a7b6000) for Tag: Irp? Irp [ Thread ] irpStack: (Mj,Mn) DevObj [Driver] MDL Process 827b64a8 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) 827b6b28 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) 827b8008 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) 827b83c0 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) 827b8b20 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) 827b9008 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) 827b9d98 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) 827bad98 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) 827bb008 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) ... ffbddb28 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) ffbde008 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) ffbde3d8 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) ffbde648 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) ffbdeb28 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3) ffbded98 [00000000] Irp is complete (CurrentLocation 4 > StackCount 3)There are 148,962 Irps listed in the output in total. Taking a look at the first in the list... !pool lets us confirm the allocation is from nonpaged pool and is an IRP, then !irp can give us some details on the I/O taking place, and !devstack lets us see the underlying device: 1: kd> !pool 827b64a8 Pool page 827b64a8 region is Nonpaged pool 827b6000 size: 270 previous size: 0 (Allocated) P_. (Protected) 827b6270 size: 230 previous size: 270 (Free) .... *827b64a0 size: 270 previous size: 230 (Allocated) *Irp Pooltag Irp : Io, IRP packets 827b6710 size: 270 previous size: 270 (Allocated) ..3. (Protected) 827b6980 size: 1a0 previous size: 270 (Free) Attv 827b6b20 size: 270 previous size: 1a0 (Allocated) Irp 827b6d90 size: 270 previous size: 270 (Allocated) P_. (Protected) 1: kd> !irp 827b64a8 Irp is active with 3 stacks 4 is current (= 0x827b6584) No Mdl: No System Buffer: Thread 00000000: Irp is completed. cmd flg cl Device File Completion-Context [ 0, 0] 0 0 00000000 00000000 00000000-00000000 Args: 00000000 00000000 00000000 00000000 [ 0, 0] 0 0 00000000 00000000 00000000-00000000 Args: 00000000 00000000 00000000 00000000 [ f, 0] 0 0 89764618 00000000 bad750ac-89763748 \Driver\usbuhci usbhub!USBH_FdoIdleNotificationRequestComplete Args: 00000000 00000000 00000000 00000000 1: kd> !devstack 89764618 !DevObj !DrvObj !DevExt ObjectName 89763690 \Driver\usbhub 89763748 000000f6 > 89764618 \Driver\usbuhci 897646d0 USBPDO-0 !DevNode 89b2fa90 : DeviceInst is "USB\ROOT_HUB\4&56cb44e&0" ServiceName is "usbhub" I can see some processes that hint at something related to communications (USB, IrDA, Bluetooth): PROCESS 884f5020 SessionId: 0 Cid: 0554 Peb: 7ffd9000 ParentCid: 0400 DirBase: 2f333000 ObjectTable: e15bd2e8 HandleCount: 62. Image: btwdins.exe PROCESS 88043430 SessionId: 0 Cid: 0c84 Peb: 7ffdf000 ParentCid: 04d4 DirBase: 3dc31000 ObjectTable: e7f42c78 HandleCount: 235. Image: BTSTAC~1.EXE PROCESS facf5020 SessionId: 0 Cid: 1908 Peb: 7ffde000 ParentCid: 1560 DirBase: 5d729000 ObjectTable: e8e88850 HandleCount: 67. Image: NclUSBSrv.exe PROCESS fa91c8c0 SessionId: 0 Cid: 1954 Peb: 7ffd9000 ParentCid: 1560 DirBase: 47e38000 ObjectTable: e16a4260 HandleCount: 145. Image: NclBCBTSrv.exe PROCESS f9f7c020 SessionId: 0 Cid: 1708 Peb: 7ffd8000 ParentCid: 1560 DirBase: 7ed19000 ObjectTable: e17ba830 HandleCount: 47. Image: NclIrSrv.exe PROCESS facf0020 SessionId: 0 Cid: 1504 Peb: 7ffdf000 ParentCid: 1560 DirBase: 46b65000 ObjectTable: e67a6b60 HandleCount: 45. Image: NclRSSrv.exe And then there's always AV to consider: a6c30000 a6c441e0 naveng \??\C:\PROGRA~1\COMMON~1\SYMANT~1\VIRUSD~1\20090705.003\naveng.sys a6c45000 a6d19440 navex15 \??\C:\PROGRA~1\COMMON~1\SYMANT~1\VIRUSD~1\20090705.003\navex15.sys a9c23000 a9c40000 EraserUtilRebootDrv \??\C:\Program Files\Common Files\Symantec Shared\EENGINE\EraserUtilRebootDrv.sys a9c40000 a9c9e000 eeCtrl \??\C:\Program Files\Common Files\Symantec Shared\EENGINE\eeCtrl.sys a9d60000 a9dc2000 SPBBCDrv \??\C:\Program Files\Common Files\Symantec Shared\SPBBC\SPBBCDrv.sys a9e2c000 a9e6e000 symidsco \??\C:\PROGRA~1\COMMON~1\SYMANT~1\SymcData\SCFIDS~1\20090625.001\symidsco.sys a9e6e000 a9e97000 SYMFW \SystemRoot\System32\Drivers\SYMFW.SYS aa19a000 aa1ae000 Savrtpel \??\C:\Program Files\Symantec Client Security\Symantec AntiVirus\Savrtpel.sys aa1ae000 aa1d0000 SYMEVENT \??\C:\Program Files\Symantec\SYMEVENT.SYS aa1d0000 aa228000 savrt \??\C:\Program Files\Symantec Client Security\Symantec AntiVirus\savrt.sys First rule of troubleshooting a new problem - did you change or install anything recently? In particular anything related to USB, bluetooth or chipset drivers? Maybe mobile phone sync software, or even fingerprint scanner drivers? Secondly, try to reduce the problem to its bare minimum - is there a particular piece of software that causes the problem to occur? Whilst running without AV is not a long-term solution, it's a valid test for problems that occur routinely - I would uninstall the Symantec software and see if the symptom disappears (note: disabling is not the same as uninstalling, the kernel drivers are still present and get involved in I/O).
  7. Nonpaged pool totally exhausted, something has leaked.The output from !poolused 7 will be long - it is sorted in descending order in nonpaged bytes, so the first few lines are the most interesting. This will give a clue as to the pooltags used for the allocations, and maybe a direct indicator as to who might have made them. AV filter drivers are common leakers of pool memory - what AV do you have installed? My comment on SP3 was intended as: "why isn't SP3 installed?"
  8. I think given the speed of Vista installation on a newly-created partition, it's a quick format used in that GUI stage. When I use a completely brand-spanking-new hard disk and first partition it, I tend to do a full format, and that's the only time I do. During Vista or Win7 setup, at the partition selection/setup stage, I hit Shift-F10 to get the command prompt up and use format from there before selecting the target partition. Newly created partitions on a brand new disk - full format. Existing partition which contains data - quick format.
  9. Actually, it might be some system resource getting exhausted... as you found, csrss.exe was the critical process that got killed: The line I think of interest, and its breakdown: And the "failed at" address is the module address in the thread that raised the exception (the process, csrss.exe): I would guess the page in the virtual address space for csrss.exe was paged out to disk, then at some point a context switch occurred to continue executing which incurred the inpage operation - but when pulling the data from disk the I/O failed, making the thread go boom, which terminates the process, and it was a critical process so we bugcheck. Most commonly in my experience the cause of failing inpage operations is a disk or disk controller failure (the device suddenly vanishes from the system), sometimes due to a driver fault or an I/O mode setting in the BIOS (e.g. AHCI being used)... however here there is the extra bit of info "Insufficient system resources exist to complete the API". The output from !vm might be useful, to see if it's pool memory or PTE shortage - of course there's a chance it could be a bogus status code if the origin is a dodgy CPU or heat related... Not running SP3?
  10. No worries, I see this a lot due to the unfortunate naming.OT - you know the 5th Edition of Windows Internals is out now, covering NT 6? Waiting for my copy to arrive
  11. The .DEFAULT key under HKEY_USERS is actually used by the Local System user account, it has nothing to do with interactive or default users.The NTUSER.DAT in the Default User profile (on disk, not in the registry) is the template user profile registry hive used when users log on for the first time.
  12. There is a reference to AntiVir in there too (though not in the list of running processes at the time of the report - maybe tested and removed?). If you have more than one of Prevx, Avast! and AntiVir in the Add/Remove Programs list, uninstall (don't just disable) all but 1 to ensure the kernel filter drivers are not loaded. If the problem continues after a reboot, use Process Explorer to hover the mouse over the svchost.exe with high CPU utilization and the tooltip will show the services hosted by the process. Make a note of the list of services, then from an elevated command prompt you can enter the following (where XXX is the service name): sc config XXX type= own When you restart the service it will now create a separate svchost.exe for it - now you can track the CPU time back to an individual service in Process Explorer. To restore a service back its default state, enter (in an elevated command prompt again): sc config XXX type= shared
  13. No worries, glad you got it sorted out File association is one of the things I think became a little trickier rather than more intuitive with Vista.
  14. @Glen9999: By far the best mitigation you have already mentioned - running as a standard user rather than Administrator. The vast majority of malicious activity in my experience has been through social engineering and users not understanding the implications of clicking flashy things on the screen - reduce the user's power and the system becomes more secure implicitly. This has much more value when NTFS is used as the file system, otherwise there is no way to protect the OS files from any user able to log on (I've not seen first-hand any malware employing alternate data streams or locking down ACLs that the user could not unlock that would warrant using no form of protection on the file system). Deploying a client behind a NAT router (basically any home broadband router on the market these days) should provide protection against drive-by scans, but it's still worth having the Windows Firewall service running as it's so lightweight. Reading between the lines it looks like you may be setting up a PC for a not-so-IT-literate person and want to keep the system ticking over by itself - I would enable Automatic Updates to install hotfixes as it detects them, and have an AV product with realtime scanning and automatic updates (set up for a weekly full system scan too). "Security is the enemy of useability" a colleague of mine loves to cite frequently, so it depends on how far you want to go protect the system from the user - if there are USB ports present and there will never be any USB devices connected, you can consider disabling them in the BIOS to cover another potential back door, for example. Automating cleaning of temp files can be dangerous, due to how they may be present during the lifetime of an application, or until they are cleaned up after a reboot - if you clean out *.TMP, for example, on a scheduled basis then you may run into a problem only after restarting (typically this can be seen for anything doing a self update). Teaching the user how to make backups could be useful too - a system restore to a known good point in time can be much quicker than a reinstall of the OS and all the apps (though this is more a "reactive recovery" point in the event the system has been compromised or become unstable). @JustinStacey: The DNS Client service is the DNS name resolution cache, it's not a listening service - just curious as to what security hardening this achieves? Also, the Client for Microsoft Networks is the plumbing of the Workstation service on a per-interface basis, so it's necessary for outbound SMB and disabling this would break the machine's ability to browse other machines, if there are any. The File and Printer Sharing setting is the per-interface SMB plumbing for the Server service, so I agree it can be useful to disable this if you don't share resources on the LAN.
  15. How about the Windows key, does that now bring up the Start menu? If you enter C: in the Start/Search (or Start/Run) fields, does a window open up with the contents of that volume?
  16. By "explorer" do you mean the desktop & icons, or you can now successfully start an explorer.exe process and get a window up, or you can click on your user name from the Start menu and actually get a window up in which you can navigate between folders?And when you say you cannot enter any of your drives but can see the properties (through the right-click context menu?), do you get any error when you double-click on a drive letter, or does nothing happen, or does the window process hang?
  17. Creating a new user account is always a useful test, as the Administrator account is already present, only disabled.Having a new, never-logged-on-before and non-well-known-GUID user account log on is a useful method of determining between a user profile and a system issue. So is the desktop back for all users, or just when you log on as this new test user?
  18. Can you create a new user account and log on as that user to verify if they have the same problem? Also, did I read correctly that if you boot from the Vista DVD you get nothing but the blue-ish background and a mouse pointer, you don't even get any menu at all?
  19. To be honest, if you just reinstalled the day before, I would wipe & start over, given how quick the installation is. It might save you a lot of headaches in the long run.
  20. It sounds like one or more of the many shell DLLs has a problem, or maybe some shell extension - did you do any kind of "takeown" under %systemroot%, or clean up of the WinSxS folder at some point in the past? Was the installation vLite'd? Or was there some custom/unattended installation used? I see a fair bit of file recovery being done in your Component Based Servicing log...
  21. Is there anything special about the location of gallery.exe? Is it on a removable drive, a UNC path, a folder with a custom ACL? I would test by renaming gallery.exe to temp.exe, then copy notepad.exe to where gallery.exe was and go through the test again - if the second Notepad entry appears then it would appear to be something up with gallery.exe itself (and you could afterwards close the Notepad-that-is-gallery.exe, delete it and rename temp.exe back to gallery.exe).
  22. As would every malware author out there, I bet.So the logged-on user isn't an admin, and each of the programs called from the batch file is attempting to run elevated (hence the multiple prompts) from a batch file that was not launched elevated... this is expected behaviour. The only way I would expect a user to be able to select such an option and not have privilege issues would be if a call was being made to a service running under a privileged user account - then the idea would be that the service does the job on behalf of the user (hopefully in a secured, non-exploitable manner). Similar to how AV interfaces allow a regular user to "clean" an infection when they would not have the explicit privilege to access the file. What exactly are these programs trying to do with the contents of the Recycle Bin?
  23. Admittedly I'm doing this test on Win7, not Vista, but the principle should be the same...I created a file on my desktop named test.gal, then copied Notepad.exe to gallery.exe on my desktop. I double-clicked test.gal and got prompted with the following: I selected the second option and clicked OK, then I got the "Open with" dialogue window: (The list of programs is built from apps registered in Windows through an installer, so it's unlikely to be the same on any 2 machines.)I clicked the Browse button and navigated to my Desktop folder, then double-clicked on the gallery.exe icon - I was returned to the previous "Open with" dialogue now with th additional icon selected (it had the icon and name for "Notepad" as this comes out of the PE header, not the exectuable name). I clicked OK and the icon for the file on my desktop changed to that of Notepad, indicating the new association had worked. I double-clicked the file and it opened in Notepad. I ran Task Manager and on the Processes tab I verified the command line for the process was C:\Users\{user}\Desktop\gallery.exe (not C:\Windows\Notepad.exe). So when you go through the process of selecting gallery.exe and get returned to the "Open with" dialogue, it hasn't added Gallery to that list and selected it for you?
  24. Bit of a change from August 2008, now looks like this: Main machines for myself and my wife were upgraded to Core i7 w/12GB - my wife likes to play with rendering in Poser, and I play with games and virtual machines so we get the use of the RAM. My wife's old machine now acts as the server, Virtual Server replaced with Hyper-V and the web server finally migrated from a virtual W2K3 SP2 to virtual W2K8 SP2. With the higher-spec, Hyper-V capable server it made sense to set up a domain and run the DC as 1 virtual machine and a separate virtual machine for a file (and Squeezebox) server. The host machine is still standalone, but the VMs and clients 1 & 2 are now domain-joined (makes life easier for roaming profiles & folder redirection when testing builds of Windows 7). Client3 will likely end up being a pet project for having clustered Hyper-V hosts for highly available VMs, though I'd need to figure out something for the iSCSI targets... Edit: 6th October - upgraded Internet connection to 50/10, changed details for servers upgraded from W2K8 SP2 to W2K8 R2
  25. Well poking the registry directly should be a last resort - it's only a database which has APIs specifically to update it in a controlled manner.If you double-click a .gal file, do you get prompted to select a program with which to open the file? If nothing at all happens, do you have and "Open with" option if you right-click the file?
×
×
  • Create New...