Jump to content

Recommended Posts

Posted

I have a number of 2003 R2 x64 and x86 SP1 servers attached to a fabric-switched IBM Fibre-Channel SAN. The servers are all IBM xSeries servers attached using QLogic QLA2340 HBAs. We are using MPIO and M$s Storport driver (the latest version, or course) for multipathing on all servers. Furthermore, we're using IBMs StorageManager Agents (again, latest version) on all hosts. Also part of that SAN is a Quantum PX502 robotic tape-library which is also Fibre-Channel and attached directly to the SAN (i.e. not physically attached to a server). We are not using any kind of SAN partitioning, so all hosts attached to the SAN see the tape drives and robot.

Here's what happens. After rebooting the tape library, some or all of my x64 servers BSOD with a 0x0A stop error and your typical IRQL_NOT_LESS... message. x86 servers have yet to be affected. Debugging the resulting memory dump shows that storport.sys is the culprit. Additionally, soon before the server BSODs, the system event has log entries from PlugPlayManager saying that the tape drives and robot disappeared without being prepared for removal (Event ID 12). Obviously, preparing the hardware for removal on all my servers is out of the question, besides, the hardware never shows up in the list of items to be safely removed.

I'm very aware that SP2 is out for 2k3, and I intend to install that someday (once I recover from all the late-night work I've had to put in dealing with this problem); however, I'm not confident that will solve the problem since I will still have the same version of the storport driver.

So, short of calling M$ and paying for a support incident, any other bright ideas? I'd appreciate sparing me of basic "update firmware" "update driver" suggestions as those are obvious and already done.

Thanks guys (and gals).


Posted

How did you determine storport.sys was the culprit, again? I'm just curious. Also, what version of storport.sys do you have, and is it certified to work with your HBA driver version? The latest storport is 5.2.3790.2929 for SP1, and 5.2.3790.4073 for SP2, btw.

Posted (edited)

I ran the crash dump file that was generated after the BSOD (MEMORY.DMP) through M$s WinDbg program. And got the information below (ignore the symbol errors). And yes, This version of storport (5.2.3790.2880) is certified to work with our HBAs. We won't run a newer version of storport until it is "generally released" and as such, better tested.

Windows Server 2003 Kernel Version 3790 (Service Pack 1) MP (4 procs) Free x64
Product: Server, suite: Enterprise TerminalServer SingleUserTS
Built by: 3790.srv03_sp1_gdr.060315-1609
Kernel base = 0xfffff800`01000000 PsLoadedModuleList = 0xfffff800`011d60c0
Debug session time: Sat Mar 17 23:39:08.656 2007 (GMT-6)
System Uptime: 0 days 21:16:50.781
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
*** ERROR: Symbol file could not be found. Defaulted to export symbols for ntkrnlmp.exe -
Loading Kernel Symbols
....................................................................................................
...........
Loading unloaded module list
....
Loading User Symbols
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck D0, {a9943, 2, 1, fffff800011a70c5}

***** Kernel symbols are WRONG. Please fix symbols to do analysis.

*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!_KPRCB ***
*** ***
*************************************************************************
*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!_KPRCB ***
*** ***
*************************************************************************
*** ERROR: Symbol file could not be found. Defaulted to export symbols for storport.sys -
Probably caused by : storport.sys ( storport!StorPortQuerySystemTime+229c )

Edited by ErEkoSuave
Posted

Could you perhaps upload the dump here? Running !analyze -v is likely not going to give you good root cause (especially on a STOP 0x0A), although it could still be storport.sys. I'd like to take a look, if you don't have any objections to it.

Posted
The dump is 843MB (the server is set to do a complete kernel dump). I doubt your boards will allow files of that size to be uploaded. B)

Make sure the file is zipped. Check your PM for an upload location.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...