Jump to content

Windows 2003 gets disconnected from network


Recommended Posts

Hi all,

I recently installed two Windows 2003 Server Standard Edition

At randon times network freezes. The server continues to function properly except networking. No matter what server, it is indifferent, the two servers have the same problem.

The two servers are Domain Controllers and working in mixed mode with AD, one is PDC and the other is BDC.

After I disable and enable network interface I recover network access again. So, I suppose the problem is with the TCPIP Stack.

Hardware used: IBM BladeServer HS20 Model 8843.

Integrated NIC: Broadcom 57xx NetXtreme Gigabit Card.

I have tried: updating drivers, installing latest hotfix and the last one: installing Service Pack 1 for windows 2003. I have been in contact with IBM and updated BIOS and BMC but no good result, the hardware is good (I have same model on Windows 2000 Server and I don't have this problem).

I think that the "network hang" occurs when the two servers replicate domain information each other.

I am very worried about this and really desperate. Please Help.

thank you very much.

Albert.-

Link to comment
Share on other sites


What does the event viewer say? any errors reported there?

This may or may not work but you may want to give it a shot as it solved a similar problem i had. Also performing these actions will reset your TCP/IP properties/ settings and DNS servers, so you may want to note these details down.

at a cmd prompt type:

ipconfig /release

ipconfig /flushdns

netsh int ip reset resetlog.txt

netsh winsock reset

Restart, enter your details (IP add/ DNS etc)

Link to comment
Share on other sites

Have you tried to manually replicate AD information? If not, install Windows support tools and use replmon.exe to monitor replication and if possible verify that network hangs during domain information exchange

Link to comment
Share on other sites

Hi,

Thanks for your responses.

Jpatto,

Event viewer says nothing. Only Win32Time and replication errors due to the lack of network access.

The steps you wrote on resetting IP didn't work. In the past, before installing Service Pack 1, I tried those steps with no good results.

klasika,

I didn't try replmon.exe I thing it is a good suggestion. I'll try and let you know.

Marsden,

Use a different NIC is not possible. The BladeServer doesn't have this option. The NIC is integrated on the board.

thanks.

...still trying new things.

Any suggestions ?

Link to comment
Share on other sites

@Marsden - he won't be able to swap another NIC as the HS20's don't have NIC's on the blades themselves. They connect to the backplane which shares a common gigabit switch which usually trunks to your core switch.

@ bertovic - Have you tried hard coding the port speed of the NIC in the adapter settings and on the switch configuration? If not, maybe consider that. Broadcom's network utility will also work to configure the port speed of the NIC on the blade.

Outside of that the other's suggestions to check the logs and try manual replication are good ones.

Also check the adapter binding order in the Advanced Properties of Network Connections. If you are teaming NIC's make sure your team is setup correctly, refer to IBM/Broadcom documentation.

Hope this helps.

Link to comment
Share on other sites

Hi,

tguy,

Changing port speed configuration is not an option. In the NIC configuration properties dialog there is no chance to do this, it seems that this NIC is designed to work at speed of 1 Gb only. There are other options like: 802.1p QoS, checksum offload, flow control, jumbo mtu, large send offload and locally administered address.

I also have checked the switch module in the BladeCenter chassis and there is no option to change the port speed. The only parameters that I can change are: state (enabled/disabled) and Flow Control (enabed/disabled).

About the advanced properties of network connections the binding order seems correct but I let you know that I am using Microsoft iSCSI Driver. The order of connections listbox is: iSCSI Adapter (gigabit #2), LAN Adapter (gigabit #1), Remote Access Connections. I think that I could put LAN adapter in the first place isn't it ?

For iSCSI adapter I have not checked any client (shared files and microsoft client).

In my tests I tried to disable iSCSI Initiator service with no results.

One more thing, I am not teaming NIC's.

For the moment I have to try manual replication and look for replication monitor. I am new on this and I have to see how to do.

thank you very much.

Link to comment
Share on other sites

Just a stab in the dark but...

if your NIC is 1000M only, then you have made sure that you are connecting it to a 10/100/1000M (or 100/1000M or 1000M only) switch port right?

Connecting it to a 10/100M wont work properly/at all tho if its 1000M only ...er...right guys?

Also, as I havent got any 1000M connections, is there a limit to the CAT5e cable when using 1000M speed?

Just a few random things out of my head ;)

Regards,

N.

Link to comment
Share on other sites

Hi,

it_ybd,

Yes I know. It is just the hardware is designed as this. The IBM BladeServer HS20 connects to the BladeCenter Chassis backplane which shares a common gigabit switch with 14 internal ports at only 1 Gb speed and 4 external ports at 10/100/1000 that you can trunk to you core switch so the problem is not the speed, nor the switch or cable.

thanks for your thoughts.

for the moment to get arround this problem I created a scheduled task that disables and enables the adapter when it gets freezed.

=================================

chekif.cmd

=================================

*******************************************

@echo off

SET TESTIPADDR=192.168.61.20

SET LOGFILE=c:\windows\iferror.log

Rem //

ping %TESTIPADDR%

if %errorlevel% LSS 1 goto okay

echo *ERROR* >> %LOGFILE%

date /T >> %LOGFILE%

time /T >> %LOGFILE%

echo Interface Restart >> %LOGFILE%

ifstop.vbs

ifstart.vbs

echo ----------------------------------- >> %LOGFILE%

Rem //

goto fin

:okay

echo *OK*

:fin

echo Retorno=%errorlevel%

******************************************

=================================

ifstop.vbs

=================================

Const ssfCONTROLS = 3

sConnectionName = "LAN"

sEnableVerb = "&Activar"

sDisableVerb = "&Desactivar"

set shellApp = createobject("shell.application")

set oControlPanel = shellApp.Namespace(ssfCONTROLS)

set oNetConnections = nothing

for each folderitem in oControlPanel.items

if folderitem.name = "Conexiones de red" then

set oNetConnections = folderitem.getfolder: exit for

end if

next

if oNetConnections is nothing then

msgbox "Couldn't find 'Network and Dial-up Connections' folder"

wscript.quit

end if

set oLanConnection = nothing

for each folderitem in oNetConnections.items

if lcase(folderitem.name) = lcase(sConnectionName) then

set oLanConnection = folderitem: exit for

end if

next

if oLanConnection is nothing then

msgbox "Couldn't find '" & sConnectionName & "' item"

wscript.quit

end if

bEnabled = true

set oEnableVerb = nothing

set oDisableVerb = nothing

s = "Verbs: " & vbcrlf

for each verb in oLanConnection.verbs

s = s & vbcrlf & verb.name

' msgbox verb.name

if verb.name = sEnableVerb then

set oEnableVerb = verb

bEnabled = false

end if

if verb.name = sDisableVerb then

set oDisableVerb = verb

end if

next

if bEnabled then

oDisableVerb.DoIt

end if

wscript.sleep 8000

=================================

ifstart.vbs

=================================

Const ssfCONTROLS = 3

sConnectionName = "LAN"

sEnableVerb = "&Activar"

sDisableVerb = "&Desactivar"

set shellApp = createobject("shell.application")

set oControlPanel = shellApp.Namespace(ssfCONTROLS)

set oNetConnections = nothing

for each folderitem in oControlPanel.items

if folderitem.name = "Conexiones de red" then

set oNetConnections = folderitem.getfolder: exit for

end if

next

if oNetConnections is nothing then

msgbox "Couldn't find 'Network and Dial-up Connections' folder"

wscript.quit

end if

set oLanConnection = nothing

for each folderitem in oNetConnections.items

if lcase(folderitem.name) = lcase(sConnectionName) then

set oLanConnection = folderitem: exit for

end if

next

if oLanConnection is nothing then

msgbox "Couldn't find '" & sConnectionName & "' item"

wscript.quit

end if

bEnabled = true

set oEnableVerb = nothing

set oDisableVerb = nothing

s = "Verbs: " & vbcrlf

for each verb in oLanConnection.verbs

s = s & vbcrlf & verb.name

if verb.name = sEnableVerb then

set oEnableVerb = verb

bEnabled = false

end if

if verb.name = sDisableVerb then

set oDisableVerb = verb

end if

next

if bEnabled then

' oLanConnection.invokeverb sDisableVerb

' oDisableVerb.DoIt

else

oEnableVerb.DoIt

end if

wscript.sleep 8000

*******************************************

*******************************************

Note that this is a temporal fix it is not the solution.

Has anyone got a similar problem like this before ? @Jpry565 ?

Regards.

Link to comment
Share on other sites

Actually it could be your switch if the switch port exceeds a certain traffic load it (depending on setup) it will disable itself.

What about half duplex/full duplex auto negotiate, check those options. are they consistent with the other devices connecting to your servers.

The reason I’m suggesting your switch is when you disable your network card that traffic stops going through the switch – which then in-turn re-enables the port. When it exceeds a certain load it disables it again.

We had a 100 MB fiber uplink to a cabinet with 8 switches. From there they were configured as a stack through gigabit ports. Although this worked it wasn’t very reliable and we would experience intermittent network connectivity.

IF everything on your BladeCenter Chassis is gigabit which then links into your core switch at 100MB’s (!!assuming!!) it could be your problem

Just a thought

Minus Human

Link to comment
Share on other sites

I have run into a similar situation with a 2k3 box (DC, AD, DNS, DHCP, IIS, Exchange 2k3). when ever i use remote desktop to run updates, etc if the process requires a restart, the box will come back up, but email/iis are down, and you cant rdc back to the server. it does this if i let it auto restart or restart it thru windows security. if i log on locally to the box, it can hit the internet and ping the whole **** network, but nothing on the network(nor off for email and webpages) can connect to it. only way to get it up and running again is to uninstall the nic. still havent figured out why it does this.

Link to comment
Share on other sites

Hi,

thanks for your replies but I have discarded hardware related problems. Before installing windows 2003 Server I had windows 2000 server with no problems so I discard NIC problems and switch problems.

When I installed windows 2003 server the network problems began. I installed from zero (not migrated). The problem is that I can't rollback to 2000.

this is very strange... and I am really desperate.

for the moment I will place a service call to MS and of course have to pay.

thanks.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...