Jump to content

Advice on a 64bit system upgrade


Recommended Posts

Posted

Thanks for keeping us up-to-date!

Tell the people at Supermicro they're awesome! I, for one, I'm very impressed! :thumbup

That's how customer support should work, but dedication like that's beyond rare, nowadays, regretably.

Good to hear from you.

Cheers!

  • 3 weeks later...

Posted

Hi again, and sorry again for the prolonged silence!

The Supermicro support guys now seem to agree with me that the PECI Agent 2 temperature monitoring on my second processor is almost certainly faulty.

Although apparently processor 1 is always more stressed than processor 2 in dual processor systems, the fact that my processor 2 is always reading "low" on the temperature monitoring system, even under stress, cannot be right.

They think it's a hardware fault, which of course is almost certainly not fixable.

 

I suppose that's the chance you take with buying a used motherboard, but I don't think it's really a problem as long as I'm aware of it.

I am reasonably happy with the system now, getting the better coolers has made all the difference, and I'm only very occasionally getting overheat alarms now, even with all the covers on.

 

I am now getting the single-bit ECC errors being reported again, usually from one of the "new" DIMMS, but occasionally from the other one too.

Research seems to indicate that this isn't actually a problem, as the whole ECC memory system is designed to cope with that level of error, and it's really just a warning that the memory modules could be on their way to more serious errors that can't be coped with.

There is an option in the BIOS to switch off the reporting of ECC errors, so I'm wondering if I should just do that, on the basis that unless it gets a lot worse it won't affect the performance of the system anyway.

:)

Posted

Good to hear from you!
 

The Supermicro support guys now seem to agree with me that the PECI Agent 2 temperature monitoring on my second processor is almost certainly faulty.
Although apparently processor 1 is always more stressed than processor 2 in dual processor systems, the fact that my processor 2 is always reading "low" on the temperature monitoring system, even under stress, cannot be right.
They think it's a hardware fault, which of course is almost certainly not fixable.

 

If that's the case, you may ignore or disable the sensors and measure the true temperature directly with thermopairs or thermal diodes. Here's a reference but, of course, not the only way to do it, just a good description of one of the many possible methods.

Posted (edited)

Thanks Den, but that does look a bit too complicated to go into!

I assume that Core Temp is reading the correct temperatures anyway, as I gather that reads straight off the processors' sensors, not using the PECI system.

:)

BTW, I guess my entry on the "Day-to-day running Win 9x/ME with more than 1 GiB RAM" thread now needs updating!

Edited by Dave-H
Posted

Do you have access to the readings from the individual processors available to compare with the PECI readings?

That sure is better than what I was suggesting!

 

BTW, PM me the new entry for the list and I shall duly add it, OK?

Posted

PM sent.

 

Post #107 has a grab of Core Temp's output.

The guy at Supermicro said it reads directly from the processors and does not use the PECI system.

Personally I'd much rather have actual temperature readings anyway than just the "high" "medium" or "low" readings.

That seems rather useless to me! Not one of Intel's better ideas!

 

What do you think about the ECC errors?

I've found that I can switch the logging of just single bit ones off in the BIOS, so it only record multi-bit errors now.

I'm hoping that's OK to do.

:)

Posted

Well, I'm not used to working with ECC RAM...

I sure don't like any hardware errors around, but ECC RAM is made to recover from errors.

That said, I think it odd those errors should be much frequent.

I think you should both delve into 'net looking for more info and ask your contacts at supermicro about it.

They've been so helpful up to now, I'm sure they can give you sure guidance regarding ECC RAM, since their boards always use it (or, at least, support it).

Having just looked at the pic in post # 107:

Core Temp does give what you want... the first column is current temp, core by core; the next two columns keep record of the maximum and minimum reached during the monitored period, so they help you become aware of peaks of temp when you're not at the console. Seems pretty complete for my taste. So, if you have balanced readings from both processors, then, of course, the PECI system is really faulty, but irrelevant, because you've got a much better way to keep track of the temps, using Core Temp. Great!

Posted

Yes, I think Core Temp will do everything I need, it even has its own overheat notification system.

I wish you could completely turn off the PECI monitoring system, but I see no way of doing that unfortunately.

 

I had the same ECC errors on one of the four DIMMs that came with the motherboard, which is why I decided to get the others.

Also I reasoned that two DIMMs weren't going to produce as much heat as four, so I went for a couple of 4GB ones which doubled the amount of  memory with half the number of modules!

It seemed fine at first, but now I'm getting the same error messages from one of the replacements, and occasionally from the other one.

Again I guess that's the risk you take when you buy used hardware!

I have set it to hopefully still log multi-bit errors, so I will know if it gets worse.

:)

Posted (edited)

FWIW, you might -carefully- check the RAM contacts on the MoBo. I've used an -extremely- fine grit piece of sandpaper (folded) and run it through the slot to clear off the contacts then blown the residue out. It did help. Remember, -carefully- since they're -spring-loaded- so make sure they fully contact/"spring". Just a suggestion. :unsure:

Edited by submix8c
Posted

Thanks for that.

The first thing I did was properly clean the contacts on the DIMMs, but I only sprayed Isopropanol Cleaning Solvent into the sockets, as I was a bit scared of actually doing physical damage if I put any cleaning tool into them.

I convinced myself anyway that it was the modules that were faulty, not the sockets, because the errors moved to another slot when I swapped the modules.

That was a great relief as you can imagine!

If it persists I will try what you suggest though.

:)

Posted

Did you clean the contacts on the modules, too? I bet you did, but maybe not this way: For that I habitually use a soft white rubber-eraser intended for retouching pencil-drawing. They are somewhat softer than the erasers that sometimes come on the near-end of some pencils, but they do a great job at cleaning module contacts, and you dont run the risk of shortening any contacts, if some rubber trimming remains clung to the module, after you brush them away before mounting. BTW, do your new modules have heat spreaders/dissipators?

Posted (edited)

Yes I did clean the modules' contacts, as I said, but only with cleaning solvent on a cotton bud.

Perhaps I'll try the rubber eraser method next time I've got the covers off the machine.

The DIMMs do have some sort of metal heatsinks on them.

 

post-84253-0-02052300-1413826304_thumb.j

 

These are the original 1GB DIMMs that came with the motherboard, but the present 4GB ones look physically the same.

When they say "hot surface" they're not kidding!

 

Just going ahead a bit now, and sorry this is probably now not in the right forum, but I'm starting to think about my 64 bit OS installation, which was the whole point of the motherboard upgrade.

I ran the MS Windows 8 compatibility checker, and it said that my processors "might" not support NX, which I gather is a malicious code execution prevention technology. If they don't, Windows 8/8.1 will not install.

According to the Sysinternals CoreInfo utility, my processors do support NX!

The MS utility says they don't or it's disabled in the BIOS, but there is no BIOS option for it that I can see.

I will obviously ask Supermicro about this, but I'm now worried that I could pay out for a full 64 bit install of Windows 8.1 Pro, only to find that it refuses to install! Windows 7 will be fine, but I'm still tempted to go for 8.1, simply because it is the most up to date OS, which I'll get the longest support for.

:no:

Edited by Dave-H
Posted

You're much better served by a 7 Ultimate 64 than by anything MS did afterwards, I regret to say, so don't hurry.

Your processor sure should support NX, and CPU-Z is the standard way to confirm it. What does CPU-Z say?

Posted (edited)

Well, CPU-Z doesn't seem to mention NX, at least not by name, which is even more worrying!

I've attached the report so you can have a look.

To me this would seem to indicate that if the processors do support it, it's disabled somehow, but as I said I can't find any specific entry in the BIOS for it.

:no:

CPU-Z.txt

Edited by Dave-H

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...