Jump to content

IIS web server log files get corrupted after power failure


Recommended Posts

Guest wsxedcrfv
Posted

This is an NT4 server running IIS 4 (hey, sure I can update it to server 2008, but it runs fine so why mess with it?).

The machine is used as a web server (a very simple web site) and a log file showing the details for web-hits is created each day. Usually the files are 100kb to 500kb each. The current day's log file is locked until a little after 12:00 am the next day, when the file is closed and the next day's file is created and opened.

The machine is temporarily running without the benefit of a UPS. What has been happening is that after the machine is restarted after a power failure, the data in the previous 7 to 13 log files are overwritten with null characters. The file-modified dates and the file-sizes are not modified. I'm not sure if it's always a certain number of log files that get trashed - I know it's been 13 days on at least 2 occasions. I should say 13.5 log files - it's usually the case that part of one file (the oldest) will be trashed. It will have log entries that start a little after midnight, but the last entry will be 10 or 11 am before the rest of it is replaced with null characters.

Is this a file-system error, or some other more complicated mess related to IIS and the way it handles logging? I would have thought that once a log file has been closed, that IIS wouldn't ever touch it again.

Anyone know what's going on here?


Posted

Usually on an NTFS-based system, corrupted files are actually corrupted (even text files), so seeing a file size *exactly* the same, but with garbage, would lead me to believe that file caching is the more likely culprit here, and when the system goes down whatever hasn't actually been *written* to the disk itself (it's still either in memory on the system or in the disk cache) is what you see as NULL chars in the file itself. Obviously fixing the problem with the power would be a better solution, and I don't remember NT4 much these days, but disabling write caching if it's possible on those log disks is probably a good idea. You really shouldn't be caching on any disk that contains log or transaction files anyway, for this specific reason.

Guest wsxedcrfv
Posted

seeing a file size *exactly* the same, but with garbage, would lead me to believe that file caching is the more likely culprit here, and when the system goes down whatever hasn't actually been *written* to the disk itself

What I don't understand is that these log files *have* been written to the disk, because they are accessible over the network by other computers. If the data that is supposed to be contained in those files exist *ONLY* in a write-cache, then how could they be accessible? And we're talking about data that is up to 13 days old. I can't believe that an NTFS-based operating system would be caching write data for that long, unless NTFS is more of a bull-crap file system than I already think it is.

I have little to no respect for NTFS as it is, and this just makes it look worse to me compared to FAT32. These are files that *have* been written to the hard drive. Why NTFS should reach back and wipe them out days later is hard to understand.

Is there anything to the idea that NTFS has changed or been improved since NT4 sp6, and that what I'm seeing would not happen if the machine was running 2K or XP?

Also - the drive in this NT4 system has been slaved to an XP system periodically (virus scan, defrag, etc) and I understand that when ever a newer version of NT (like XP) touches an older NTFS drive, that it does something to the file system to make it essentially a newer version of NTFS. When that happens, and when the drive is allowed to boot back into NT4, will that cause a problem? This problem?

Posted

Indeed NTFS has been improved since NT4 (even with nt4 service pack 4 ntfs version changed) and if the drive was plugged on an XP the ntfs structure might have been altered and then generating errors when in NT4.

Posted
What I don't understand is that these log files *have* been written to the disk, because they are accessible over the network by other computers. If the data that is supposed to be contained in those files exist *ONLY* in a write-cache, then how could they be accessible? And we're talking about data that is up to 13 days old. I can't believe that an NTFS-based operating system would be caching write data for that long, unless NTFS is more of a bull-crap file system than I already think it is.
A file being accessible over the network has nothing to do with it being actually written to disk. All I/O access goes through the ntfs.sys filter driver (and any other upstream filters installed) to provide access to the raw filesystem, which can mean serving data from a file cache versus the *actual* file on disk. NTFS can cache data, depending on how busy the server's disks are, for days, sometimes even weeks, depending on how much data is in the cache. If the server is just a web server and doesn't do much disk writing otherwise, I could see it taking many, many days to actually flush IIS log files to disk.
I have little to no respect for NTFS as it is, and this just makes it look worse to me compared to FAT32. These are files that *have* been written to the hard drive. Why NTFS should reach back and wipe them out days later is hard to understand.
If you're comparing NT4's NTFS on a webserver to FAT32 as your comparison point, you'll get little complaints from me. NT4's NTFS was pretty impressive in it's day, but that day was almost 15 years ago. NTFS has gotten MUCH better since then, so I'd suggest you revisit your statement from before of "This is an NT4 server running IIS 4 (hey, sure I can update it to server 2008, but it runs fine so why mess with it?).". I would argue running something business-critical on NT4 is a bit dangerous, and if you aren't going to upgrade to at least Server 2003 / IIS6, you should switch to FAT32 or disable write caching on your NTFS log volumes at this point. I'd suggest B, but A is something you can consider.
Is there anything to the idea that NTFS has changed or been improved since NT4 sp6, and that what I'm seeing would not happen if the machine was running 2K or XP?
The algorithms have been tweaked in the filesystem filter driver (in W2K, I think) to specifically make sure things like this on a low-volume write server (like an IIS server) doesn't happen. I've had this problem before on NTFS NT4 systems, and upgrading them to W2K did resolve the issue. I'd suggest 2003 at least, but given you don't seem to need supportability or security hotfixes, etc, 2000 might be fine.
Also - the drive in this NT4 system has been slaved to an XP system periodically (virus scan, defrag, etc) and I understand that when ever a newer version of NT (like XP) touches an older NTFS drive, that it does something to the file system to make it essentially a newer version of NTFS. When that happens, and when the drive is allowed to boot back into NT4, will that cause a problem? This problem?
Not sure where you would have read that, but that is untrue. The version of the FS is not changed, but certain structures can be modified based on how newer versions of NTFS are structured (the underlying format of the filesystem does not change). It could cause a problem, so it might be worth testing by simulating a power failure without mounting the drive on the XP system to see if you get the same behavior. I would think you would, but it is always worth investigating possibilities.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...