Jump to content

How to debug out of memory issues in Windows 9x/ME?


Kahenraz
 Share

Recommended Posts

I'm experimenting with Windows 98 and ME on a very fast system (Core 2 Duo) where I can test a variety of hardware combinations. But the most problematic issue I'm having is the operating system running out of some kind of memory somewhere. One of the areas I am experimenting with is X forwarding over SSH to expand the suite of software available on such an old system to include anything and everything that could possibly run on a Linux system, such as modern browsers, development environments, etc.

The problem I'm having is that, no matter how little or how much memory I throw at it, no matter how I configure it, the system keeps running out of some kind of memory, despite there being plenty of available resources during this event. The symptoms are my SSH tunnel failing and then being unable to open additional command prompts. Even if I close all running child processes (sometimes this is impossible), whatever memory is being allocated is not properly freed and I cannot reinitialize a Bash shell, XWin, or whatever, until I reboot the entire system. I suspect that this is somehow related to the SSH client running from within a DOS window, and that the resources being consumed are related to the 16-bit subsystem and not Win32 protected mode.

I have tried monitoring the system resources using Sysinternals Process Explorer and Windows's own System Monitor, but it's difficult to track exactly WHERE the memory is going and WHAT process is using it, as I can't find any way to display how much any individual process has allocated. At best, I can view how memory is being used across the entire system and the swap file, but there is never a significant uptick in memory to suggest a leak, and there is always free memory and swap available at the time of these memory pressure events.

How can I debug this memory issue by monitoring memory allocations for specific processes, what memory exactly is running out, and why isn't it being reclaimed upon program termination?

I have tried using a single 256MB stick of memory and up to 2GB of memory (with R. Loew's patch). I also tried limiting vcache in system.ini to various values such as 512MB and 64MB; the system does reflect this value, but it does not mitigate the problem.

I have found many references to this specific memory error as being related to vcache, but I can't confirm that the problem being discussed is related to my issue. Using 256MB of physical memory or artificially limiting the vcache setting does not seem to make a difference.

What I am looking for is help on how to further investigate this issue. I am a software programmer with extensive historical Win32 experience, but primarily on Windows NT based operating systems. I'm unfamiliar on how to properly debug Windows 9x based software that is misbehaving, and I'm finding that I am otherwise unprepared for these kinds of errors that just don't exist on NT.

 

oom2.png.b58f00e90fda7d6974827298d09db5b3.png

 

oom.png.05bfe3f161879e9b258d099d2fa1a4ce.png

 

memory.png.a395295daf2906ebc981b1cbe4742b05.png

 

system-monitor.png.7d820974562cedcd6da2d9159c2de572.png

 

Edited by Kahenraz
Link to comment
Share on other sites

  • Kahenraz changed the title to How to debug out of memory issues in Windows 9x/ME?

You could profile the programs with Dependency Walker. Heapwalk.exe (Heap Walker) 16 bit WinNT tool will work for checking memory address allocations including the Kernel. It will show addresses, flags, handles, owner, size, lock and type.

Xming runs OK on my ME system but requires KernelEx setting of XP to install. It should have only needed 2K though as it said it needed NTsp5 which is above the NT4 setting of KernelEx. PuTTY requires API-MS-WIN-CORE-FIBRES-L1-1-1 in registry knownDLLs pointing/redirected to Kernel32.dll. PuTTY works on my ME system as far as I can tell using latest KernelEx.

Link to comment
Share on other sites

The problem exists identically with both with Cygwin 1.5.25 XWin as well as Xming 6.9.0.31, but Xming runs on top of Cygwin, so this isn't indicative of much.

Which version of Xming are you using that requires KernelEx?

Link to comment
Share on other sites

Your source says:

'To warn Windows that you have more than 512MB of memory installed, add the following line to the [VCache] section of your win.ini file: MaxFileCache = 524288."'

It should be system.ini.

Also: did you try a static Vcache, setting MinFileCache and MaxFileCache to the same value?

Best start with 1024 both, most stable setting available, but slowest.

And with WinME you can't use SMARTDRIVE.EXE -limited to 128G(i)B partitions- to speed-up things (unless you make WinME real-mode aware).

Link to comment
Share on other sites

Posted (edited)

I have been unable to find a vcache setting that mitigates this issue. It also shouldn't be a problem when I use a compatible memory size, such as 256MB.

Edited by Kahenraz
Link to comment
Share on other sites

2 hours ago, Kahenraz said:

Which version of Xming are you using that requires KernelEx?

6.9.0.31. Needs KernelEx in Base mode for my system. I tried PuTTY vs 0.77 and it will require KernelEx obviously.

Link to comment
Share on other sites

Are you using latest version of Process Explorer for 9x (11.11)?

Do you have IE6 installed? Have you tried using the SHELL32.DLL fix? https://msfn.org/board/topic/84451-98-fe-98-sp1-98-se-me-shell32dll-fix/

Try using AIDA64 to monitor the system, latest version still works on 9x, just use the ZIP version.

Edited by MrMateczko
Link to comment
Share on other sites

I have switched from Windows ME to Windows 98 SE and found this issue to still be present. I also tried installing IE6 and then this shell32.dll update, but it did not help with the out of memory error.

I also tried installing AIDA64, but my system immediately bluescreens when I run it.

 

20220729_174926_resize_41.jpg

Link to comment
Share on other sites

I can't help with the AIDA64 BSOD, as it's an issue with its Win9x kernel driver, still not fully resolved. Sometimes it works for me, and sometimes it doesn't, mostly doesn't, bummer.

Can you try with a different PC with a different motherboard? Preferably something more period-correct like Socket 370/478/A and older.

Link to comment
Share on other sites

2 hours ago, MrMateczko said:

I can't help with the AIDA64 BSOD, as it's an issue with its Win9x kernel driver, still not fully resolved. Sometimes it works for me, and sometimes it doesn't, mostly doesn't, bummer.

Can you try with a different PC with a different motherboard? Preferably something more period-correct like Socket 370/478/A and older.

I have this replicated on a fast i945 with a Pentium D or Core 2 running Windows 98 se, an era appropriate i440BX with a Pentium 3 at 650 Mhz and 128MB of ram running Windows ME, and two VMware VMs, one running Windows 98 and the other Windows ME.

Link to comment
Share on other sites

Posted (edited)


I managed to replicate this issue and put the system into this out of memory state.

Here is a video of Windows 98 in a VM exhibiting the problem. I've saved it as a suspended state, so I can spin it up at any time. I can still run applications, but I'm unable to open any command prompt windows. Even if I close every other running application with Process Explorer, I can never put the system back into a state where it has the memory to open a DOS window.

 


I have replicated this on fast hardware (i945/Pentium D/Core 2), age appropriate hardware (i440/Pentium 3/650Mhz/128MB RAM), and in a virtual machine (VMware) on a Ryzen 1800X. The easiest way to put the computer in this state is to open an close an application repeatedly (such as Notepad). Sometimes it takes a few minutes. Or an hour. Or eight hours. It's somewhat random. If you're not sure how long it will take, then it's best to do this operation overnight. I've written a program that will automate the process. You'll know when it's finished when an error box appears which says "Couldn't find a window to receive WM_CLOSE message", which indicates a failure to open the target application.

Here is a video of the system reporting the memory error when I try to open a DOS window. Not that, while there is in fact some memory pressure, there should still be plenty left to run a DOS command window. I also demonstrate in the video that there are no problems running Win32 programs. I've found that the more memory that is available past a minimum of 128MB, the more physical memory there is leftover once the system reaches this state, so the amount of unused physical memory is not indicative of "free memory" as it pertains to this problem.

The failure state of Windows ME is completely different when trying to induce it like this. The system will either crash Explorer or cause a system fault. On actual hardware, this causes the system to reboot suddenly. In a VM, a fault is reported that caused the virtual CPU to enter a shutdown state. When I'm just working on a real machine, the state comes naturally and I get the familiar popups and the inability to open a DOS window.

I have started configuring both hardware and VMs, Windows 98 and ME, to the following conservative values for vcache, and it seems reasonable given the amount of physical memory available:

[vcache]
MinFileCache=1024
MaxFileCache=32768

As far as testing various patches, updates, and service packs, things that I have used or tried:

I need to use this patch to allow Windows 98 to run on my Ryzen CPU, tested 0.7.45 and 0.7.47, with and without Q288430:
https://www.vogons.org/viewtopic.php?f=24&t=88284

I have tried the following service packs for Windows 98 SE (3.65 and 3.66):
http://www.techtalk.cc/viewtopic.php?t=65

Windows ME Service Pack 1.05
https://retrosystemsrevival.blogspot.com/2019/05/windows-me-service-pack-102.html

Auto-Patcher for Windows 98
https://retrosystemsrevival.blogspot.com/2018/01/auto-patcher-for-windows-98se-december.html

I have tried Windows 98 with a stock IE5, IE6SP1, and IE6SP1 with the shell32.dll update:
https://msfn.org/board/topic/84451-98-fe-98-sp1-98-se-me-shell32dll-fix/

I tried varying sizes for the swap file, but with sufficient memory (at least 256MB), it apepars that the swap file is never even used (it's never expanded), so the problem seems to lie elsewhere.

I have tried replacing himem.sys with HIMEMX. I have tried with and without rloew's memory patch. I have tried varying amounts of physical memory, 32MB, 128MB, 256MB, 512MB, 1.5GB. I have tried large swap sizes and small swap sizes. I have tried varying the amount of vcache, up to 512MB when large amounts of physical memory is installed.

I understand that my method to replicate this error is unrealistic, opening and losing Notepad or some other program hundreds of times, but it is simply meant to induce the error with the absolute smallest amount of noise to eliminate the change that it may be caused by something else. In my case, it is triggered VERY often when I use Cygwin, as it spawns lots of child processes that perform some action and then exit. I have been pursuing this issue for weeks and I've finally made some progress.

I have a copy of the system in this very state and can provide it in a VM, if there is an expert available to help look into this.

Note that the picture below are from a VM with 128MB of memory and do show memory pressure and swap file usage. The video linked above is a VM with 256MB of memory and shows that this error still occurs without the same kind of memory pressure (plenty of swap and physical memory available).
 

 

me-crash-4.png

me-crash-3.png

me-crash-2.png

me-crash-1.png

98-memory-1.png

98-memory-2.png

98-memory-3.png

98-memory-4.png

98-memory-5.png

98-memory-6.png

Edited by Kahenraz
Link to comment
Share on other sites

Posted (edited)

This problem is the state that the operating system can find itself in where some kind of "memory" resource has become exhausted. I can induce this state during testing by using my stress-open tool. I wrote this tool as a reaction to finding my system falling into this state with normal use and trying to find a way to induce it on demand for testing and replication.

This is after a fresh reboot. The tool opens Notepad (a native Win32 program) and closes it gracefully with the WM_CLOSE message (this is what happens when you click the X button). The stress-open tool is also a Win32 binary; there is no 16-bit code here. But somehow the operating system becomes unable to open a DOS window, despite still being able to run Win32 applications.

What DOES happen right before this state is my tool tries to open notepad.exe but it fails to open. At this point I can continue to open Notepad as normal, but I can't run anything that uses the DOS subsystem. Maybe it's some kind of race condition that corrupts memory somewhere in the kernel.

Does Windows 98 support JIT debugging? Maybe this can provide some more information as to what causes Notepad to crash.

Edited by Kahenraz
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.


×
×
  • Create New...