Dietmar Posted April 2, 2024 Author Posted April 2, 2024 @pappyN4 Oh Waaaoh, this is a crazy nice idea. It just mean, when you look at original ntoskrnl.exe from win2000, you see only that version with cmpxchg8b. But when during Setup of Win2000 the installer noticed, that it has to live on a 486 cpu, it patches all about cmpxchg8b. And not only in ntoskrnl.exe. Also ntdll.dll and each other file from the Setup. This I will test tomrrow Dietmar PS: By the way I noticed my mistake in my new cmpxchg8b Emulator: The opcode cmpxchg [pointer to 32 bit in memory], REG changes the 32bit at the adress pointer in memory only, if the 32bit in memory are identic with EAX. Only then the content of REG is written to those 32 bit in memory.
Dietmar Posted April 2, 2024 Author Posted April 2, 2024 (edited) 20 minutes ago, pappyN4 said: @pappyN4 Otherwise I would try for AVX on x64 using x86 as guide. Ask @Mov AX, 0xDEAD for this Dietmar Edited April 2, 2024 by Dietmar
pappyN4 Posted April 3, 2024 Posted April 3, 2024 11 minutes ago, Dietmar said: @pappyN4 Otherwise I would try for AVX on x64 using x86 as guide. Ask @Mov AX, 0xDEAD for this Dietmar Stability of OS is higher priority. I have been testing a BSOD problem for a while now that usually happens every 5 of 100 boots. But I have been able to recreate very easily doing certain thing in certain game. Server 2003x86 seems fine, just x64 has problem. Hopefully with BSOD dumps someone smarter than me can figure out why
user57 Posted April 3, 2024 Posted April 3, 2024 i heared chappell died a few months ago, sad story we could still need him reading chappells writing it says that there once was a solution that dont use the command if not supported by cpu that dont neccesary say if you just use a different one from a different OS version that it just work - maybe - maybe not why would it has to be that other cmpxchg command the linux one is not perfect - depending on what the other routines do, the linux one might work, but certainly its not 100 % correct, while mine is the linux one looks almost the same to the one i posted up, but it dont compare the 64 bits for the false result (maybe the linux solution dont need, but again mine is correct the linux one is not) so why not "just the right one" doing it a other way cause more commands and maybe fixes, there are certainly multiple solutions https://www.felixcloutier.com/x86/cmpxchg the description might be wrong this time, the description here unlike cmpxchg8b it always compares EAX with the memory location the description actually dont tell that a other register then EAX can be choosen well this time your code might work but you rather trying to fix the results, the ZF non reaction is set to just go back that is ok but you have to do this in every function like this then also that makes 2 times locks xchg and 2 times lock cmpxchg you do cmpxchg for the atomic question ? if you have to change 64 bit at once then it might be atomic for the 64 bits, doing 64 bit in 1 step just having the lock prefix dont change it to a 64 bit mov
j7n Posted April 3, 2024 Posted April 3, 2024 On 4/1/2024 at 7:18 PM, Dietmar said: CMPXCHG [EBP], EAX instruction compares the value in the EAX register with the value at the memory address pointed to by EBP. If they are equal, the value in EAX is stored at that memory address Can you explain the purpose of this function? I don't understand why it writes the same value from a register into the memory location. The memory already contains that value (when they are equal).
Dietmar Posted April 3, 2024 Author Posted April 3, 2024 (edited) @j7n This is not the correct use of this opcode: In the real cmpxchg8b [EBP], there is the first check, if the value in memory (low 32 bit) is the same as in EAX. When yes, those lower 32 bit are changed against the value in EBX. So, using CMPXCHG [EBP], EBX offers exact this functionality, but only for the lower 32bit of the 64 bit in memory Dietmar Edited April 3, 2024 by Dietmar 1
user57 Posted April 3, 2024 Posted April 3, 2024 well i would think a different offset is possible but if they are equal it exchange that offset with ECX and EBX (it overworks those) if you know about logical circuits you might know why but exactly this is why i wanted to say he dont need that command at least i dont see a reason for that the function itself seems to compare: "if (this offset 64 bit entry still has the same value as EAX and EDX) -> change that offset with ECX and EBX (then it actually has a 64 bit changed value there)" functions descriptions: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-exinterlockedflushslist https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-exinterlockedpopentryslist https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-exinterlockedpushentrylist "If there were entries on the specified list, ExInterlockedFlushSList returns a pointer to the first SLIST_ENTRY structure that was an entry on the list; otherwise, it returns NULL." this opens the question why they used this command to return an pointer offset to the first SLIST_ENTRY structure and changing that SLIST_ENTRY (what is a internal windows structure) about that "atomical" it "suppose to be" a "non interruptable doing" therefore dietmar probaly deleted the interrupt flag (so no interrupts - actually it still do that - but thats a other story) (the other is the "lock" command) the next question what comes to mind is changing 64 bit at once (non interrupt doing), not stepwise that that cmpxchg8b actually do (even tho its 32 bits - it just use 2 registers) BUT i never seen that to be needed for 64 bits, that far to small to have an interaction, not even interrupts if they are changed up (and those are constandly used) have that problem https://www.quora.com/What-is-the-meaning-of-atomic-in-programming we also talking about a function here , so the function itself might actually be "atomical" if it solves its job, because the function has a start and a end to solve this step we are not in c++ that might has a code "atom (this)" in assembly you have to write the real instruction that physical do so (if that was a problem we might have some more answers for now) dietmar just said that he wants to remove cmpxchg8b with a working alternativ code, it might be a little road but over time we will find this i highly suspect that list is for threading/multicore
Dietmar Posted April 3, 2024 Author Posted April 3, 2024 (edited) Here is next try for to reach am atomic compare and exchange for to emulate CMPXCHG8B Dietmar ExInterlockedFlushSList proc near push ebx push ebp pushf cli xor ebx, ebx mov ebp, ecx mov edx, [ebp+4] mov eax, [ebp] or eax, eax jz .done mov ecx, edx mov cx, bx lock cmpxchg [ebp], ebx push eax mov eax, edx lock cmpxchg [ebp+4], ecx pop eax .done: sti popf pop ebp pop ebx ret ExInterlockedFlushSList endp Edited April 3, 2024 by Dietmar
roytam1 Posted April 3, 2024 Posted April 3, 2024 1 hour ago, Dietmar said: Here is next try for to reach am atomic compare and exchange for to emulate CMPXCHG8B Dietmar (...) or eax, eax jz .done mov ecx, edx mov cx, bx <-- why this uses 16bit registers? lock cmpxchg [ebp], ebx (...)
Dietmar Posted April 3, 2024 Author Posted April 3, 2024 @roytam1 In win2000, the whole list was set to 0. But in XP SP3, ECX becomes a special structure: ECX = ab cd 00 00, where ab is the highest byte from the 64 bit in memory and cd the next following. And this 16 bit operation kills the 2 following bytes from ECX, because EBX = 00 00 00 00 and so bx = 00 00 Dietmar
roytam1 Posted April 3, 2024 Posted April 3, 2024 15 hours ago, pappyN4 said: I did look at win2000 ntoskrnl and some of the older pre release ones. They all have cmpxchg8b also. But if windows 2000 is able to install and run on 486, then it must be bypassed somehow for it to work. According to Chappell site, yeah it does. if you search `cpuid` in ntoskrnl.exe 5.0.2195.1, you can find one in `sub_553EB4`: INIT:0055415C mov eax, 1 INIT:00554161 cpuid INIT:00554163 test edx, 100h INIT:00554169 jz short loc_554177 INIT:0055416B or ds:dword_47DEEC, 80h INIT:00554175 jmp short loc_5541EA INIT:00554177 ; --------------------------------------------------------------------------- INIT:00554177 INIT:00554177 loc_554177: ; CODE XREF: sub_553EB4+29Ej INIT:00554177 ; sub_553EB4+2B5j INIT:00554177 lea eax, ExInterlockedCompareExchange64 INIT:0055417D lea ecx, sub_40078C INIT:00554183 mov byte ptr [eax], 0E9h INIT:00554186 lea edx, [eax+5] INIT:00554189 sub ecx, edx INIT:0055418B mov [eax+1], ecx INIT:0055418E lea eax, ExInterlockedPopEntrySList INIT:00554194 lea ecx, loc_4006F0 INIT:0055419A mov byte ptr [eax], 0E9h INIT:0055419D lea edx, [eax+5] INIT:005541A0 sub ecx, edx INIT:005541A2 mov [eax+1], ecx INIT:005541A5 lea eax, ExInterlockedPushEntrySList INIT:005541AB lea ecx, loc_400704 INIT:005541B1 mov byte ptr [eax], 0E9h INIT:005541B4 lea edx, [eax+5] INIT:005541B7 sub ecx, edx INIT:005541B9 mov [eax+1], ecx INIT:005541BC lea eax, ExInterlockedFlushSList INIT:005541C2 lea ecx, loc_400714 INIT:005541C8 mov byte ptr [eax], 0E9h INIT:005541CB lea edx, [eax+5] INIT:005541CE sub ecx, edx INIT:005541D0 mov [eax+1], ecx INIT:005541D3 lea eax, sub_402352 INIT:005541D9 lea ecx, ExInterlockedAddLargeInteger INIT:005541DF mov byte ptr [eax], 0E9h INIT:005541E2 lea edx, [eax+5] INIT:005541E5 sub ecx, edx INIT:005541E7 mov [eax+1], ecx and 2nd `lea` is the `cmpxchg8b`-less version of upper one.
Dietmar Posted April 3, 2024 Author Posted April 3, 2024 @roytam1 Can you make a complete ExInterlockedFlushSList in Hex Code from it? Because of the a lot of jumps, you do not see, how they make it Dietmar
roytam1 Posted April 3, 2024 Posted April 3, 2024 50 minutes ago, Dietmar said: @roytam1 Can you make a complete ExInterlockedFlushSList in Hex Code from it? Because of the a lot of jumps, you do not see, how they make it Dietmar which version? the listing I post above is Win2000 RTM (5.0.2195.1), which may not suit XP's usage.
Dietmar Posted April 3, 2024 Author Posted April 3, 2024 @roytam1 This is not so much important. I have also win2000 SP4. It is only for to get the idea Dietmar
Dietmar Posted April 3, 2024 Author Posted April 3, 2024 (edited) We have a new, working Emulation for CMPXCHG8B Dietmar 53 55 9C FA 33 DB 8B E9 8B 55 04 8B 45 00 0B C0 74 13 8B CA 66 89 D9 F0 0F B1 5D 00 50 8B C2 F0 0F B1 4D 04 58 FB 9D 5D 5B 90 90 90 90 C3 .data:004762B2 ; Exported entry 7. ExInterlockedFlushSList .data:004762B2 .data:004762B2 ; =============== S U B R O U T I N E ======================================= .data:004762B2 .data:004762B2 .data:004762B2 public ExInterlockedFlushSList .data:004762B2 ExInterlockedFlushSList proc near ; CODE XREF: sub_45F0DF:loc_45F0F7p .data:004762B2 ; DATA XREF: .edata:off_5AC2A8o .data:004762B2 push ebx .data:004762B3 push ebp .data:004762B4 pushf .data:004762B5 cli .data:004762B6 xor ebx, ebx .data:004762B8 mov ebp, ecx .data:004762BA mov edx, [ebp+4] .data:004762BD mov eax, [ebp+0] .data:004762C0 or eax, eax .data:004762C2 jz short loc_4762D7 .data:004762C4 mov ecx, edx .data:004762C6 mov cx, bx .data:004762C9 lock cmpxchg [ebp+0], ebx .data:004762CE push eax .data:004762CF mov eax, edx .data:004762D1 lock cmpxchg [ebp+4], ecx .data:004762D6 pop eax .data:004762D7 .data:004762D7 loc_4762D7: ; CODE XREF: ExInterlockedFlushSList+10j .data:004762D7 sti .data:004762D8 popf .data:004762D9 pop ebp .data:004762DA pop ebx .data:004762DB nop .data:004762DC nop .data:004762DD nop .data:004762DE nop .data:004762DF retn .data:004762DF ExInterlockedFlushSList endp .data:004762DF .data:004762DF ; --------------------------------------------------------------------------- Edited April 3, 2024 by Dietmar
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in
Sign In Now