Dietmar Posted March 31 Author Share Posted March 31 @user57 This are no jumps. That are the 3 references to Data XREF, which PeMaker can not handel. But they are not a real problem, you can repair this by hand, but Bsod stays. This Code here is from NT4 Servicepack 4 Dietmar .text:8013CEA0 ; Exported entry 4. ExInterlockedPopEntrySList .text:8013CEA0 .text:8013CEA0 ; =============== S U B R O U T I N E ======================================= .text:8013CEA0 .text:8013CEA0 .text:8013CEA0 public ExInterlockedPopEntrySList .text:8013CEA0 ExInterlockedPopEntrySList proc near ; CODE XREF: CcScheduleReadAhead+2BB�p .text:8013CEA0 ; sub_80108058+10�p ... .text:8013CEA0 push ebx .text:8013CEA1 push ebp .text:8013CEA2 mov ebp, ecx .text:8013CEA4 .text:8013CEA4 loc_8013CEA4: ; DATA XREF: .text:loc_80140E17�o .text:8013CEA4 mov edx, [ebp+4] .text:8013CEA7 mov eax, [ebp+0] .text:8013CEAA .text:8013CEAA loc_8013CEAA: ; CODE XREF: ExInterlockedPopEntrySList+1C�j .text:8013CEAA or eax, eax .text:8013CEAC jz short loc_8013CEBE .text:8013CEAE mov ecx, edx .text:8013CEB0 add ecx, 0FFFFh .text:8013CEB6 .text:8013CEB6 loc_8013CEB6: ; DATA XREF: sub_80140AF4:loc_80140AFD�o .text:8013CEB6 ; .text:80140D28�o .text:8013CEB6 mov ebx, [eax] .text:8013CEB8 cmpxchg8b qword ptr [ebp+0] .text:8013CEBC jnz short loc_8013CEAA .text:8013CEBE .text:8013CEBE loc_8013CEBE: ; CODE XREF: ExInterlockedPopEntrySList+C�j .text:8013CEBE pop ebp .text:8013CEBF pop ebx .text:8013CEC0 retn .text:8013CEC0 ExInterlockedPopEntrySList endp .text:8013CEC0 .text:8013CEC0 ; --------------------------------------------------------------------------- Link to comment Share on other sites More sharing options...
user57 Posted March 31 Share Posted March 31 .text:8013CEA0 ExInterlockedPopEntrySList proc near ; CODE XREF: CcScheduleReadAhead+2BB�p .text:8013CEA0 ; sub_80108058+10�p ... .text:8013CEA0 push ebx .text:8013CEA1 push ebp pushf cli .text:8013CEA2 mov ebp, ecx .text:8013CEA4 .text:8013CEA4 loc_8013CEA4: <-- this seems to has a jump to ; DATA XREF: .text:loc_80140E17�o .text:8013CEA4 mov edx, [ebp+4] .text:8013CEA7 mov eax, [ebp+0] .text:8013CEAA .text:8013CEAA loc_8013CEAA: // valid ; CODE XREF: ExInterlockedPopEntrySList+1C�j .text:8013CEAA or eax, eax .text:8013CEAC jz short end_of_ExInterlockedPopEntrySList // has to be changed .text:8013CEAE mov ecx, edx .text:8013CEB0 add ecx, 0FFFFh .text:8013CEB6 .text:8013CEB6 loc_8013CEB6: <-- seems to have some jumps at too ; DATA XREF: sub_80140AF4:loc_80140AFD�o .text:8013CEB6 ; .text:80140D28�o .text:8013CEB6 mov ebx, [eax] cmp eax, [ebp+0] jnz loc_fail // something we did cmp edx, [ebp+4] jnz loc_fail // again mov [ebp+0], ebx mov [ebp+4], ecx jmp loop_check_ExInterlockedPopEntrySList // the loop check loc_fail: mov eax, [ebp+0] mov edx, [ebp+4] .text:8013CEB8 loop_check_ExInterlockedPopEntrySList: .text:8013CEBC jnz short loc_8013CEAA // valid but need fix to that or eax,eax loop .text:8013CEBE .text:8013CEBE loc_8013CEBE/end_of_ExInterlockedPopEntrySList: ; CODE XREF: ExInterlockedPopEntrySList+C�j sti popf .text:8013CEBE pop ebp .text:8013CEBF pop ebx .text:8013CEC0 retn .text:8013CEC0 ExInterlockedPopEntrySList endp ------------------------------------------------------- well data refs dont make sence at these spots, bug view ? at 8013CEA4 it says it get 1 or more jumps and says from 80140E17 is 1 of the jumps - since the pushf and cli changed the offset 2 bytes therefore the jump is gambled if not location fixed it would be common to see some jumps into different functions and oposite location 8013CEB6 seems to bejumped at _80140AF4:loc_80140AFD�o .text:80140D28�o (you should look at least that 3 spots for this jump) looks ida disassembler to me you at best search for that address where they get jumped from the jz at 8013CEAC has to be fixed to jump at the end that is sti / end_of_ExInterlockedPopEntrySList .text:8013CEAC jz end_of_ExInterlockedPopEntrySList .text:8013CEAA loc_8013CEAA: that one is valid, but since we have more code the jump that do this is a bit bigger but that one is shown in the visable code at 8013CEBC jnz short loc_8013CEAA .text:8013CEAC is valid but also need a adjust to reach the (loc_8013CEBE/end_of_ExInterlockedPopEntrySList) if you want you can try to remove sti,cli popf pushf (but have to be all 4) --------------------- you actually could also use a different method cmpxchg8b has 4 bytes of opcode jnz short has 2 aka 6 bytes you need 5 that makes jmp at your location + 1 nop cmpxchg8b qword ptr [ebp+0] jnz short loc_8013CEAA those to you replace with your memory location , use jmp + nop you memory location then do cmp eax, [ebp+0] jnz loc_fail2 // something we did cmp edx, [ebp+4] jnz loc_fail2 // again mov [ebp+0], ebx mov [ebp+4], ecx jmp the_check // the loop check loc_fail2: mov eax, [ebp+0] mov edx, [ebp+4] the_check: jnz short loc_8013CEAA // this one conditional jmp to that loop (or eax, eax) // now you just have to jump back jmp to (.text:8013CEBE loc_8013CEBE/end_of_ExInterlockedPopEntrySList) / just backwards as if the command has happend i dont know if that NT version can be used for XP they might have used a different behavior Link to comment Share on other sites More sharing options...
Dietmar Posted April 1 Author Share Posted April 1 (edited) New try, but one step back. This is for the first function. EDIT: Now I think, this will not work at all. Because lock bts dword ptr [ebp], 0 changed the last bit in memory from the 64 bit. So, when I do a compare later, because of this, EAX compare will always fail. Even the try with lock bts dword ptr [ebp], 0 jnb acquired, it a crazy missunderstanding of the work of lock bts dword ptr [ebp], 0 . IF the last bit from the 64 bit in memory was a 0, it is changed to 1, the CF flag is set and jumps, meaning lock of the 64 bit in memory works. BUT when the last bit of the 64 bit in memory was already a 1 before lock bts dword ptr [ebp], 0, the last bit remains untouched at 1, no CF flag is set, not jump. This means the compi thinks, that LOCK was not successful. Oh, what a big mistake in the base of this code.. cli push ebx push ebp pushfd xor ebx, ebx mov ebp, ecx try: mov edx, [ebp + 4] mov eax, [ebp] or eax, eax jz short Efls20 mov ecx, edx mov cx, bx lock bts dword ptr [ebp], 0 jnb acquired popfd pushfd test dword ptr [ebp], 1 je try acquired: cmp eax, [ebp] jne keep cmp edx, [ebp + 4] je exchange keep: mov eax, [ebp] mov edx, [ebp + 4] jmp done exchange: mov [ebp + 4], ecx mov [ebp], ebx jmp done done: mov byte ptr [ebp], 0 Efls20: sti popfd pop ebp pop ebx retn the crazy wrong code from Chappell Dietmar Edited April 1 by Dietmar Link to comment Share on other sites More sharing options...
Dietmar Posted April 1 Author Share Posted April 1 Now I do everything by myself Dietmar try: ; Emulation of CMPXCHG8B LOCK CMPXCHG [EBP], EAX jnz fail LOCK CMPXCHG [EBP+4], EDX jnz fail ; If both CMPXCHG conditions are met, perform the exchange mov [ebp+4], ecx mov [ebp+0], ebx jmp check fail: mov edx, [ebp+4] ; Reload edx with the value at ebp+4 higher 32 bit mov eax, [ebp+0] ; Reload eax with the value at ebp, lower 32bit jmp try check: jnz try Link to comment Share on other sites More sharing options...
Dietmar Posted April 1 Author Share Posted April 1 .data:004762B2 ; Exported entry 7. ExInterlockedFlushSList .data:004762B2 .data:004762B2 ; =============== S U B R O U T I N E ======================================= .data:004762B2 .data:004762B2 public ExInterlockedFlushSList .data:004762B2 ExInterlockedFlushSList proc near ; CODE XREF: sub_45F0DF:loc_45F0F7p .data:004762B2 ; DATA XREF: .edata:off_5AC2A8o .data:004762B2 push ebx .data:004762B3 push ebp .data:004762B4 pushf .data:004762B5 cli .data:004762B6 xor ebx, ebx .data:004762B8 .data:004762B8 loc_4762B8: ; CODE XREF: ExInterlockedFlushSList:loc_4762E1j .data:004762B8 mov ebp, ecx .data:004762BA mov edx, [ebp+4] .data:004762BD mov eax, [ebp+0] .data:004762C0 or eax, eax .data:004762C2 jz short loc_4762E3 .data:004762C4 mov ecx, edx .data:004762C6 mov cx, bx .data:004762C9 LOCK CMPXCHG [EBP], EAX .data:004762CE jnz short loc_4762DB .data:004762D0 LOCK CMPXCHG [EBP+4], EDX .data:004762D5 jz short loc_4762E1 .data:004762D7 jmp short loc_4762E1 .data:004762D9 ; --------------------------------------------------------------------------- .data:004762D9 .data:004762D9 loc_4762DB: ; CODE XREF: ExInterlockedFlushSList+1Aj .data:004762DB ; ExInterlockedFlushSList+1Fj .data:004762DB mov eax, [ebp+0] .data:004762DE mov edx, [ebp+4] .data:004762E1 .data:004762E1 loc_4762E1: ; CODE XREF: ExInterlockedFlushSList+27j .data:004762E1 jnz short loc_4762B8 .data:004762E3 .data:004762E3 loc_4762E3: ; CODE XREF: ExInterlockedFlushSList+10j .data:004762E3 sti .data:004762E4 popf .data:004762E5 pop ebp .data:004762E6 pop ebx .data:004762E7 nop .data:004762E8 nop .data:004762E9 nop .data:004762EA nop .data:004762EB nop .data:004762EC nop .data:004762ED nop .data:004762EE nop .data:004762EF retn .data:004762EF ExInterlockedFlushSList endp .data:004762EF .data:004762EF ; --------------------------------------------------------------------------- Link to comment Share on other sites More sharing options...
user57 Posted April 1 Share Posted April 1 36 minutes ago, Dietmar said: .data:004762B2 ; Exported entry 7. ExInterlockedFlushSList .data:004762B2 .data:004762B2 ; =============== S U B R O U T I N E ======================================= .data:004762B2 .data:004762B2 public ExInterlockedFlushSList .data:004762B2 ExInterlockedFlushSList proc near ; CODE XREF: sub_45F0DF:loc_45F0F7p .data:004762B2 ; DATA XREF: .edata:off_5AC2A8o .data:004762B2 push ebx .data:004762B3 push ebp .data:004762B4 pushf .data:004762B5 cli .data:004762B6 xor ebx, ebx .data:004762B8 .data:004762B8 loc_4762B8: ; CODE XREF: ExInterlockedFlushSList:loc_4762E1j .data:004762B8 mov ebp, ecx .data:004762BA mov edx, [ebp+4] .data:004762BD mov eax, [ebp+0] .data:004762C0 or eax, eax .data:004762C2 jz short loc_4762E3 .data:004762C4 mov ecx, edx .data:004762C6 mov cx, bx .data:004762C9 LOCK CMPXCHG [EBP], EAX .data:004762CE jnz short loc_4762DB .data:004762D0 LOCK CMPXCHG [EBP+4], EDX .data:004762D5 jz short loc_4762E1 .data:004762D7 jmp short loc_4762E1 .data:004762D9 ; --------------------------------------------------------------------------- .data:004762D9 .data:004762D9 loc_4762DB: ; CODE XREF: ExInterlockedFlushSList+1Aj .data:004762DB ; ExInterlockedFlushSList+1Fj .data:004762DB mov eax, [ebp+0] .data:004762DE mov edx, [ebp+4] .data:004762E1 .data:004762E1 loc_4762E1: ; CODE XREF: ExInterlockedFlushSList+27j .data:004762E1 jnz short loc_4762B8 .data:004762E3 .data:004762E3 loc_4762E3: ; CODE XREF: ExInterlockedFlushSList+10j .data:004762E3 sti .data:004762E4 popf .data:004762E5 pop ebp .data:004762E6 pop ebx .data:004762E7 nop .data:004762E8 nop .data:004762E9 nop .data:004762EA nop .data:004762EB nop .data:004762EC nop .data:004762ED nop .data:004762EE nop .data:004762EF retn .data:004762EF ExInterlockedFlushSList endp .data:004762EF .data:004762EF ; --------------------------------------------------------------------------- LOCK CMPXCHG [EBP], EAX <-- that already makes a change if that was the case but it actually need the 64 bit compare, before making any changes because if both 32 + 32 bits are not the same it dont do that thats the first mistake again that other part has no decision, it both jmps on ZF 1 and ZF 0 .data:004762D5 jz short loc_4762E1 .data:004762D7 jmp short loc_4762E1 Link to comment Share on other sites More sharing options...
Dietmar Posted April 1 Author Share Posted April 1 (edited) @user57 The LOCK CMPXCHG [EBP], EAX instruction compares the value in the EAX register with the value at the memory address pointed to by EBP. If they are equal, the value in EAX is stored at that memory address, and the zero flag (ZF) is set. If they are not equal, the value at the memory address remains unchanged, and the zero flag is cleared. So, if the values are equal, the change will be made and ZF will be set. If the values are not equal, the change will not be made, and ZF will be cleared. The LOCK prefix ensures atomicity, meaning that the operation is performed as an indivisible unit, preventing interference from other processors. EDIT: Ah, now I see my error. It can happen, that the other 32 bits are NOT equal. But in this case, the first habe been erranous exchanged. Can you repair my code? Dietmar Edited April 1 by Dietmar Link to comment Share on other sites More sharing options...
roytam1 Posted April 1 Share Posted April 1 FYI in Linux there is an emulator for this instruction(but they don't do LOCK): https://android.googlesource.com/kernel/mediatek/+/android-5.1.0_r0.1/arch/x86/lib/cmpxchg8b_emu.S?autodive=0%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F Link to comment Share on other sites More sharing options...
Dietmar Posted April 1 Author Share Posted April 1 @roytam1 interesting, this is nearly exact the first working emulation from @user57. For the use in ExInterlockedFlushSList from XP SP3 it is enough. But not for the much more complex function ExInterlockedPopEntrySList from XP SP3. I came with this emulator to desktop, but in less than a second it crashes. This happens I think, because the check is not atomic Dietmar Link to comment Share on other sites More sharing options...
Dietmar Posted April 1 Author Share Posted April 1 @user57 I sharpen your emulator to its maximum. Now, the boottime from XP is shorter Dietmar .data:004762B2 ; Exported entry 7. ExInterlockedFlushSList .data:004762B2 .data:004762B2 ; =============== S U B R O U T I N E ======================================= .data:004762B2 .data:004762B2 .data:004762B2 public ExInterlockedFlushSList .data:004762B2 ExInterlockedFlushSList proc near ; CODE XREF: sub_45F0DF:loc_45F0F7p .data:004762B2 ; DATA XREF: .edata:off_5AC2A8o .data:004762B2 push ebx .data:004762B3 push ebp .data:004762B4 pushf .data:004762B5 cli .data:004762B6 xor ebx, ebx .data:004762B8 mov ebp, ecx .data:004762BA mov edx, [ebp+4] .data:004762BD mov eax, [ebp+0] .data:004762C0 .data:004762C0 loc_4762C0: ; CODE XREF: ExInterlockedFlushSList+2Fj .data:004762C0 or eax, eax .data:004762C2 jz short loc_4762E3 .data:004762C4 mov ecx, edx .data:004762C6 mov cx, bx .data:004762C9 cmp eax, [ebp+0] .data:004762CC jnz short loc_4762DB .data:004762CE cmp edx, [ebp+4] .data:004762D1 jnz short loc_4762DB .data:004762D3 mov [ebp+0], ebx .data:004762D6 mov [ebp+4], ecx .data:004762D9 jmp short loc_4762E3 .data:004762DB ; --------------------------------------------------------------------------- .data:004762DB .data:004762DB loc_4762DB: ; CODE XREF: ExInterlockedFlushSList+1Aj .data:004762DB ; ExInterlockedFlushSList+1Fj .data:004762DB mov eax, [ebp+0] .data:004762DE mov edx, [ebp+4] .data:004762E1 jmp short loc_4762C0 .data:004762E3 ; --------------------------------------------------------------------------- .data:004762E3 .data:004762E3 loc_4762E3: ; CODE XREF: ExInterlockedFlushSList+10j .data:004762E3 ; ExInterlockedFlushSList+27j .data:004762E3 sti .data:004762E4 popf .data:004762E5 pop ebp .data:004762E6 pop ebx .data:004762E7 nop .data:004762E8 nop .data:004762E9 nop .data:004762EA nop .data:004762EB nop .data:004762EC nop .data:004762ED nop .data:004762EE nop .data:004762EF retn .data:004762EF ExInterlockedFlushSList endp .data:004762EF .data:004762EF ; --------------------------------------------------------------------------- Link to comment Share on other sites More sharing options...
Dietmar Posted April 1 Author Share Posted April 1 (edited) With the Debugger Windbg connected, my XP does NOT crash! This I have never seen before. Now I have an ntoskrnl.exe (see below), with 2 new build functions in it, ExInterlockedFlushSList and ExInterlockedPopEntrySList, both now without any cmpxchg8b. For them I use my new build and sharpen cmpxchg8b Emulator. I am absolut sure: Disconnect Windbg and normal boot, Bsod 0xA (xxx, 000000FF,...) And I doublechecked that indeed my ntoskrnl.exe is used and no other^^, see build date Dietmar ntoskrnl.exe mit 2 neuen Funktionen ohne cmpxchg8b https://ufile.io/en45qotb Edited April 1 by Dietmar 1 Link to comment Share on other sites More sharing options...
pappyN4 Posted April 2 Share Posted April 2 So... how long does boot take? Last time I remember using 486 was with Win3.11 as a kid. According to wiki win2k is officially supported on a 486. XP must be slow Link to comment Share on other sites More sharing options...
Dietmar Posted April 2 Author Share Posted April 2 (edited) 30 minutes ago, pappyN4 said: So... how long does boot take? Last time I remember using 486 was with Win3.11 as a kid. According to wiki win2k is officially supported on a 486. XP must be slow @pappyN4 Hi, I know that you are good in Assembler. Can you help to improve the Emulator for the cmpxchg8b with code for example like this xor eax, eax .loop: lock xchg [ebp], eax test eax, eax jz .loop Edited April 2 by Dietmar Link to comment Share on other sites More sharing options...
Dietmar Posted April 2 Author Share Posted April 2 (edited) May be this? ExInterlockedFlushSList proc near push ebx push ebp pushf cli xor ebx, ebx mov ebp, ecx .loop: mov edx, [ebp+4] mov eax, [ebp] or eax, eax jz short .done mov ecx, edx mov cx, bx ; Attempt to swap low 32 bits lock cmpxchg [ebp], eax ; If the low swap was successful, attempt to swap high 32 bits jz .high_swap ; If the low swap failed, retry the entire operation jmp .loop .high_swap: ; Attempt to swap high 32 bits lock cmpxchg [ebp+4], edx ; If the high swap of 32bits was also successful, jz .rescue ; If the high swap failed, retry the entire operation jmp .loop .rescue: ; Save ECX and EBX onto the stack push ecx push ebx lock xchg [ebp+4], ecx lock xchg [ebp], ebx ; Restore ECX and EBX from the stack pop ebx pop ecx .done: sti popf pop ebp pop ebx ret ExInterlockedFlushSList endp Edited April 2 by Dietmar Link to comment Share on other sites More sharing options...
pappyN4 Posted April 2 Share Posted April 2 5 hours ago, Dietmar said: @pappyN4 Hi, I know that you are good in Assembler. Can you help to improve the Emulator for the cmpxchg8b with code for example like this You are much better than me. I can make patch for x64 by looking at what others have done for x86 if logic is similar and needs simple jumps. Writing assembly? One class, 20 years ago, not much help. Otherwise I would try for AVX on x64 using x86 as guide. I did look at win2000 ntoskrnl and some of the older pre release ones. They all have cmpxchg8b also. But if windows 2000 is able to install and run on 486, then it must be bypassed somehow for it to work. According to Chappell site, Quote In the early days of Windows NT, however, not all the extant processors implemented the cmpxchg8b instruction. In versions before 5.1, every function that uses the instruction has an alternate coding for processors that do not support the instruction. Very early during its initialisation, the kernel checks whether the boot processor supports the cmpxchg8b instruction. If the support is missing, the kernel patches jmp instructions at the start of each of those functions to redirect execution to their alternates. Conversely, if the boot processor does support the instruction, and the functions are left unpatched If accurate, then original ntoskrnl on the ISO would be default unpatched. But if installed last version of win2000 that works on the 486, then kernel on C: would be patched to the "alternate functions". So maybe you can install win2k on 486, and get copy of patched kernel, and see what "alternate functions" exist. I do not have 486, so I can't get test theory. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now