Jump to content

XP running on a 486 cpu


Dietmar

Recommended Posts

@user57

This are no jumps. That are the 3 references to Data XREF, which PeMaker can not handel.

But they are not a real problem, you can repair this by hand, but Bsod stays.

This Code here is from NT4 Servicepack 4

Dietmar

.text:8013CEA0 ; Exported entry   4. ExInterlockedPopEntrySList
.text:8013CEA0
.text:8013CEA0 ; =============== S U B R O U T I N E =======================================
.text:8013CEA0
.text:8013CEA0
.text:8013CEA0                 public ExInterlockedPopEntrySList
.text:8013CEA0 ExInterlockedPopEntrySList proc near    ; CODE XREF: CcScheduleReadAhead+2BB�p
.text:8013CEA0                                         ; sub_80108058+10�p ...
.text:8013CEA0                 push    ebx
.text:8013CEA1                 push    ebp
.text:8013CEA2                 mov     ebp, ecx
.text:8013CEA4
.text:8013CEA4 loc_8013CEA4:                           ; DATA XREF: .text:loc_80140E17�o
.text:8013CEA4                 mov     edx, [ebp+4]
.text:8013CEA7                 mov     eax, [ebp+0]
.text:8013CEAA
.text:8013CEAA loc_8013CEAA:                           ; CODE XREF: ExInterlockedPopEntrySList+1C�j
.text:8013CEAA                 or      eax, eax
.text:8013CEAC                 jz      short loc_8013CEBE
.text:8013CEAE                 mov     ecx, edx
.text:8013CEB0                 add     ecx, 0FFFFh
.text:8013CEB6
.text:8013CEB6 loc_8013CEB6:                           ; DATA XREF: sub_80140AF4:loc_80140AFD�o
.text:8013CEB6                                         ; .text:80140D28�o
.text:8013CEB6                 mov     ebx, [eax]
.text:8013CEB8                 cmpxchg8b qword ptr [ebp+0]
.text:8013CEBC                 jnz     short loc_8013CEAA
.text:8013CEBE
.text:8013CEBE loc_8013CEBE:                           ; CODE XREF: ExInterlockedPopEntrySList+C�j
.text:8013CEBE                 pop     ebp
.text:8013CEBF                 pop     ebx
.text:8013CEC0                 retn
.text:8013CEC0 ExInterlockedPopEntrySList endp
.text:8013CEC0
.text:8013CEC0 ; ---------------------------------------------------------------------------

 

Link to comment
Share on other sites


.text:8013CEA0 ExInterlockedPopEntrySList proc near    ; CODE XREF: CcScheduleReadAhead+2BB�p
.text:8013CEA0                                         ; sub_80108058+10�p ...
.text:8013CEA0                 push    ebx
.text:8013CEA1                 push    ebp
                               pushf
                               cli 
.text:8013CEA2                 mov     ebp, ecx
.text:8013CEA4
.text:8013CEA4 loc_8013CEA4: <-- this seems to has a jump to                ; DATA XREF: .text:loc_80140E17�o
.text:8013CEA4                 mov     edx, [ebp+4]
.text:8013CEA7                 mov     eax, [ebp+0]
.text:8013CEAA
.text:8013CEAA loc_8013CEAA:  // valid                          ; CODE XREF: ExInterlockedPopEntrySList+1C�j
.text:8013CEAA                 or      eax, eax
.text:8013CEAC                 jz      short end_of_ExInterlockedPopEntrySList // has to be changed
.text:8013CEAE                 mov     ecx, edx
.text:8013CEB0                 add     ecx, 0FFFFh
.text:8013CEB6
.text:8013CEB6 loc_8013CEB6: <-- seems to have some jumps at too           ; DATA XREF: sub_80140AF4:loc_80140AFD�o
.text:8013CEB6                                         ; .text:80140D28�o
.text:8013CEB6                 mov     ebx, [eax]

                               cmp     eax, [ebp+0]
                               jnz     loc_fail // something we did 
                               cmp     edx, [ebp+4]
                               jnz     loc_fail // again 
                               mov     [ebp+0], ebx
                               mov     [ebp+4], ecx
                               jmp loop_check_ExInterlockedPopEntrySList // the loop check
               loc_fail:  
                               mov     eax, [ebp+0]
                               mov     edx, [ebp+4]


.text:8013CEB8 loop_check_ExInterlockedPopEntrySList:                
.text:8013CEBC                 jnz     short loc_8013CEAA // valid but need fix to that or eax,eax loop
.text:8013CEBE
.text:8013CEBE loc_8013CEBE/end_of_ExInterlockedPopEntrySList:                           ; CODE XREF: ExInterlockedPopEntrySList+C�j
                               sti  
                               popf  
.text:8013CEBE                 pop     ebp
.text:8013CEBF                 pop     ebx
.text:8013CEC0                 retn
.text:8013CEC0 ExInterlockedPopEntrySList endp
-------------------------------------------------------
well data refs dont make sence at these spots, bug view ?


at 8013CEA4 it says it get 1 or more jumps and says from 80140E17 is 1 of the jumps - since the pushf and cli
changed the offset 2 bytes therefore the jump is gambled if not location fixed

it would be common to see some jumps into different functions and oposite

 

location 8013CEB6 seems to bejumped at 
_80140AF4:loc_80140AFD�o  .text:80140D28�o (you should look at least that 3 spots for this jump)
looks ida disassembler to me you at best search for that address where they get jumped from

the jz at 8013CEAC has to be fixed to jump at the end that is sti / end_of_ExInterlockedPopEntrySList
.text:8013CEAC                 jz end_of_ExInterlockedPopEntrySList


.text:8013CEAA loc_8013CEAA: that one is valid, but since we have more code the jump that do this is a bit bigger
but that one is shown in the visable code at 8013CEBC   jnz     short loc_8013CEAA 

.text:8013CEAC   is valid but also need a adjust to reach the (loc_8013CEBE/end_of_ExInterlockedPopEntrySList)  


if you want you can try to remove sti,cli popf pushf (but have to be all 4) 

---------------------
you actually could also use a different method
cmpxchg8b has 4 bytes of opcode jnz short has 2 aka 6 bytes
you need 5 that makes jmp at your location + 1 nop

        cmpxchg8b qword ptr [ebp+0]
        jnz     short loc_8013CEAA

those to you replace with your memory location ,  use jmp + nop

you memory location then do 

                               cmp     eax, [ebp+0]
                               jnz     loc_fail2 // something we did 
                               cmp     edx, [ebp+4]
                               jnz     loc_fail2 // again 
                               mov     [ebp+0], ebx
                               mov     [ebp+4], ecx
                               jmp the_check // the loop check
               loc_fail2:  
                               mov     eax, [ebp+0]
                               mov     edx, [ebp+4]
               the_check:
                               jnz     short loc_8013CEAA // this one conditional jmp to that loop (or      eax, eax)
                               // now you just have to jump back
                               jmp to (.text:8013CEBE loc_8013CEBE/end_of_ExInterlockedPopEntrySList) / just backwards 
as if the command has happend


i dont know if that NT version can be used for XP they might have used a different behavior

Link to comment
Share on other sites

Posted (edited)

New try, but one step back. This is for the first function.

EDIT: Now I think, this will not work at all. Because lock bts dword ptr [ebp], 0 changed the last bit in memory from the 64 bit. So, when I do a compare later, because of this, EAX compare will always fail.

Even the try with lock bts dword ptr [ebp], 0 jnb acquired, it a crazy missunderstanding of the work of  lock bts dword ptr [ebp], 0 . IF the last bit from the 64 bit in memory was a 0, it is changed to 1, the CF flag is set and jumps, meaning lock of the 64 bit in memory works. BUT when the last bit of the 64 bit in memory was already a 1 before lock bts dword ptr [ebp], 0, the last bit remains untouched at 1, no CF flag is set, not jump. This means the compi thinks, that LOCK was not successful. Oh, what a big mistake in the base of this code..

cli
push ebx
push ebp
pushfd

xor ebx, ebx
mov ebp, ecx

try:
    mov edx, [ebp + 4]
    mov eax, [ebp]
    or eax, eax
    jz short Efls20
    mov ecx, edx
    mov cx, bx
    lock bts dword ptr [ebp], 0
    jnb acquired
    popfd
    pushfd

    test dword ptr [ebp], 1
    je try
  
acquired:
    cmp eax, [ebp]
    jne keep
    cmp edx, [ebp + 4]
    je exchange

keep:
    mov     eax, [ebp]
    mov     edx, [ebp + 4]
    jmp done

exchange:
    mov     [ebp + 4], ecx
    mov     [ebp], ebx
    jmp done

done:
    mov byte ptr [ebp], 0
    
Efls20:  
    sti
    popfd
    pop ebp
    pop ebx
 retn

the crazy wrong code from Chappell

Dietmar

Edited by Dietmar
Link to comment
Share on other sites

Now I do everything by myself

Dietmar

 try:
; Emulation of CMPXCHG8B
    LOCK CMPXCHG [EBP], EAX
    jnz fail
    LOCK CMPXCHG [EBP+4], EDX
    jnz fail    

    ; If both CMPXCHG conditions are met, perform the exchange
    mov     [ebp+4], ecx
    mov     [ebp+0], ebx    
    jmp check

fail:
        mov     edx, [ebp+4]    ; Reload edx with the value at ebp+4 higher 32 bit
        mov     eax, [ebp+0]    ; Reload eax with the value at ebp, lower 32bit
        jmp try

check:
    jnz try

 

Link to comment
Share on other sites

.data:004762B2 ; Exported entry   7. ExInterlockedFlushSList
.data:004762B2
.data:004762B2 ; =============== S U B R O U T I N E =======================================
.data:004762B2
.data:004762B2                 public ExInterlockedFlushSList
.data:004762B2 ExInterlockedFlushSList proc near       ; CODE XREF: sub_45F0DF:loc_45F0F7p
.data:004762B2                                         ; DATA XREF: .edata:off_5AC2A8o
.data:004762B2                 push    ebx
.data:004762B3                 push    ebp
.data:004762B4                 pushf
.data:004762B5                 cli
.data:004762B6                 xor     ebx, ebx
.data:004762B8
.data:004762B8 loc_4762B8:                             ; CODE XREF: ExInterlockedFlushSList:loc_4762E1j
.data:004762B8                 mov     ebp, ecx
.data:004762BA                 mov     edx, [ebp+4]
.data:004762BD                 mov     eax, [ebp+0]
.data:004762C0                 or      eax, eax
.data:004762C2                 jz      short loc_4762E3
.data:004762C4                 mov     ecx, edx
.data:004762C6                 mov     cx, bx
.data:004762C9                 LOCK CMPXCHG [EBP], EAX
.data:004762CE                 jnz     short loc_4762DB
.data:004762D0                 LOCK CMPXCHG [EBP+4], EDX
.data:004762D5                 jz      short loc_4762E1
.data:004762D7                 jmp     short loc_4762E1
.data:004762D9 ; ---------------------------------------------------------------------------
.data:004762D9
.data:004762D9 loc_4762DB:                             ; CODE XREF: ExInterlockedFlushSList+1Aj
.data:004762DB                                         ; ExInterlockedFlushSList+1Fj
.data:004762DB                 mov     eax, [ebp+0]
.data:004762DE                 mov     edx, [ebp+4]
.data:004762E1
.data:004762E1 loc_4762E1:                             ; CODE XREF: ExInterlockedFlushSList+27j
.data:004762E1                 jnz     short loc_4762B8
.data:004762E3
.data:004762E3 loc_4762E3:                             ; CODE XREF: ExInterlockedFlushSList+10j
.data:004762E3                 sti
.data:004762E4                 popf
.data:004762E5                 pop     ebp
.data:004762E6                 pop     ebx
.data:004762E7                 nop
.data:004762E8                 nop
.data:004762E9                 nop
.data:004762EA                 nop
.data:004762EB                 nop
.data:004762EC                 nop
.data:004762ED                 nop
.data:004762EE                 nop
.data:004762EF                 retn
.data:004762EF ExInterlockedFlushSList endp
.data:004762EF
.data:004762EF ; ---------------------------------------------------------------------------

Link to comment
Share on other sites

36 minutes ago, Dietmar said:

.data:004762B2 ; Exported entry   7. ExInterlockedFlushSList
.data:004762B2
.data:004762B2 ; =============== S U B R O U T I N E =======================================
.data:004762B2
.data:004762B2                 public ExInterlockedFlushSList
.data:004762B2 ExInterlockedFlushSList proc near       ; CODE XREF: sub_45F0DF:loc_45F0F7p
.data:004762B2                                         ; DATA XREF: .edata:off_5AC2A8o
.data:004762B2                 push    ebx
.data:004762B3                 push    ebp
.data:004762B4                 pushf
.data:004762B5                 cli
.data:004762B6                 xor     ebx, ebx
.data:004762B8
.data:004762B8 loc_4762B8:                             ; CODE XREF: ExInterlockedFlushSList:loc_4762E1j
.data:004762B8                 mov     ebp, ecx
.data:004762BA                 mov     edx, [ebp+4]
.data:004762BD                 mov     eax, [ebp+0]
.data:004762C0                 or      eax, eax
.data:004762C2                 jz      short loc_4762E3
.data:004762C4                 mov     ecx, edx
.data:004762C6                 mov     cx, bx
.data:004762C9                 LOCK CMPXCHG [EBP], EAX
.data:004762CE                 jnz     short loc_4762DB
.data:004762D0                 LOCK CMPXCHG [EBP+4], EDX
.data:004762D5                 jz      short loc_4762E1
.data:004762D7                 jmp     short loc_4762E1
.data:004762D9 ; ---------------------------------------------------------------------------
.data:004762D9
.data:004762D9 loc_4762DB:                             ; CODE XREF: ExInterlockedFlushSList+1Aj
.data:004762DB                                         ; ExInterlockedFlushSList+1Fj
.data:004762DB                 mov     eax, [ebp+0]
.data:004762DE                 mov     edx, [ebp+4]
.data:004762E1
.data:004762E1 loc_4762E1:                             ; CODE XREF: ExInterlockedFlushSList+27j
.data:004762E1                 jnz     short loc_4762B8
.data:004762E3
.data:004762E3 loc_4762E3:                             ; CODE XREF: ExInterlockedFlushSList+10j
.data:004762E3                 sti
.data:004762E4                 popf
.data:004762E5                 pop     ebp
.data:004762E6                 pop     ebx
.data:004762E7                 nop
.data:004762E8                 nop
.data:004762E9                 nop
.data:004762EA                 nop
.data:004762EB                 nop
.data:004762EC                 nop
.data:004762ED                 nop
.data:004762EE                 nop
.data:004762EF                 retn
.data:004762EF ExInterlockedFlushSList endp
.data:004762EF
.data:004762EF ; ---------------------------------------------------------------------------

LOCK CMPXCHG [EBP], EAX <-- that already makes a change if that was the case
but it actually need the 64 bit compare, before making any changes

because if both 32 + 32 bits are not the same it dont do that

thats the first mistake again

that other part has no decision, it both jmps on ZF 1 and ZF 0
.data:004762D5                 jz      short loc_4762E1
.data:004762D7                 jmp     short loc_4762E1

 

 

Link to comment
Share on other sites

Posted (edited)

@user57

The LOCK CMPXCHG [EBP], EAX instruction compares the value in the EAX register with the value at the memory address pointed to by EBP. If they are equal, the value in EAX is stored at that memory address, and the zero flag (ZF) is set. If they are not equal, the value at the memory address remains unchanged, and the zero flag is cleared.

So, if the values are equal, the change will be made and ZF will be set. If the values are not equal, the change will not be made, and ZF will be cleared. The LOCK prefix ensures atomicity, meaning that the operation is performed as an indivisible unit, preventing interference from other processors.

EDIT: Ah, now I see my error. It can happen, that the other 32 bits are NOT equal. But in this case, the first habe been erranous exchanged. Can you repair my code?

Dietmar

Edited by Dietmar
Link to comment
Share on other sites

@roytam1

interesting, this is nearly exact the first working emulation from @user57.

For the use in ExInterlockedFlushSList from XP SP3 it is enough.

But not for the much more complex function ExInterlockedPopEntrySList from XP SP3.

I came with this emulator to desktop, but in less than a second it crashes.

This happens I think, because the check is not atomic

Dietmar

Link to comment
Share on other sites

@user57

I sharpen your emulator to its maximum. Now, the boottime from XP is shorter

Dietmar

.data:004762B2 ; Exported entry   7. ExInterlockedFlushSList
.data:004762B2
.data:004762B2 ; =============== S U B R O U T I N E =======================================
.data:004762B2
.data:004762B2
.data:004762B2                 public ExInterlockedFlushSList
.data:004762B2 ExInterlockedFlushSList proc near       ; CODE XREF: sub_45F0DF:loc_45F0F7p
.data:004762B2                                         ; DATA XREF: .edata:off_5AC2A8o
.data:004762B2                 push    ebx
.data:004762B3                 push    ebp
.data:004762B4                 pushf
.data:004762B5                 cli
.data:004762B6                 xor     ebx, ebx
.data:004762B8                 mov     ebp, ecx
.data:004762BA                 mov     edx, [ebp+4]
.data:004762BD                 mov     eax, [ebp+0]
.data:004762C0
.data:004762C0 loc_4762C0:                             ; CODE XREF: ExInterlockedFlushSList+2Fj
.data:004762C0                 or      eax, eax
.data:004762C2                 jz      short loc_4762E3
.data:004762C4                 mov     ecx, edx
.data:004762C6                 mov     cx, bx
.data:004762C9                 cmp     eax, [ebp+0]
.data:004762CC                 jnz     short loc_4762DB
.data:004762CE                 cmp     edx, [ebp+4]
.data:004762D1                 jnz     short loc_4762DB
.data:004762D3                 mov     [ebp+0], ebx
.data:004762D6                 mov     [ebp+4], ecx
.data:004762D9                 jmp     short loc_4762E3
.data:004762DB ; ---------------------------------------------------------------------------
.data:004762DB
.data:004762DB loc_4762DB:                             ; CODE XREF: ExInterlockedFlushSList+1Aj
.data:004762DB                                         ; ExInterlockedFlushSList+1Fj
.data:004762DB                 mov     eax, [ebp+0]
.data:004762DE                 mov     edx, [ebp+4]
.data:004762E1                 jmp     short loc_4762C0
.data:004762E3 ; ---------------------------------------------------------------------------
.data:004762E3
.data:004762E3 loc_4762E3:                             ; CODE XREF: ExInterlockedFlushSList+10j
.data:004762E3                                         ; ExInterlockedFlushSList+27j
.data:004762E3                 sti
.data:004762E4                 popf
.data:004762E5                 pop     ebp
.data:004762E6                 pop     ebx
.data:004762E7                 nop
.data:004762E8                 nop
.data:004762E9                 nop
.data:004762EA                 nop
.data:004762EB                 nop
.data:004762EC                 nop
.data:004762ED                 nop
.data:004762EE                 nop
.data:004762EF                 retn
.data:004762EF ExInterlockedFlushSList endp
.data:004762EF
.data:004762EF ; ---------------------------------------------------------------------------

 

Link to comment
Share on other sites

Posted (edited)

With the Debugger Windbg connected, my XP does NOT crash!

This I have never seen before. Now I have an ntoskrnl.exe (see below), with 2 new build functions in it,

ExInterlockedFlushSList and ExInterlockedPopEntrySList,

both now without any cmpxchg8b.

For them I use my new build and sharpen cmpxchg8b Emulator.

I am absolut sure: Disconnect Windbg and normal boot,

Bsod 0xA (xxx, 000000FF,...)

And I doublechecked that indeed my ntoskrnl.exe is used and no other^^, see build date

Dietmar

ntoskrnl.exe mit 2 neuen Funktionen ohne cmpxchg8b

https://ufile.io/en45qotb

JwggbcJ.md.png

Edited by Dietmar
Link to comment
Share on other sites

So... how long does boot take?  Last time I remember using 486 was with Win3.11 as a kid.  According to wiki win2k is officially supported on a 486.  XP must be slow:D 

Link to comment
Share on other sites

Posted (edited)
30 minutes ago, pappyN4 said:

So... how long does boot take?  Last time I remember using 486 was with Win3.11 as a kid.  According to wiki win2k is officially supported on a 486.  XP must be slow:D 

@pappyN4

Hi, I know that you are good in Assembler. Can you help to improve the Emulator for the cmpxchg8b with code for example like this

xor eax, eax
.loop:
  lock xchg [ebp], eax
  test eax, eax
  jz .loop

 

Edited by Dietmar
Link to comment
Share on other sites

Posted (edited)

May be this?

ExInterlockedFlushSList proc near
    push    ebx
    push    ebp
    pushf
    cli
    xor     ebx, ebx
    mov     ebp, ecx 

.loop:
    mov     edx, [ebp+4]    
    mov     eax, [ebp]      

or      eax, eax
jz      short .done
mov     ecx, edx
mov     cx, bx

    ; Attempt to swap low 32 bits
    lock cmpxchg [ebp], eax

    ; If the low swap was successful, attempt to swap high 32 bits
    jz      .high_swap

    ; If the low swap failed, retry the entire operation
    jmp     .loop

.high_swap:

    ; Attempt to swap high 32 bits
    lock cmpxchg [ebp+4], edx

    ; If the high swap of  32bits was also successful, 
    jz      .rescue

    ; If the high swap failed, retry the entire operation
    jmp     .loop

.rescue:
    ; Save ECX and EBX onto the stack
    push    ecx
    push    ebx

    lock xchg [ebp+4], ecx

    lock xchg [ebp], ebx

    ; Restore ECX and EBX from the stack
    pop     ebx
    pop     ecx

    
.done:
    sti
    popf
    pop     ebp
    pop     ebx
    ret
ExInterlockedFlushSList endp

 

Edited by Dietmar
Link to comment
Share on other sites

5 hours ago, Dietmar said:

@pappyN4

Hi, I know that you are good in Assembler. Can you help to improve the Emulator for the cmpxchg8b with code for example like this

You are much better than me.  I can make patch for x64 by looking at what others have done for x86 if logic is similar and needs simple jumps.  Writing assembly?  One class, 20 years ago, not much help.  Otherwise I would try for AVX on x64 using x86 as guide.

I did look at win2000 ntoskrnl and some of the older pre release ones.  They all have cmpxchg8b also.  But if windows 2000 is able to install and run on 486, then it must be bypassed somehow for it to work.  According to Chappell site,

Quote

In the early days of Windows NT, however, not all the extant processors implemented the cmpxchg8b instruction. In versions before 5.1, every function that uses the instruction has an alternate coding for processors that do not support the instruction. Very early during its initialisation, the kernel checks whether the boot processor supports the cmpxchg8b instruction. If the support is missing, the kernel patches jmp instructions at the start of each of those functions to redirect execution to their alternates. Conversely, if the boot processor does support the instruction, and the functions are left unpatched

If accurate, then original ntoskrnl on the ISO would be default unpatched. But if installed last version of win2000 that works on the 486, then kernel on C: would be patched to the "alternate functions".  So maybe you can install win2k on 486, and get copy of patched kernel, and see what "alternate functions" exist.  I do not have 486, so I can't get test theory.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...