Dietmar Posted March 27 Posted March 27 (edited) Hi, I try to install XP SP3 on the Shuttle Hot 433 board with 486 cpu. But very early in Setup comes a message, that the 486 cpu does not support the hex opcode cmpxchg8b and so XP cant be installed. I also try an XP SP3 from another compi in IDE mode, crash at once. Now I look at the hex wíth Ida pro for this cmpxchg8b on an ready XP SP3 install. On a first try I find it in ntoskrnl.exe (one cpu) and in ntdll.dll. There may be other PE files in XP also with this opcode. The use is always the same. This opcode does a atomic search in a register. So, when a working solution is found, the replacement in other files is easy! I try to replace it with a series of opcodes, that the 486 cpu understands. This is not easy. I found this (Edit: This is wrong). push ebx ; save nonvolatile registers push ebp xor ebx, ebx ; zero out new pointer mov ebp, ecx ; save listhead address mov edx, [ebp] + 4 ; get current sequence number mov eax, [ebp] + 0 ; get current next link Efls10: or eax, eax ; check if list is empty jz short Efls20 ; if z set, list is empty mov ecx, edx ; copy sequence number mov cx, bx ; clear depth leaving sequence number jnz short Efls10 ; if z clear, exchange failed Efls20: pop ebp ; restore nonvolatile registers pop ebx ret This I try as a replacement for this function ExInterlockedFlushSList in ntoskrnl.exe in XP SP3. The funny thing in this is, that simple the opcode cmpxchg8b qword ptr [ebp+0] is deleted. May be it works on NT4 but for me it crashes XP. EDIT: May be, that this version for i368 cpu of ExInterlockedFlushSList works really only on a compi with 1 cpu and 1 core. Like in 1992 486 cpu. Then, my test on modern compi will fail. Also can be, that now I use a mix of cmpxchg8b, nothing from this, cmpxchg on one compi, because I simulated only one appearence of this function in ntoskrnl.exe. Funny, this is from Cutler, 13. March 1996, now also identic in XP SP3, THis is the original ExInterlockedFlushSList in XP SP3, first introduced in NT4 Servicepack4, Hex code 53 55 33 DB 8B E9 8B 55 04 8B 45 00 0B C0 74 0B 8B CA 66 8B CB 0F C7 4D 00 75 F1 5D 5B C3 .text:0040B0B2 ; Exported entry 7. ExInterlockedFlushSList .text:0040B0B2 .text:0040B0B2 ; =============== S U B R O U T I N E ======================================= .text:0040B0B2 .text:0040B0B2 .text:0040B0B2 public ExInterlockedFlushSList .text:0040B0B2 ExInterlockedFlushSList proc near ; CODE XREF: sub_45F0DF:loc_45F0F7p .text:0040B0B2 push ebx .text:0040B0B3 push ebp .text:0040B0B4 xor ebx, ebx .text:0040B0B6 mov ebp, ecx .text:0040B0B8 mov edx, [ebp+4] .text:0040B0BB mov eax, [ebp+0] .text:0040B0BE .text:0040B0BE loc_40B0BE: ; CODE XREF: ExInterlockedFlushSList+19j .text:0040B0BE or eax, eax .text:0040B0C0 jz short loc_40B0CD .text:0040B0C2 mov ecx, edx .text:0040B0C4 mov cx, bx .text:0040B0C7 cmpxchg8b qword ptr [ebp+0] .text:0040B0CB jnz short loc_40B0BE .text:0040B0CD .text:0040B0CD loc_40B0CD: ; CODE XREF: ExInterlockedFlushSList+Ej .text:0040B0CD pop ebp .text:0040B0CE pop ebx .text:0040B0CF retn .text:0040B0CF ExInterlockedFlushSList endp .text:0040B0CF .text:0040B0CF ; --------------------------------------------------------------------------- With PE Maker I make a relocate of this function in ntoskrnl.exe. This works(!). The relocation I do, because the following replacement is bigger than the original Hex code. I split the cmpxchg8b opcode in 2 parts with lock cmpxchg, because the 486 cpu understands this. But Bsod. I use Windbg, cant fetch the reason. I check my hex code several times, find no error. The only thing in my eyes that can happen, is a missing syncronic between the 2 cmpxchg. This does not happen on cmpxchg8b, because all memory is blocked during this operation. Here is my last try for the replacement of the ExInterlockedFlushSList .data:004762B2 ; --------------------------------------------------------------------------- .data:004762B2 ; Exported entry 7. ExInterlockedFlushSList .data:004762B2 .data:004762B2 public ExInterlockedFlushSList .data:004762B2 ExInterlockedFlushSList: ; CODE XREF: sub_45F0DF:loc_45F0F7p .data:004762B2 ; DATA XREF: .edata:off_5AC2A8o .data:004762B2 push ebx .data:004762B3 push ebp .data:004762B4 xor ebx, ebx .data:004762B6 mov ebp, ecx .data:004762B8 mov edx, [ebp+4] .data:004762BB mov eax, [ebp+0] .data:004762BE .data:004762BE loc_4762BE: ; CODE XREF: .data:004762D5j .data:004762BE or eax, eax .data:004762C0 jz short loc_4762DA .data:004762C2 mov ecx, edx .data:004762C4 mov cx, bx .data:004762C7 lock cmpxchg [ebp+4], eax .data:004762CC mov ecx, edx .data:004762CE mov edx, ecx .data:004762D0 lock cmpxchg [ebp+0], eax .data:004762D5 jnz short near ptr loc_4762BE+1 .data:004762D7 nop .data:004762D8 nop .data:004762D9 nop .data:004762DA .data:004762DA loc_4762DA: ; CODE XREF: .data:004762C0j .data:004762DA pop ebp .data:004762DB pop ebx .data:004762DC nop .data:004762DD nop .data:004762DE nop .data:004762DF retn .data:004762DF ; --------------------------------------------------------------------------- I put this via relocation to the new address 4762B2. This is in .data section and not in .text section. But this does not matter, because when I put the original Hex code to this new place, it works. The original place at 40B0B2 I fill with 00 00 00.. for to make sure, that now my function at this new place is used. I want to get better in Assembler. No free KI for Assembler in Internet. Do you have an idea @Mov AX, 0xDEAD? Chatgpt, Bard AI and Bing behave like crazy, when it comes to Hex code Dietmar Edited March 29 by Dietmar 1
gerwin Posted March 27 Posted March 27 (edited) Surely people at Vogons have tried similar things. I just did a quick search with the terms: ""site:vogons.org "windows xp" 486 cmpxchg8b"" See for example post from KCompRoom2000 here: https://www.vogons.org/viewtopic.php?t=82914 EDIT: Also the link to the POD tests at winhistory.de here: https://www.vogons.org/viewtopic.php?t=75778 PS, this is my 486 system, With DOS and Windows 95: https://www.vogons.org/viewtopic.php?p=1117089#p1117089 Edited March 27 by gerwin
Dietmar Posted March 27 Author Posted March 27 Hi, I found also this but have no idea how to make a simulation for 486 cpu from it, because it has an retn, a second retn is not good in a function Dietmar the single instruction lock cmpxchg8b qword ptr [ebp] is replaceable with the following sequence pushfd try: cli lock bts dword ptr [edi],0 jnb acquired popfd pushfd wait: test dword ptr [edi],1 je try pause ; if available jmp wait acquired: cmp eax,[ebp] jne keep cmp edx,[ebp+4] je exchange keep: mov eax,[ebp] mov edx,[ebp+4] jmp done exchange: mov [ebp],ebx mov [ebp+4],ecx done: mov byte ptr [edi],0 popfd and this lock cmpxchg8b qword ptr [esi] is replaceable with the following sequence pushfd try: cli lock bts dword ptr [edi],0 jnb acquired popfd pushfd wait: test dword ptr [edi],1 je try pause ; if available jmp wait acquired: cmp eax,[esi] jne keep cmp edx,[esi+4] je exchange keep: mov eax,[esi] mov edx,[esi+4] jmp done exchange: mov [esi],ebx mov [esi+4],ecx done: mov byte ptr [edi],0 popfd
user57 Posted March 28 Posted March 28 well you certainly can translate this command to a 32 bit variant code you already have used the "cmpxchg" assembly command but it actually should do the wrong job sometimes because that compares up only 32 bits (and then already react to the 32 bits) (if that compare was the same or not already changed the result because it can already react to either the first 32 bits or the next 32 bits) (.data:004762D5 jnz short near ptr loc_4762BE+1 - that done again erased the first 32 compare results and only react to the next 32 bits compare) but you need the result for 64 bits compare! it seems to me that you can also solve this problem by : making 2 compares "cmp" commands for the flags/reaction now it is about not to make the same mistake (if you do just the 32 bit compare again it reads the next 32 bits and ignored the first 32 bits from the first compare) you need a reaction to the first compare (if that was the case) and making the "cmp" command again and react a second time if both compares was correct you make the reaction just as described (else the other described reaction) : https://www.felixcloutier.com/x86/cmpxchg8b:cmpxchg16b that command description actually dont say something about exchanging the values it just says that if the 64 bit compare was equal it says "if the compare was equal the values in it stores the data in ECX and EBX in other case in EDX EAX (what dont look a exchange for me) - maybe the description lacks (what i useally do then i try it out and take looks) // if it would be an exchange it would be: (later reading the code i dont see a common exchange a common exchange would be if eax would be changed to edx - eax having eax and edx having eax): 4 assembly "mov" commands (2 for the destination and 2 for the source) or: 2 times the "xchg" command // but ! looking the assembly code from you it seems different to me i dont see a exchange (just let me say im not entire certain here, but it might helps to talk about that): the cmpxchg8b command seems to compare registers EDX and EAX for equal and then changing an offset to a memory location (stack register two "EBP") (qword ptr [ebp+0]) (qword useally describes a 64 bit movement (word * 4 (16 bits * 4)) if that result was equal it should store EAX and EDX to that offset (otherwise it probaly loads that values to EDX EAX) the next command is "jnz" that command still has the results from this compare, if they was equal it jumps back to "Efls10" (what seems a loop to me) if not it continues the end and and this function seeing your code again "lock cmpxchg [ebp+4], eax" dont have a reaction but it might need (as said before it need a reaction to both of the 32 bits) if that was not the case it need to end this (not always just continue) done that way the first 32 bit can have a false result - and if the next 32 bit are right - then it just still do the job - while it should not --------------------- if the 64 bit guys apear, that is not neccesary needed if you have to use more then 32 bits there are severial methods you can solve this (to name a few) 1: one is using 2 registers and just create its behavoir for that there is a such 32 bit assembly command that is used for that ( CDQ - Convert Word to Doubleword/Convert Doubleword to Quadword ) 2: an offset to somewhere in memory that is bigger then 32 bits and control it as 64 bits 3 (even more is possible with a offset location): if you have more then 64 bit flags you just need an offset to a location , where you actually control the flags/ or data 4: for file movements there is for the REP command the CPU actually can see that it has to move a certain amount of data, and the cpu can translate the filemovement to something it actually can progress the FSB (quad pumped) to the RAM is doing a such thing unlike the 64 bit guys might would think you dont need a 64 bit offset for this a other example would be the CACHE, HDD´s use a CACHE to fill up the data that data can then be progressed differently - like with 2 bit(wires), 4 bit, 16, 32, 64 or even more (it rather comes down what the physical cable/wire can do)
Dietmar Posted March 28 Author Posted March 28 (edited) Now I will describe as good as I can the work of the function ExInterlockedFlushSList in XP SP3. cmpxchg8b works on 64 contiguous bits. Those 64 bits (8 bytes) stand in memory (RAM) of the compi at a given place. Those 64 bits are here given indirect to cmpxchg8b by the 32 bit register EBP on the cpu. In EBP stands a 32 bit address, which points exact to the first byte from those 64bit. Even EBP holds in XP only a 32-bit address, cmpxchg8b qword ptr [ebp+0] works from the RAM location given by ebp for all the 64bit from there. The cmpxchg8b instruction works now directly on these 64bits in memory. So we have cmpxchg8b qword ptr [ebp+0]. Example:The 64bits in memory are 0x1122334455667788. 11223344 are the higher 32bit. 55667788 the lower 32bit. In EAX stand 0x55667788 in EDX stand 94712056 (any values). Now only the 32bits in EAX are compared via cmpxchg8b with the 64 bit in ram. (Only each lower 32bit compare.) This behavior is, because we have a 32bit OS. The higher bits in EDX are just ignored. Also those higher 32bits from the 64bit in Ram. By the way this means, that when we use "lock cmpxchg" in a simulation, it is without any sense to use "lock cmpxchg" 2 times. Here we need the "lock" because only cmpxchg8b is from home atomic, means no other processor can disturb the memory during its comparing operation. This is only garanted for cmpxchg with the lock before it. In my example we have the case, that the lower 32 bit in Ram and in EAX are identic. In this case, the lower 32 bits (of the 64-bit value in memory) will be replaced with the 32 bits stored in ebx. But EBX = 00 00 00 00. This means, the real list in memory is filled to half from botten with 00. From a 32 bit view, this list is now empty at all. The higher 32bit in Ram are not changed, whatever is there, whatever is in EDX. The Zero flag is set after a change happens. If the bits in EAX and the lower 32bits in Ram from the 64 bits are not identic, cmpxchg8b will do nothing with the 64 bit in memory and also change nothing in EAX, EDX, EBX, ECX, EBP. So, in this case cmpxchg8b has the same effect as 90 90 90 90. The Zero flag is NOT set. Now I see, what happens with my try, when I just replace cmpxchg8b qword ptr [ebp+0] with 90 90 90 90. At once I have an infinite loop, because no Zero flag is set. Unclear for me, why there is this loop. I n my eyes, in a first try the both lower 32 bit pairs are identic and exchanged against 00 00 00 00. Edited March 29 by Dietmar
Dietmar Posted March 28 Author Posted March 28 (edited) Now we come to the whole work of the function ExInterlockedFlushSList in XP SP3. This function starts after its call with push ebx ; Push value of the ebx register to the stack to rescue its content there, its value is not changed. push ebp ; Push value of the ebp register to the stack to rescue its content there, its value is not changed. xor ebx, ebx ; Set the ebx register to zero (EBX = 00 00 00 00) by performing a bitwise XOR operation with itself. mov ebp, ecx ; Copy value of the ecx register in the ebp register (ECX value has to be prepared outside this function). mov edx, [ebp+4] ; Copy the high 32-bit value stored at the RAM address [ebp+4] into the edx register (ebp is new from above ecx). mov eax, [ebp+0] ; Copy the low 32-bit value stored at the RAM address [ebp+0] into the eax register (ebp is new from above ecx). Now we have empty ebx, and the lower 32bit in ram from the address of ecx, and the higher 32bit from the address from ecx. or eax, eax ; If eax was zero, the zero flag will be set. If eax was non-zero, the zero flag will be cleared. jz short loc_4762CD ; If EAX was zero, we overjump (short) all of the compare, to address 4762CD. mov ecx, edx ; Now we move the content of edx to ecx. The content of ecx is lost, the content in edx is still kept. But the content of ecx is (see before) already rescued in ebp. in ECX are now the higher 32 bit from 64 bit in Ram. mov cx, bx ; cx represents the lower 16 bits of the ecx register. bx represents the lower 16 bits of the ebx register. mov cx, bx copies the content of the lower 16 bit of the ebx register (bx) into the lower 16 bit of the ecx register (cx). The upper 16 bits of both ebx and ecx remain unchanged. This means: In ECX now only the 2 highest Byte survive from the 64 bit in memory. They can be 00 00 also. So, it is not impossible, that ECX = 00 00 00 00 , but only when the 2 highest bytes from the 64 bit in memry are also 00 00. Example: EBX = 0x12345678 (upper 16 bit: 0x1234, lower 16 bit: 0x5678) ECX = 0x98765432 (upper 16 bit: 0x9876, lower 16 bit: 0x5432) Now mov cx, bx EBX remains unchanged (0x12345678). ECX will have only its lower 16 bit replaced with the lower 16 bit from bx = 0x5678. The upper 16 bit of ECX will remain the same (0x9876). So, this is the only change from mov cx, bx is in this example ECX = 0x98765678 jnz short loc_4762BE ; If the operation cmpxchg8b qword ptr [ebp+0] changes Ram via EBX, the Zero flag is set. Then, we go out of the loop, just next opcode after this jnz short loc_4762BE instruction. If the bits in EAX and the lower 32bits in Ram from the 64 bits are not identic, the cmpxchg8b qword ptr [ebp+0] does just nothing with any memory or register. But the Zero flag is not set. So, the jump to loc_4762BE happens. pop ebp ;Fetches the topmost value from the stack and store it in the ebp register and delete its value on top of stack. pop ebx ; Fetches the now topmost value from the stack, store it in the ebx register. Delete this value on stack. retn ; Return from the function ExInterlockedFlushSList to the caller. And delets the return address from the stack (the address where the function was called from). Jumps to the popped return address, effectively resuming execution from the point where the function was called. Edited March 28 by Dietmar
Dietmar Posted March 28 Author Posted March 28 (edited) Here is the from me relocated function ExInterlockedFlushSList from XP SP3 .data:004762B2 ; Exported entry 7. ExInterlockedFlushSList .data:004762B2 .data:004762B2 ; =============== S U B R O U T I N E ======================================= .data:004762B2 .data:004762B2 .data:004762B2 public ExInterlockedFlushSList .data:004762B2 ExInterlockedFlushSList proc near ; CODE XREF: sub_45F0DF:loc_45F0F7p .data:004762B2 ; DATA XREF: .edata:off_5AC2A8o .data:004762B2 push ebx .data:004762B3 push ebp .data:004762B4 xor ebx, ebx .data:004762B6 mov ebp, ecx .data:004762B8 mov edx, [ebp+4] .data:004762BB mov eax, [ebp+0] .data:004762BE .data:004762BE loc_4762BE: ; CODE XREF: ExInterlockedFlushSList+19j .data:004762BE or eax, eax .data:004762C0 jz short loc_4762CD .data:004762C2 mov ecx, edx .data:004762C4 mov cx, bx .data:004762C7 cmpxchg8b qword ptr [ebp+0] .data:004762CB jnz short loc_4762BE .data:004762CD .data:004762CD loc_4762CD: ; CODE XREF: ExInterlockedFlushSList+Ej .data:004762CD pop ebp .data:004762CE pop ebx .data:004762CF retn .data:004762CF ExInterlockedFlushSList endp .data:004762CF .data:004762CF ; --------------------------------------------------------------------------- Edited March 28 by Dietmar
Dietmar Posted March 28 Author Posted March 28 (edited) And now the explanation, what this function ExInterlockedFlushSList is doing in real: The calling function gives the register ECX to this function ExInterlockedFlushSList. In ECX stays the information of the startpoint for a 64 bit list in memory. Now the function ExInterlockedFlushSList checks 2 scenarios: ECX=NULL is given back to the calling function, which means, that never such a list existed, because EAX=0. The second scenario is, that EAX is not NULL. In this case, the ONLY thing, that the function ExInterlockedFlushSList is doing, is to delete the pointer in the register ECX. But the first 2 highest bytes are stored in ECX. So, mostly ECX is not Null, only when the highest 2 Byte are 00 00. The list itself stays untouched in memory. But now, the calling function has lost all information about the place in memory about this list, because the work of ExInterlockedFlushSList on ECX. And it cant be repaired from the calling function via ECX, because ECX contains only 2 highest 2 bytes from the 64 bit in Ram. The whole list is kept in Ram and also with its higher 32 bit in EDX and the higher 32 bit in EAX. Edited March 29 by Dietmar
Dietmar Posted March 28 Author Posted March 28 I make a new try with my hacked function .text:0040B0B2 ; Exported entry 7. ExInterlockedFlushSList .text:0040B0B2 .text:0040B0B2 ; =============== S U B R O U T I N E ======================================= .text:0040B0B2 .text:0040B0B2 .text:0040B0B2 public ExInterlockedFlushSList .text:0040B0B2 ExInterlockedFlushSList proc near ; CODE XREF: sub_45F0DF:loc_45F0F7p .text:0040B0B2 ; DATA XREF: .edata:off_5AC2A8o .text:0040B0B2 push ebx .text:0040B0B3 push ebp .text:0040B0B4 xor ebx, ebx .text:0040B0B6 mov ebp, ecx .text:0040B0B8 mov edx, [ebp+4] .text:0040B0BB mov eax, [ebp+0] .text:0040B0BE or eax, eax .text:0040B0C0 jz short loc_40B0C9 .text:0040B0C2 mov ecx, edx .text:0040B0C4 mov cx, bx .text:0040B0C7 xor ecx, ecx .text:0040B0C9 .text:0040B0C9 loc_40B0C9: ; CODE XREF: ExInterlockedFlushSList+Ej .text:0040B0C9 pop ebp .text:0040B0CA pop ebx .text:0040B0CB nop .text:0040B0CC nop .text:0040B0CD nop .text:0040B0CE nop .text:0040B0CF retn .text:0040B0CF ExInterlockedFlushSList endp .text:0040B0CF .text:0040B0CF ; --------------------------------------------------------------------------- Hex code 53 55 33 DB 8B E9 8B 55 04 8B 45 00 09 C0 74 07 8B CA 66 89 D9 33 C9 5D 5B 90 90 90 90 C3 But I get this Bsod kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace. Arguments: Arg1: 0a130038, memory referenced Arg2: 00000002, IRQL Arg3: 00000000, value 0 = read operation, 1 = write operation Arg4: f7839bd8, address which referenced memory Debugging Details: ------------------ READ_ADDRESS: 0a130038 CURRENT_IRQL: 2 FAULTING_IP: storport!StorPortExtendedFunction+57cd f7839bd8 8b7e24 mov edi,dword ptr [esi+24h] DEFAULT_BUCKET_ID: DRIVER_FAULT BUGCHECK_STR: 0xD1 PROCESS_NAME: System ANALYSIS_VERSION: 6.3.9600.17237 (debuggers(dbg).140716-0327) x86fre DPC_STACK_BASE: FFFFFFFFF78A3000 TRAP_FRAME: f78a2ef8 -- (.trap 0xfffffffff78a2ef8) ErrCode = 00000000 eax=8a619ab8 ebx=00000000 ecx=8a619b4c edx=00000000 esi=0a130014 edi=8a619ab8 eip=f7839bd8 esp=f78a2f6c ebp=f78a2f78 iopl=0 nv up ei pl zr na pe nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246 storport!StorPortExtendedFunction+0x57cd: f7839bd8 8b7e24 mov edi,dword ptr [esi+24h] ds:0023:0a130038=???????? Resetting default scope LAST_CONTROL_TRANSFER: from 80532747 to 804e3592 STACK_TEXT: f78a2aac 80532747 00000003 f78a2e08 00000000 nt!RtlpBreakWithStatusInstruction f78a2af8 8053321e 00000003 0a130038 f7839bd8 nt!KiBugCheckDebugBreak+0x19 f78a2ed8 804e187f 0000000a 0a130038 00000002 nt!KeBugCheck2+0x574 f78a2ed8 f7839bd8 0000000a 0a130038 00000002 nt!KiTrap0E+0x233 WARNING: Stack unwind information not available. Following frames may be wrong. f78a2f78 f783a26e 8a619ab8 8a6129f0 8a4be024 storport!StorPortExtendedFunction+0x57cd f78a2fa8 f782b356 8a610438 8a619ab8 8a610438 storport!StorPortExtendedFunction+0x5e63 f78a2fd0 804dbbd4 8a6129ac 8a612938 00000000 storport!DllInitialize+0xfc5 f78a2ff4 804db89e f789ded8 00000000 00000000 nt!KiRetireDpcList+0x46 f78a2ff8 f789ded8 00000000 00000000 00000000 nt!KiDispatchInterrupt+0x2a 804db89e 00000000 00000009 bb835675 00000128 0xf789ded8 STACK_COMMAND: kb FOLLOWUP_IP: storport!StorPortExtendedFunction+57cd f7839bd8 8b7e24 mov edi,dword ptr [esi+24h] SYMBOL_STACK_INDEX: 4 SYMBOL_NAME: storport!StorPortExtendedFunction+57cd FOLLOWUP_NAME: MachineOwner MODULE_NAME: storport IMAGE_NAME: storport.sys DEBUG_FLR_IMAGE_TIMESTAMP: 6142afab IMAGE_VERSION: 6.1.7601.25735 FAILURE_BUCKET_ID: 0xD1_storport!StorPortExtendedFunction+57cd BUCKET_ID: 0xD1_storport!StorPortExtendedFunction+57cd ANALYSIS_SOURCE: KM FAILURE_ID_HASH_STRING: km:0xd1_storport!storportextendedfunction+57cd FAILURE_ID_HASH: {2d353e86-f9c7-de18-d8db-956bcb502646} Followup: MachineOwner ---------
Dietmar Posted March 28 Author Posted March 28 (edited) So I think, that even on one cpu with one core and one thread, via this attempt cmpxchg8b qword ptr [ebp+0] is necessary Dietmar PS: Now I think, that I read the paper from Cutler wrong. There is NO version for .386 at all in this paper. Edited March 28 by Dietmar
jumper Posted March 28 Posted March 28 Use a slim lock instead. If an SList node is present, it must be processed (Next and Depth zeroed). A pointer to the next node in the list must be returned.
Dietmar Posted March 28 Author Posted March 28 (edited) @jumper I do not think, that always the register is set to ECX = Null. Only, when the first 2 highest bytes are also 00 00. Because in this case, my fake function from above would always work. Can you please explain me in detail, what you think about the work of ExInterlockedFlushSList. "If an SList node is present, it must be processed (Next and Depth zeroed). A pointer to the next node in the list must be returned." This sounds for me, that something of the original list hast to be given back to the calling function via the register ECX, means ECX not Null, if a real list exist. But from the code I see, that the last 16 bits of ECX for sure are set to zero, mov ebp, ecx means, that now the original pointer in ecx to the list is rescued is ebp. mov edx, [ebp+4] means, that this original content in ram, to what the pointer shiftet by 4 bytes = 32 bit point and now those bytes are stored in edx. In EBP is the original pointer stored from ECX. It points to the lowest byte of the 64 real bits in Ram. So, now EDX contains the whole higher 32 bits (not a pointer) from the original 64 Bit in Ram. In EAX is with mov eax, [ebp+0] the original content of the 32 lower bits, from original 64 bits in Ram. With mov ecx, edx are now in ECX also the 32 higher bits from Ram (no pointer any more, Adress to 64 bit is lost). With mov cx, bx now for the lowest 16 bit in ECX are set to 00 00, because EBX is empty at all. What is now in ECX? The 2 Highest Bytes from the original 64 bits in Ram, with 00 00 at its end. in [EBP+0] is still the Pointer to the lowest byte in ram, but with [ ] it becomes the real 64 original bit in Ram. Now, the lower 32 bit from the original 64 bit in Ram are compared with the content of EAX. In EAX are also the 32 lower bits, so the same bits as at the adress of [EBP+0]. The lower half of the 64 but list in memory is filled with 00 00 00 00, because EBX= 00 00 00 00. The upper half of the 64 bit list in memory stays untouched. So, no loop at all, the Zero flag is set. But ECX = 2highest bytes from the original 64 bits in ram, followed by 00 00. Even no value is direct returned from this function, ECX contains the 2 highest Bytes from original 64 bits in ram. EBP and EBX are set from the stack back to there original value before the function is used. In EAX are still the 32 lower bits from the original 64 bits in Ram. in EDX are still the 32 higher bits from Ram. So, the Adress (Pointer) to the 64 bit in Ram is lost. Also the real 64 bit list keeps only her upper 32 bits. The lower 32 bits of this list becomes 00 00 00 00. So, where is flush? The pointer to the 64 bit in ram is complete destroyed. A simulation of cmpxchg8b has to show exact those values in all the registers as here. This can be testet by hand. Edited March 29 by Dietmar
Mark-XP Posted March 28 Posted March 28 (edited) 6 hours ago, Dietmar said: Here is the from me relocated function ExInterlockedFlushSList from XP SP3 ... .data:004762BE loc_4762BE: ; CODE XREF: ExInterlockedFlushSList+19j .data:004762BE or eax, eax .data:004762C0 jz short loc_4762CD ... or eax, eax - what does that do? Is this ment to initialize the Flags OF, CF or modify the SF, ZF, PF Flag! Edited March 28 by Mark-XP
Dietmar Posted March 28 Author Posted March 28 @Mark-XP or eax, eax ; If eax was zero, the zero flag will be set. If eax was non-zero, the zero flag will be cleared Dietmar 1
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now