Jump to content

user57

Member
  • Posts

    105
  • Joined

  • Last visited

  • Donations

    0.00 USD 
  • Country

    Germany

Posts posted by user57

  1. and the firmware translate this correctly ? 

    it would make sence the the harddrives firmware actually know this and translates these to physical places on the real harddrive

     

    if the partition can filled with how you want to have the clusters, what is even the problem ?

  2. Cixert creator of thread this has mentioned other methods it always came in to use bigger sectors, it it was mentioned again by Milkinis

    some say that already worked for them

    it is a similiar discussion:

    https://msfn.org/board/topic/176480-2-tib-limit-size-in-mbr-hard-drives/#comments

    user-mode wise it dont seems a problem to me since it use that overlapped structure

    it contain 2 times 32 bits (64 bits) offsets -> those get translated to a physical address on a harddrive (i think recently somewhere i pointed that out somewhere passing to 64 bit via a structure)

    https://learn.microsoft.com/en-us/windows/win32/api/minwinbase/ns-minwinbase-overlapped

    that harddrive example makes a good example why and how 32 bit where passed, we know why there already where HDD discs with more then 4 GB - so actually we have a passed method because harddrive reached that areas a lot ealier then the RAM

     

    to be honest it dont look hard either since the function already can do that - sure i might not know about the windows driver for now ... but that raise the question why the driver cant do that

     

    it looks simple to me up to the point i know about it

    it just has to convert that 64 bit address given in the overlapped structure to a physical offset on the disc

    if they are 512  / 4096 /whatever "cluster-sector" size

    thats easy too , that just means you have more data that you actually can use with the 64 bit offset

     

    to make an example if the sector size was 1 you might would have have the 4 GB limit with a 32 bit offset, but that simply didnt use the other 32 bits (that are available)

    in case the sector was 512 with and now having a 4096 sector that means you have 8 times more space

    4 gb (32 bit) * 512 = 2,19 TB

     

    GPT is a partion not a disc , a partion is a small file on the disc (in the past it was easy to currupt, you had bad luck if that one got demaged) - thats why you rather dont come to easy to access it

  3. well that with the GPT might be wrong idea in this case

    the idea was for a MBR with bigger sectors - even tho the title was supposed for reading the GPT partition

     

    GPT has not really a use except the higher possible disc space

    the idea that came around was just to increase the MBR sectors, the boot or read of GPT partition would be a different question then

     

    that paragon driver is made from a public driver, but it dont increase the MBR sectors

    that driver probaly emulates a next disc, where that driver makes read and writes 

     

    if the windows driver really cant do that only then a driver change would be needed

  4. well i dont know what this firmware is written at

    but even if it would be a pure assembly code i certainly can change that code to all of the needs

    i suspect for the firmware a c/c++ (there are some differences in these but they are not big and i know them too) , combined with some assembly code

     

    i certainly can understand those codes and change them , but its something to read into - i dont know all the disc norms 

     

    but thats something a programmer can do 

    i was involved in chrome gdi, supermium, llvm,sumatra pdf or that heic image encoder

    to say the least it took some time to read into that codec, but the code i actually understand

    https://msfn.org/board/topic/185879-winxp-hevcheifheic-image-encoderdecoder/#comment-1254293

     

  5. since its finalized you should write a protocol and make a release


    you told us it´s acting oddly slow ? maybe you should try the code i posted up 
    it actually can be that the reaction sometimes, 1 effect can be that the subtraction dont cause it to pop/push that well
    then might a escape or other logic has to take it out

    happy to see that you found a new section to use too, i told you its risky just to use other ram and the one you had where used

    roytram gave you the right solution for this

    happy to see the 486 working well 


    interesting to see XP actually choose 32 MB instead of 256 MB 

    caches useally makes the the computer faster 
    https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ns-winioctl-storage_write_cache_property
    https://www.seagate.com/de/de/support/kb/disabling-the-write-cache-feature-in-windows-2000-xp-vista-and-windows-7-187751en/
     

  6. this is a good time to talk about the CPUID command

    that command returns info about the processor 
    it stores that information in EAX, EBX, ECX, and EDX

    very interesting for WINXP might be the PSE flag and the PAE flag

    with this interesting result as we always have it around somewhere "32 bits or wires are the limit for 32 bits"

    that guy actually wrote it like this:
    "Summary of 32-bit paging":
    "This allows a maximum RAM configuration of 252 bytes, or 4 petabytes (about 4.5×1015 bytes)."

    and it tells us win2k actually used up these methods "Windows 2000 Datacenter Memory Limit 32 GB RAM"

    https://en.wikipedia.org/wiki/Physical_Address_Extension
    https://en.wikipedia.org/wiki/PSE-36
    https://en.m.wikipedia.org/wiki/Page_Size_Extension
    we might can but OS, CPU and BUS/RAM have to do so


    but back to the cpuid command

    it has information what commands can be used or what "technology" is available for this cpu

    this includes if it can make that cmpxchg8b command

    in EDX MMX (flag 23), cx8(flag 8 = cmpxchg8b), (pse(page size extension) flag 3), pae ((physical address extension)flag 6) , in ECX (AVX (28), sse4.2 , sse4.1, sse3) and so on

    the operating system useally should know if that command in invalid

    if it just continue it might use SSE or the MMX commands, what should cause a BSOD


    so rather be safe and store them up with a CPU result you actually made with a cpuid command script from a old CPU (a script for cpuid is easy to write and around in web)
    maybe from a late 486 cpu (what we can google that those are to be said to have the cpuid command)
    then you know for sure what those CPU actually gave back as result 
    (the few flags maybe if that cmpxchg8b was avaiable you can just delete up)

    then you fill up either the registers or where windows store that information, then the OS/WINXP can react to that information, if WINXP actually dont have a reaction, if the command was not correctly reconized, failed, ect

  7. well honestly i actually do not want to study the entire thing behind that 
      
    if its a PCB control(what i dont know - nor think) you have to study the entire function chain for this - the entirety of windows in relation to this
    at least the entire reaction related to that SLIST_HEADER/PKSPIN_LOCK strucuture is needed 
    that raise a big question why that 2 strucutes would actually be that - sounds at least very odd to me
    so i want to say im out of this for now

    i remember intel removed the lock prefix as a virus once used it to hide its activity/itself(if i remember correct
    it execute the lock prefix - but it no longer has any effect - that lets normal activity continue)

    that description from masm archiv tells us that lock rep where removed already on a 286 cpu, so a 486 is affected (wanna go back to a 186 ? (joke))

    a different cpu however needs some time to react, if a interference should happen, to be honest i dont think so

    and i changed up the entire IDT table and even made it invalid, not even execution 10 commands caused a problem - if there would be a fault in the 10 command then maybe but
    this is not the case 

    this mov commands are however in nanosecond´s area, i dont think it actually can that it can interrupt this so fast
    a thread/cpu switch takes time
    rather 10 milliseconds would be something here (for others nano are a lot faster then 1 ms 1000 ns = 1 ms)

    if the thinking was about some kind of high language problem like "java atom" 
    java and programming languages dont have atom based relations that rather comes from the programming language itself
    and is not CPU based

    only assembly actually do a such thing, assembly dont work like a high language 

    IRQL,SIT/CLI and lock 

    2 locks then 2 command then locks dont make a "atomic move" either
    again i dont think that is the problem

    the REP command without lock it still would be done with 1 command executed - this goes as fast the cpu can handle this 
    whatever exactly cycles that caused on the CPU itself

    i think if there is a problem the problem is not with the emulation, the problem is elsewhere, without make a big code to try around and looking the WRK
    dietmar could look that 5 functions in the win2k kernel too, maybe that helps or maybe not if the structure reaction/s changed up
    if somebody has a proof or the right knowlegue - let me know

    actually maybe the cmpxchange8b command where not entire used, only a part of its doing/reaction 
    some changes actually also can be skipped - some are bad like bsod - while others continue without full functionality - while others work correct
    - and while others work but not that well - while others made some code but that code just didnt change anything and function too

    very certain what controls SLIST_HEADER, PKSPIN_LOCK would be a next step to look if the functions did the right things

    but also a next fault could be a problem, it would not be uncommon if 1 problem is solved, that just a next problem apears - what actually then has nothing to do anymore with the first problem (just in case i wanted to say that - for now hopefully not the problem)

    lets just say very likely those 2 structures (if correctly changed with the emulation) will be processed with some next code (why a atomic move would be needed?)

    https://www.nirsoft.net/kernel_struct/vista/SLIST_HEADER.html

  8. interesting thats neither "atomic" in both 2 moves nor the non interrupt flag
    i wrote dietmar he might leave it out in a private message
    also it dont have the checks or the loop, and the cmp cmpxchg8b is not done correct

    maybe it just fulfills that functions needs

    that can be, instead of just replacing the function the function where written to its real needs


    so we didnt had to be so specific, just for the correct function reaction

    well done

  9. roytam gave you the patch code, it are 5 changes

    lea ecx, sub_40078C <--- thats the first function that this replaced if it founds that cpuid number (ExInterlockedCompareExchange64)

    lea ecx, loc_4006F0 <-- next (ExInterlockedPopEntrySList)

    lea ecx, loc_400704 (ExInterlockedPushEntrySList)

    lea ecx, loc_400714 (ExInterlockedFlushSList)

    lea ecx, ExInterlockedAddLargeInteger

    last one is different it replace sub_402352 with ExInterlockedAddLargeInteger

     

     

    somebody can tell that the functions are at these places

  10. well i would think a different offset is possible 

    but if they are equal it exchange that offset with ECX and EBX (it overworks those)

    if you know about logical circuits you might know why 

     

    but exactly this is why i wanted to say he dont need that command at least i dont see a reason for that 

     

    the function itself seems to compare: 

    "if (this offset 64 bit entry still has the same value as EAX and EDX) -> change that offset with ECX and EBX (then it actually has a 64 bit changed value there)"

    functions descriptions:

    https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-exinterlockedflushslist

    https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-exinterlockedpopentryslist

    https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-exinterlockedpushentrylist

     

    "If there were entries on the specified list, ExInterlockedFlushSList returns a pointer to the first SLIST_ENTRY structure that was an entry on the list; otherwise, it returns NULL."

    this opens the question why they used this command to return an pointer offset to the first SLIST_ENTRY structure and changing that SLIST_ENTRY (what is a internal windows structure)

     

     

    about that "atomical"

    it "suppose to be" a "non interruptable doing"

    therefore dietmar probaly deleted the interrupt flag (so no interrupts - actually it still do that - but thats a other story) (the other is the "lock" command)

    the next question what comes to mind is changing 64 bit at once (non interrupt doing), not stepwise that that cmpxchg8b actually do (even tho its 32 bits - it just use 2 registers) 

    BUT i never seen that to be needed for 64 bits, that far to small to have an interaction, not even interrupts if they are changed up (and those are constandly used) have that problem

    https://www.quora.com/What-is-the-meaning-of-atomic-in-programming

     

    we also talking about a function here , so the function itself might actually be "atomical" if it solves its job, because the function has a start and a end to solve this step

    we are not in c++ that might has a code "atom (this)" in assembly you have to write the real instruction that physical do so (if that was a problem we might have some more answers for now)

     

     

     

    dietmar just said that he wants to remove cmpxchg8b with a working alternativ code, it might be a little road but over time we will find this 

    i highly suspect that list is for threading/multicore

  11. i heared chappell died a few months ago, sad story we could still need him
    reading chappells writing it says that there once was a solution that dont use the command if not supported by cpu
    that dont neccesary say if you just use a different one from a different OS version that it just work - maybe - maybe not


    why would it has to be that other cmpxchg command

    the linux one is not perfect - depending on what the other routines do, the linux one might work, but certainly its not 100 % correct, while mine is

    the linux one looks almost the same to the one i posted up, but it dont compare the 64 bits for the false result (maybe the linux solution dont need, 
    but again mine is correct the linux one is not)
    so why not "just the right one"


    doing it a other way cause more commands and maybe fixes, there are certainly multiple solutions

    https://www.felixcloutier.com/x86/cmpxchg
    the description might be wrong this time, the description here unlike cmpxchg8b it always compares EAX with the memory location
    the description actually dont tell that a other register then EAX can be choosen  


    well this time your code might work 


    but you rather trying to fix the results, the ZF non reaction is set to just go back that is ok but you have to do this in every function like this then


    also that makes 2 times locks xchg and 2 times lock cmpxchg 

    you do cmpxchg for the atomic question ? 

    if you have to change 64 bit at once then it might be atomic for the 64 bits, doing 64 bit in 1 step

    just having the lock prefix dont change it to a 64 bit mov 
     

  12. 36 minutes ago, Dietmar said:

    .data:004762B2 ; Exported entry   7. ExInterlockedFlushSList
    .data:004762B2
    .data:004762B2 ; =============== S U B R O U T I N E =======================================
    .data:004762B2
    .data:004762B2                 public ExInterlockedFlushSList
    .data:004762B2 ExInterlockedFlushSList proc near       ; CODE XREF: sub_45F0DF:loc_45F0F7p
    .data:004762B2                                         ; DATA XREF: .edata:off_5AC2A8o
    .data:004762B2                 push    ebx
    .data:004762B3                 push    ebp
    .data:004762B4                 pushf
    .data:004762B5                 cli
    .data:004762B6                 xor     ebx, ebx
    .data:004762B8
    .data:004762B8 loc_4762B8:                             ; CODE XREF: ExInterlockedFlushSList:loc_4762E1j
    .data:004762B8                 mov     ebp, ecx
    .data:004762BA                 mov     edx, [ebp+4]
    .data:004762BD                 mov     eax, [ebp+0]
    .data:004762C0                 or      eax, eax
    .data:004762C2                 jz      short loc_4762E3
    .data:004762C4                 mov     ecx, edx
    .data:004762C6                 mov     cx, bx
    .data:004762C9                 LOCK CMPXCHG [EBP], EAX
    .data:004762CE                 jnz     short loc_4762DB
    .data:004762D0                 LOCK CMPXCHG [EBP+4], EDX
    .data:004762D5                 jz      short loc_4762E1
    .data:004762D7                 jmp     short loc_4762E1
    .data:004762D9 ; ---------------------------------------------------------------------------
    .data:004762D9
    .data:004762D9 loc_4762DB:                             ; CODE XREF: ExInterlockedFlushSList+1Aj
    .data:004762DB                                         ; ExInterlockedFlushSList+1Fj
    .data:004762DB                 mov     eax, [ebp+0]
    .data:004762DE                 mov     edx, [ebp+4]
    .data:004762E1
    .data:004762E1 loc_4762E1:                             ; CODE XREF: ExInterlockedFlushSList+27j
    .data:004762E1                 jnz     short loc_4762B8
    .data:004762E3
    .data:004762E3 loc_4762E3:                             ; CODE XREF: ExInterlockedFlushSList+10j
    .data:004762E3                 sti
    .data:004762E4                 popf
    .data:004762E5                 pop     ebp
    .data:004762E6                 pop     ebx
    .data:004762E7                 nop
    .data:004762E8                 nop
    .data:004762E9                 nop
    .data:004762EA                 nop
    .data:004762EB                 nop
    .data:004762EC                 nop
    .data:004762ED                 nop
    .data:004762EE                 nop
    .data:004762EF                 retn
    .data:004762EF ExInterlockedFlushSList endp
    .data:004762EF
    .data:004762EF ; ---------------------------------------------------------------------------

    LOCK CMPXCHG [EBP], EAX <-- that already makes a change if that was the case
    but it actually need the 64 bit compare, before making any changes

    because if both 32 + 32 bits are not the same it dont do that

    thats the first mistake again

    that other part has no decision, it both jmps on ZF 1 and ZF 0
    .data:004762D5                 jz      short loc_4762E1
    .data:004762D7                 jmp     short loc_4762E1

     

     

  13. .text:8013CEA0 ExInterlockedPopEntrySList proc near    ; CODE XREF: CcScheduleReadAhead+2BB�p
    .text:8013CEA0                                         ; sub_80108058+10�p ...
    .text:8013CEA0                 push    ebx
    .text:8013CEA1                 push    ebp
                                   pushf
                                   cli 
    .text:8013CEA2                 mov     ebp, ecx
    .text:8013CEA4
    .text:8013CEA4 loc_8013CEA4: <-- this seems to has a jump to                ; DATA XREF: .text:loc_80140E17�o
    .text:8013CEA4                 mov     edx, [ebp+4]
    .text:8013CEA7                 mov     eax, [ebp+0]
    .text:8013CEAA
    .text:8013CEAA loc_8013CEAA:  // valid                          ; CODE XREF: ExInterlockedPopEntrySList+1C�j
    .text:8013CEAA                 or      eax, eax
    .text:8013CEAC                 jz      short end_of_ExInterlockedPopEntrySList // has to be changed
    .text:8013CEAE                 mov     ecx, edx
    .text:8013CEB0                 add     ecx, 0FFFFh
    .text:8013CEB6
    .text:8013CEB6 loc_8013CEB6: <-- seems to have some jumps at too           ; DATA XREF: sub_80140AF4:loc_80140AFD�o
    .text:8013CEB6                                         ; .text:80140D28�o
    .text:8013CEB6                 mov     ebx, [eax]

                                   cmp     eax, [ebp+0]
                                   jnz     loc_fail // something we did 
                                   cmp     edx, [ebp+4]
                                   jnz     loc_fail // again 
                                   mov     [ebp+0], ebx
                                   mov     [ebp+4], ecx
                                   jmp loop_check_ExInterlockedPopEntrySList // the loop check
                   loc_fail:  
                                   mov     eax, [ebp+0]
                                   mov     edx, [ebp+4]


    .text:8013CEB8 loop_check_ExInterlockedPopEntrySList:                
    .text:8013CEBC                 jnz     short loc_8013CEAA // valid but need fix to that or eax,eax loop
    .text:8013CEBE
    .text:8013CEBE loc_8013CEBE/end_of_ExInterlockedPopEntrySList:                           ; CODE XREF: ExInterlockedPopEntrySList+C�j
                                   sti  
                                   popf  
    .text:8013CEBE                 pop     ebp
    .text:8013CEBF                 pop     ebx
    .text:8013CEC0                 retn
    .text:8013CEC0 ExInterlockedPopEntrySList endp
    -------------------------------------------------------
    well data refs dont make sence at these spots, bug view ?


    at 8013CEA4 it says it get 1 or more jumps and says from 80140E17 is 1 of the jumps - since the pushf and cli
    changed the offset 2 bytes therefore the jump is gambled if not location fixed

    it would be common to see some jumps into different functions and oposite

     

    location 8013CEB6 seems to bejumped at 
    _80140AF4:loc_80140AFD�o  .text:80140D28�o (you should look at least that 3 spots for this jump)
    looks ida disassembler to me you at best search for that address where they get jumped from

    the jz at 8013CEAC has to be fixed to jump at the end that is sti / end_of_ExInterlockedPopEntrySList
    .text:8013CEAC                 jz end_of_ExInterlockedPopEntrySList


    .text:8013CEAA loc_8013CEAA: that one is valid, but since we have more code the jump that do this is a bit bigger
    but that one is shown in the visable code at 8013CEBC   jnz     short loc_8013CEAA 

    .text:8013CEAC   is valid but also need a adjust to reach the (loc_8013CEBE/end_of_ExInterlockedPopEntrySList)  


    if you want you can try to remove sti,cli popf pushf (but have to be all 4) 

    ---------------------
    you actually could also use a different method
    cmpxchg8b has 4 bytes of opcode jnz short has 2 aka 6 bytes
    you need 5 that makes jmp at your location + 1 nop

            cmpxchg8b qword ptr [ebp+0]
            jnz     short loc_8013CEAA

    those to you replace with your memory location ,  use jmp + nop

    you memory location then do 

                                   cmp     eax, [ebp+0]
                                   jnz     loc_fail2 // something we did 
                                   cmp     edx, [ebp+4]
                                   jnz     loc_fail2 // again 
                                   mov     [ebp+0], ebx
                                   mov     [ebp+4], ecx
                                   jmp the_check // the loop check
                   loc_fail2:  
                                   mov     eax, [ebp+0]
                                   mov     edx, [ebp+4]
                   the_check:
                                   jnz     short loc_8013CEAA // this one conditional jmp to that loop (or      eax, eax)
                                   // now you just have to jump back
                                   jmp to (.text:8013CEBE loc_8013CEBE/end_of_ExInterlockedPopEntrySList) / just backwards 
    as if the command has happend


    i dont know if that NT version can be used for XP they might have used a different behavior

  14. at the moment

    3 makers are (image called the .avif) 

    AOM

    SVT-AV1

    rav1e

    one called the HEIC/h.265/x265 

    by hardware (for example nvidias NVDEC)

    all of those say that they are new hevc (h.265 codecs)

    is there a proof ? 

    hardware might be faster at the moment but not better (1:12 or 9:29 the mountain in the background)

    software is a lot more clear, SVT both P3 and P6 are better:

    https://youtu.be/5rgteZRNb-A?t=72

    another candidate for pictures is JXL 

    https://www.youtube.com/watch?v=w7UDJUCMTng

    but we have to consider what settings where used, that actually makes a big difference

     

    while others even tell something about a h.266 codec

    https://de.wikipedia.org/wiki/Versatile_Video_Coding

     

    there should be a real comparison, always going for the best settings the encoder offers , both picture and motion video (also looking b frames, because often some pictures that are secondary are just stronger compressed - in this case the first picture might look good, but the second rather looks blured)

  15.   public ExInterlockedPopEntrySList
    .data:004762F2 ExInterlockedPopEntrySList proc near    ; CODE XREF: sub_40E06D+1DAp
    .data:004762F2                                         ; sub_41159B+8Ap ...
    .data:004762F2                 push    ebx             ; ExInterlockedPopEntrySList
    .data:004762F3                 push    ebp
                                   pushf
                                   cli
    .data:004762F4                 mov     ebp, ecx
                  loc_jumper_unknown:
    .data:004762F6                 mov     edx, [ebp+4]
    .data:004762F9                 mov     eax, [ebp+0]


    .data:004762FC
    .data:004762FC loc_4762FC:                             ; CODE XREF: ExInterlockedPopEntrySList+17j
    .data:004762FC                 or      eax, eax
    .data:004762FE                 jz      short loc_end
    .data:00476300                 lea     ecx, [edx-1]
                   loc_jumper_unknown2:
    .data:00476303                 mov     ebx, [eax]
    .data:00476305                 
                  loc_jumper_unknown3:
                                   cmp     eax, [ebp+0]
                               jnz     short loc_4762E5
                                   cmp     edx, [ebp+4]
                                   jnz     short loc_4762E5

    .data:004762ED                 mov     [ebp+0], ebx
    .data:004762F0                 mov     [ebp+4], ecx
    .data:004762F3                 jmp loop_check   
    .data:004762E5                 mov     eax, [ebp+0]
    .data:004762E8                 mov     edx, [ebp+4]                             


                  loop_check:                 
    .data:00476309                 jnz     short loc_4762FC
    .data:0047630B
    .data:0047630B loc_47630B: / loc_end:                            ; CODE XREF: ExInterlockedPopEntrySList+Cj
                                   sti
                                   popf
    .data:0047630B                 pop     ebp
    .data:0047630C                 pop     ebx
    .data:0047630D                 
    .data:0047630E                 
    .data:0047630F                 retn
    .data:0047630F ExInterlockedPopEntrySList endp


    for the c++ code you just have to look the translation that the c++ compiler did , if equal good


    for this other function there is a jmp to "mov     ebx, [eax]"
    from 40a7470 (this means if change is changed that jump has to be adjusted to there (if not bsod from other part of this code)
    (since you added assembly commands in the start (we could make rid of pushf, cli , sti and popf) to keep that location at place
    (that is extra jump missing in your 3 post of code too)
    it has 3 jumps that has to be fixed from other parts 


    0040B0DE                 jz      short loc_40B0EB (has to be adjusted) it has more code below now

    the others i have wrote locations, i think you can solve this


    tell me if this works that command dont work in my VM so i actually cant see how it reacts 
    if i could that would it make a lot easier

  16. happy to see you had a good result is it working now ?

    do this one work ? the jumps have to fixed to the right locations

     

    .data:004762B2                 public ExInterlockedFlushSList
    .data:004762B2 ExInterlockedFlushSList proc near       ; CODE XREF: sub_45F0DF:loc_45F0F7p
    .data:004762B2                 push    ebx
    .data:004762B3                 push    ebp
                                             pushf
                                             cli
    .data:004762B4                 xor     ebx, ebx
    .data:004762B6                 mov     ebp, ecx
    .data:004762B8                 mov     edx, [ebp+4]
    .data:004762BB                 mov     eax, [ebp+0]

                   loc_1:
    .data:004762BE                 or      eax, eax
    .data:004762C0                 jz      short loc_end (004762F7)
    .data:004762C2                 mov     ecx, edx
    .data:004762C4                 mov     cx, bx
    .data:004762C7                 
    .data:004762C8
    .data:004762C8                             
    .data:004762C8                
    .data:004762C9                 
    .data:004762CE                 
    .data:004762D0                 
    .data:004762D1
    .data:004762D1 
    .data:004762D1               
    .data:004762D7                
    .data:004762D9           
    .data:004762DB ; --------------------------------------------------------------------------- 
    .data:004762DB ; emulation of CMPXCHG8B
    .data:004762DB                              
    .data:004762DB                cmp     eax, [ebp+0]
    .data:004762DE                jnz     short loc_4762E5
    .data:004762E0                 cmp     edx, [ebp+4]
    .data:004762E3                 jnz     short loc_4762E5
    .data:004762E5
    .data:004762E5  
    .data:004762ED 
    .data:004762ED                mov     [ebp+0], ebx
    .data:004762F0                mov     [ebp+4], ecx
    .data:004762F3                 jmp loop_check (004762F3)
                  

    .data:004762E5                 mov     eax, [ebp+0]
    .data:004762E8                 mov     edx, [ebp+4]
    .data:004762EB                 
    .data:004762ED
    .data:004762ED 
    .data:004762ED ; end emulation of CMPXCHG8B               
    .data:004762F0 ; ---------------------------------------------------------------------------              
                     
                  loop_check:                
    .data:004762F3                 jnz     short loc_1  (004762BE)

                  loc_end:
    .data:004762F7                 sti
    .data:004762F8                 popf
    .data:004762F9                 pop     ebp
    .data:004762FA                 pop     ebx
    .data:004762FB                 retn

                     
    .data:004762FF ExInterlockedFlushSList endp

  17. you making the same mistake

    that command sets flags, and react if the compare was correct or not


    there 2 problems i can certainly tell

    in the first step 

    cmpxchg can have 2 results (if equal it makes the mov if not it makes the mov to a register)

    (and it should not do that because it has to compare 64 bits) 

    if you have 32 bits with the compare it reacts already to the 32 bits (the other 64 bit are ignored)

    then the following happens : the flags are lost and the reaction - for equal 32 bit already reacted or not

    then you do the code again

    but here sits the same problems

    now the flags get changed a second time (and it should not)

    the compare depending if equal reacts to the next 32 bit (while igoring the first 32 bit)

    if that compare was equal it sets the values and if not it sets no values (but you need the 64 bit)

    the flag registes (ZF) is that readed as if the first 32 bits are not there 

    with other words the results are gambled up

    the solution looks not that hard to me 

    you need 2 compares to see if the wanted to compare 64 bits are equal
    before you set the 64 bits reactions 
    if those 2 compare where equal you set the values at the memory location, in the other case you need an extra reaction to set 
    the other case the reaction stores them into EDX and EAX
    (the flag should still be activ, unless you start to use a command that affect flags)


    cmp edx and eax (destination operand)


    if equal store ECX EBX to destination operand (The destination operand is an 8-byte memory location)

    // CMPXCHG8B should be removed and followed by this code:

    // CMPXCHG8B - 32 bit emulator

    cmp dword ptr [ebp],eax // eax suppose to be the low part
    jne skip_and_load_edx_eax
    cmp dword ptr [ebp+4], edx // edx suppose to be the high part 
    jne skip_and_load_edx_eax
    // 64 bits where equal, change with ECX and EBX
    mov dword ptr [ebp], ebx // suppose to have the low part
    mov dword ptr [ebp+4], ecx // suppose to have the high part
    jmp end_of_CMPXCHG8B

    // they where not equal do as the command is described and load those to EDX and EAX
    skip_and_load_edx_eax:
    mov eax, dword ptr [ebp] // suppose to be the low part
    mov edx, dword ptr [ebp+4] // suppose to be the high part

    end_of_CMPXCHG8B:
    // CMPXCHG8B - 32 bit emulator end

    // normal code continue

    this emulation for the 1 line of CMPXCHG8B, it also should have the correct flag

    jumps might need a adjust to their usual locations

     

    // notice i could not test that command if the order is right (like upper and higher parts)
    it might said something about the upper and lower part but as i remember right you never 
    can be exactly certain about this (in memory if you have 11223344 - the 44 are the 
    bits that control the high values (very old architecture stores that differently too - but in this case we dont have that problem even in a 486)

    if that dont work i certainly can fix this, i need a test to make certain the command reaction

    after that i can see its exact behavior 

    the command description however says EDX and ECX contain the high part
    https://www.felixcloutier.com/x86/cmpxchg8b:cmpxchg16b

    if the high order is different then just the spot change from ebp to ebp+4 and ebp+4 to ebp (or change the registers assigned to that ebp locations) :

    // CMPXCHG8B - 32 bit emulator

    cmp dword ptr [ebp],edx // if different edx suppose to be the low part 
    jne skip_and_load_edx_eax
    cmp dword ptr [ebp+4], eax // if different eax suppose to be the high part
    jne skip_and_load_edx_eax
    // 64 bits where equal, change with ECX and EBX
    mov dword ptr [ebp], ecx // if different ecx has the low part
    mov dword ptr [ebp+4], ebx // if different ebx has the high part
    jmp end_of_CMPXCHG8B

    // they where not equal do as the command is described and load those to EDX and EAX
    skip_and_load_edx_eax:
    mov edx, dword ptr [ebp] // suppose to be the low part
    mov eax, dword ptr [ebp+4] // suppose to be the high part // your 55667788 example say so

    end_of_CMPXCHG8B:
    // CMPXCHG8B - 32 bit emulator end


     

  18. well you certainly can translate this command to a 32 bit variant code

    you already have used the "cmpxchg" assembly command
    but it actually should do the wrong job sometimes
    because that compares up only 32 bits (and then already react to the 32 bits) (if that compare was the same or not already changed the result
    because it can already react to either the first 32 bits or the next 32 bits)
    (.data:004762D5                 jnz     short near ptr loc_4762BE+1  - that done again erased the first 32 compare results and only react to the next 32 bits compare)

    but you need the result for 64 bits compare! 


    it seems to me that you can also solve this problem by :

    making 2 compares "cmp" commands for the flags/reaction

    now it is about not to make the same mistake (if you do just the 32 bit compare again it reads the next 32 bits and ignored the first 32 bits
    from the first compare)
    you need a reaction to the first compare (if that was the case) 
    and making the "cmp" command again and react a second time


    if both compares was correct you make the reaction just as described (else the other described reaction) : 

    https://www.felixcloutier.com/x86/cmpxchg8b:cmpxchg16b

    that command description actually dont say something about exchanging the values
    it just says that if the 64 bit compare was equal  

    it says "if the compare was equal the values in it stores the data in ECX and EBX 
    in other case in EDX EAX 
    (what dont look a exchange for me) - maybe the description lacks (what i useally do then i try it out and take looks)


    // if it would be an exchange it would be:

    (later reading the code i dont see a common exchange 
    a common exchange would be if eax would be changed to edx - eax having eax and edx having eax):
    4 assembly "mov" commands (2 for the destination and 2 for the source) 

    or:
    2 times the "xchg" command 
    // 

    but ! looking the assembly code from you it seems different to me 
    i dont see a exchange (just let me say im not entire certain here, but it might helps to talk about that):

    the cmpxchg8b command seems to compare registers EDX and EAX for equal 
    and then changing an offset to a memory location (stack register two "EBP") (qword ptr [ebp+0]) (qword useally describes a 64 bit movement (word * 4 (16 bits * 4)) 
    if that result was equal it should store EAX and EDX to that offset (otherwise it probaly loads that values to EDX EAX)

    the next command is "jnz" that command still has the results from this compare, if they was equal it jumps back to "Efls10" (what seems a loop to me)
    if not it continues the end and and this function


    seeing your code again "lock cmpxchg [ebp+4], eax" dont have a reaction but it might need (as said before it need a reaction to both of the 32 bits)
    if that was not the case it need to end this (not always just continue)
    done that way the first 32 bit can have a false result - and if the next 32 bit are right - then it just still do the job - while it should not

    ---------------------
    if the 64 bit guys apear, that is not neccesary needed

    if you have to use more then 32 bits there are severial methods you can solve this (to name a few)


    1: 
    one is using 2 registers and just create its behavoir for that

    there is a such 32 bit assembly command that is used for that ( CDQ - Convert Word to Doubleword/Convert Doubleword to Quadword ) 


    2:
    an offset to somewhere in memory that is bigger then 32 bits and control it as 64 bits 

    3 (even more is possible with a offset location):
    if you have more then 64 bit flags you just need an offset to a location , where you actually control the flags/ or data


    4: 
    for file movements there is for the REP command 

    the CPU actually can see that it has to move a certain amount of data, and the cpu can translate the filemovement to something it actually can progress

    the FSB (quad pumped) to the RAM is doing a such thing

    unlike the 64 bit guys might would think you dont need a 64 bit offset for this 

    a other example would be the CACHE, HDD´s use a CACHE to fill up the data 

    that data can then be progressed differently - like with 2 bit(wires), 4 bit, 16, 32, 64 or even more (it rather comes down what the physical cable/wire can do)

  19. my "DOS" got a little old, but i remember that "going to DOS mode" from windows often resultet in a non "well-working" DOS

     

    i also remember you had to press the menu button ( i think it was F8 before windows starts) , and select the DOS boot (instead of windows)

    then you had a nativ DOS , where most DOS apps actually then function

     

    in the windows to dos not all DOS apps worked

     

    so this one fix this problem ?

  20. right youtube unlike in the past much look like a commercial tv channel 

    the idea however was different it was a plattform for users that tube (you) - tube 

    it was a challenger against exactly this 

     

    there where no ad´s, there where no odd filters 

     

    i remember the makers had a such discussion about this in the past and denied to "connection" to the common others

    but slowly it more and more got into that direction 

     

    the same happens to google search at the moment, it find nothing anymore 

     

    in the past youtube also had challengers like veoh and other video plattforms 

     

    it looks hard to me to find a replacement for youtube at the moment - but if youtube continue that way it might be the only option to go to a other plattform 

     

    one challenger the commercial tv wont get rid of the is the smartphone, it basicly has replaced the common tv 

     

    i gone away from google search already, here we have alternativ plattforms

×
×
  • Create New...