user57

2024-04-16T07:45:39Z

and the firmware translate this correctly ?

it would make sence the the harddrives firmware actually know this and translates these to physical places on the real harddrive

if the partition can filled with how you want to have the clusters, what is even the problem ?

2024-04-16T00:11:12Z

Cixert creator of thread this has mentioned other methods it always came in to use bigger sectors, it it was mentioned again by Milkinis

some say that already worked for them

it is a similiar discussion:

https://msfn.org/board/topic/176480-2-tib-limit-size-in-mbr-hard-drives/#comments

user-mode wise it dont seems a problem to me since it use that overlapped structure

it contain 2 times 32 bits (64 bits) offsets -> those get translated to a physical address on a harddrive (i think recently somewhere i pointed that out somewhere passing to 64 bit via a structure)

https://learn.microsoft.com/en-us/windows/win32/api/minwinbase/ns-minwinbase-overlapped

that harddrive example makes a good example why and how 32 bit where passed, we know why there already where HDD discs with more then 4 GB - so actually we have a passed method because harddrive reached that areas a lot ealier then the RAM

to be honest it dont look hard either since the function already can do that - sure i might not know about the windows driver for now ... but that raise the question why the driver cant do that

it looks simple to me up to the point i know about it

it just has to convert that 64 bit address given in the overlapped structure to a physical offset on the disc

if they are 512 / 4096 /whatever "cluster-sector" size

thats easy too , that just means you have more data that you actually can use with the 64 bit offset

to make an example if the sector size was 1 you might would have have the 4 GB limit with a 32 bit offset, but that simply didnt use the other 32 bits (that are available)

in case the sector was 512 with and now having a 4096 sector that means you have 8 times more space

4 gb (32 bit) * 512 = 2,19 TB

GPT is a partion not a disc , a partion is a small file on the disc (in the past it was easy to currupt, you had bad luck if that one got demaged) - thats why you rather dont come to easy to access it

2024-04-15T22:53:55Z

well that with the GPT might be wrong idea in this case

the idea was for a MBR with bigger sectors - even tho the title was supposed for reading the GPT partition

GPT has not really a use except the higher possible disc space

the idea that came around was just to increase the MBR sectors, the boot or read of GPT partition would be a different question then

that paragon driver is made from a public driver, but it dont increase the MBR sectors

that driver probaly emulates a next disc, where that driver makes read and writes

if the windows driver really cant do that only then a driver change would be needed

2024-04-15T21:55:20Z

well i dont know what this firmware is written at

but even if it would be a pure assembly code i certainly can change that code to all of the needs

i suspect for the firmware a c/c++ (there are some differences in these but they are not big and i know them too) , combined with some assembly code

i certainly can understand those codes and change them , but its something to read into - i dont know all the disc norms

but thats something a programmer can do

i was involved in chrome gdi, supermium, llvm,sumatra pdf or that heic image encoder

to say the least it took some time to read into that codec, but the code i actually understand

https://msfn.org/board/topic/185879-winxp-hevcheifheic-image-encoderdecoder/#comment-1254293

2024-04-15T18:11:15Z

since its finalized you should write a protocol and make a release

you told us it´s acting oddly slow ? maybe you should try the code i posted up
it actually can be that the reaction sometimes, 1 effect can be that the subtraction dont cause it to pop/push that well
then might a escape or other logic has to take it out

happy to see that you found a new section to use too, i told you its risky just to use other ram and the one you had where used

roytram gave you the right solution for this

happy to see the 486 working well

interesting to see XP actually choose 32 MB instead of 256 MB

caches useally makes the the computer faster
https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ns-winioctl-storage_write_cache_property
https://www.seagate.com/de/de/support/kb/disabling-the-write-cache-feature-in-windows-2000-xp-vista-and-windows-7-187751en/

2024-04-15T07:14:56Z

i could write assembly or c++ to a firmware

but i think i need a drive to test

Saturday at 01:17 AM

this is a good time to talk about the CPUID command

that command returns info about the processor
it stores that information in EAX, EBX, ECX, and EDX

very interesting for WINXP might be the PSE flag and the PAE flag

with this interesting result as we always have it around somewhere "32 bits or wires are the limit for 32 bits"

that guy actually wrote it like this:
"Summary of 32-bit paging":
"This allows a maximum RAM configuration of 252 bytes, or 4 petabytes (about 4.5×1015 bytes)."

and it tells us win2k actually used up these methods "Windows 2000 Datacenter Memory Limit 32 GB RAM"

https://en.wikipedia.org/wiki/Physical_Address_Extension
https://en.wikipedia.org/wiki/PSE-36
https://en.m.wikipedia.org/wiki/Page_Size_Extension
we might can but OS, CPU and BUS/RAM have to do so

but back to the cpuid command

it has information what commands can be used or what "technology" is available for this cpu

this includes if it can make that cmpxchg8b command

in EDX MMX (flag 23), cx8(flag 8 = cmpxchg8b), (pse(page size extension) flag 3), pae ((physical address extension)flag 6) , in ECX (AVX (28), sse4.2 , sse4.1, sse3) and so on

the operating system useally should know if that command in invalid

if it just continue it might use SSE or the MMX commands, what should cause a BSOD

so rather be safe and store them up with a CPU result you actually made with a cpuid command script from a old CPU (a script for cpuid is easy to write and around in web)
maybe from a late 486 cpu (what we can google that those are to be said to have the cpuid command)
then you know for sure what those CPU actually gave back as result
(the few flags maybe if that cmpxchg8b was avaiable you can just delete up)

then you fill up either the registers or where windows store that information, then the OS/WINXP can react to that information, if WINXP actually dont have a reaction, if the command was not correctly reconized, failed, ect

Friday at 07:10 PM

cpuid not an essential command

however you should set this command to values the OS/WINXP can act related to a 4x86 cpu

https://www.felixcloutier.com/x86/cpuid

ttps://en.m.wikipedia.org/wiki/CPUID

Thursday at 11:22 PM

you dont have to neccesary use a near jmp, short jump it is distance based with signed byte (-127 +127)

April 5

well honestly i actually do not want to study the entire thing behind that

if its a PCB control(what i dont know - nor think) you have to study the entire function chain for this - the entirety of windows in relation to this
at least the entire reaction related to that SLIST_HEADER/PKSPIN_LOCK strucuture is needed
that raise a big question why that 2 strucutes would actually be that - sounds at least very odd to me
so i want to say im out of this for now

i remember intel removed the lock prefix as a virus once used it to hide its activity/itself(if i remember correct
it execute the lock prefix - but it no longer has any effect - that lets normal activity continue)

that description from masm archiv tells us that lock rep where removed already on a 286 cpu, so a 486 is affected (wanna go back to a 186 ? (joke))

a different cpu however needs some time to react, if a interference should happen, to be honest i dont think so

and i changed up the entire IDT table and even made it invalid, not even execution 10 commands caused a problem - if there would be a fault in the 10 command then maybe but
this is not the case

this mov commands are however in nanosecond´s area, i dont think it actually can that it can interrupt this so fast
a thread/cpu switch takes time
rather 10 milliseconds would be something here (for others nano are a lot faster then 1 ms 1000 ns = 1 ms)

if the thinking was about some kind of high language problem like "java atom"
java and programming languages dont have atom based relations that rather comes from the programming language itself
and is not CPU based

only assembly actually do a such thing, assembly dont work like a high language

IRQL,SIT/CLI and lock

2 locks then 2 command then locks dont make a "atomic move" either
again i dont think that is the problem

the REP command without lock it still would be done with 1 command executed - this goes as fast the cpu can handle this
whatever exactly cycles that caused on the CPU itself

i think if there is a problem the problem is not with the emulation, the problem is elsewhere, without make a big code to try around and looking the WRK
dietmar could look that 5 functions in the win2k kernel too, maybe that helps or maybe not if the structure reaction/s changed up
if somebody has a proof or the right knowlegue - let me know

actually maybe the cmpxchange8b command where not entire used, only a part of its doing/reaction
some changes actually also can be skipped - some are bad like bsod - while others continue without full functionality - while others work correct
- and while others work but not that well - while others made some code but that code just didnt change anything and function too

very certain what controls SLIST_HEADER, PKSPIN_LOCK would be a next step to look if the functions did the right things

but also a next fault could be a problem, it would not be uncommon if 1 problem is solved, that just a next problem apears - what actually then has nothing to do anymore with the first problem (just in case i wanted to say that - for now hopefully not the problem)

lets just say very likely those 2 structures (if correctly changed with the emulation) will be processed with some next code (why a atomic move would be needed?)

https://www.nirsoft.net/kernel_struct/vista/SLIST_HEADER.html

April 5

there might be there would be would be the REP command

https://www.felixcloutier.com/x86/rep:repe:repz:repne:repnz

it can have a lock prefix

it is actually used for buffers not for smallers moves

April 4

interesting thats neither "atomic" in both 2 moves nor the non interrupt flag
i wrote dietmar he might leave it out in a private message
also it dont have the checks or the loop, and the cmp cmpxchg8b is not done correct

maybe it just fulfills that functions needs

that can be, instead of just replacing the function the function where written to its real needs

so we didnt had to be so specific, just for the correct function reaction

well done

April 3

roytam gave you the patch code, it are 5 changes

lea ecx, sub_40078C <--- thats the first function that this replaced if it founds that cpuid number (ExInterlockedCompareExchange64)

lea ecx, loc_4006F0 <-- next (ExInterlockedPopEntrySList)

lea ecx, loc_400704 (ExInterlockedPushEntrySList)

lea ecx, loc_400714 (ExInterlockedFlushSList)

lea ecx, ExInterlockedAddLargeInteger

last one is different it replace sub_402352 with ExInterlockedAddLargeInteger

somebody can tell that the functions are at these places

April 3

well i would think a different offset is possible

but if they are equal it exchange that offset with ECX and EBX (it overworks those)

if you know about logical circuits you might know why

but exactly this is why i wanted to say he dont need that command at least i dont see a reason for that

the function itself seems to compare:

"if (this offset 64 bit entry still has the same value as EAX and EDX) -> change that offset with ECX and EBX (then it actually has a 64 bit changed value there)"

functions descriptions:

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-exinterlockedflushslist

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-exinterlockedpopentryslist

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-exinterlockedpushentrylist

"If there were entries on the specified list, ExInterlockedFlushSList returns a pointer to the first SLIST_ENTRY structure that was an entry on the list; otherwise, it returns NULL."

this opens the question why they used this command to return an pointer offset to the first SLIST_ENTRY structure and changing that SLIST_ENTRY (what is a internal windows structure)

about that "atomical"

it "suppose to be" a "non interruptable doing"

therefore dietmar probaly deleted the interrupt flag (so no interrupts - actually it still do that - but thats a other story) (the other is the "lock" command)

the next question what comes to mind is changing 64 bit at once (non interrupt doing), not stepwise that that cmpxchg8b actually do (even tho its 32 bits - it just use 2 registers)

BUT i never seen that to be needed for 64 bits, that far to small to have an interaction, not even interrupts if they are changed up (and those are constandly used) have that problem

https://www.quora.com/What-is-the-meaning-of-atomic-in-programming

we also talking about a function here , so the function itself might actually be "atomical" if it solves its job, because the function has a start and a end to solve this step

we are not in c++ that might has a code "atom (this)" in assembly you have to write the real instruction that physical do so (if that was a problem we might have some more answers for now)

dietmar just said that he wants to remove cmpxchg8b with a working alternativ code, it might be a little road but over time we will find this

i highly suspect that list is for threading/multicore

April 3

i heared chappell died a few months ago, sad story we could still need him
reading chappells writing it says that there once was a solution that dont use the command if not supported by cpu
that dont neccesary say if you just use a different one from a different OS version that it just work - maybe - maybe not

why would it has to be that other cmpxchg command

the linux one is not perfect - depending on what the other routines do, the linux one might work, but certainly its not 100 % correct, while mine is

the linux one looks almost the same to the one i posted up, but it dont compare the 64 bits for the false result (maybe the linux solution dont need,
but again mine is correct the linux one is not)
so why not "just the right one"

doing it a other way cause more commands and maybe fixes, there are certainly multiple solutions

https://www.felixcloutier.com/x86/cmpxchg
the description might be wrong this time, the description here unlike cmpxchg8b it always compares EAX with the memory location
the description actually dont tell that a other register then EAX can be choosen

well this time your code might work

but you rather trying to fix the results, the ZF non reaction is set to just go back that is ok but you have to do this in every function like this then

also that makes 2 times locks xchg and 2 times lock cmpxchg

you do cmpxchg for the atomic question ?

if you have to change 64 bit at once then it might be atomic for the 64 bits, doing 64 bit in 1 step

just having the lock prefix dont change it to a 64 bit mov

April 1

36 minutes ago, Dietmar said:

.data:004762B2 ; Exported entry   7. ExInterlockedFlushSList
.data:004762B2
.data:004762B2 ; =============== S U B R O U T I N E =======================================
.data:004762B2
.data:004762B2                 public ExInterlockedFlushSList
.data:004762B2 ExInterlockedFlushSList proc near       ; CODE XREF: sub_45F0DF:loc_45F0F7p
.data:004762B2                                         ; DATA XREF: .edata:off_5AC2A8o
.data:004762B2                 push    ebx
.data:004762B3                 push    ebp
.data:004762B4                 pushf
.data:004762B5                 cli
.data:004762B6                 xor     ebx, ebx
.data:004762B8
.data:004762B8 loc_4762B8:                             ; CODE XREF: ExInterlockedFlushSList:loc_4762E1j
.data:004762B8                 mov     ebp, ecx
.data:004762BA                 mov     edx, [ebp+4]
.data:004762BD                 mov     eax, [ebp+0]
.data:004762C0                 or      eax, eax
.data:004762C2                 jz      short loc_4762E3
.data:004762C4                 mov     ecx, edx
.data:004762C6                 mov     cx, bx
.data:004762C9                 LOCK CMPXCHG [EBP], EAX
.data:004762CE                 jnz     short loc_4762DB
.data:004762D0                 LOCK CMPXCHG [EBP+4], EDX
.data:004762D5                 jz      short loc_4762E1
.data:004762D7                 jmp     short loc_4762E1
.data:004762D9 ; ---------------------------------------------------------------------------
.data:004762D9
.data:004762D9 loc_4762DB:                             ; CODE XREF: ExInterlockedFlushSList+1Aj
.data:004762DB                                         ; ExInterlockedFlushSList+1Fj
.data:004762DB                 mov     eax, [ebp+0]
.data:004762DE                 mov     edx, [ebp+4]
.data:004762E1
.data:004762E1 loc_4762E1:                             ; CODE XREF: ExInterlockedFlushSList+27j
.data:004762E1                 jnz     short loc_4762B8
.data:004762E3
.data:004762E3 loc_4762E3:                             ; CODE XREF: ExInterlockedFlushSList+10j
.data:004762E3                 sti
.data:004762E4                 popf
.data:004762E5                 pop     ebp
.data:004762E6                 pop     ebx
.data:004762E7                 nop
.data:004762E8                 nop
.data:004762E9                 nop
.data:004762EA                 nop
.data:004762EB                 nop
.data:004762EC                 nop
.data:004762ED                 nop
.data:004762EE                 nop
.data:004762EF                 retn
.data:004762EF ExInterlockedFlushSList endp
.data:004762EF
.data:004762EF ; ---------------------------------------------------------------------------

LOCK CMPXCHG [EBP], EAX <-- that already makes a change if that was the case
but it actually need the 64 bit compare, before making any changes

because if both 32 + 32 bits are not the same it dont do that

thats the first mistake again

that other part has no decision, it both jmps on ZF 1 and ZF 0
.data:004762D5 jz short loc_4762E1
.data:004762D7 jmp short loc_4762E1

March 31

.text:8013CEA0 ExInterlockedPopEntrySList proc near ; CODE XREF: CcScheduleReadAhead+2BB�p
.text:8013CEA0 ; sub_80108058+10�p ...
.text:8013CEA0 push ebx
.text:8013CEA1 push ebp
pushf
cli
.text:8013CEA2 mov ebp, ecx
.text:8013CEA4
.text:8013CEA4 loc_8013CEA4: <-- this seems to has a jump to ; DATA XREF: .text:loc_80140E17�o
.text:8013CEA4 mov edx, [ebp+4]
.text:8013CEA7 mov eax, [ebp+0]
.text:8013CEAA
.text:8013CEAA loc_8013CEAA: // valid ; CODE XREF: ExInterlockedPopEntrySList+1C�j
.text:8013CEAA or eax, eax
.text:8013CEAC jz short end_of_ExInterlockedPopEntrySList // has to be changed
.text:8013CEAE mov ecx, edx
.text:8013CEB0 add ecx, 0FFFFh
.text:8013CEB6
.text:8013CEB6 loc_8013CEB6: <-- seems to have some jumps at too ; DATA XREF: sub_80140AF4:loc_80140AFD�o
.text:8013CEB6 ; .text:80140D28�o
.text:8013CEB6 mov ebx, [eax]

cmp eax, [ebp+0]
jnz loc_fail // something we did
cmp edx, [ebp+4]
jnz loc_fail // again
mov [ebp+0], ebx
mov [ebp+4], ecx
jmp loop_check_ExInterlockedPopEntrySList // the loop check
loc_fail:
mov eax, [ebp+0]
mov edx, [ebp+4]

.text:8013CEB8 loop_check_ExInterlockedPopEntrySList:
.text:8013CEBC jnz short loc_8013CEAA // valid but need fix to that or eax,eax loop
.text:8013CEBE
.text:8013CEBE loc_8013CEBE/end_of_ExInterlockedPopEntrySList: ; CODE XREF: ExInterlockedPopEntrySList+C�j
sti
popf
.text:8013CEBE pop ebp
.text:8013CEBF pop ebx
.text:8013CEC0 retn
.text:8013CEC0 ExInterlockedPopEntrySList endp
-------------------------------------------------------
well data refs dont make sence at these spots, bug view ?

at 8013CEA4 it says it get 1 or more jumps and says from 80140E17 is 1 of the jumps - since the pushf and cli
changed the offset 2 bytes therefore the jump is gambled if not location fixed

it would be common to see some jumps into different functions and oposite

location 8013CEB6 seems to bejumped at
_80140AF4:loc_80140AFD�o .text:80140D28�o (you should look at least that 3 spots for this jump)
looks ida disassembler to me you at best search for that address where they get jumped from

the jz at 8013CEAC has to be fixed to jump at the end that is sti / end_of_ExInterlockedPopEntrySList
.text:8013CEAC jz end_of_ExInterlockedPopEntrySList

.text:8013CEAA loc_8013CEAA: that one is valid, but since we have more code the jump that do this is a bit bigger
but that one is shown in the visable code at 8013CEBC jnz short loc_8013CEAA

.text:8013CEAC is valid but also need a adjust to reach the (loc_8013CEBE/end_of_ExInterlockedPopEntrySList)

if you want you can try to remove sti,cli popf pushf (but have to be all 4)

---------------------
you actually could also use a different method
cmpxchg8b has 4 bytes of opcode jnz short has 2 aka 6 bytes
you need 5 that makes jmp at your location + 1 nop

cmpxchg8b qword ptr [ebp+0]
jnz short loc_8013CEAA

those to you replace with your memory location , use jmp + nop

you memory location then do

cmp eax, [ebp+0]
jnz loc_fail2 // something we did
cmp edx, [ebp+4]
jnz loc_fail2 // again
mov [ebp+0], ebx
mov [ebp+4], ecx
jmp the_check // the loop check
loc_fail2:
mov eax, [ebp+0]
mov edx, [ebp+4]
the_check:
jnz short loc_8013CEAA // this one conditional jmp to that loop (or eax, eax)
// now you just have to jump back
jmp to (.text:8013CEBE loc_8013CEBE/end_of_ExInterlockedPopEntrySList) / just backwards
as if the command has happend

i dont know if that NT version can be used for XP they might have used a different behavior

March 31

at the moment

3 makers are (image called the .avif)

AOM

SVT-AV1

rav1e

one called the HEIC/h.265/x265

by hardware (for example nvidias NVDEC)

all of those say that they are new hevc (h.265 codecs)

is there a proof ?

hardware might be faster at the moment but not better (1:12 or 9:29 the mountain in the background)

software is a lot more clear, SVT both P3 and P6 are better:

https://youtu.be/5rgteZRNb-A?t=72

another candidate for pictures is JXL

https://www.youtube.com/watch?v=w7UDJUCMTng

but we have to consider what settings where used, that actually makes a big difference

while others even tell something about a h.266 codec

https://de.wikipedia.org/wiki/Versatile_Video_Coding

there should be a real comparison, always going for the best settings the encoder offers , both picture and motion video (also looking b frames, because often some pictures that are secondary are just stronger compressed - in this case the first picture might look good, but the second rather looks blured)

March 31

public ExInterlockedPopEntrySList
.data:004762F2 ExInterlockedPopEntrySList proc near ; CODE XREF: sub_40E06D+1DAp
.data:004762F2 ; sub_41159B+8Ap ...
.data:004762F2 push ebx ; ExInterlockedPopEntrySList
.data:004762F3 push ebp
pushf
cli
.data:004762F4 mov ebp, ecx
loc_jumper_unknown:
.data:004762F6 mov edx, [ebp+4]
.data:004762F9 mov eax, [ebp+0]

.data:004762FC
.data:004762FC loc_4762FC: ; CODE XREF: ExInterlockedPopEntrySList+17j
.data:004762FC or eax, eax
.data:004762FE jz short loc_end
.data:00476300 lea ecx, [edx-1]
loc_jumper_unknown2:
.data:00476303 mov ebx, [eax]
.data:00476305
loc_jumper_unknown3:
cmp eax, [ebp+0]
jnz short loc_4762E5
cmp edx, [ebp+4]
jnz short loc_4762E5

.data:004762ED mov [ebp+0], ebx
.data:004762F0 mov [ebp+4], ecx
.data:004762F3 jmp loop_check
.data:004762E5 mov eax, [ebp+0]
.data:004762E8 mov edx, [ebp+4]

loop_check:
.data:00476309 jnz short loc_4762FC
.data:0047630B
.data:0047630B loc_47630B: / loc_end: ; CODE XREF: ExInterlockedPopEntrySList+Cj
sti
popf
.data:0047630B pop ebp
.data:0047630C pop ebx
.data:0047630D
.data:0047630E
.data:0047630F retn
.data:0047630F ExInterlockedPopEntrySList endp

for the c++ code you just have to look the translation that the c++ compiler did , if equal good

for this other function there is a jmp to "mov ebx, [eax]"
from 40a7470 (this means if change is changed that jump has to be adjusted to there (if not bsod from other part of this code)
(since you added assembly commands in the start (we could make rid of pushf, cli , sti and popf) to keep that location at place
(that is extra jump missing in your 3 post of code too)
it has 3 jumps that has to be fixed from other parts

0040B0DE jz short loc_40B0EB (has to be adjusted) it has more code below now

the others i have wrote locations, i think you can solve this

tell me if this works that command dont work in my VM so i actually cant see how it reacts
if i could that would it make a lot easier

March 30

happy to see you had a good result is it working now ?

do this one work ? the jumps have to fixed to the right locations

.data:004762B2 public ExInterlockedFlushSList
.data:004762B2 ExInterlockedFlushSList proc near ; CODE XREF: sub_45F0DF:loc_45F0F7p
.data:004762B2 push ebx
.data:004762B3 push ebp
pushf
cli
.data:004762B4 xor ebx, ebx
.data:004762B6 mov ebp, ecx
.data:004762B8 mov edx, [ebp+4]
.data:004762BB mov eax, [ebp+0]

loc_1:
.data:004762BE or eax, eax
.data:004762C0 jz short loc_end (004762F7)
.data:004762C2 mov ecx, edx
.data:004762C4 mov cx, bx
.data:004762C7
.data:004762C8
.data:004762C8
.data:004762C8
.data:004762C9
.data:004762CE
.data:004762D0
.data:004762D1
.data:004762D1
.data:004762D1
.data:004762D7
.data:004762D9
.data:004762DB ; ---------------------------------------------------------------------------
.data:004762DB ; emulation of CMPXCHG8B
.data:004762DB
.data:004762DB cmp eax, [ebp+0]
.data:004762DE jnz short loc_4762E5
.data:004762E0 cmp edx, [ebp+4]
.data:004762E3 jnz short loc_4762E5
.data:004762E5
.data:004762E5
.data:004762ED
.data:004762ED mov [ebp+0], ebx
.data:004762F0 mov [ebp+4], ecx
.data:004762F3 jmp loop_check (004762F3)

.data:004762E5 mov eax, [ebp+0]
.data:004762E8 mov edx, [ebp+4]
.data:004762EB
.data:004762ED
.data:004762ED
.data:004762ED ; end emulation of CMPXCHG8B
.data:004762F0 ; ---------------------------------------------------------------------------

loop_check:
.data:004762F3 jnz short loc_1 (004762BE)

loc_end:
.data:004762F7 sti
.data:004762F8 popf
.data:004762F9 pop ebp
.data:004762FA pop ebx
.data:004762FB retn

.data:004762FF ExInterlockedFlushSList endp

March 29

you making the same mistake

that command sets flags, and react if the compare was correct or not

there 2 problems i can certainly tell

in the first step

cmpxchg can have 2 results (if equal it makes the mov if not it makes the mov to a register)

(and it should not do that because it has to compare 64 bits)

if you have 32 bits with the compare it reacts already to the 32 bits (the other 64 bit are ignored)

then the following happens : the flags are lost and the reaction - for equal 32 bit already reacted or not

then you do the code again

but here sits the same problems

now the flags get changed a second time (and it should not)

the compare depending if equal reacts to the next 32 bit (while igoring the first 32 bit)

if that compare was equal it sets the values and if not it sets no values (but you need the 64 bit)

the flag registes (ZF) is that readed as if the first 32 bits are not there

with other words the results are gambled up

the solution looks not that hard to me

you need 2 compares to see if the wanted to compare 64 bits are equal
before you set the 64 bits reactions
if those 2 compare where equal you set the values at the memory location, in the other case you need an extra reaction to set
the other case the reaction stores them into EDX and EAX
(the flag should still be activ, unless you start to use a command that affect flags)

cmp edx and eax (destination operand)

if equal store ECX EBX to destination operand (The destination operand is an 8-byte memory location)

// CMPXCHG8B should be removed and followed by this code:

// CMPXCHG8B - 32 bit emulator

cmp dword ptr [ebp],eax // eax suppose to be the low part
jne skip_and_load_edx_eax
cmp dword ptr [ebp+4], edx // edx suppose to be the high part
jne skip_and_load_edx_eax
// 64 bits where equal, change with ECX and EBX
mov dword ptr [ebp], ebx // suppose to have the low part
mov dword ptr [ebp+4], ecx // suppose to have the high part
jmp end_of_CMPXCHG8B

// they where not equal do as the command is described and load those to EDX and EAX
skip_and_load_edx_eax:
mov eax, dword ptr [ebp] // suppose to be the low part
mov edx, dword ptr [ebp+4] // suppose to be the high part

end_of_CMPXCHG8B:
// CMPXCHG8B - 32 bit emulator end

// normal code continue

this emulation for the 1 line of CMPXCHG8B, it also should have the correct flag

jumps might need a adjust to their usual locations

// notice i could not test that command if the order is right (like upper and higher parts)
it might said something about the upper and lower part but as i remember right you never
can be exactly certain about this (in memory if you have 11223344 - the 44 are the
bits that control the high values (very old architecture stores that differently too - but in this case we dont have that problem even in a 486)

if that dont work i certainly can fix this, i need a test to make certain the command reaction

after that i can see its exact behavior

the command description however says EDX and ECX contain the high part
https://www.felixcloutier.com/x86/cmpxchg8b:cmpxchg16b

if the high order is different then just the spot change from ebp to ebp+4 and ebp+4 to ebp (or change the registers assigned to that ebp locations) :

// CMPXCHG8B - 32 bit emulator

cmp dword ptr [ebp],edx // if different edx suppose to be the low part
jne skip_and_load_edx_eax
cmp dword ptr [ebp+4], eax // if different eax suppose to be the high part
jne skip_and_load_edx_eax
// 64 bits where equal, change with ECX and EBX
mov dword ptr [ebp], ecx // if different ecx has the low part
mov dword ptr [ebp+4], ebx // if different ebx has the high part
jmp end_of_CMPXCHG8B

// they where not equal do as the command is described and load those to EDX and EAX
skip_and_load_edx_eax:
mov edx, dword ptr [ebp] // suppose to be the low part
mov eax, dword ptr [ebp+4] // suppose to be the high part // your 55667788 example say so

end_of_CMPXCHG8B:
// CMPXCHG8B - 32 bit emulator end

March 28

well you certainly can translate this command to a 32 bit variant code

you already have used the "cmpxchg" assembly command
but it actually should do the wrong job sometimes
because that compares up only 32 bits (and then already react to the 32 bits) (if that compare was the same or not already changed the result
because it can already react to either the first 32 bits or the next 32 bits)
(.data:004762D5 jnz short near ptr loc_4762BE+1 - that done again erased the first 32 compare results and only react to the next 32 bits compare)

but you need the result for 64 bits compare!

it seems to me that you can also solve this problem by :

making 2 compares "cmp" commands for the flags/reaction

now it is about not to make the same mistake (if you do just the 32 bit compare again it reads the next 32 bits and ignored the first 32 bits
from the first compare)
you need a reaction to the first compare (if that was the case)
and making the "cmp" command again and react a second time

if both compares was correct you make the reaction just as described (else the other described reaction) :

https://www.felixcloutier.com/x86/cmpxchg8b:cmpxchg16b

that command description actually dont say something about exchanging the values
it just says that if the 64 bit compare was equal

it says "if the compare was equal the values in it stores the data in ECX and EBX
in other case in EDX EAX
(what dont look a exchange for me) - maybe the description lacks (what i useally do then i try it out and take looks)

// if it would be an exchange it would be:

(later reading the code i dont see a common exchange
a common exchange would be if eax would be changed to edx - eax having eax and edx having eax):
4 assembly "mov" commands (2 for the destination and 2 for the source)

or:
2 times the "xchg" command
//

but ! looking the assembly code from you it seems different to me
i dont see a exchange (just let me say im not entire certain here, but it might helps to talk about that):

the cmpxchg8b command seems to compare registers EDX and EAX for equal
and then changing an offset to a memory location (stack register two "EBP") (qword ptr [ebp+0]) (qword useally describes a 64 bit movement (word * 4 (16 bits * 4))
if that result was equal it should store EAX and EDX to that offset (otherwise it probaly loads that values to EDX EAX)

the next command is "jnz" that command still has the results from this compare, if they was equal it jumps back to "Efls10" (what seems a loop to me)
if not it continues the end and and this function

seeing your code again "lock cmpxchg [ebp+4], eax" dont have a reaction but it might need (as said before it need a reaction to both of the 32 bits)
if that was not the case it need to end this (not always just continue)
done that way the first 32 bit can have a false result - and if the next 32 bit are right - then it just still do the job - while it should not

---------------------
if the 64 bit guys apear, that is not neccesary needed

if you have to use more then 32 bits there are severial methods you can solve this (to name a few)

1:
one is using 2 registers and just create its behavoir for that

there is a such 32 bit assembly command that is used for that ( CDQ - Convert Word to Doubleword/Convert Doubleword to Quadword )

2:
an offset to somewhere in memory that is bigger then 32 bits and control it as 64 bits

3 (even more is possible with a offset location):
if you have more then 64 bit flags you just need an offset to a location , where you actually control the flags/ or data

4:
for file movements there is for the REP command

the CPU actually can see that it has to move a certain amount of data, and the cpu can translate the filemovement to something it actually can progress

the FSB (quad pumped) to the RAM is doing a such thing

unlike the 64 bit guys might would think you dont need a 64 bit offset for this

a other example would be the CACHE, HDD´s use a CACHE to fill up the data

that data can then be progressed differently - like with 2 bit(wires), 4 bit, 16, 32, 64 or even more (it rather comes down what the physical cable/wire can do)

March 20

my "DOS" got a little old, but i remember that "going to DOS mode" from windows often resultet in a non "well-working" DOS

i also remember you had to press the menu button ( i think it was F8 before windows starts) , and select the DOS boot (instead of windows)

then you had a nativ DOS , where most DOS apps actually then function

in the windows to dos not all DOS apps worked

so this one fix this problem ?

March 16

Dr. Stone

March 11

right youtube unlike in the past much look like a commercial tv channel

the idea however was different it was a plattform for users that tube (you) - tube

it was a challenger against exactly this

there where no ad´s, there where no odd filters

i remember the makers had a such discussion about this in the past and denied to "connection" to the common others

but slowly it more and more got into that direction

the same happens to google search at the moment, it find nothing anymore

in the past youtube also had challengers like veoh and other video plattforms

it looks hard to me to find a replacement for youtube at the moment - but if youtube continue that way it might be the only option to go to a other plattform

one challenger the commercial tv wont get rid of the is the smartphone, it basicly has replaced the common tv

i gone away from google search already, here we have alternativ plattforms

Sign In

user57

Posts

Joined

Last visited

Donations

Country

Content Type

Profiles

Forums

Events

Posts posted by user57

Read GPT hard disk on Windows XP

Read GPT hard disk on Windows XP

Read GPT hard disk on Windows XP

Read GPT hard disk on Windows XP

XP running on a 486 cpu

Read GPT hard disk on Windows XP

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

Enable MP4 (H.264 + AAC) HTML5 video in Firefox on Windows XP without Flash

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

XP running on a 486 cpu

Full-Featured Real Mode DOS in Windows Millennium

What are you watching?

Who here has a Youtube-DL compile for WinXP?

Activity

Browse