I realise I'm very late and you've already solved it, but someone on vogons found this nugget of info:
https://www.geoffchappell.com/studies/windows/km/cpu/cx8.htm
Specifically :
"In the early days of Windows NT, however, not all the extant processors implemented the cmpxchg8b instruction. In versions before 5.1, every function that uses the instruction has an alternate coding for processors that do not support the instruction. Very early during its initialisation, the kernel checks whether the boot processor supports the cmpxchg8b instruction. If the support is missing, the kernel patches jmp instructions at the start of each of those functions to redirect execution to their alternates. Conversely, if the boot processor does support the instruction, and the functions are left unpatched, then the kernel requires that all processors support the instruction, under pain of the bug check"