ESXi 7.0 and AVX

From OISecWiki

When upgrading a Debian 11 to Debian 12 server we encountered that mongodb wouldn't start anymore reporting that it has no AVX support anymore.

WARNING: MongoDB 5.0+ requires a CPU with AVX support, and your current system does not appear to have that!

  see https://jira.mongodb.org/browse/SERVER-54407

  see also https://www.mongodb.com/community/forums/t/mongodb-5-0-cpu-intel-g4650-compatibility/116610/2

  see also https://github.com/docker-library/mongo/issues/485#issuecomment-891991814

After some troubleshooting we encountered that the avx CPU flags was no longer present on Linux VM's that ran Kernel 6.1+.

We then tried booting the original Debian 11 kernel which was 5.10.0-23 and the avx cpu flag was present again.

Looking into the differences in the boot messages from this machine we see that 5.10.0 recognizes the avx flags. But 6.1 doesn't recognize the avx flags.

Kernel 5.10.0:

+ ------------[ cut here ]------------
+ XSAVE consistency problem, dumping leaves
+ WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:692 fpu__init_system_xstate+0x4b2/0x9e1
+ Modules linked in:
+ CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0-23-amd64 #1 Debian 5.10.179-1
+ RIP: 0010:fpu__init_system_xstate+0x4b2/0x9e1
+ Code: 0f 85 83 fd ff ff 3b 1d 7e ad 26 00 74 2c 80 3d 55 cd be ff 00 75 15 48 c7 c7 c8 d6 2b ab c6 05 45 cd be ff 01 e8 25 e8 c6 fe <0f> 0b 83 3d d7 23 9d ff 00 74 05 e8 88 9d c6 fe 48 8b 35 c9 67 c1
+ RSP: 0000:ffffffffab803e40 EFLAGS: 00010086 ORIG_RAX: 0000000000000000
+ RAX: 0000000000000000 RBX: 0000000000000988 RCX: ffffffffab8b3768
+ RDX: c0000000ffffefff RSI: 00000000ffffefff RDI: 0000000000000047
+ RBP: 0000000000000a88 R08: 0000000000000000 R09: ffffffffab803c60
+ R10: ffffffffab803c58 R11: ffffffffab8cb7a8 R12: 0000000000000010
+ R13: 0000000000000008 R14: 0000000000000001 R15: 0000000000000000
+ FS:  0000000000000000(0000) GS:ffffffffabe0e000(0000) knlGS:0000000000000000
+ CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+ CR2: ffff88800008b000 CR3: 0000000066caa000 CR4: 00000000000406a0
+ Call Trace:
+  ? fpu__init_system+0x11d/0x15e
+  ? early_cpu_init+0x4cc/0x4f4
+  ? setup_arch+0xb7/0xc6e
+  ? slab_is_available+0x5/0x20
+  ? start_kernel+0x64/0x599
+  ? load_ucode_bsp+0x34/0xd1
+  ? secondary_startup_64_no_verify+0xb0/0xbb
+ ---[ end trace cbf22677bc764666 ]---
+ CPUID[0d, 00]: eax=000002e7 ebx=00000a88 ecx=00000a88 edx=00000000
+ CPUID[0d, 01]: eax=0000000f ebx=00000a88 ecx=00000000 edx=00000000
+ CPUID[0d, 02]: eax=00000100 ebx=00000240 ecx=00000000 edx=00000000
+ CPUID[0d, 03]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 04]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 05]: eax=00000040 ebx=00000440 ecx=00000000 edx=00000000
+ CPUID[0d, 06]: eax=00000200 ebx=00000480 ecx=00000000 edx=00000000
+ CPUID[0d, 07]: eax=00000400 ebx=00000680 ecx=00000000 edx=00000000
+ CPUID[0d, 08]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 09]: eax=00000008 ebx=00000a80 ecx=00000000 edx=00000000
+ CPUID[0d, 0a]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 0b]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 0c]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 0d]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 0e]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 0f]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 10]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 11]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 12]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 13]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 14]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 15]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 16]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 17]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 18]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ CPUID[0d, 19]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
+ x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
+ x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
+ x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
+ x86/fpu: Supporting XSAVE feature 0x020: 'AVX-512 opmask'
+ x86/fpu: Supporting XSAVE feature 0x040: 'AVX-512 Hi256'
+ x86/fpu: Supporting XSAVE feature 0x080: 'AVX-512 ZMM_Hi256'
+ x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers'
+ x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
+ x86/fpu: xstate_offset[5]:  832, xstate_sizes[5]:   64
+ x86/fpu: xstate_offset[6]:  896, xstate_sizes[6]:  512
+ x86/fpu: xstate_offset[7]: 1408, xstate_sizes[7]: 1024
+ x86/fpu: xstate_offset[9]: 2432, xstate_sizes[9]:    8
+ x86/fpu: Enabled xstate features 0x2e7, context size is 2696 bytes, using 'compacted' format.

This behaviour change is caused by the XSAVE feature on CPU's. Both kernel versions will do a XSAVE consistency check that fails:

------------[ cut here ]------------
- XSAVE consistency problem: size 2440 != kernel_size 2696
- WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:606 fpu__init_system_xstate+0x85a/0xb19
- Modules linked in:
- CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-27-amd64 #1  Debian 6.1.115-1
- Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/09/2019
- RIP: 0010:fpu__init_system_xstate+0x85a/0xb19
- Code: ca df fb fd 89 c6 41 39 c5 74 26 80 3d 2f 23 b8 ff 00 75 90 44 89 ea 48 c7 c7 90 17 93 a6 c6 05 1c 23 b8 ff 01 e8 f4 c2 01 fe <0f> 0b e9 73 ff ff f
f 8b 44 24 10 48 8b 3d 52 12 39 ff 31 f6 44 89
- RSP: 0000:ffffffffa7003e68 EFLAGS: 00010282
- RAX: 0000000000000000 RBX: 0000000000000040 RCX: c0000000ffffefff
- RDX: 0000000000000000 RSI: 00000000ffffefff RDI: 0000000000000001
- RBP: 0000000000000200 R08: 0000000000000000 R09: ffffffffa7003ce0
- R10: 0000000000000003 R11: ffffffffa70d4488 R12: 00000000000002e7
- R13: 0000000000000a88 R14: 0000000000000001 R15: 0000000000000a88
- FS:  0000000000000000(0000) GS:ffff8c4973c00000(0000) knlGS:0000000000000000
- CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
- CR2: ffff8c474e602000 CR3: 000000010cc10001 CR4: 00000000007706f0
- PKRU: 55555554
- Call Trace:
-  <TASK>
-  ? __warn+0x7d/0xc0
-  ? fpu__init_system_xstate+0x85a/0xb19
-  ? report_bug+0xe2/0x150
-  ? handle_bug+0x41/0x70
-  ? exc_invalid_op+0x13/0x60
-  ? asm_exc_invalid_op+0x16/0x20
-  ? fpu__init_system_xstate+0x85a/0xb19
-  ? fpu__init_system_xstate+0x85a/0xb19
-  fpu__init_system+0x15c/0x18b
-  ? static_key_disable+0x16/0x20
-  arch_cpu_finalize_init+0x1e/0x47
-  start_kernel+0x687/0x733
-  secondary_startup_64_no_verify+0xe5/0xeb
-  </TASK>
- ---[ end trace 0000000000000000 ]---
- CPUID[0d, 00]: eax=000002e7 ebx=00000a88 ecx=00000a88 edx=00000000
- CPUID[0d, 01]: eax=0000000f ebx=00000a88 ecx=00000000 edx=00000000
- CPUID[0d, 02]: eax=00000100 ebx=00000240 ecx=00000000 edx=00000000
- CPUID[0d, 03]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 04]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 05]: eax=00000040 ebx=00000440 ecx=00000000 edx=00000000
- CPUID[0d, 06]: eax=00000200 ebx=00000480 ecx=00000000 edx=00000000
- CPUID[0d, 07]: eax=00000400 ebx=00000680 ecx=00000000 edx=00000000
- CPUID[0d, 08]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 09]: eax=00000008 ebx=00000a80 ecx=00000000 edx=00000000
- CPUID[0d, 0a]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 0b]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 0c]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 0d]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 0e]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 0f]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 10]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 11]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 12]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 13]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 14]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 15]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 16]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 17]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 18]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 19]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 1a]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 1b]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
- CPUID[0d, 1c]: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000

The newer 6.1 kernel will disable AVX support if this consistency check fails

The reason for this failing consistency check is a bug in ESXi 7.0. Broadcom has fixed this in ESXi 7.0U1[1]

This might be related to the GNU TLS issue we had. See ESXi 7.0 and apt-get update for more info about that issue.