Salut Jérôme,
Tu as accès à /var/log/mcelog ?
Sinon, d'après cela :
https://forum.manjaro.org/t/mce-hardware-error-cpu-0-machine-check-0-bank-5/137519
on dirait que c'est toujours un problème de RAM ECC... Tu peux tester
avec d'autres bancs non ECC ?
Le jeu. 6 juin 2024 à 19:32, Jérôme Kieffer
<jerome.kieffer@???> a écrit :
>
> Bonjour,
>
> Ma machine plante (i.e. reboot violent) avec ces information au prochain demarrage ces messages:
>
> [ 1.078033] ERST: Error Record Serialization Table (ERST) support is initialized.
> [ 1.216216] BERT: Error records from previous boot:
> [ 1.216219] [Hardware Error]: event severity: recoverable
> [ 1.216222] [Hardware Error]: Error 0, type: recoverable
> [ 1.216225] [Hardware Error]: fru_text: ProcessorError
> [ 1.216228] [Hardware Error]: section_type: IA32/X64 processor error
> [ 1.216230] [Hardware Error]: Local APIC_ID: 0x0
> [ 1.216233] [Hardware Error]: CPUID Info:
> [ 1.216237] [Hardware Error]: 00000000: 00830f10 00000000 00100800 00000000
> [ 1.216241] [Hardware Error]: 00000010: 76d8320b 00000000 178bfbff 00000000
> [ 1.216245] [Hardware Error]: 00000020: 00000000 00000000 00000000 00000000
> [ 1.216248] [Hardware Error]: Error Information Structure 0:
> [ 1.216251] [Hardware Error]: Error Structure Type: cache error
> [ 1.216253] [Hardware Error]: Check Information: 0x000000001c4d0077
> [ 1.216256] [Hardware Error]: Transaction Type: 1, Data Access
> [ 1.216258] [Hardware Error]: Operation: 3, data read
> [ 1.216261] [Hardware Error]: Level: 1
> [ 1.216263] [Hardware Error]: Uncorrected: true
> [ 1.216266] [Hardware Error]: Precise IP: true
> [ 1.216268] [Hardware Error]: Restartable IP: true
> [ 1.216270] [Hardware Error]: Instruction Pointer: 0x00000000a92109be
> [ 1.216273] [Hardware Error]: Context Information Structure 0:
> [ 1.216275] [Hardware Error]: Register Context Type: MSR Registers (Machine Check and other MSRs)
> [ 1.216277] [Hardware Error]: Register Array Size: 0x0050
> [ 1.216280] [Hardware Error]: MSR Address: 0xc0002001
> [ 1.216289] [Hardware Error]: Error 1, type: recoverable
> [ 1.216291] [Hardware Error]: fru_text: ProcessorError
> [ 1.216294] [Hardware Error]: section_type: IA32/X64 processor error
> [ 1.216296] [Hardware Error]: Local APIC_ID: 0x0
> [ 1.216298] [Hardware Error]: CPUID Info:
> [ 1.216301] [Hardware Error]: 00000000: 00830f10 00000000 00100800 00000000
> [ 1.216305] [Hardware Error]: 00000010: 76d8320b 00000000 178bfbff 00000000
> [ 1.216308] [Hardware Error]: 00000020: 00000000 00000000 00000000 00000000
> [ 1.216311] [Hardware Error]: Error Information Structure 0:
> [ 1.216313] [Hardware Error]: Error Structure Type: micro-architectural error
> [ 1.216315] [Hardware Error]: Check Information: 0x0000000000850021
> [ 1.216318] [Hardware Error]: Error Type: 5, Internal Unclassified
> [ 1.216321] [Hardware Error]: Overflow: true
> [ 1.216323] [Hardware Error]: Context Information Structure 0:
> [ 1.216325] [Hardware Error]: Register Context Type: MSR Registers (Machine Check and other MSRs)
> [ 1.216327] [Hardware Error]: Register Array Size: 0x0050
> [ 1.216329] [Hardware Error]: MSR Address: 0xc00021b1
> [ 1.216334] [Hardware Error]: Error 2, type: recoverable
> [ 1.216336] [Hardware Error]: fru_text: ProcessorError
> [ 1.216338] [Hardware Error]: section_type: IA32/X64 processor error
> [ 1.216340] [Hardware Error]: Local APIC_ID: 0x30
> [ 1.216342] [Hardware Error]: CPUID Info:
> [ 1.216345] [Hardware Error]: 00000000: 00830f10 00000000 30100800 00000000
> [ 1.216349] [Hardware Error]: 00000010: 76d8320b 00000000 178bfbff 00000000
> [ 1.216352] [Hardware Error]: 00000020: 00000000 00000000 00000000 00000000
> [ 1.216355] [Hardware Error]: Error Information Structure 0:
> [ 1.216357] [Hardware Error]: Error Structure Type: cache error
> [ 1.216359] [Hardware Error]: Check Information: 0x000000001c4d0077
> [ 1.216362] [Hardware Error]: Transaction Type: 1, Data Access
> [ 1.216364] [Hardware Error]: Operation: 3, data read
> [ 1.216366] [Hardware Error]: Level: 1
> [ 1.216368] [Hardware Error]: Uncorrected: true
> [ 1.216370] [Hardware Error]: Precise IP: true
> [ 1.216372] [Hardware Error]: Restartable IP: true
> [ 1.216375] [Hardware Error]: Instruction Pointer: 0x0000000000000000
> [ 1.216377] [Hardware Error]: Context Information Structure 0:
> [ 1.216379] [Hardware Error]: Register Context Type: MSR Registers (Machine Check and other MSRs)
> [ 1.216381] [Hardware Error]: Register Array Size: 0x0050
> [ 1.216383] [Hardware Error]: MSR Address: 0xc0002001
> [ 1.216420] mce: [Hardware Error]: Machine check events logged
> [ 1.216422] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 0: bc002800000c0135
> [ 1.216527] mce: [Hardware Error]: TSC 0 ADDR 1000000fed80280 MISC d01c0dff00000000 PPIN 2b497ef4dd64076 IPID b000000000
> [ 1.216635] mce: [Hardware Error]: PROCESSOR 2:830f10 TIME 1717694421 SOCKET 0 APIC 0 microcode 830107a
> [ 1.216730] mce: [Hardware Error]: Machine check events logged
> [ 1.216732] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 27: d82010000004080b
> [ 1.216821] mce: [Hardware Error]: TSC 0 MISC d01c0dff00000000 PPIN 2b497ef4dd64076 SYND 5b000000 IPID 1002e00000000
> [ 1.216923] mce: [Hardware Error]: PROCESSOR 2:830f10 TIME 1717694421 SOCKET 0 APIC 0 microcode 830107a
> [ 1.216925] mce: [Hardware Error]: CPU 6: Machine Check: 0 Bank 0: bc002800000c0135
> [ 1.216927] mce: [Hardware Error]: TSC 0 ADDR 1000000f5f00480 MISC d01c0dff00000000 PPIN 2b497ef4dd64076 IPID b000000000
> [ 1.217210] mce: [Hardware Error]: PROCESSOR 2:830f10 TIME 1717694421 SOCKET 0 APIC 30 microcode 830107a
>
> J'ai changé la carte mère récemment à cause de plantages dans ce genre,
> la RAM a également été testée... Serait-ce possible que ce soit le processeur ?
>
> Je suis preneur de toutes sortes d'info.
> --
> Jérôme Kieffer, désespéré
>
--
|\ _,,,---,,_ Patrice KARATCHENTZEFF
ZZZzz /,`.-'`' -. ;-;;,_ mailto:patrice.karatchentzeff@gmail.com
|,4- ) )-,_. ,\ ( `'-'
'---''(_/--' `-'\_)