Re: Hardware error ...

Top Page

Reply to this message
Author: Patrice Karatchentzeff
Date:  
To: Jérôme Kieffer
CC: guilde
Subject: Re: Hardware error ...
Salut Jérôme,

Tu as accès à /var/log/mcelog ?

Sinon, d'après cela :
https://forum.manjaro.org/t/mce-hardware-error-cpu-0-machine-check-0-bank-5/137519

on dirait que c'est toujours un problème de RAM ECC... Tu peux tester
avec d'autres bancs non ECC ?

Le jeu. 6 juin 2024 à 19:32, Jérôme Kieffer
<jerome.kieffer@???> a écrit :
>
> Bonjour,
>
> Ma machine plante (i.e. reboot violent) avec ces information au prochain demarrage ces messages:
>
> [    1.078033] ERST: Error Record Serialization Table (ERST) support is initialized.
> [    1.216216] BERT: Error records from previous boot:
> [    1.216219] [Hardware Error]: event severity: recoverable
> [    1.216222] [Hardware Error]:  Error 0, type: recoverable
> [    1.216225] [Hardware Error]:  fru_text: ProcessorError
> [    1.216228] [Hardware Error]:   section_type: IA32/X64 processor error
> [    1.216230] [Hardware Error]:   Local APIC_ID: 0x0
> [    1.216233] [Hardware Error]:   CPUID Info:
> [    1.216237] [Hardware Error]:   00000000: 00830f10 00000000 00100800 00000000
> [    1.216241] [Hardware Error]:   00000010: 76d8320b 00000000 178bfbff 00000000
> [    1.216245] [Hardware Error]:   00000020: 00000000 00000000 00000000 00000000
> [    1.216248] [Hardware Error]:   Error Information Structure 0:
> [    1.216251] [Hardware Error]:    Error Structure Type: cache error
> [    1.216253] [Hardware Error]:    Check Information: 0x000000001c4d0077
> [    1.216256] [Hardware Error]:     Transaction Type: 1, Data Access
> [    1.216258] [Hardware Error]:     Operation: 3, data read
> [    1.216261] [Hardware Error]:     Level: 1
> [    1.216263] [Hardware Error]:     Uncorrected: true
> [    1.216266] [Hardware Error]:     Precise IP: true
> [    1.216268] [Hardware Error]:     Restartable IP: true
> [    1.216270] [Hardware Error]:    Instruction Pointer: 0x00000000a92109be
> [    1.216273] [Hardware Error]:   Context Information Structure 0:
> [    1.216275] [Hardware Error]:    Register Context Type: MSR Registers (Machine Check and other MSRs)
> [    1.216277] [Hardware Error]:    Register Array Size: 0x0050
> [    1.216280] [Hardware Error]:    MSR Address: 0xc0002001
> [    1.216289] [Hardware Error]:  Error 1, type: recoverable
> [    1.216291] [Hardware Error]:  fru_text: ProcessorError
> [    1.216294] [Hardware Error]:   section_type: IA32/X64 processor error
> [    1.216296] [Hardware Error]:   Local APIC_ID: 0x0
> [    1.216298] [Hardware Error]:   CPUID Info:
> [    1.216301] [Hardware Error]:   00000000: 00830f10 00000000 00100800 00000000
> [    1.216305] [Hardware Error]:   00000010: 76d8320b 00000000 178bfbff 00000000
> [    1.216308] [Hardware Error]:   00000020: 00000000 00000000 00000000 00000000
> [    1.216311] [Hardware Error]:   Error Information Structure 0:
> [    1.216313] [Hardware Error]:    Error Structure Type: micro-architectural error
> [    1.216315] [Hardware Error]:    Check Information: 0x0000000000850021
> [    1.216318] [Hardware Error]:     Error Type: 5, Internal Unclassified
> [    1.216321] [Hardware Error]:     Overflow: true
> [    1.216323] [Hardware Error]:   Context Information Structure 0:
> [    1.216325] [Hardware Error]:    Register Context Type: MSR Registers (Machine Check and other MSRs)
> [    1.216327] [Hardware Error]:    Register Array Size: 0x0050
> [    1.216329] [Hardware Error]:    MSR Address: 0xc00021b1
> [    1.216334] [Hardware Error]:  Error 2, type: recoverable
> [    1.216336] [Hardware Error]:  fru_text: ProcessorError
> [    1.216338] [Hardware Error]:   section_type: IA32/X64 processor error
> [    1.216340] [Hardware Error]:   Local APIC_ID: 0x30
> [    1.216342] [Hardware Error]:   CPUID Info:
> [    1.216345] [Hardware Error]:   00000000: 00830f10 00000000 30100800 00000000
> [    1.216349] [Hardware Error]:   00000010: 76d8320b 00000000 178bfbff 00000000
> [    1.216352] [Hardware Error]:   00000020: 00000000 00000000 00000000 00000000
> [    1.216355] [Hardware Error]:   Error Information Structure 0:
> [    1.216357] [Hardware Error]:    Error Structure Type: cache error
> [    1.216359] [Hardware Error]:    Check Information: 0x000000001c4d0077
> [    1.216362] [Hardware Error]:     Transaction Type: 1, Data Access
> [    1.216364] [Hardware Error]:     Operation: 3, data read
> [    1.216366] [Hardware Error]:     Level: 1
> [    1.216368] [Hardware Error]:     Uncorrected: true
> [    1.216370] [Hardware Error]:     Precise IP: true
> [    1.216372] [Hardware Error]:     Restartable IP: true
> [    1.216375] [Hardware Error]:    Instruction Pointer: 0x0000000000000000
> [    1.216377] [Hardware Error]:   Context Information Structure 0:
> [    1.216379] [Hardware Error]:    Register Context Type: MSR Registers (Machine Check and other MSRs)
> [    1.216381] [Hardware Error]:    Register Array Size: 0x0050
> [    1.216383] [Hardware Error]:    MSR Address: 0xc0002001
> [    1.216420] mce: [Hardware Error]: Machine check events logged
> [    1.216422] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 0: bc002800000c0135
> [    1.216527] mce: [Hardware Error]: TSC 0 ADDR 1000000fed80280 MISC d01c0dff00000000 PPIN 2b497ef4dd64076 IPID b000000000
> [    1.216635] mce: [Hardware Error]: PROCESSOR 2:830f10 TIME 1717694421 SOCKET 0 APIC 0 microcode 830107a
> [    1.216730] mce: [Hardware Error]: Machine check events logged
> [    1.216732] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 27: d82010000004080b
> [    1.216821] mce: [Hardware Error]: TSC 0 MISC d01c0dff00000000 PPIN 2b497ef4dd64076 SYND 5b000000 IPID 1002e00000000
> [    1.216923] mce: [Hardware Error]: PROCESSOR 2:830f10 TIME 1717694421 SOCKET 0 APIC 0 microcode 830107a
> [    1.216925] mce: [Hardware Error]: CPU 6: Machine Check: 0 Bank 0: bc002800000c0135
> [    1.216927] mce: [Hardware Error]: TSC 0 ADDR 1000000f5f00480 MISC d01c0dff00000000 PPIN 2b497ef4dd64076 IPID b000000000
> [    1.217210] mce: [Hardware Error]: PROCESSOR 2:830f10 TIME 1717694421 SOCKET 0 APIC 30 microcode 830107a

>
> J'ai changé la carte mère récemment à cause de plantages dans ce genre,
> la RAM a également été testée... Serait-ce possible que ce soit le processeur ?
>
> Je suis preneur de toutes sortes d'info.
> --
> Jérôme Kieffer, désespéré
>



-- 
      |\      _,,,---,,_           Patrice KARATCHENTZEFF
ZZZzz /,`.-'`'    -.  ;-;;,_   mailto:patrice.karatchentzeff@gmail.com
     |,4-  ) )-,_. ,\ (  `'-'
    '---''(_/--'  `-'\_)