[Nagiosplug-help] check_ide_smart problems with SATA disks on LSI controller
Kim Pedersen
lists at kimp.org
Wed Jan 9 14:06:37 CET 2008
Hi again,
I did have some problems sending to the list when I first signed up, so
forgive me if this has been posted before, I'll take the chance and repost:
Hi Everyone,
I've been having a problem monitoring some SATA disks under an LSI 3442
Raid Controller.
I got 4 disks in a computer (Identical Seagate 750GB ES Drives), 1
(/dev/sda) attached directly to the mainboard, and the other 3 attached
to the LSI Raid Controller as /dev/sd[bcd]
Smartmontools are able to query all 4 disk drives fine, and the
check_ide_smart plugin is able to query /dev/sda fine too.
But I get error messages when using check_ide_smart on the 3 drives on
the Raid Controller
I am using Mandriva 2008, running nagios-plugins 1.4.9.
Trying to fix the issue, I installed the latest 1.4.11 tarball and
tested the latest version of the check_ide_smart plugin from this
release, with the same results.
My guess is that there is a problem with how check_ide_smart parses the
output from smartctl. (The onboard connected drive reports "SMART
Attributes Data Structure revision number: 16", while the Raid attached
drives say "SMART Attributes Data Structure revision number: 10"
FYI, the lspci output on the RAID controller is
04:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E
PCI-Express Fusion-MPT SAS (rev 04)
Any suggestions, or feedback to/from the developers?
Kim
------
[root at juke plugins]# ./check_ide_smart -n -d /dev/sda
OK - Operational (18/18 tests passed)
[root at juke plugins]# ./check_ide_smart -n -d /dev/sdb
CRITICAL - SMART_ENABLE: Invalid argument
CRITICAL - SMART_CMD_ENABLE
[root at juke plugins]# ./check_ide_smart -n -d /dev/sdc
CRITICAL - SMART_ENABLE: Invalid argument
CRITICAL - SMART_CMD_ENABLE
[root at juke plugins]# ./check_ide_smart -n -d /dev/sdd
CRITICAL - SMART_ENABLE: Invalid argument
CRITICAL - SMART_CMD_ENABLE
[root at juke plugins]# smartctl -A /dev/sda
smartctl version 5.37 [x86_64-mandriva-linux-gnu] Copyright (C) 2002-6
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 105 086 006 Pre-fail
Always - 148938236
3 Spin_Up_Time 0x0003 095 093 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 41
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 2
7 Seek_Error_Rate 0x000f 076 061 030 Pre-fail
Always - 44346780
9 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 1096
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 41
187 Unknown_Attribute 0x0032 100 100 000 Old_age
Always - 0
189 Unknown_Attribute 0x003a 100 100 000 Old_age
Always - 0
190 Temperature_Celsius 0x0022 058 038 045 Old_age
Always In_the_past 56575262762
194 Temperature_Celsius 0x0022 042 062 000 Old_age
Always - 42 (Lifetime Min/Max 0/34)
195 Hardware_ECC_Recovered 0x001a 064 057 000 Old_age
Always - 201881025
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age
Always - 0
[root at juke plugins]#
[root at juke plugins]# smartctl -A /dev/sdb
smartctl version 5.37 [x86_64-mandriva-linux-gnu] Copyright (C) 2002-6
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0003 170 163 021 Pre-fail
Always - 6458
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 99
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x000e 200 200 051 Old_age
Always - 0
9 Power_On_Hours 0x0032 092 092 000 Old_age
Always - 6034
10 Spin_Retry_Count 0x0012 100 253 051 Old_age
Always - 0
11 Calibration_Retry_Count 0x0012 100 253 051 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 98
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 96
193 Load_Cycle_Count 0x0032 200 200 000 Old_age
Always - 99
194 Temperature_Celsius 0x0022 112 094 000 Old_age
Always - 38
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0012 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 200 200 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 051 Old_age
Offline - 0
[root at juke plugins]#
[root at juke plugins]# smartctl -A /dev/sdc
smartctl version 5.37 [x86_64-mandriva-linux-gnu] Copyright (C) 2002-6
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 115 090 006 Pre-fail
Always - 168678674
3 Spin_Up_Time 0x0003 089 086 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 101
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 0
7 Seek_Error_Rate 0x000f 073 060 030 Pre-fail
Always - 23301042
9 Power_On_Hours 0x0032 100 100 000 Old_age
Always - 666
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 112
187 Unknown_Attribute 0x0032 100 100 000 Old_age
Always - 0
189 Unknown_Attribute 0x003a 100 100 000 Old_age
Always - 0
190 Temperature_Celsius 0x0022 056 034 045 Old_age
Always In_the_past 4901348900908
194 Temperature_Celsius 0x0022 044 066 000 Old_age
Always - 44 (Lifetime Min/Max 0/31)
195 Hardware_ECC_Recovered 0x001a 062 052 000 Old_age
Always - 204577213
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age
Always - 0
[root at juke plugins]#
[root at juke plugins]# smartctl -A /dev/sdd
smartctl version 5.37 [x86_64-mandriva-linux-gnu] Copyright (C) 2002-6
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 098 094 006 Pre-fail
Always - 14018321
3 Spin_Up_Time 0x0003 089 086 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 148
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 0
7 Seek_Error_Rate 0x000f 077 060 030 Pre-fail
Always - 51060533
9 Power_On_Hours 0x0032 100 100 000 Old_age
Always - 666
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 102
187 Unknown_Attribute 0x0032 100 100 000 Old_age
Always - 0
189 Unknown_Attribute 0x003a 100 100 000 Old_age
Always - 0
190 Temperature_Celsius 0x0022 060 031 045 Old_age
Always In_the_past 4871200047144
194 Temperature_Celsius 0x0022 040 069 000 Old_age
Always - 40 (Lifetime Min/Max 0/28)
195 Hardware_ECC_Recovered 0x001a 058 054 000 Old_age
Always - 185041322
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age
Always - 0
More information about the Help
mailing list