Discussion:
[smartmontools-support] Failed SMART self-check, but only failing attribute is Throughput_Performance?
Markus Koller
2007-01-29 19:00:36 UTC
Permalink
Hi,

Since yesterday I'm getting messages from smartd about /dev/hda failing the
SMART self-check, and smartctl tells me a drive failure is expected in less
than 24 hours. But when looking at the smartctl output I see only the
Throughput_Performance failing, and some manual self-tests didn't yet turn
up any errors, so I think maybe it's just that attribute that is broken.
Though now I also get a warning in the BIOS about the failed self-check,
so I'm not really sure.

The drive is a Hitachi Deskstar 7K250 series and connected to the first
IDE controller, without anything else on it. Two CD-ROM drives are on the
second controller, and a S-ATA disk on a Promise controller card. What
may have caused this is that I left the PC running for 3 days last week,
whereas I usually shut it down over the night.

Here's the full smartctl output:

$ smartctl -a /dev/hda
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar 7K250 series
Device Model: HDS722580VLAT20
Serial Number: VNR21LC2SD7S4N
Firmware Version: V32OA60A
User Capacity: 82,348,277,760 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 3a
Local Time is: Mon Jan 29 19:55:37 2007 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
See vendor-specific Attribute list for failed Attributes.

General SMART Values:
Offline data collection status: (0x80) Offline data collection activity
was never started.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 242) Self-test routine in progress...
20% of test remaining.
Total time to complete Offline
data collection: (1828) seconds.
Offline data collection
capabilities: (0x1b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 31) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 094 094 060 Pre-fail Always - 262163
2 Throughput_Performance 0x0005 001 001 050 Pre-fail Offline FAILING_NOW 7373
3 Spin_Up_Time 0x0007 106 106 024 Pre-fail Always - 188 (Average 189)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 1260
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 136 136 020 Pre-fail Offline - 31
9 Power_On_Hours 0x0012 099 099 000 Old_age Always - 12432
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1244
192 Power-Off_Retract_Count 0x0032 099 099 050 Old_age Always - 1732
193 Load_Cycle_Count 0x0012 099 099 050 Old_age Always - 1732
194 Temperature_Celsius 0x0002 122 122 000 Old_age Always - 45 (Lifetime Min/Max 14/55)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 12430 -
# 2 Short offline Completed without error 00% 12425 -

Device does not support Selective Self Tests/Logging


I'd appreciate it if somebody could give me some help.


Cheers,
Markus
Volker Kuhlmann
2007-01-30 06:22:51 UTC
Permalink
Post by Markus Koller
Since yesterday I'm getting messages from smartd about /dev/hda failing the
SMART self-check, and smartctl tells me a drive failure is expected in less
than 24 hours. But when looking at the smartctl output I see only the
Throughput_Performance failing, and some manual self-tests didn't yet turn
up any errors,
Hm, I don't know, but I thought one can usually trust the disk saying
about itself that it's had it. If it's still under warranty get it
replaced.

Volker
--
Volker Kuhlmann is list0570 with the domain in header
http://volker.dnsalias.net/ Please do not CC list postings to me.
Eduardo Diaz - Gmail
2007-01-30 19:44:58 UTC
Permalink
Hi , I replace this this if your data are important
02 02 Throughput Performance Overall (general) throughput
performance of a hard disk drive. If the value of this attribute is
decreasing there is a high probability that there is a problem with
your disk.

Review:

http://en.wikipedia.org/wiki/Self-Monitoring%2C_Analysis%2C_and_Reporting_Technology

high probability :-D
Post by Volker Kuhlmann
Post by Markus Koller
Since yesterday I'm getting messages from smartd about /dev/hda failing the
SMART self-check, and smartctl tells me a drive failure is expected in less
than 24 hours. But when looking at the smartctl output I see only the
Throughput_Performance failing, and some manual self-tests didn't yet turn
up any errors,
Hm, I don't know, but I thought one can usually trust the disk saying
about itself that it's had it. If it's still under warranty get it
replaced.
Volker
--
Volker Kuhlmann is list0570 with the domain in header
http://volker.dnsalias.net/ Please do not CC list postings to me.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Smartmontools-support mailing list
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Markus Koller
2007-02-04 02:46:21 UTC
Permalink
(sorry for the broken thread)

Hi,

Thanks for both your answers!

I was curious and left the drive running to see what would happen ;)
Interestingly, the error suddenly disappeared yesterday, but now the drive is
making strange noises sometimes, so I guess I'll replace it anyway next week.
Also, the Raw_Read_Error_Rate is changing a lot since then, though never to a
critical value.

If anyone's interested, here are the log messages since yesterday:

smartd[2542]: Device: /dev/hda, FAILED SMART self-check. BACK UP DATA NOW!
smartd[2542]: Device: /dev/hda, SMART Prefailure Attribute: 2 Throughput_Performance changed from 1 to 100
smartd[2542]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 74 to 78
smartd[2542]: Device: /dev/hda, SMART Prefailure Attribute: 2 Throughput_Performance changed from 100 to 101
smartd[2542]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 78 to 72
smartd[2542]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 72 to 76
smartd[2542]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 76 to 92
smartd[2542]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 92 to 77
smartd[2542]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 77 to 94

and today:
smartd[2558]: Device: /dev/hda, SMART Prefailure Attribute: 2 Throughput_Performance changed from 101 to 100
smartd[2558]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 94 to 80
smartd[2558]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 80 to 71
smartd[2558]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 71 to 100
smartd[2558]: Device: /dev/hda, SMART Prefailure Attribute: 2 Throughput_Performance changed from 100 to 154
smartd[2558]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 96
smartd[2558]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 96 to 100
smartd[2558]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 77
smartd[2558]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 77 to 95


Cheers,
Markus

Loading...