qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH][v2] Align file accesses with cache=off (O_D


From: Jamie Lokier
Subject: Re: [Qemu-devel] Re: [PATCH][v2] Align file accesses with cache=off (O_DIRECT)
Date: Wed, 21 May 2008 02:19:15 +0100
User-agent: Mutt/1.5.13 (2006-08-11)

Anthony Liguori wrote:
> >One property of disks is that if you overwrite a sector and the're
> >power loss, when read later that sector might be corrupt.  Even if the
> >new data is the same as the old data with only some bytes changed,
> >some of the _unchanged_ bytes may be corrupt by this.
> 
> I don't think this is true.  What evidence do you have to support such 
> claims?

What do you imagine happens when you pull the power in the middle of
writing a sector to a floppy disk (to pick a more easily imagined
example)?

There is not enough residual power to write the rest of the sector.
That sector's checksum will therefore be corrupt, and (hopefully) have
a CRC read error.  It can be written over again, wiping the CRC error.

No sector which wasn't being written will be corrupt: the write head
isn't activated over those.  The drive waits until it senses the start
of sector N, then activates the write head to write data bits.

The CRC error by itself my cause the whole sector to be reported as
corrupt with no data.  However, if you do manage to get back the bits
from the media, some bits of the sector being written whose values
were not intended to change may be different than expected.  This is
because the way data is recorded does not encode each bit separately,
but multiplexes them together for modulation, and also because bit
timing is not exact.

A modern hard disk uses much more complex data encoding, which further
adds to the effect of a truncated write corrupting even data bits not
intended to be changed, in the vicinity of those being changed.

But it should aim to provide the same basic guarantee that writing a
sector cannot corrupt neighbouring sectors on power failure, only the
one(s) being written.  This is because robustness of journalling
filesystems and databases do rather depend on this property, and
simple old-fashioned disks do provide it.

I am just speculating; I don't know whether modern hard disks provide
this property, or under what circumstances they fail.  But it seems
they could provide it, because they still have physically independent
sectors.

(Interestingly, the journal block size used by Oracle on different
OSes is different, suggesting the "basic unit of corruption"
varies between OSes and is not always a single sector).

Although it's just speculation, do you think modern hard disks behave
differently from this?

-- Jamie




reply via email to

[Prev in Thread] Current Thread [Next in Thread]