lists.zerezo.com
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: "failed to read delta base object at..."
- Date: Wed, 27 Aug 2008 12:48:00 -0700 (PDT)
- From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
- Subject: Re: "failed to read delta base object at..."
On Wed, 27 Aug 2008, Nicolas Pitre wrote:
>
> And isn't the bad data block size and alignment a bit odd for a
> filesystem crash corruption?
Yes. If it was a filesystem issue, I'd expect it to be at least disk block
aligned (512 bytes, most of the time) and more likely filesystem block
aligned (ie mostly 4kB).
However, if we were to re-write the file afterwards, it could still get
non-block-aligned corruption - simply because there was a
non-block-aligned rewrite that got lost. But we don't actually ever do
that, except for the header and the SHA1 at the end in some unusual cases.
> However, in the pack-objects case, it is almost impossible to have such
> a corruption since the data is SHA1 summed immediately before being
> written out.
Yes. Anything that uses the "sha1write()" model (which includes the
regular pack-file _and_ the index) should generally be pretty safe.
However, we do have this odd case of fixing up the pack after-the-fact
when we receive it from somebody else (because we get a thin pack and
don't know how many objects the final result will have). And that case
seems to be not as safe, because it
- re-reads the file to recompute the SHA1
This is understandable, and it's fairly ok, but it does mean that there
is a bigger chance of the SHA1 matching if something has corrupted the
file in the meantime!
(That was not the case of this corruption, obviously, since the SHA1
didn't match)
- but it also forgets to fsync the result, because it only did that in
one path rather in all cases of fixup.
Again, this wasn't actually the cause of this corruption, because the
corruption wasn't near the header or tail, so if it had been due to a
missed write due to missing an fsync, the pattern would have been
different.
Anyway, we should fix the latter problem regardless, even if it's (a) damn
unlikely and (b) definietly not the case in this thing.
The fix is trivial - just move the "fsync_or_die()" into the fixup routine
rather than doing it in one of the callers.
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Linus
---
builtin-pack-objects.c | 1 -
pack-write.c | 1 +
2 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 2dadec1..d394c49 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -499,7 +499,6 @@ static void write_pack_file(void)
} else {
int fd = sha1close(f, NULL, 0);
fixup_pack_header_footer(fd, sha1, pack_tmp_name, nr_written);
- fsync_or_die(fd, pack_tmp_name);
close(fd);
}
diff --git a/pack-write.c b/pack-write.c
index a8f0269..ddcfd37 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -179,6 +179,7 @@ void fixup_pack_header_footer(int pack_fd,
SHA1_Final(pack_file_sha1, &c);
write_or_die(pack_fd, pack_file_sha1, 20);
+ fsync_or_die(pack_fd, pack_name);
}
char *index_pack_lockfile(int ip_out)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html