[afnog] Read/Write Comparison
Phil Regnauld
regnauld at x0.dk
Thu Aug 24 10:33:27 SAST 2006
On Thu, Aug 24, 2006 at 08:45:25AM +0300, Tony Kinyua wrote:
> At the risk of starting a flame war I would like to request the list's opinion
> on how UFS (read freeBSD default) compares to other journalled file systems
> like XFS, Reiserfs or ext3 (read Linux default) when it comes to multiple
> heavy read/write operations on a fairly large raided partition >100GB at the
> rate of thousands per second. What am looking at is
> 1. Is UFS consistent enough in its journalizing to reduce the risk of file
> system errors?
UFS does not journal. There is an ongoing project for this:
http://lists.freebsd.org/pipermail/freebsd-current/2006-June/064043.html
... but it is not meta-data aware, i.e.: it's a block-based journaling,
GEOM based, so in that respect it's different from say ext3 and Reiser.
But othewise, UFS will garantee you that the filesystem metadata will
be consistent upon a hard restart, by reordering writes in such a way
that only the necessary data gets written, , provided your controller or
your disks don't lie to you when they say they've written the data.
(i.e.: write back RAID without a battery -- which should be forbidden --
or IDE/SATA disks lie (run "sysctl hw.ata.wc").
Problem with softupdates is you can't skip FSCK, you can only delay it
or run it in the background. And even though it can be reniced, it eats
quite a bit of I/O. Large filesystems (multiple terabytes) can take
hours to check.
Otherwise there was a quite thorough Linux FS benchmark posted
somewhere recently, I'll try and find the link again. Globally,
XFS/JFS came out on top, with ReiserFS and ext3 slightly behing --
if I remember correctly.
I tend to stick with whatever the distribution has by default. ext2
bit me a couple of times, ext3 seems more stable in 2.6 (had kernel
panics in 2.4 / RedHat). ReiserFS seems reliable, but behaves VERY
strangely on hardware failure. It has dynamic inode allocation, so
it has the advantage that it cannot run out of inodes like UFS/EXT* can.
Very useful on large Maildir-based mail systems (or Usenet news servers,
for the gray haired among us). Otherwise, I think Hans Reiser is an
arrogant person, but that only engages me :)
See http://lwn.net/Articles/190222/ for more info about the future
of Linux FSes.
That said, FreeBSD is rock stable and I know several large ISPs
who use it, also for storing large email backends.
You might (seriously) consider testing Solaris 10 on x86... ZFS
is very new and doesn't really have a track record yet, but it
looks very interesting. There are porting efforts underway to
FreeBSD, and OS X apparently. But to quote a reader on Slashdot:
"ZFS consumes the block driver, the volume manager,
and the RAID layer into one giant entity. It further
adds things like FS snapshots, compression, and
dynamically resizable partitions [...]"
> 2. Why do a majority of large email services prefer Linux for core email
> services?
I see three reasons:
a) 2.6 is good enough these days, and the admins are younger and have
the perception that Linux is more modern.
b) faster recovery on crash (if you run Linux as your storage backend,
which might not even be the case if you run NetApp for example)
c) corporate policy (ISPs used to have a technology drives business
approach, it's turned around now). "You can buy support for Linux".
And indeed, nowadays even Debian is supported by HP.
More information about the afnog
mailing list