[afnog] Read/Write Comparison

Phil Regnauld regnauld at x0.dk
Thu Aug 24 10:33:27 SAST 2006


On Thu, Aug 24, 2006 at 08:45:25AM +0300, Tony Kinyua wrote:
> At the risk of starting a flame war I would like to request the list's opinion 
> on how UFS (read freeBSD default) compares to other journalled file systems 
> like XFS, Reiserfs or ext3 (read Linux default) when it comes to multiple 
> heavy read/write operations on a fairly large raided partition >100GB at the 
> rate of thousands per second. What am looking at is
> 1. Is UFS consistent enough in its journalizing to reduce the risk of file 
> system errors?

    UFS does not journal.  There is an ongoing project for this:

    http://lists.freebsd.org/pipermail/freebsd-current/2006-June/064043.html

    ... but it is not meta-data aware, i.e.: it's a block-based journaling,
    GEOM based, so in that respect it's different from say ext3 and Reiser.

    But othewise, UFS will garantee you that the filesystem metadata will
    be consistent upon a hard restart, by reordering writes in such a way
    that only the necessary data gets written, , provided your controller or
    your disks don't lie to you when they say they've written the data.
    (i.e.: write back RAID without a battery -- which should be forbidden --
    or IDE/SATA disks lie (run "sysctl hw.ata.wc").

    Problem with softupdates is you can't skip FSCK, you can only delay it
    or run it in the background.  And even though it can be reniced, it eats
    quite a bit of I/O.  Large filesystems (multiple terabytes) can take
    hours to check.

    Otherwise there was a quite thorough Linux FS benchmark posted
    somewhere recently, I'll try and find the link again.  Globally,
    XFS/JFS came out on top, with ReiserFS and ext3 slightly behing --
    if I remember correctly.

    I tend to stick with whatever the distribution has by default.  ext2
    bit me a couple of times, ext3 seems more stable in 2.6 (had kernel
    panics in 2.4 / RedHat).  ReiserFS seems reliable, but behaves VERY
    strangely on hardware failure.  It has dynamic inode allocation, so
    it has the advantage that it cannot run out of inodes like UFS/EXT* can.
    Very useful on large Maildir-based mail systems (or Usenet news servers,
    for the gray haired among us).  Otherwise, I think Hans Reiser is an
    arrogant person, but that only engages me :)

    See http://lwn.net/Articles/190222/ for more info about the future
    of Linux FSes.

    That said, FreeBSD is rock stable and I know several large ISPs
    who use it, also for storing large email backends.

    You might (seriously) consider testing Solaris 10 on x86... ZFS
    is very new and doesn't really have a track record yet, but it
    looks very interesting.  There are porting efforts underway to
    FreeBSD, and OS X apparently.  But to quote a reader on Slashdot:

        "ZFS consumes the block driver, the volume manager,
        and the RAID layer into one giant entity. It further
        adds things like FS snapshots, compression, and
        dynamically resizable partitions [...]"


> 2. Why do a majority of large email services prefer Linux for core email 
> services?

    I see three reasons:

    a) 2.6 is good enough these days, and the admins are younger and have
       the perception that Linux is more modern.

    b) faster recovery on crash (if you run Linux as your storage backend,
       which might not even be the case if you run NetApp for example)

    c) corporate policy (ISPs used to have a technology drives business
       approach, it's turned around now).  "You can buy support for Linux".

       And indeed, nowadays even Debian is supported by HP.



More information about the afnog mailing list