After tinkering with Ext4 I did some research and tested other file systems on my new disk arrays. I’ve concluded that XFS, once tuned, is the best file system for my needs and it could well be the best file system for your needs too.
The remainder of this page explains how I arrived at that decision and how I tune XFS to get optimal, yet safe, performance that can rival Ext4 and JFS.
This post was updated in October 2012
Here are some benchmarks.
- Reiser4 Benchmarked On Linux 3.5 Against EXT4, Btrfs, XFS, ReiserFS
Why I chose XFS
I have not chosen XFS for performance alone, indeed some benchmarks show that XFS it outperformed for some file operations.
My workstation at home has two 6TB disk arrays and a 1TB root file system. The disk arrays contain photo, music and video libraries which are streamed via UPnP/DLNA and DAAP. The video files can be 2GB to 30GB in size. I also do a good deal of HD video encoding, processing and editing. My root partition contains many virtual machine images of which several are running at any given time.
My work laptop has a 250GB root file system and also contains many virtual machine images of which one is usually running.
XFS is designed with large file systems and large file handling in mind. It seems a sensible choice for those reasons alone, but I also liked the following features:
- XFS has on-line defragmentation tools, while (at the original time of writing in early 2010) Ext3/4 and JFS do not.
- XFS dramatically reduces start-up time by avoiding
fsckdelay. Ext3/4 can be very slow to
- XFS has very fast (a few seconds or less) file system creation. JFS is faster than XFS but Ext4 takes many, many minutes.
- XFS formatted disk capacity is greater than Ext3/4 even after removing the reserved blocks from the Ext3/4 file system. JFS formatted capacity is similar to XFS.
On that last point, XFS gains 400GB over Ext4 on a 6TB array but when the Ext4 reserved blocks are removed XFS gains 100GB over Ext4.
Most of the performance tuning information I found (at the original time of writing in early 2010) is out of date and doesn’t reflect the XFS defaults in modern Linux kernels.
That said, the information on this page is quite old and I no longer feel the need to tweak XFS like I once did.
XFS 3.1.0 and Kernel 2.6.32 or newer
Ubuntu Lucid 10.04 comes with XFS 3.1.0. The defaults used when creating a XFS file system using Ubuntu 10.04 are optimal and do not require any tweaking.
XFS 3.0.2 and Kernel 2.6.31 or older
Ubuntu Karmic 9.10 comes with XFS 3.0.2. If you are running an earlier Ubuntu
release and want to use a tuned XFS root file system you can’t simply use the
graphical partitioning tool from the Ubuntu LiveCD installer. However, it is
very easy manually create the tuned XFS file systems. Simply boot the Ubuntu
Live CD, then start a new shell
Application -> Accessories -> Terminal.
Now run the following as
mkfs.xfs -l lazy-count=1 -L VolumeName <dev>
lasy-count=1 is a default since XFS 3.1.0 but was recommended by the XFS
developers before that.
lazy-count is a
mkfs option because it changes the
on-disk format slightly, and older kernels do not understand this new format.
mkfs sets a superblock feature bit to prevent the file system from being
mounted on kernels that don’t understand the slightly different disk format.
So you must specify
lazy-count=0 if you want to disable this feature for older
kernels which don’t support it.
Forcing a tuned XFS creation
If you are not sure what XFS version you are running, and therefore what the defaults might be on your system, you can fully tune XFS using the following.
For < 1TB XFS file system
mkfs.xfs -l lazy-count=1,version=2,size=128m -i attr=2 -d agcount=4 -L VolumeName <dev>
For > 1TB XFS filesystem
mkfs.xfs -l lazy-count=1,version=2,size=128m -i attr=2 -d agcount=16 -L VolumeName <dev>
Once you have created all your tuned XFS file systems start the Ubuntu installer from the
Live CD. When the disk partitioning section comes round choose:
Specify Partitions Manually
Change each XFS file system telling the partitioner where to mount each XFS file system.
But ensure that you do not tick
Format the Partition:, thereby preserving your tuned XFS
When you see this message, just click Continue.
The file system on /dev/sda1 assigned to /boot has not been marked for formatting. Directories containing system files (/etc, /lib, /usr, /var, …) that already exist under any defined mountpoint will be deleted during the install.
Please ensure that you have backed up any critical data before installing.
Further performance optimisations can be gained but specifying some additional mount options for your XFS file systems.
To manually mount a XFS file system with, optimal mount options, use the following:
mount -t xfs -o noatime,osyncisosync,logbsize=256k,logbufs=8 <dev> <mtpt>
The ‘/etc/fstab’ entries I use look something like this.
UUID=xxxxxxxxxxx...x <mtpt> xfs noatime,osyncisosync,logbsize=256k,logbufs=8 0 2
options address the often sited limitation of XFS when
handling lots of small files and large number of file deletions. The above assumes
you don't requireatime
. Not usingatime` provides a significant performance benefit.
atime, relatime and noatime
Every time a file is accessed (read or write) the default for most file systems
is to append the metadata associated with that file with an updated access time.
Thus, even read operations incur an overhead associated with a write to the file
system. This can lead to a significant degradation in performance in some usage
noatime to the fstab line for any file system stops
this action from happening.
One may also specify a
relatime option which updates the atime if the previous
atime is older than the mtime or ctime. In terms of performance, this will not be
as fast as the
noatime mount option, but is useful if using applications that
need to know when files were last read (like
As access time is of little importance in most scenarios, this alteration has been widely touted as a fast and easy way to get a performance boost. Even Linus Torvalds seems to be a proponent of this optimization
Access time is not the same as the last-modified time. Disabling access time will still enable you to see when files were last modified by a write operation.
async and nobarrier
If you really want to go for all out performance you can also provide
nobarrier mount options. But you really need to understand and accept the potential
issues with using these options.
Read the following to understand what write barriers are and if you are prepared to disable them to gain performance.
XFS userspace tools
XFS is available as a kernel module in Ubuntu and also available from the Live CDs. Once Ubuntu is installed you can install the XFS userspace tools as follows.
sudo apt-get install xfsdump xfsprogs
There are two utilities that XFS has to manage this fragmentation.
xfs_dbXFS Debug Information. Used to examine an XFS file system for problems or gather information about the XFS file system.
xfs_fsrFile System Organiser. Improves the organisation of mounted file systems. The reorganisation algorithm operates on one file at a time, compacting or otherwise improving the layout of the file extents (contiguous blocks of file data).
Defragment a file system
To find the health of a XFS file system use the
xfs_db command to
gather some information. In the example below
/dev/sda1 is mounted as
/dev/sda3 is mounted as
“` shell sudo xfs_db -c frag -r /dev/sda1 actual 162, ideal 162, fragmentation factor 0.00%
sudo xfs_db -c frag -r /dev/sda3 actual 2288833, ideal 254504, fragmentation factor 88.88%
The closer the fragmentation factor is to 0% the better. Unsurprisingly
/boot is not fragmented. However
/root is very fragmented.
Defragmenting XFS file systems can be done on a live running system, but it is a good idea to schedule this for a time where the partition will be used less.
The file system reorganizer for XFS is
xfs_fsr. Typically, I instruct
xfs_fsr to reorganise
/dev/sda3 with a timeout (-t) of 6hrs (60 * 60 * 6 = 21600)
which is specified in seconds. But for the purposes of this example I used a timeout of 15 mins.
sudo xfs_fsr -t 300 /dev/sda3 -v
The output will look something like this.
/ start inode=0 ino=145565 extents before:2 after:1 DONE ino=145565 ino=145662 extents before:2 after:1 DONE ino=145662 ino=600148 extents before:2 after:1 DONE ino=600148 ino=1127295 extents before:82794 after:1 DONE ino=1127295 ino=1127243 extents before:2 after:1 DONE ino=1127243 ino=1382852 extents before:50869 after:1 DONE ino=1382852 ino=1422636
When the defrag is finished check how well the file system reorganising was.
sudo xfs_db -c frag -r /dev/sda3 actual 2155648, ideal 254512, fragmentation factor 88.19%
As you can see defragmenting for 15 mins doesn’t improve things greatly, which
`xfs_fsr needs to be run for several hours or more.
Manually defragmenting the file system is simple enough, but a better solution would be to schedule a cron job to run periodically.
Defragment a file
It is also possible to de-fragment a single file. To determine if a file is in need of defragmenting run the following…
xfs_bmap -v /srv/A320/PGQAR.DAT | wc -l
This will output a number which showing the number of extents the file is using.
This number should be close to 1. So in the example above, I have a very fragmented file.
sudo xfs_fsr -v /srv/A320/PGQAR.DAT
This will output something like the following.
/srv/A320/PGQAR.DAT extents before:95278 after:1 DONE /srv/A320/PGQAR.DAT
The file is now defragmented. I use the method above to target defragmentation where I know files reside that are most likely to be fragmented, rather than defragmenting the whole file system.
Performance Tuning XFS References
- http://www.opensubscriber.com/message/[email protected]/8198329.html