Performance
(→Performance of raids with 2 disks) |
(→Performance of raids with 2 disks) |
||
Line 40: | Line 40: | ||
Sequential reads are about the same (80 MB/s) for ordinary partition, RAID1 and RAID10. | Sequential reads are about the same (80 MB/s) for ordinary partition, RAID1 and RAID10. | ||
− | + | Random reads for ordinary partition and RAID1 is about the same (35 MB/s) and about 50 % higher for | |
RAID10. I am puzzled why RAID10 is faster than RAID1 here. | RAID10. I am puzzled why RAID10 is faster than RAID1 here. | ||
Revision as of 14:35, 12 February 2008
Contents |
Performance of raids with 2 disks
I have made some testing of performance of different types of RAIDs, with 2 disks involved. I have used my own home grown testing methods, which are quite simple, to test sequential and random reading and writing of 200 files of 40 MB. The tests were meant to see what performance I could get out of a system mostly oriented towards file serving, such as a mirror site.
My configuration was
1800 MHz AMD Sempron(tm) Processor 3100+ 1500 MB RAM 2 x Hitachi Ultrastar SCSI-II 1 TB. Linux version 2.6.12-26mdk Tester: Keld Simonsen, keld@dkuug.dk
Figures are in MB/s, and the file system was ext3. The chunk size was 256 kiB. Times were measured with iostat, and an estimate for steady performance was taken. The times varied quite a lot over the different 10 second intervals, for example the estimate 155 MB/s ranged from 135 MB/s to 163 MB/s. I then looked at the avearge over the period when a test was running in full scale (for example all processes started, and none stopped).
RAID type sequential read random read sequential write random write Ordinary disk 82 34 67 56 RAID0 155 80 97 80 RAID1 80 35 72 55 RAID10 79 56 69 48 RAID10,f2 150 79 70 55
Random read for RAID1 and RAID10 were quite unbalanced, almost only coming out of one of the disks.
The results are quite as expected:
RAID0 and RAID10,f2 reads are double speed compared to ordinary file system for sequential reads (155 vs 82) and more than double for random reads (80 vs 35).
Writes (both sequential and random) are roughly the same for ordinary disk, RAID1, RAID10 and RAID10,f2, around 70 MB/s for sequential, and 55 MB/s for random.
Sequential reads are about the same (80 MB/s) for ordinary partition, RAID1 and RAID10.
Random reads for ordinary partition and RAID1 is about the same (35 MB/s) and about 50 % higher for RAID10. I am puzzled why RAID10 is faster than RAID1 here.
All in all RAID10,f2 is the fastest mirrored RAID for both sequential and random reading for this test, while it is about equal with the other mirrored RAIDs when writing.
My kernel did not allow me to test RAID10,o2 as this is only supported from kernel 2.6.18.
Old performance benchmark
This section contains a number of benchmarks from a real-world system using software RAID. There is some general information about benchmarking software too.
Benchmark samples were done with the bonnie program, and at all times on files twice- or more the size of the physical RAM in the machine.
The benchmarks here only measures input and output bandwidth on one large single file. This is a nice thing to know, if it's maximum I/O throughput for large reads/writes one is interested in. However, such numbers tell us little about what the performance would be if the array was used for a news spool, a web-server, etc. etc. Always keep in mind, that benchmarks numbers are the result of running a "synthetic" program. Few real-world programs do what bonnie does, and although these I/O numbers are nice to look at, they are not ultimate real-world-appliance performance indicators. Not even close.
For now, I only have results from my own machine. The setup is:
- Dual Pentium Pro 150 MHz
- 256 MB RAM (60 MHz EDO)
- Three IBM UltraStar 9ES 4.5 GB, SCSI U2W
- Adaptec 2940U2W
- One IBM UltraStar 9ES 4.5 GB, SCSI UW
- Adaptec 2940 UW
- Kernel 2.2.7 with RAID patches
The three U2W disks hang off the U2W controller, and the UW disk off the UW controller.
It seems to be impossible to push much more than 30 MB/s thru the SCSI busses on this system, using RAID or not. My guess is, that because the system is fairly old, the memory bandwidth sucks, and thus limits what can be sent thru the SCSI controllers.
RAID-0
Read is Sequential block input, and Write is Sequential block output. File size was 1GB in all tests. The tests where done in single-user mode. The SCSI driver was configured not to use tagged command queuing.
From this it seems that the RAID chunk-size doesn't make that much of
a difference. However, the ext2fs block-size should be as large as
possible, which is 4kB (eg. the page size) on IA-32.
| | | | | |Chunk size | Block size | Read kB/s | Write kB/s | | | | | | |4k | 1k | 19712 | 18035 | |4k | 4k | 34048 | 27061 | |8k | 1k | 19301 | 18091 | |8k | 4k | 33920 | 27118 | |16k | 1k | 19330 | 18179 | |16k | 2k | 28161 | 23682 | |16k | 4k | 33990 | 27229 | |32k | 1k | 19251 | 18194 | |32k | 4k | 34071 | 26976 |
RAID-0 with TCQ
This time, the SCSI driver was configured to use tagged command queuing, with a queue depth of 8. Otherwise, everything's the same as before.
| | | | | |Chunk size | Block size | Read kB/s | Write kB/s | | | | | | |32k | 4k | 33617 | 27215 |
No more tests where done. TCQ seemed to slightly increase write
performance, but there really wasn't much of a difference at all.
RAID-5
The array was configured to run in RAID-5 mode, and similar tests where done.
| | | | | |Chunk size | Block size | Read kB/s | Write kB/s | | | | | | |8k | 1k | 11090 | 6874 | |8k | 4k | 13474 | 12229 | |32k | 1k | 11442 | 8291 | |32k | 2k | 16089 | 10926 | |32k | 4k | 18724 | 12627 |
Now, both the chunk-size and the block-size seems to actually make a
difference.
RAID-10
RAID-10 is "mirrored stripes", or, a RAID-1 array of two RAID-0 arrays. The chunk-size is the chunk sizes of both the RAID-1 array and the two RAID-0 arrays. I did not do test where those chunk-sizes differ, although that should be a perfectly valid setup.
| | | | | |Chunk size | Block size | Read kB/s | Write kB/s | | | | | | |32k | 1k | 13753 | 11580 | |32k | 4k | 23432 | 22249 |
No more tests where done. The file size was 900MB, because the four
partitions involved where 500 MB each, which doesn't give room for a
1G file in this setup (RAID-1 on two 1000MB arrays).
Fresh benchmarking tools
To check out speed and performance of your RAID systems, do NOT use hdparm. It won't do real benchmarking of the arrays.
Instead of hdparm, take a look at the tools described here: IOzone and Bonnie++.
IOzone is a small, versatile and modern tool to use. It benchmarks file I/O performance for read, write, re-read, re-write, read backwards, read strided, fread, fwrite, random read, pread, mmap, aio_read and aio_write operations. Don't worry, it can run on any of the ext2, ext3, reiserfs, JFS, or XFS filesystems in OSDL STP.
You can also use IOzone to show throughput performance as a function of number of processes and number of disks used in a filesystem, something interesting when it's about RAID striping.
Although documentation for IOzone is available in Acrobat/PDF, PostScript, nroff, and MS Word formats, we are going to cover here a nice example of IOzone in action:
iozone -s 4096
This would run a test using a 4096KB file size.
And this is an example of the output quality IOzone gives
File size set to 4096 KB Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 4096 4 99028 194722 285873 298063 265560 170737 398600 436346 380952 91651 127212 288309 292633
Now you just need to know about the feature that makes IOzone useful
for RAID benchmarking: the file operations involving RAID are the read
strided. The example above shows a 380.952Kb/sec. for the read
strided, so you can go figure.
Bonnie++ seems to be more targeted at benchmarking single drives that at RAID, but it can test more than 2Gb of storage on 32-bit machines, and tests for file creat, stat, unlink operations.