PixInsight 1.8.3 Ripley: Enhanced Parallel Swap File Storage

Juan Conejero

PixInsight Staff
Sep 2, 2004
8,380
467
57
Valencia, Spain
pixinsight.com
The upcoming version 1.8.3 of PixInsight comes with an improved parallel swap file storage management. In previous versions, only one swap storage file could be generated per drive (physical or virtual). Clearly this was suboptimal for SSD disks and virtual RAM drives.

In the new version, one can select the same swap storage directory multiple times, as shown in the next screenshot.


In this screenshot, the same physical device (an SSD in this case) is being used with four concurrent I/O threads. The new swap file storage management routines implemented in version 1.8.3 are optimized for parallel I/O operations on fast SSD and RAM devices. The performance gain that can be achieved with this new feature is spectacular on all platforms. The following benchmarks have been performed on the Antares reference workstation (Intel Xeon E5-2695 v2 @ 2.40GHz, 64 GiB of RAM, Fedora 20 64-bit Linux) with PixInsight Core 1.8.3.1115:

0TNZ014338A70855X9M9NAP6BLHW41JN - 4 threads on a Samsung SSD EVO 840 1TB
Performance indexes: total=12983 cpu=12133 swap=18342 (3311.650 MiB/s)

JEXKR496O859I3213BIG56MV17G67769 - 4 threads on the system /tmp directory (Linux tmpfs)
Performance indexes: total=13814 cpu=12194 swap=30877 (5574.920 MiB/s)

The new parallel storage subsystem yields spectacular I/O transfer rates above 3.23 GiB and 5.44 GiB, respectively for the physical SSD and virtual RAM drives in these benchmarks. The good news is that similar improvements can now be achieved on all platforms, including Windows and Mac OS X, although the Linux kernel remains hard to beat in these tasks.

Version 1.8.3 of PixInsight is now undergoing the last testing stages and should be released in a few days for all supported platforms.
 

NGC7789

Well-known member
Aug 13, 2012
391
0
Connecticut
Juan,

I've got a few questions about the upcoming parallel swap feature. Using a combination of RAM disk and SSD swap I already get about 4600 MiB/s swap performance. Can you multi-thread swap on multiple sources? How many threads should one create? Is it related to the number of cores? Is the a tradeoff between swap threads and the cores that are available for other processing tasks? Some guideliness one setting up the new feature will be most appreciated. Of course, I will probably end up experimenting on my own anyway  ;D

-Josh
 

Juan Conejero

PixInsight Staff
Sep 2, 2004
8,380
467
57
Valencia, Spain
pixinsight.com
Hi Josh,

Can you multi-thread swap on multiple sources?
Yes, this is now possible with 1.8.3. As expected, the best results are always achieved with multithreaded swap storage on RAM disks (Linux tmpfs for example), but any configuration is now possible.

How many threads should one create? Is it related to the number of cores?
I think it's more related to the I/O bandwidth of the device(s) involved. For example, with Linux tmpfs on the Antares workstation we achieve a maximum of transference speed of about 6 GiB/s with 5 - 6 threads. With SSD disks connected to motherboard SATA ports it seems that 4 threads are about optimal.

Is the a tradeoff between swap threads and the cores that are available for other processing tasks?
Not at all. The swap I/O routines never work in parallel with other tasks. The whole PixInsight platform gets 'freezed' during a swap read or write operation. This may change in a future version, but not in the middle term.

Some guideliness one setting up the new feature will be most appreciated.
As you say, once we release 1.8.3 you'll have to experiment to find the optimal settings for your system. It all depends on your hardware and on the volume of the data you process.
 

NGC7789

Well-known member
Aug 13, 2012
391
0
Connecticut
Juan,

I've been experienting with swap benchmarks since the new release and I have some observations to share.

1. I've noticed a slight difference depending on the order the swap threads are declared (only about 1.5% but very repeatable). If they are ramdisk first (say ramdisk/ramdisk/ssd/ssd or ramdisk/ssd/ramdisk/ssd) it performs a bit better than the other way around. Is this expected that the order matters, albeit only a tiny amount?

2. Is there any reason to consider an unequal number of thread per source? I experimented with it and it was always slower so I think not. But if not I would propose a different interface where you choose sources (only once) and then separately the number of threads per source. This would be both a simpler interface and remove the temptation to give an unequal number of threads.

3. I assume the total available swap space is still the smallest swap source times the number of sources. Threads don't matter here, right?

4. Overall with 1.8.3 I am seeing the CPU score about 1% slower regardless of swap setting.

-Josh

 

slang

Well-known member
Sep 22, 2010
65
1
New Zealand
Hi Juan.

I have 3 temporary directories defined
Code:
/mnt/sdb1
/mnt/sdc1
/mnt/sdd1
and as they're fast drives, I leave them alone.

Default swap speed is 2916 today before config update.

I've gone back to my original post/config (to remember the tweak) - http://pixinsight.com/forum/index.php?topic=7083.msg47979#msg47979 Made those changes (incl. the important sysctl -p)

and with the parallel i/o in 1.8.3 I get
Code:
Execution Times
Total time ............. 01:33.31
CPU time ............... 01:27.54
Swap time .............. 00:05.70
Swap transfer rate ..... 2908.379 MiB/s

Performance Indices
Total performance ......  5041
CPU performance ........  4324
Swap performance ....... 16108
These new stats are within 10% (or so) of my previous test from the link above.

What I deduce from this is that with that kernel tuning in place, the disk i/o is so heavily cached, that the bottleneck is system RAM and o/s calls, and that parallelising i/o doesn't help me here. I can tell the disk isn't being touched during this process too, no lights, no clicky clicky whir whir.

Even if it doesn't appear to help me much, it is great that this exists as it seems to help some platforms significantly.

Cheers -
 

NGC7789

Well-known member
Aug 13, 2012
391
0
Connecticut
Hi Slang. Just to be clear, did you set up multiple entries for each swap drive (at least 2 per)? That would be necessary to benefit from the new swap thread feature. I had also tuned my Ubuntu along the lines of your other post and definitely saw benefit with the new swap feature.
 

slang

Well-known member
Sep 22, 2010
65
1
New Zealand
Hiya.

Yes.

I have a SSD boot drive. 3 x WD Raptors as cache drives. I have 6 entries;
Code:
/mnt/sdb1/scratch
/mnt/sdb1/scratch2
/mnt/sdc1/scratch
/mnt/sdc1/scratch2
/mnt/sdd1/scratch
/mnt/sdd1/scratch2
Changing these to;
Code:
/mnt/sdb1/scratch
/mnt/sdb1/scratch
/mnt/sdc1/scratch
/mnt/sdc1/scratch
/mnt/sdd1/scratch
/mnt/sdd1/scratch
made no difference - the results are pretty close to just having the 3 straight entries, one per scratch drive. Guess my bottleneck is somewhere else. Either way, I'm pretty happy with
Code:
Swap transfer rate ..... 3125.498 MiB/s
Total performance ......  5099
CPU performance ........  4355
Swap performance ....... 17311
Cheers -
 

mcgillca

Well-known member
Sep 7, 2012
89
2
www.astrobin.com
Hi - I have 2 SSDs. I tried having multiple swaps reference the same disk (under Windows 7), and saw some improvement (say 50%), but got the swap speed to trebble when I had half the entries referring to one SSD, the other half to the other disk.

I also realised that when I upgraded my BIOS, the overclocking parameters got wiped, so reinstated these and overall, my benchmark increased significantly (3 times swap, extra 15% for the overclocking).

Thanks, Juan - a real improvement for just a few minutes work (for me anyway - I'm sure this took you days to code!).

Colin
 

jkmorse

Well-known member
For those of us who are not hardware experts, does it make any sense to add an SSD drive externally and not through a motherboard connection?  Use a laptop for PI and want to see if I can get a significant speed boost by adding an external SSD drive.  If it is worthwhile, what is the minimum size SSD drive that I should consider adding?

Thanks,

Jim
 

NGC7789

Well-known member
Aug 13, 2012
391
0
Connecticut
That would depend on the interface of the external drive and how it compares to the internal connection. If it's USB3.0 or eSata and you are comparing to an internal hard drive (especially a 5400 rpm one typical for a laptop) then there would be a benefit. Keep in mind that these interfaces top out at about 300 MB/sec so you don't need to spend money on a top SSD that can go at 500MB/sec or better. Also make sure you pay attention to both read and write speeds too.
 

jkmorse

Well-known member
Thought I would report on effects of subbing out an HDD with an SSD, then adding swap files.  First, using the HDD in basic mode, here were my results:

Total 1672; CPU 4796; Swap 454; MiB/s: 81.885 (pathetic)

Replacing the HDD with the SSD resulted in the following:

Total 4645: CPU 4654; Swap 4617; MiB/s: 833.663 (an amazing tenfold improvement!!  :))

Finally, first try at adding swap directories (simply created a 128GB partition, created a "swap" folder, then pointed to that folder 4 times in the Swap Storage Directories; welcome suggestions on how to do that more efficiently so I get more bang for the buck):

Total 5081; CPU 5038; Swap 5276; MiB/s: 952.628 (more improvement and will surely go higher with some trial and error  :D)

For reference, this is from a Samsung NP880z5e Laptop  with a Core i7 and 16GB of ram.  The SSD is a Samsung 840 EVO 1TB.

The speed increase is magical and I can't wait to put all that speed to use in my image processing.

Thanks Juan!!

Jim
 

lmamakos

Member
Apr 28, 2011
6
0
Just a stupid question, but if you've enough memory in your system for a tmpfs RAM disk, and you have a 64 bit user process environment, why bother to swap at all?  Even just copying the data to another chunk of RAM in a file system container is going to evict  all of your "hot" data from the CPU caches on that core and the common L3 cache.