Author Topic: A PixInsight system; what are the performance bottlenecks?  (Read 6686 times)

Offline chrisvdberge

  • PixInsight Addict
  • ***
  • Posts: 104
Just wanted to check in and ask your insights into PI performance;
From what I understand PI is really demanding in all three area's of CPU, RAM and Storage (for swapping to disc)
I noticed a tremendous improvement in performance when I added a SSD to my macbook.
However, I was wondering wether this is mainly for the registration/integration processes only, or if other (main) processes are also swapping a lot?

If that is the case, I want to go for two M.2 ssd's (samsung 950 Pro) in Raid 0 for the fastest possible read/write to discs.
I figured it could make sense to spent a little extra in this area and going for less cores on the CPU (going for i7 6700K).
Any thoughts on this?

It's too bad the benchmarks don't record the type of SSD's or even HD's used ;)
« Last Edit: 2016 January 17 12:30:39 by chrisvdberge »

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #1 on: 2016 January 18 05:29:28 »
Your question is similar to: "What is the performance bottleneck in my car?". It really depends on what you are doing. If you want to do racing, it is the motor. if you want to transport goods, it is the load capacity. If you just want to travel with comfort, it may be the air conditioning...

Having said this:
  • A major bottleneck for PI is disk-IO, especially for image registration/integration and for the undo functionality. Using an SSD and lots of RAM are therefore the first priority.
  • Compute speed is rarely a botteneck these days, so more than 4 cores usually dont give you too much.
  • It is rather the memory bandwidth that counts. So if you have the money to buy an expensive Xeon CPU with many memory channels, this is giving you another performance boost.
Georg
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)

Offline chrisvdberge

  • PixInsight Addict
  • ***
  • Posts: 104
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #2 on: 2016 January 18 07:59:26 »
Thx!

Is there anyway to say anything about the amount of data that is swapped to disc?
If this is not to much, I understand you can also assign the RAM to be used for swapping.
Buying an extra 32GB for swapping is cheaper then buying the fastest m.2 (samsung 950 Pro 512GB). card, so that could be the best performance booster?
(so having 64GB of RAM total)

depends of course on data, but let's assume working on a Mozaiek of 4 panels of pictures shot by a 24MP DSLR...
:D


Offline NGC7789

  • PixInsight Old Hand
  • ****
  • Posts: 391
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #3 on: 2016 January 18 14:53:16 »
I believe 64GB is the target swap capacity although I'm sure there are image sets that benefit from more (even much more). A 32GB RAM disk (or /tmp in RAM on Linux) coupled with a standard high performance SSD will get you this (number of sources times smallest source = total swap capacity). Personally I use a 16GB RAM + SSD (32GB swap) and have not experienced an issue and I am quite satisfied with the resulting performance. I'm sure m.2 or other cutting edge SSD performers would improve swap performance. How much is hard to measure and if it's worth the cost only you can tell.

Offline chrisvdberge

  • PixInsight Addict
  • ***
  • Posts: 104
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #4 on: 2016 January 19 00:01:50 »
Thx, that's quite helpful!
Will go for only 1 m.2 card for now then, but get a motherboard that supports 2 so could do Raid-0 in the future ;)

Offline jerryyyyy

  • PixInsight Old Hand
  • ****
  • Posts: 425
    • Astrobin Images
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #5 on: 2016 January 19 06:57:36 »
Hi, can someone please tell me what the relevant setting to tweek might be?  Want to make sure am on the same page. 
Takahashi 180ED
Astrophysics Mach1
SBIG STT-8300M and Nikon D800
PixInsight Maxim DL 6 CCDComander TheSkyX FocusMax

Offline chrisvdberge

  • PixInsight Addict
  • ***
  • Posts: 104
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #6 on: 2016 January 19 10:37:48 »
Hi, can someone please tell me what the relevant setting to tweek might be?  Want to make sure am on the same page.

You mean the disc/ssd/RAM/.. swapping?
Basically we're talking about this;
https://pixinsight.com/forum/index.php?topic=7644.0

Offline chrisvdberge

  • PixInsight Addict
  • ***
  • Posts: 104
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #7 on: 2016 February 03 05:06:50 »
Very interesting article about a test with 3x M.2 ssd in RAID 0 configuration...
http://www.pcper.com/reviews/Storage/Triple-M2-Samsung-950-Pro-Z170-PCIe-NVMe-RAID-Tested-Why-So-Snappy

Would be very nice for PI I presume, so thinking about using this motherboard in my configuration so I'm future proof and can run 3x M.2. whenever I got the $$ to spend on them :P

Offline oldwexi

  • PixInsight Guru
  • ****
  • Posts: 627
    • Astronomy Pages G.W.
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #8 on: 2016 February 03 12:38:04 »
The Performance bottleneck is usually between Keyboard
and Chair - in my case...

Gerald

Offline chrisvdberge

  • PixInsight Addict
  • ***
  • Posts: 104
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #9 on: 2016 February 04 00:46:49 »
The Performance bottleneck is usually between Keyboard
and Chair - in my case...

Gerald
Lol, that's a whole other type of performance we are talking about..... I presume  ;D ;)

Offline james423896

  • Newcomer
  • Posts: 22
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #10 on: 2016 February 08 05:16:29 »
I recently uploaded some benchmark results from using azure machines. The more powerful ones make cheap compute very accessible if it is for short bursts of activity. I used onedrive to sync the data which makes the whole process very slick.

On one image I was playing with it was taking 35+ mins on my local machine (q6600) to run tgv. This dropped to 100s in azure so it could be a viable alternative for certain tasks.

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #11 on: 2016 February 08 08:43:36 »
...On one image I was playing with it was taking 35+ mins on my local machine (q6600) to run tgv. This dropped to 100s in azure so it could be a viable alternative for certain tasks.

Absolutely. I did the same a while ago with EC2 instances, and they look quite good.
Where can I find your results?
Can you publish a guide on how to work with Azure and PI?

Georg
« Last Edit: 2016 February 10 02:37:42 by georg.viehoever »
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)

Offline james423896

  • Newcomer
  • Posts: 22
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #12 on: 2016 February 08 13:33:29 »
Some results published tonight:
http://www.pixinsight.com/benchmark/benchmark-report.php?sn=63T88J0I5CE02MVTQ62TV4P89QD9100S
http://www.pixinsight.com/benchmark/benchmark-report.php?sn=08600H3X777068V2KW1VQLR0V979M2J3

In truth the default OS disk is pretty slow even on the beefy machines. The temporary storage D drive on the larger machines is much faster, but not persistent between reboots. I need to optimise the ram disk configuration as this is the current bottleneck, although my TGV time comparison was just PI running with the swap on D.

This is all based on a "Standard D5_v2 (16 Cores, 56 GB memory)" machine.

Good idea re guide; will post a short guide in the next few days.

Offline james423896

  • Newcomer
  • Posts: 22
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #13 on: 2016 February 10 02:43:36 »
First attempt at an azure guide...

Some brief notes on using Azure for PI:
Azure makes an excellent platform for certain PI activities because you can rent large amounts of compute and memory cheaply, and for as long as you need it. To this end a few use cases stand out to me:
1.   Creation of master/super bias files where you need to integrate potentially hundreds of files. On my Q6600 with only 4GB ram this was impossible because I would run out of memory.
2.   Image Preprocessing and integration. For the reason noted above this can be challenging if you have large numbers of files.
3.   Certain computationally intensive operations such as TGVDenoise. I tested this on an image from my 100d (18mp). On my Q6600 it took approx. 40mins for TGV to run. In an azure machine (Standard D5_v2 (16 Cores, 56 GB memory)) it took 100s!

How I use Azure:
•   I have an msdn subscription which includes a monthly azure allowance. I can therefore rent large amounts of compute without actually spending any money! You can of course get your own subscription on a PAYG basis if you don’t have a monthly allowance.
•   I also have an O365 subscription which includes 1TB of Onedrive capacity. This plays an important role in being able to hop between my local machine and azure. You could of course use any similar service, you just need a way of getting your data into the cloud.

Getting started:
•   Within Onedrive I created an AP folder which holds all of my AP data (raw through to processed). I capture data straight into this folder structure so it is uploaded to Onedrive in near realtime.
•   Within the azure portal (http://portal.azure.com) create a virtual machine. For OS I picked Windows 10 Enterprise N x64, and for size pick D2_V2 Standard. At this stage you don’t want to pay for large amounts of compute so keep the machine small. You could pick a smaller machine but they are architecturally different (not using the latest xeons) and they can be quite slow!
•   Once the machine is accessible sign in, install PI, apply patches etc. Sign into onedrive and select your AP folder as a sync folder. By default this will goto the C drive which works fine for me although the OS partition can be somewhat slow. At this point you will notice the azure internet pipe is so large, the download of all your AP data is disk limited!
•   Once your data is fully in sync check PI is working as you would expect. Power the machine off.
•   Back in the azure portal resize the machine; I choose D5_v2 or D14_v2.
•   Power it on and connect. You now have a 16 core PC at your disposal, billed hourly!
•   For preprocessing and image integration I use the temporary D disk (volatile) or a ram disk (volatile) as the destination (not my Onedrive folder). I then copy the integrated images into my AP folder.

Benefits/Tips:
•   You can have a 16 core machine billed hourly. A D5_v2 machine costs approx. $2 per hour. Building and running a 16 core machine at home would be a very expensive exercise.
•   By leveraging the elasticity of azure you can scale the machine up and down to minimise cost. There is no point paying for 16 cores unless you want to use them, and you don’t have to.
•   Remember to stop the virtual machine as soon as you are finished with it. Otherwise you could be incurring a significant charge. If the machine shows “Stopped (deallocated)” it will have no cost (apart from storage).
•   Inbound data has no cost, outbound data does. You tend to take large amounts of incoming data and integrate it into very few, relatively small files, so you might ingest 10GB of data and output 250MB. This works brilliantly, but remember to keep the working directories out of Onedrive otherwise you will start paying for significant amounts of outbound data. At this point the huge internet pipe will hurt you because it will upload vast amounts of data in no time.
•   Normal storage is very cheap in azure, but beware of premium storage. I tried this and whilst very fast (1.5GB/s), it is expensive and 30GB was costing me approx. $2.50 per day! Storage costs accrue whether the machine is running or not. The temp drive is almost as fast, but remember it doesn’t persist between sessions. In many ways it is easier to use multiple ram drives and then copy the data you want to keep.
•   If you want to minimise storage costs you can selectively sync directories (e.g. by target) so you only use storage as you need it. Because the azure internet pipe is so fast this doesn’t introduce significant delays, and I guess you could try syncing your onedrive folder to volatile memory if you were prepared to resync every time you powered the machine on.

PI Benchmark Analysis:
•   In general the azure machines Dn_V2 are very capable, especially compare to a normal spec desktop/laptop. However, I suspect ECC memory limits the performance compared to the fastest desktop results. The CPU speeds are also much lower than the fastest desktop results.
http://pixinsight.com/benchmark/benchmark-report.php?sn=KOQG911N6EE0QT031HMCBNF7CD2511CZ
http://pixinsight.com/benchmark/benchmark-report.php?sn=G06X582RBL69239KF28JR12YEL7MJ4RF

•   The cheaper An machines are slow; your normal pc is probably just as capable.
http://pixinsight.com/benchmark/benchmark-report.php?sn=EE98891LK5NXVK39J12E28U4I4LF7PCX
http://pixinsight.com/benchmark/benchmark-report.php?sn=L0G7I6GX4G2JDX542OQ5BBK330159I33



Feedback welcome...

Offline chrisvdberge

  • PixInsight Addict
  • ***
  • Posts: 104
Re: A PixInsight system; what are the performance bottlenecks?
« Reply #14 on: 2016 February 18 02:49:53 »
Just wanted to comment to say that I think it's awesome that you did this test and posted a guide + your results here!
Truly helpful!