Author Topic: Benchmark meanings (Read 7765 times)

geomcd1949 · « **on:** 2014 July 16 14:18:05 »

Am designing a computer for my brother. Wondering how important the benchmark results are to speed of a computer using Pixinsight. That is, for any given Pixinsight task, is a machine with a score of 9000 exactly twice as fast as one with a score of 4500? Or is the theory of diminishing returns in operation?

In building the machine, will it be faster if:

it has 24 cores as opposed to 12?

it has 64 GB of RAM as opposed to 32?

it has four SSDs as opposed to two?

Does size matter in deciding on an SSD?

Any advice on other issues in building a Pixinsight computer are greatly appreciated!

We also have a question about whether a GPU is necessary. But I guess if a 4K monitor is used, the computer must have a GPU, no?

Thanks!

~George

NGC7789 · « **Reply #1 on:** 2014 July 16 15:42:23 »

Here's my two cents.

It does depend on your operating system. You will notice better swap performance from Linux and much of this is real based on my experiments.

I would say twice the benchmark is roughly but not exactly twice as fast. More cores CAN be better but if you examine the benchmarks you will see fewer cores scoring higher on CPU than more cores so it's not the whole story. Same with more RAM. Up to about 16GB you will definitely benefit. After than you may be better off devoting extra RAM to a RAM Disk to improve swap or to devote those resources to other system components. SSDs are definitely an advantage for swap. Two is better then one IF your bus can support the bandwidth. Usually a larger SSD will outperform a smaller one all other things being equal. GPU is not used by PixInsight at this time. I don't know much about 4K but depending on what CPU you have I think you can run 4K off the internal graphics.

-Josh

geomcd1949 · « **Reply #2 on:** 2014 July 24 11:04:33 »

Josh,

Thanks very much!

I'm using Linux (Ubuntu 14.04). Here's what I've configured, and would appreciate comment on whether there are any drawbacks to doing it this way.

With 32GB total RAM, I made a 28GB ramdisk, and also moved the /tmp file into RAM. Now I get swap rates consistently over 4000 MiB/sec. I've alternated using two additional swap files, each on a dedicated HDD. Sometimes I run a Benchmark with those added, other times with them removed. The two additional swap files don't seem to have an effect on the overall benchmark score - sometimes scores are higher with them, sometimes lower. The highest Benchmark score (7275) was achieved using only the /tmp file and the ramdisk.

~George

P.S. I'm a little confused by the 'MiB/sec.' Is this 4 gigabytes/second or 4 gigabits per second?

NGC7789 · « **Reply #3 on:** 2014 July 24 11:49:09 »

As I understand it the extreme benchmarks from linux are because swap resides in RAM rather than disk. The only concern is what happens when you run out of RAM.

I think if you have 32GB total devoting 28GB to a ramdisk is excessive. In fact I would think you'd be better off allowing linux to manage the swap (i.e. no ram disk at all) assuming that management is tuned (like putting /tmp in RAM). One thing a little misleading about the benchmark is that it doesn't tax total swap space. As I understand it even 4gb of total swap space will not be taxed by the benchmark. In practice you may need many times that much (10 - 20 times!).

MB = 1000 KB. KB = 1000 bytes. MiB = 1024 KiB. KiB = 1024 bytes. Basically MiB is the proper base 2 values for a megabyte.

NGC7789 · « **Reply #4 on:** 2014 July 24 19:59:15 »

After my previous post I kept thinking about this moving of /tmp to RAM. Since PI puts it's swap in /tmp (at least by default) putting /tmp in RAM is about the same as putting swap in a ram disk. It has about the same tradeoff: improved speed in exchange for smaller swap space. In practice I don't believe this is a good trade unless you have enough RAM. I think I read somewhere that you want 40-60 GB of swap. Maybe more when performing operations involving a large number of files.

Based on this I'm back to default settings for my Ubuntu.

geomcd1949 · « **Reply #5 on:** 2014 July 24 20:47:54 »

Josh,

Interesting discussion, and thanks for your guidance. My purpose in asking questions is to find the best configuration for a computer to accomplish Pixinsight tasks. My assumption is that the Benchmark was designed so that, the higher the score obtained, the faster the tested computer would be in accomplishing those tasks, and thus the less time the operator would have to wait for each successive task to be completed in the course of a project.

It appears I've stumbled on a configuration that minimizes swap time, and maximizes transfer rate, and this with 32GB of RAM. It seems anything done to further minimize swap time would have little value in bettering the Benchmark, because it wouldn't be of significant value to lower the swap time from about four seconds to three. I'm thinking that the remaining issue is to lower the CPU time, which looks like a function of the number of cores. Do you think this is a reasonable assumption?

Thanks!

~George

NGC7789 · « **Reply #6 on:** 2014 July 24 21:39:01 »

There is going to certainly be a diminishing return as you improve performance in one area. Not only will you have to expend more to gain less but as you mention the difference of even large percentage gains is small when it's already quite fast.

While the benchmark is a useful tool for comparing systems and configurations it must be understood on it's own terms and limitations. One limitation, as I have mentioned, is that it does not stress the size of the swap. I think a good way to check real world performance is to run a large BPP session with your settings and see how it goes. Each process will output it's swap performance to the console. If you see consistent performance through the run I would say your performance is real.

As far as CPU core is concerned, for a perfectly multi-threaded application CPU speed and cores should both contribute. A second core at the same clock would be equivalent to doubling the clock on a single core. Of course no application is perfectly multi-threaded and more significantly in most multicore processors the clock speed of all cores goes down as the number of cores is engaged. An interesting discussion of this can be found here: http://macperformanceguide.com/MacPro2013-CPU-GPU-choice.html. While this reference is discussing Mac Pros and OS X I would guess much of it still can be applied to Linux. Even if Linux is more efficient in it's use of the cores I don't think it can get away from the dropping clock speed as the cores use goes up.

But the short story is that a six core processor should out perform a four core one if the base clock speed and underlying chip technology are similar. But also six core machine are going to be much more expensive as all the components go up: processor, motherboard, there's no on-board graphics, etc.

NGC7789 · « **Reply #7 on:** 2014 July 25 12:10:42 »

Turns out BPP is not a good stresser of swap. In fact my test seems to indicate that it doesn't use swap at all. I ran my largest possible BBP run which ran just over 31 minutes. The difference with and without /tmp in RAM was 5 seconds or no difference. So now I need so seek a process that does stress swap in order to have a better test /tmp in RAM in the real world.

geomcd1949 · « **Reply #8 on:** 2014 July 25 20:35:40 »

This may be interesting. I created a ramdisk of 28GB in 32GB of total RAM. In one series of three Benchmark runs, I enabled a swap-file on each of two HDDs and the ramdisk. Here are the results:
Score:
Total performance ...... 7135
CPU performance ........ 6223
Swap performance ....... 18121 (3271.809 MiB/s)

In the next series, I disabled the swap files on the HDDs, and used only the ramdisk for swap. Here are the results of those tests:
Total performance ...... 7331
CPU performance ........ 6239
Swap performance ....... 26533 (4790.555 MiB/s)

It appears that Pixinsight, when forcing swap to slower disks as well as to RAM, actually slows the process.

Juan Conejero · « **Reply #9 on:** 2014 July 26 02:12:32 »

Quote

It appears that Pixinsight, when forcing swap to slower disks as well as to RAM, actually slows the process.

It seems this depends on the speed of your RAM. See for example this benchmark on our main workstation with 64 GB of DDR3 1600 MHz ECC unbuffered:

1 x 16 GiB RAM disk with Linux tmpfs + 1 x Samsung SSD EVO 840 1TB + 1 x HGST Ultrastar 7K4000 4TB: 3391.480 MiB/s

this benchmark replaces the RAM disk with one SSD:

2 x Samsung SSD EVO 840 1TB + 1 x HGST Ultrastar 7K4000 4TB: 3085.350 MiB/s

In both cases the HDDs are using ext4 filesystems. As you see the transfer rates are not very different, just about 300 MiB/s faster using the RAM disk.

What's the type and speed of your RAM modules?

NGC7789 · « **Reply #10 on:** 2014 July 26 06:49:52 »

Juan,

My experience was very similar to George's. When I was trying to optimize OS X I measured ~1500 MiB/s with RAM disk alone, ~1200 with RAM disk and SSD, ~550 with SSD alone and ~400 with RAM disk and HDD. It seamed that multiple swap sources was achieving some kind of average. Better than the slower drive but worse than the faster one. Based on one of your other posts I thought maybe there was some saturation of the SATA bus going on so I added a PCI SATA card and a second SSD. If this setup had improved OS X I would have stayed single boot. My plan B was to use the new SSD as my OS X boot and boot from the faster SSD into Linux for Pixinsight. The new SSD achieved ~400 alone. In conjunction with the existing one I got ~490. Again not as good as the one SSD alone.

So now I am booting into Ubuntu for PI. I haven't tried parallel swap. I have only been experimenting with trying to optimize RAM caching with options like moving /tmp to RAM. But because the benchmark doesn't seem to stress the amount of swap I don't know how to tell if the speeds I'm seeing are at the sacrifice of adequate swap space.

Can you suggest a process that would push the swap space to it's limit? My system has 32GB of DDR3 1600 MHz non-ECC RAM. The SSD is a Samsung 840 EVO 250GB.

Thanks for your attention!
-Josh

geomcd1949 · « **Reply #11 on:** 2014 July 26 09:27:33 »

Juan and Josh,

RAM is 4x8GB G Skill Ripjaws X DDR3 1600 PCS 12800 ECC Unbuffered on an ASRock Z77 Extreme4 board.
Both HDDs used for swap are 1TB 6Gb/s with ext4 filesystems.

I'm wondering what the PI main workstation would benchmark if you disabled the file-swap files on the SSD and HDD, and used only a (say) 56GB ramdisk? ! ?

My goal is to build the best PI machine for my brother. It's nice to have high benchmark scores, but the concern is whether these high scores on benchmark tests will translate to greater (that is, faster) performance on PI projects and processes. I note that the decrease in swap-times from 3271 to 4790 MiB/s resulted in a total-time reduction of only about 1.5 seconds. Granted, this is a 32% reduction in swap time, but is only a 2.5% reduction in total time. I'm wondering if this sort of performance increase in benchmarking will result in commensurately faster times for the various processes involved in PI image work?

Am also wondering if there is any down side to using such a high amount of RAM for the ramdisk?

Thanks!

~George

P.S. I don't do astro-photography or image-processing work, but would like to comment on Pixinsight as a program. It has a beautiful interface All its features are set in places where one would expect. The instructions are clear and concise. I think you should charge my brother and other users more for the program.

NGC7789 · « **Reply #12 on:** 2014 July 26 16:42:14 »

I am certain that you are being too aggressive with your RAM disk. You are only leaving 4GB for the OS and applications. Modern OS's, especially Linux, are very good at utilizing RAM. Making a RAM disk deprives the OS of that RAM. If you really have an excess of RAM then maybe it makes sense. But even 32GB is not an extreme amount of RAM. Remember that directing Ubuntu to put /tmp in RAM does not involve using a RAM disk.

Swap performance will have a benefit for processes that get their data from the swap area. I have seen processes where the loading of file from swap is virtually instantaneous just from an SSD. Then the CPU takes over. If swap access accounts for 20% of a process then even a 200% increase in swap will only get you 10% overall performance. Additional enhancement will have further reduced benefit because you are driving the swap impact on total performance down.

At this point I am more concerned with a scheme that does not have enough swap space rather than eking out every last MiB/s.

NGC7789 · « **Reply #13 on:** 2014 July 30 06:56:36 »

George,

I've given a little more thought to this. As I understand it the largest size swap is to be attained by simply putting PI swap on my SSD and allowing default Ubuntu configuration do whatever it does. This results in the slowest swap performance (swap performance of about 1500 MiB/s). This is about the same if I put swap in /tmp or somewhere else in the files system.

If I put swap on /tmp and move /tmp to ram by mounting it to tmpfs then the performance improves to 3900 MiB/s. However I believe with this configuration swap is limited to half of RAM (in my case 16GB) with default size of tmpfs. After that it is my understanding Ubuntu will begin to use it's swap which in my case is on the SSD. I don't believe the benchmark would ever cause this to happen and I have not been able to create something to cause this. If swap in this configuration exceeded tmpfs plus swap (48GB in my case) I also don't know what would happen. Would there be errors or would PI know it is out of swap space and take other (slower) options?

If I put swap on a ram disk (16GB) performance is slightly better at about 4200 MiB/s. Of course this also limit swap size and I'm not sure what happens if the 16GB is exceeded.

The best performance came with a 16GB ram disk along with PI swap on the SSD at 4400 MiB/s. This provides 32GB of swap space before any OS swapping needs to occur so perhaps this is the best option. Note that unlike my experience in OS X, under Ubuntu combining a ram disk with SSD swap does perform better than the ram disk alone.

Also worth noting is that the difference between the slowest ram based option and the fastest is only 12% and that impact on the total performance benchmark is only 2%. And compared to SSD alone the total performance of the fastest ram based option is only 13% better even though swap performance is almost 300% better! This begs the questions if we are tweaking too much.

Maybe Juan will drop back in and give his thoughts.

Juan Conejero · « **Reply #14 on:** 2014 August 05 04:52:13 »

Sorry guys for taking so long to get back to you. I'm supposed to be on vacation so please bear with me!

Quote

The best performance came with a 16GB ram disk along with PI swap on the SSD at 4400 MiB/s. This provides 32GB of swap space before any OS swapping needs to occur so perhaps this is the best option. Note that unlike my experience in OS X, under Ubuntu combining a ram disk with SSD swap does perform better than the ram disk alone.

PixInsight will spread each swap file into equal chunks on the different swap directories specified via Preferences. Assuming that you set each swap directory on a different physical unit, the total amount of swap space that can be used is always the size of the smallest swap drive multiplied by the number of swap directories. So in the case you have described above you indeed have 32 GB of total swap space. This is a reasonable limit for many processing works of moderate complexity. For big images such as drizzled images and large mosaics, and/or very complex processing works, 32 GB may easily become insufficient though. Once the swap disk space is exhausted you'll get multiple file access errors each time you attempt to process an image, but the PI Core application should be stable. You can change or add swap directories dynamically and they will be used the next time PI needs to write a new swap file. Of course, you should always try to avoid these situations by providing enough swap space in advance.

Quote

In building the machine, will it be faster if:
it has 24 cores as opposed to 12?

Definitely yes. The more cores, the better performance, especially for highly parallelized processes such as CurvesTransformation (and in general, all point-level processing algorithms), Convolution, all multiscale processing tools, ImageIntegration, DrizzleIntegration, etc.

Quote

it has 64 GB of RAM as opposed to 32?

Yes. With large RAM you can use RAM disks for swap file storage very efficiently.

Quote

it has four SSDs as opposed to two?

Two SSDs are sufficient to achieve *very* fast swap I/O transfer rates, especially if you use Linux.

Quote

Does size matter in deciding on an SSD?

Larger SSDs are usually more durable and sometimes faster. See for example this review of the Samsung SSD 840 EVO in120 GB, 250 GB, 500 GB, 750 GB and 1 TB models.

Quote

P.S. I don't do astro-photography or image-processing work, but would like to comment on Pixinsight as a program. It has a beautiful interface All its features are set in places where one would expect.

Thank you very much, I appreciate you saying this a lot. If only most of our users had a similar vision. Many of them hate our user interface without reflexion just because it doesn't look like what they expect it to be—that is, like Adobe Photoshop—, and things are called by their real names—such as MorphologicalTransformation instead of "dust and scratches", HistogramTransformation instead of "develop" or "give life", and so on. Of course, we have never been guided by popular acceptance or conventional wisdom.

This forum is closed since 5 March 2020

PixInsight Forum is now available at:

https://pixinsight.com/forum/

News:

Author Topic: Benchmark meanings (Read 7765 times)

geomcd1949

Benchmark meanings

NGC7789

Re: Benchmark meanings

geomcd1949

Re: Benchmark meanings

NGC7789

Re: Benchmark meanings

NGC7789

Re: Benchmark meanings

geomcd1949

Re: Benchmark meanings

NGC7789

Re: Benchmark meanings

NGC7789

Re: Benchmark meanings

geomcd1949

Re: Benchmark meanings

Juan Conejero

Re: Benchmark meanings

NGC7789

Re: Benchmark meanings

geomcd1949

Re: Benchmark meanings

NGC7789

Re: Benchmark meanings

NGC7789

Re: Benchmark meanings

Juan Conejero

Re: Benchmark meanings