Author Topic: New PC: biggest bang for the buck?  (Read 123662 times)

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
Re: New PC: biggest bang for the buck?
« Reply #135 on: 2011 April 19 14:04:15 »
Wade,

Quote
Do you use ECC memory? If yes, try to switch off ECC.

Yes, I do use ECC memory.  I'll see if there is a way to run this feature off.


It appears unlikely that you have thermal problems. I mentioned ECC only because this costs a few percent performance compared to non-ECC (desktop class) systems. With 24 GBytes, it is not a bad idea to have ECC, so you should keep it switched on for normal operation.

Georg
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)

Offline twade

  • PTeam Member
  • PixInsight Old Hand
  • ****
  • Posts: 445
    • http://www.northwest-landscapes.com
Re: New PC: biggest bang for the buck?
« Reply #136 on: 2011 April 19 20:43:25 »
To all,

Here are the results after changing the settings Juan suggested in an earlier reply.

Threads            Non-Parallel         Parallel
  24T                     67.70              10.72
  12T                     39.50              10.20
    6T                     39.88              11.80
    4T                     44.17              13.96

These values are with Enable Thread CPU Affinity enabled.  This option saved about 1 second or so when enabled.  It appears the sweet spot is 12T.  I'm not sure why this is, but it seems to be the case for this version of PixInsight and my processors.  What's really interesting is the severe penalty for using all the threads in the non-parallel test.

I appreciate everybody's suggestions during this adventure.  I look forward to the next version of PixInsight so I can "play" more.

Wade

Offline Catanonia

  • Newcomer
  • Posts: 26
Re: New PC: biggest bang for the buck?
« Reply #137 on: 2011 April 27 04:00:40 »
I am running a water cooled I7 920 (8 cores) clocked to 4.2Ghz

Also Raid 0 (Stripped) fast HD.

A real boost for PI and makes life much easier.


Offline Louis Mamakos

  • Newcomer
  • Posts: 6
Re: New PC: biggest bang for the buck?
« Reply #138 on: 2011 April 29 09:45:54 »
  What's really interesting is the severe penalty for using all the threads in the non-parallel test.

With too many threads, you're probably thrashing the L1 cache and perhaps L2 cache in the CPUs as the threads each get scheduled for execution.  The local data for each thread will tend to evict the local data for another thread that's probably doing the same work.  You take a big hit when your execution thread gets a cache miss and has to go to the next level of cache or worse, all the way to main memory to retrieve the memory the program is referencing.

With fewer threads, closer to the actual number of execution cores, there will tend to be more locality of reference and higher cache hit rates as those same threads stay resident on the CPU and it's cache.  You can see a little evidence of this with the CPU cache affinity switch's effect on the timing.

Just a guess.

Offline Louis Mamakos

  • Newcomer
  • Posts: 6
Re: New PC: biggest bang for the buck?
« Reply #139 on: 2011 April 29 09:50:55 »
Quote
Quote
Do you use ECC memory? If yes, try to switch off ECC.

Yes, I do use ECC memory.  I'll see if there is a way to run this feature off.

I don't get this; why would you ever turn ECC off and trade reliability for just a few percent increase in performance.  We spend all this money on cameras and image acquisition and processing software to extract as much signal out of the data we acquire from photons that come from halfway across the universe.  And then risk undetected corruption by turning off ECC?

That's the one thing that really annoys me about my otherwise wonderful iMac computer; I can't populate it with ECC memory and I wonder how many bits (out of 12GB of DRAM) have rotted away, mostly unnoticed.  I mean, we surely see evidence of like phenomenon in our raw CCD images due to "cosmic rays" and other decay events dropping a few extra electrons here and there..
  

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
Re: New PC: biggest bang for the buck?
« Reply #140 on: 2011 April 29 10:05:45 »
Quote
Quote
Do you use ECC memory? If yes, try to switch off ECC.

Yes, I do use ECC memory.  I'll see if there is a way to run this feature off.

I don't get this; ...  

See http://pixinsight.com/forum/index.php?topic=1052.msg20629#msg20629 . This was intended as an attempt to drill down to the cause of the issue.

Georrg
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
Re: New PC: biggest bang for the buck?
« Reply #141 on: 2011 May 02 01:19:45 »
  What's really interesting is the severe penalty for using all the threads in the non-parallel test.

With too many threads, you're probably thrashing the L1 cache and perhaps L2 cache in the CPUs as the threads each get scheduled for execution.  ...

You are right in saying that more cores sometimes mean *less* performance - anybody who has ever experienced the slowness of a large discussion group compared to an effective task force with few people will probably agree. However, a clever OS would keep the tasks as much as possible on already used cores - and apparently Linux does a better job than Windows in this respect. Having said that: Sometimes you need to help the OS in doing this, by using suitable parallel programming constructs or specific annotation. This however is tedious work, and it rarely successful for all possible combinations of CPUs, sockets, RAM channels, ...

Georg
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)

Offline Yuriy Toropin

  • PixInsight Addict
  • ***
  • Posts: 209
Re: New PC: biggest bang for the buck?
« Reply #142 on: 2011 May 13 05:41:31 »
Intel Core 2 Quad Q9650 (@3GHz), 8Gb RAM (@1066MHz), 1Tb Samsung HDD,
Win 7 Ultimate (x64 bits),
PI 1.05.09.0561 eng (x86-64)

It takes 13.9sec (console open) to proceed - old set.
New BenchmarkParallel PSM
7.2 sec w/console opened, on average, individual runs took 6.8...7.8 sec,
6.9 sec on average with console closed

Guys, need your opinions,

That time my Core 2 Quad 9650 demonstrated just fine performance.
Now, when I'm trying to process 2x2 or 3x2 mosaics or "overlaps" combining data from 2 astroimaging setups working in parallel I feel "Need for Speed".

Intel Core i7-970 with 12 threads looks like good selection for the "calculation engine" of the new rig, promising ~2...2,5x speed improvement, but...

Are there other options? GeForce 570 is useless untill CUDA support is added to PixInsight (Juan...  ::))
Any "dedicated calculation units" that will be supported under Win 7 by PixInsight?

Or multithreaded i7 is the only option?

PS: below is example of "IFN overlap" I'm working right now. Alignment and integration of ~50 16Mpix pictures is... well, slow.

Offline twade

  • PTeam Member
  • PixInsight Old Hand
  • ****
  • Posts: 445
    • http://www.northwest-landscapes.com
Re: New PC: biggest bang for the buck?
« Reply #143 on: 2011 May 30 18:47:18 »
To all,

I don't know what Juan did in version 1.7, but I'm seeing a significant improvement.  I still have to limit the threads to 12, but I can live with that.  :)

Here are the results:

Dual processor Xeon x5650 2.66 GHz:
   Benchmark_M74:             12.48
   Benchmark_M74_parallel:  6.035

What's really impressive is the lack of variability between each run.  For example, the non-parallel execution only varied 0.03 seconds between the mean.  I had a lot more variability with v 1.6.

Awesome job Juan!!!

I do get the following error when loading the non-parallel process icon.  Hopefully, this is not having an impact on the overall results.

Reading PSM resource:
I:/Benchmark/Benchmark_M74.psm
*** Error: Deconvolution: Unknown parameter: linear
*** Error: HDRWaveletTransform: Unknown parameter: scalingFunctionKernelSize
*** Error: ATrousWaveletTransform: Unknown parameter: scalingFunctionKernelSize
*** Error: ATrousWaveletTransform: Unknown parameter: scalingFunctionNoiseLayers
0 icon(s) loaded.

Wade

Offline Juan Conejero

  • PTeam Member
  • PixInsight Jedi Grand Master
  • ********
  • Posts: 7111
    • http://pixinsight.com/
Re: New PC: biggest bang for the buck?
« Reply #144 on: 2011 May 31 00:18:02 »
Quote
I don't know what Juan did in version 1.7

Magic ...  :o

Seriously, I've rewritten a lot of low level routines, especially parallel convolution and FFT routines which are at the heart of wavelets and other transformations widely used in PixInsight. The use of Visual C++ 2010 has had also a significant impact on performance for Windows versions of PI. VC++ 2010 seems to produce better optimized code than VC++ 2008. Nevertheless, I must say that your results are awesome. I didn't expect such a big performance improvement, so I'm very happy to see those numbers :)
Juan Conejero
PixInsight Development Team
http://pixinsight.com/

Offline Harry page

  • PTeam Member
  • PixInsight Jedi Knight
  • *****
  • Posts: 1458
    • http://www.harrysastroshed.com
Re: New PC: biggest bang for the buck?
« Reply #145 on: 2011 May 31 12:47:33 »
Hi

I got my new i7 laptop this week with 8 gig of mem windoz 7

I concur 1.7 is so much faster

1.6 parallel set = 9.4 sec

1.7 parallel set = 6.1 sec

cool  8)

Harry
Harry Page

Offline Andres.Pozo

  • PTeam Member
  • PixInsight Padawan
  • ****
  • Posts: 927
Re: New PC: biggest bang for the buck?
« Reply #146 on: 2011 June 01 12:12:39 »
I can confirm that the version 1.7 is MUCH faster than 1.6

Benchmark_M74Benchmark_M74_parallel
Version 1.6
25.28 s
8.55 s
Version 1.7
8.61 s
5.13 s

Second best time in four runs
Intel Core i7 2600K @ 3.4GHz
8GB RAM
Win 7 64bits


Offline Carlos Milovic

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2172
  • Join the dark side... we have cookies
    • http://www.astrophoto.cl
Re: New PC: biggest bang for the buck?
« Reply #147 on: 2011 June 01 12:37:24 »
Huge imrpovement here too :) At work, these were the previous numbers:

Intel Core 2Quad 2.40GHz - 4Gb RAM
Results
- Benchmark: 43.77s
- Parallel B. : 26.63s
(average of three) std console behaviour

Now:
- Benchmark: 28.53
- Parallel B. : 17.65
console always opened.

Congrats Juan :)
Regards,

Carlos Milovic F.
--------------------------------
PixInsight Project Developer
http://www.pixinsight.com

Offline Carlos Milovic

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2172
  • Join the dark side... we have cookies
    • http://www.astrophoto.cl
Re: New PC: biggest bang for the buck?
« Reply #148 on: 2011 June 05 16:37:49 »
Juan, are you using Intel specific optimizations? I ask, because no I saw no gain at home, with the AMD Phenom II x6 CPU. 1.6 and 1.7 tests show almost the same values (in fact, 1.7 is a bit slower now, nearly 1 sec for both benchmarks).
Regards,

Carlos Milovic F.
--------------------------------
PixInsight Project Developer
http://www.pixinsight.com

Offline Juan Conejero

  • PTeam Member
  • PixInsight Jedi Grand Master
  • ********
  • Posts: 7111
    • http://pixinsight.com/
Re: New PC: biggest bang for the buck?
« Reply #149 on: 2011 June 05 17:02:14 »
Hi Carlos,

No I am not using Intel-specific optimizations, at least not in any way that I can control directly. For UNIX/Linux the whole PI has been built with GCC 4.4.5 (Linux and FreeBSD) and GCC 4.2.1 (Mac OS X) with '-mtune=generic' as part of the optimization compiler flags. On Windows PI has been built with Visual C++ 2010, which unlike GCC, does not allow control over machine-specific optimizations.

So the results you're getting with your AMD processor are very strange. PI 1.7 should be approximately as faster as it is on Intel machines. If it is slower, then something wrong is happening and I definitely want to know what it is.

Anybody else can report benchmarks on AMD-based machines? (we have none!)
Juan Conejero
PixInsight Development Team
http://pixinsight.com/