New Tool Released: SpectrophotometricColorCalibration (SPCC)

Spidy · Nov 28, 2022

Newbie question.

Do I need the GDR3 files if I have the GDR3SP files?

camastro · Nov 28, 2022

Yes. The GDR3 files are needed for plate solving to acquire the co-ordinates of your image; the GDR3SP files are for colour calibration. You only need to download sufficient of the GDR3 files to ensure that there are enough stars covered in your images - with wide field images the first catalogue may be sufficient.

littledevils · Nov 30, 2022

Is there somewhere else that I can download Gaia from? I have been trying to download this off the pixinsight website for days. I'm only doing the small one and I just can't get the 03 to download at all. It keeps coming up with a network error. Sometimes I only get to .1 gb downloaded before it fails and other times I have got as far as 1.6 gb (after a couple of hours). It's got to be easier than this. I've had friends try to download it on their computers and none have been able to get it either.

pstrauss · Nov 30, 2022

Am getting the error pasted below. SPCC can't find the Gaia database. Both the Gaia process and Imagesolver can find it and use it. I did start by setting up the Gaia process and pointing it to the new Gaia database (small version). Can you please advise how to show SPCC where the database is?

SpectrophotometricColorCalibration: Processing view: M42_RGB_Combo_Image74_ABE
Writing swap files...
284.761 MiB/s

* Retrieving Gaia DR3/SP database information

run -x "C:/Users/der94/AppData/Local/Temp/SPCC_EVW41DRGF3Z0.js"

Processing script file: C:/Users/der94/AppData/Local/Temp/SPCC_EVW41DRGF3Z0.js
*** Error: No database files have been selected.

*** Error: Database files not available for the Gaia DR3/SP catalog, or the XPSD server is not properly configured.
Reading swap files...
1806.499 MiB/s
<* failed *>

pstrauss · Nov 30, 2022

pstrauss said:
Am getting the error pasted below. SPCC can't find the Gaia database. Both the Gaia process and Imagesolver can find it and use it. I did start by setting up the Gaia process and pointing it to the new Gaia database (small version). Can you please advise how to show SPCC where the database is?

SpectrophotometricColorCalibration: Processing view: M42_RGB_Combo_Image74_ABE
Writing swap files...
284.761 MiB/s

* Retrieving Gaia DR3/SP database information

run -x "C:/Users/der94/AppData/Local/Temp/SPCC_EVW41DRGF3Z0.js"

Processing script file: C:/Users/der94/AppData/Local/Temp/SPCC_EVW41DRGF3Z0.js
*** Error: No database files have been selected.

*** Error: Database files not available for the Gaia DR3/SP catalog, or the XPSD server is not properly configured.
Reading swap files...
1806.499 MiB/s
<* failed *>

As they used to say on Saturday Night Live: "Never mind"
I found (above) the following: Just follow the instructions in the first post, they are very clear in the steps to follow. And in Gaia make sure you select Data Realase > Gaia DR3/SP (if you select only DR3 it will not work).
It turns out I had set up Gaia to look at DR3, not DR3/SP. Yikes!
SPCC now ran to completion.
On to the next challenge: learning how to use it correctly!

Pieter

DougS · Nov 30, 2022

While honestly appreciative of all the work expended to get to this latest release, I also think it's fair to question the decision to have thousands of people trying to download 60GB of data simultaneously. I've experienced nothing but trouble getting these files downloaded (and I'm now at multiple days of effort trying). In the future, I would hope that PI developers could consider the pain they're inflicting on users as one of the factors before eliminating backward compatibility and/or releasing significantly "different" operations models that folks must adjust to. From what has been posted to YouTube so far (admittedly only an early look and minimal opinions), the boost doesn't seem to measure up to the pain inflicted.... Let's hope all this cleans itself up soon and then hopefully PI developers can be a bit more aware of this kind of decision pain going forward. Thanks for considering these comments.

Grim408 · Nov 30, 2022

DougS said:
While honestly appreciative of all the work expended to get to this latest release, I also think it's fair to question the decision to have thousands of people trying to download 60GB of data simultaneously. I've experienced nothing but trouble getting these files downloaded (and I'm now at multiple days of effort trying). In the future, I would hope that PI developers could consider the pain they're inflicting on users as one of the factors before eliminating backward compatibility and/or releasing significantly "different" operations models that folks must adjust to. From what has been posted to YouTube so far (admittedly only an early look and minimal opinions), the boost doesn't seem to measure up to the pain inflicted.... Let's hope all this cleans itself up soon and then hopefully PI developers can be a bit more aware of this kind of decision pain going forward. Thanks for considering these comments.

I've had no problems downloading the files. Even now I have tried to download it again the whole Complete set and it was 10 minutes per file. Probably depends where are you downloading from, it does not seem to be over populated.

littledevils · Nov 30, 2022

littledevils said:
Is there somewhere else that I can download Gaia from? I have been trying to download this off the pixinsight website for days. I'm only doing the small one and I just can't get the 03 to download at all. It keeps coming up with a network error. Sometimes I only get to .1 gb downloaded before it fails and other times I have got as far as 1.6 gb (after a couple of hours). It's got to be easier than this. I've had friends try to download it on their computers and none have been able to get it either.

Can someone please reply to this I am not able to download all of GAIA as I keep getting network errors multiple times before it completely stops and I had to start again. I have checked my internet speed and it's not my end. This is crazy an update shouldn't cause this much grief. Next question can I get the old pcc back? It worked fine.

DougS · Nov 30, 2022

Well...clearly there are (from other posts) other people experiencing the same problems. I experience "server errors", "network errors", and "unknown errors" in Windows10. I live in a remote area of the US (and for those excellent skies I pay the price of lower download speeds), but somewhere between the server, my house, and between in the network, there is throttling occurring. I've run a network speed test and see poor (7-10Mb - typical DSL) download speeds, but what I have been receiving from the download server is measured at 100Kb - 900Kb (average 400-500Kb/s). At that speed, it takes 1-3 hours to download 3GB. The best I've done on some of the files is ~1 hour. I suspect there's also a timeout value on the server that's catching as well. Why would we see "server errors"? It's a mystery, but in any case, for me and others with low download speeds, this idea to download that amount of data hasn't worked out too well.....

DougS · Nov 30, 2022

and FWIW, the last two days have been particularly bad. We were getting 10x the download speed 2 days ago. Today and yesterday have been essentially show stopper events. Can't get past the server error / network error messages.... must start over from scratch. This server should have implemented partial files in a way that would allow continuation on error from where last left off.....

Juan Conejero · Nov 30, 2022

I understand your frustration, but what can we do? We cannot distribute these files from our corporate server (pixinsight.com) because the required bandwidth exceeds its capabilities by many orders of magnitude. This is true even though we recently migrated our server to a much more capable machine. So we are using a content delivery network (CDN) to distribute these large databases because we have no other option. Our CDN service is KeyCDN because it is our best option in cost/performance terms. Of course, KeyCDN could be better. There can be some locations where KeyCDN cannot provide an optimal local file server. What can we do? We could provide these files on physical media, such as pen drives, by ordinary mail. Unfortunately, we don't have the required infrastructure to offer this service.

For your information, our CDN has transferred more than 110 TB since the release of SPCC. This is costing us money and will cost us much more during the incoming months, but we are happy to be able to provide these state-of-the-art astronomical databases to our users at no cost.

DougS · Nov 30, 2022

Hi Juan, I understand the dilemma. Considering your comment about 110TB transferred, this amounts to ~1800 people with full DBs now. That doesn't factor all of us who have been transferring files that aren't completing and thus having to restart transfers (reducing the overall number by ??). So, only the PI team knows how many PI licenses are active, but I'm guessing that if most of the licensees try and upgrade, that 110TB number is going to have to go MUCH, MUCH higher. An educated guess on the slow transfers is that CDN must be the bottleneck and must be throttling due to all the traffic. Unfortunately, when combined with a network/server timeout, the throttling works against itself in that files fail before completing, thus exacerbating the situation. If CDN is charging PI for transfer TBs, then a fair question might be how much of that transfer bandwidth cost is actually for failing transfers (and should you have to pay for it!??).

Crimsus · Nov 30, 2022

Juan Conejero said:
I understand your frustration, but what can we do? We cannot distribute these files from our corporate server (pixinsight.com) because the required bandwidth exceeds its capabilities by many orders of magnitude. This is true even though we recently migrated our server to a much more capable machine. So we are using a content delivery network (CDN) to distribute these large databases because we have no other option. Our CDN service is KeyCDN because it is our best option in cost/performance terms. Of course, KeyCDN could be better. There can be some locations where KeyCDN cannot provide an optimal local file server. What can we do? We could provide these files on physical media, such as pen drives, by ordinary mail. Unfortunately, we don't have the required infrastructure to offer this service.

For your information, our CDN has transferred more than 110 TB since the release of SPCC. This is costing us money and will cost us much more during the incoming months, but we are happy to be able to provide these state-of-the-art astronomical databases to our users at no cost.

Make any download that is bigger than some arbitrary size (e.g. 500MB) a torrent instead of relying solely on a CDN. The use of torrents to distro the catalogs/databases will relieve a lot of stress on the network backbone once the torrent is downloaded by a few folks.

EDIT: To clarify, what I mean is using torrents in addition to the CDN. Torrents will relieve network stress when big updates take place. Torrents also allow download speeds to accelerate, getting the data to the end user faster. The CDN option for downloading would still exist in addition to the torrents. In the end it may be cheaper to only use torrents, but I would say stick with CDN-only for things like the actual software and any unique files that PI owns. Since the database files are the real hogs in the scenario, make those available via CDN or torrents.

aranous · Dec 1, 2022

Perhaps in this situation, you should have given people a couple weeks warning, letting them know of the pending change and the large database download required. This could have given folks some time to get the files down before the update.

Juan Conejero · Dec 1, 2022

DougS said:
An educated guess on the slow transfers is that CDN must be the bottleneck and must be throttling due to all the traffic. Unfortunately, when combined with a network/server timeout, the throttling works against itself in that files fail before completing, thus exacerbating the situation.

A service like KeyCDN has no problem at all managing the traffic that we are generating with catalog downloads. So this problem is not being caused by high traffic. The total bandwidth that we need is more like a drop in a pool compared to the requirements of large companies using their services.

DougS said:
If CDN is charging PI for transfer TBs, then a fair question might be how much of that transfer bandwidth cost is actually for failing transfers (and should you have to pay for it!??).

All bytes transferred count, of course, and we must pay for every one of them. Your question is impossible to answer precisely, but from our interaction with user support, the number of failed transfers should be relatively small, probably at most 2%.

Juan Conejero · Dec 1, 2022

Crimsus said:
Make any download that is bigger than some arbitrary size (e.g. 500MB) a torrent instead of relying solely on a CDN. The use of torrents to distro the catalogs/databases will relieve a lot of stress on the network backbone once the torrent is downloaded by a few folks.

EDIT: To clarify, what I mean is using torrents in addition to the CDN. Torrents will relieve network stress when big updates take place. Torrents also allow download speeds to accelerate, getting the data to the end user faster. The CDN option for downloading would still exist in addition to the torrents. In the end it may be cheaper to only use torrents, but I would say stick with CDN-only for things like the actual software and any unique files that PI owns. Since the database files are the real hogs in the scenario, make those available via CDN or torrents.

Thank you for your suggestion. Unfortunately, we cannot rely on peer-to-peer file sharing to distribute any files. As noted before, we are not going to take the risk of spreading corrupted files or files with uncontrolled contents. We are responsible for the data we distribute to our users and take this responsibility very seriously.

As I have explained above, the problems some users are experiencing downloading these large files are not caused by excessive traffic since our CDN service has much more than the required capability to transfer them. For example, I've just verified that I can download many of these database files at 30 GB/s. The slowest download in my test has been 8 GB/s.

We are now considering several options to solve this problem. We'll inform you when we decide in this regard.

Crimsus · Dec 1, 2022

Juan Conejero said:
Thank you for your suggestion. Unfortunately, we cannot rely on peer-to-peer file sharing to distribute any files. As noted before, we are not going to take the risk of spreading corrupted files or files with uncontrolled contents. We are responsible for the data we distribute to our users and take this responsibility very seriously.

As I have explained above, the problems some users are experiencing downloading these large files are not caused by excessive traffic since our CDN service has much more than the required capability to transfer them. For example, I've just verified that I can download many of these database files at 30 GB/s. The slowest download in my test has been 8 GB/s.

We are now considering several options to solve this problem. We'll inform you when we decide in this regard.

I understand and appreciate the concern over data integrity. Torrent clients offer many features to help verify the integrity of the files being distributed, but of course if there is a bad actor with the will to cause issues they will.

One alternative suggestion I have is to maybe strike up an agreement with perhaps 2-3 major universities globally that are willing to host the data in addition to PI's own server. Obviously the original files would come from PI, with server rights restricted so that only PI staff or the host server administrator can make any changes--it will at least maintain a tight chain-of-custody for the files. If successful, the university hosts can reduce some of the PI server traffic; the added bonus would be the potential for faster transfer speeds.

johnpane · Dec 1, 2022

Juan Conejero said:
As I have explained above, the problems some users are experiencing downloading these large files are not caused by excessive traffic since our CDN service has much more than the required capability to transfer them. For example, I've just verified that I can download many of these database files at 30 GB/s. The slowest download in my test has been 8 GB/s.

Are these stats correct? 30 gigabytes per second roughly translates to 240 Gbps. That is 6x as fast as Thunderbolt 4 and more than 4x as fast as the fastest PCIe SSD currently.

Juan Conejero · Dec 1, 2022

Oops! Of course, I meant megabytes (MB) instead of gigabytes (GB).

Terry Danks · Dec 1, 2022

Not sure I am posting this in the correct place but it does have to do with my experience to date with SPCC.

First off, I experienced no difficulty implementing it on my PC, nor downloading the complete 60GB database to do so.
Initially, I was delighted as results seemed satisfactory and were not difficult to obtain.

But, when I started experimenting with adding Ha data into RGB galaxy images . . . it seemed the wheels fell off SPCC.

As I was using PM expressions developed by others, I at first blamed my procedure as I admittedly didn't really know what I was doing.

Several attempts adding Ha to the R channel of my M31 data, using different PM expressions, kept turning up a greenish turquoise hue when doing the final combine.

I finally began to suspect it was actually SPCC causing this and resorted to a simple use of the Background Neutralization tool and, bam!, the problem went away and colours were more as they should be. After experimenting further, it was determined this had nothing to do with the PM expressions I was experimenting with. The greenish cast seems to originate with SPCC in this instance. A simple application of BN to the RGB combine shows more appropriate colours than does doing cc with SPCC.
I attach screenshots. The top panel is the SPCC result. The bottom a simple BN application to the RGB combine. All images are linear with STF.

No idea as to the source of this behaviour. My filters are entered correctly in the SPCC pane and the selection of Average Spiral Galaxy for white reference is surely appropriate in this instance?

Have there been any others experiencing "wonky" colours after using PMCC?

PS: I redid SPCC using G2V for white balance and was much more satisfied with the results. But it is surprising that the choice of white refereance should be so critical. I had hoped SPCC would be more, for want of a better word, "objective." It appears, at least in this case, that choosing the white reference is critical to good results. I do not recall PCC ever delivering a "bad" colour correction.

New Tool Released: SpectrophotometricColorCalibration (SPCC)

Member

Well-known member

Member

Active member

Active member

Member

Active member

Member

Member

Member

PixInsight Staff

Member

Member

Well-known member

PixInsight Staff

PixInsight Staff

Member

Well-known member

PixInsight Staff

Well-known member

Attachments