1. why did you change the file format to XISF?
2. what are the advantages (in simple language, please)?
Because we need to replace FITS with an efficient, open and free format. Here is a brief, non-comprehensive list of important problems that we are addressing with XISF:
* Lack of formal resources to define how image data have to be interpreted and represented. For example, application A stores floating point images in the range from -12345.123 to +6789.001, but this range is not declared publicly. No problem, as this can be done legally with FITS. Application B uses the range from 0 to 123456, also undeclared. B has no way to know how to interpret floating point pixel data written as FITS files by A, and vice-versa, so despite the fact that both applications implement the FITS standard, they have a serious interoperability problem. The developers of B are smart and, after wasting precious time that they should invest in creative development tasks, manage to discover by reverse engineering the black and white points of images written by A. That's great because now B can read FITS files written by A so B users using A data are happy. One day, the developers of A decide, for strictly technical reasons, to change image ranges in their next version, and again they don't publicize them. Suddenly, all B users who need to import images written by A are lost. This cannot happen---and will never happen---with XISF.
* Undefined physical disposition of pixel data. For example, a two-dimensional RGB color image can be stored as a three-dimensional image in a single header-data unit (HDU), with three contiguous blocks to store the red, green and blue channels. It can also be stored as three one-dimensional images in three HDUs, one for each channel. Or as four HDUs, one with all the metadata and no data and three with minimal metadata and one-dimensional channel data. It is unclear (to me at least) if it can also be stored as a one-dimensional sequence of groups of contiguous red, green and blue pixel samples in a single HDU. This "flexibility" can be nice, but FITS has no formal resources to define the organization of multichannel or vector-valued images. In XISF we have two pixel storage models (
planar and
normal models) formally defined without ambiguity.
* No color spaces. If an application stores a three-dimensional image (also known as a
data cube), what is it? An RGB color image? A grayscale image with two alpha channels? HSV? CIE XYZ? CIE L*a*b*? We need colorimetrically defined color spaces to perform brightness/chromaticity separations and other essential tasks. The word "color" appears only once in the FITS Standard version 3.0 document to say this (section G.2.1):
FITS data arrays contain elements which typically represent the values of a physical quantity at some coordinate location. Consequently they need not contain any pixel rendering information in the form of transfer functions, and there is no mechanism for color look-up tables.* Lack of essential metadata and auxiliary data structures such as ICC color profiles, CFA patterns, image resolution parameters, image thumbnails, optimal visualization parameters, etc.
* Obsolete metadata and lack of Unicode support. Punched cards are cool, but as of 2015, we
really need more than 7-bit ASCII, 8-character property names, and 80-byte metadata rows. We need full Unicode support, structured property identifiers, unlimited length names and values, and much more fundamental data types and data structures than 'logical', 'numeric', and 'character'. For example, if my name were "Iván Martínez Añejo", I would have to store it as something like this in FITS:
name="AUTHOR" value="'Ivan Martinez Anyejo'" comment="Not my real name!"while in XISF:
<Metadata>
<Property id="XISF:Author" type="String">Iván Martínez Añejo</Property>
</Metadata>* Rigid header-data sequential organization. Magnetic tapes also are very cool, but definitely not contemporary devices. For example, imagine a FITS file storing three images as three consecutive HDUs, as required by the FITS standard. If the file is available on a local hard disk this is not a practical problem: to access the headers of the second and third HDUs we need two file seek operations, but hard disks are very fast so we don't notice. However, what happens if the file is in a remote server, for example, being accessed as "
http://www.example.com/foobar.fits" ? We have to download the first HDU, including the whole first image (which we perhaps don't need), before starting to download the header of the second HDU, and the same for the third one. Special network interface applications can be created to improve this situation, but this can't be an optimal solution. With a monolithic XISF file, an application can download the entire XISF header, which contains all of the metadata including all image and property descriptions, in a single operation, without needing to read a single pixel. With a distributed XISF unit, we can download the header and then download just the required image data.
* Lack of a distributed storage model. Distributed XISF units store the header (metadata) and the data as separate files, including local and remote resources. This allows for flexible storage configurations that are not possible with FITS. Furthermore, an XISF data blocks file allows indexed access to images and data objects through symbolic identifiers. This means that XISF data blocks files can be reorganized and extended freely without invalidating existing XISF units that depend on them. Distributed XISF units are also relocatable, so one can transport them between file systems and machines.
* No effective data integrity and authentication protection. Digital signatures based on XML signatures, X.509 certificates and cryptographic checksums are formalized in the XISF specification. A digitally signed XISF unit is effectively sealed and cannot be modified (There is a registered FITS convention for checksum keywords, but it does not protect authenticity because checksums can be tampered).
3. should I change (I am a real newcomer, remember)
If you are a PixInsight user, you'll be using XISF to store your images in a natural way because it is PixInsight's native file format. You should always keep your original XISF files because XISF provides resources that don't exist in other formats. Unfortunately, you'll have to export them in different formats (FITS, TIFF, etc.) to use them with other applications. Hopefully, more applications will adopt XISF in the near future, and we'll work hard for this to happen.
4. when I go to save a file there are seven options of file format to choose from (from 8-bit unsigned integer to 64-bit IEEE 754 floating point). I always choose the default (32-bit IEEE 754 floating point). Is this the correct one to use and if so why all the other options?
The 32-bit floating point format is the most logical option in most cases. The 32-bit unsigned integer and 64-bit floating point formats can be necessary to encode images with very large dynamic ranges, such as very deep linear HDR images generated with our HDRComposition tool. For the rest of images, 32-bit floating point should normally be used.