XISF
Extensible Image Serialization Format (XISF, pronounced [ɛksˈɪsf]) is the native file format of PixInsight. It is a free, open format for storage, management and interchange of digital images and associated data.
What Is XISF?
Although XISF has been originally conceived and implemented as the native file format of PixInsight, it is intended to be much more than that: Our hope is that XISF serves as an efficient tool for the development of modern imaging software, including not only software specialized in astronomy, but image processing software in a wide range of technical and general fields.
Two key elements in the design of XISF can be found in its title: extensible and serialization. Extensibility is crucial to adapt the format easily and efficiently to the requirements of present and future software applications. The architecture of XISF has to facilitate the development of extensions to the core format specification, and for this purpose XISF headers are standard XML documents. Serialization denotes the ability of XISF to store not just image data, but also data structures associated with the environments where the images evolve as living objects. These data structures can be deserialized to recreate the images along with their working contexts. We formalize the resources to store data structures and objects as properties of a variety of predefined data types. XISF properties can be directly associated with images, with entire XISF units, or be defined as standalone components.
XISF is a free format open to the contributions of anyone interested, including users of PixInsight and other applications, as well as individuals and groups from other development teams, institutions and companies involved or interested in image processing software.
XISF Resources
The XISF version 1.0 specification document has been released officially after three years of working drafts, and is now available on this website:
The reference XISF implementation in the C++ programming language is part of the PixInsight Class Library (PCL) distribution, which can be used freely to include XISF support in any application:
For discussion on the XISF format, its specification and implementations, feel free to join us at PixInsight Forum:
Implementing XISF Support in Your Application
Supporting monolithic XISF files is easy with our reference C++ implementation. That's a trivial task for any C++ application by linking statically to our PixInsight Class Library. With other languages, such as C# for example, creating a wrapper DLL is relatively simple.
The PixInsight Class Library (PCL) has been released under a liberal BSD-like license, so using our reference XISF implementation does not compromise your source code or the way you distribute your application at all.
The PCL XISF support classes are really powerful and easy to use, They do all of the complex work for you, and are readily available on FreeBSD, Linux, macOS and Windows platforms.
For example, reading an image in 32-bit floating point format can be as simple as:
#include <pcl/XISF.h> using namespace pcl; FImage image; XISFReader xisf; xisf.SetLogHandler( new MyXISFLogHandler ); xisf.Open( "/path/to/file.xisf" ); xisf.ReadImage( image ); xisf.Close();
where MyXISFLogHandler is a simple class that you can define to receive informative, warning and recoverable error messages:
class MyXISFLogHandler : public XISFLogHandler { public: virtual void Log( const String& text, message_type type ) { switch ( type ) { default: case XISFMessageType::Informative: std::cout << text; break; case XISFMessageType::Note: std::cout << "* " << text; break; case XISFMessageType::Warning: std::cerr << "** " << text; break; case XISFMessageType::RecoverableError: std::cerr << "*** " << text; break; } } };
Our reference implementation is fully documented here:
And here is a command-line utility program for management and inspection of XISF files:
If you are interested in implementing XISF support in your application, let us know and we'll be glad to help.
Why XISF?
Because we need a contemporary, efficient, interoperable, really flexible and extensible format to store images and their associated data structures.
Astronomical image processing applications, including PixInsight before XISF, have been using mainly the Flexible Image Transport System (FITS) format for decades. Unfortunately, FITS does not have any of the properties enumerated in the preceding paragraph. It is an obsolete, loosely defined format, has a rigid architecture, causes too many interoperability problems, and does not provide the required support and features to manage images efficiently.
Here is a brief, non-comprehensive list of important problems that we have been experiencing with FITS:
- Lack of formal resources to define how images are to be interpreted and represented. For example, application A stores floating point images in an arbitrary range, say from -12345.123 to +6789.001. This can be done legally with FITS. Application B uses the range from 0 to 123456. B has no way to know how to interpret floating point pixel data written as FITS files by A, and vice-versa. If the developers of B manage to know the ranges used by A after a lot of trial-error work, nothing stops A from changing their data ranges at any moment. So despite the fact that both applications implement the FITS standard strictly, they have a serious interoperability problem caused by a loosely defined format. Interoperability issues can be purely accidental, or intentional, as an easy way to block data migration among applications. This cannot happen—and will never happen—with XISF.
- Undefined physical disposition of pixel data. For example, a two-dimensional RGB color image can be stored as a three-dimensional image in a single header-data unit (HDU), with three contiguous blocks to store the red, green and blue channels. It can also be stored as three one-dimensional images in three HDUs, one for each channel. Or as four HDUs, one with all the metadata and no data and three with minimal metadata and one-dimensional channel data. It is unclear if it can also be stored as a one-dimensional sequence of groups of contiguous red, green and blue pixel samples in a single HDU. This "flexibility" can be seen as a nice feature at first sight, but it is not, definitely, in a real software application. FITS has no formal resources to define the organization of multichannel or vector-valued images. In XISF we have two pixel storage models (planar and normal models) formally defined without ambiguity.
-
No color spaces. If an application stores a three-dimensional image (also known as a data cube), what is it? An RGB color image? A grayscale image with two alpha channels? HSV? CIE XYZ? CIE L*a*b*? a 3-D grayscale image? We need colorimetrically defined color spaces to perform brightness/chromaticity separations and other essential tasks. The word "color" appears only once in the FITS Standard version 3.0 document to say the following (section G.2.1):
FITS data arrays contain elements which typically represent the values of a physical quantity at some coordinate location. Consequently they need not contain any pixel rendering information in the form of transfer functions, and there is no mechanism for color look-up tables.
- Lack of essential metadata and auxiliary data structures such as ICC color profiles, CFA patterns, image resolution parameters, image thumbnails, optimal visualization and orientation parameters, functional roles, etc. All of these ancillary objects and more have been formalized in XISF.
- Rigid big-endian encoding. The vast majority of server, desktop and laptop computer systems currently use little-endian architectures. However, FITS forces the big-endian encoding. Did you know that each time you load a 16-bit, 32-bit or 64-bit image stored in a FITS file, your software has to reverse the order of bytes for each pixel sample? The same happens when you have to write the image back to a FITS file on disk. That is a lot of time and energy, wasted just to satisfy the requirements of a rigid format that is unable to adapt to different hardware architectures. Data stored in XISF units can be encoded in little-endian or big-endian byte order, and the little-endian encoding is used by default.
- Obsolete metadata and lack of Unicode support. Punched cards are cool in museums, but as of 2017, we really need more than 7-bit ASCII, 8-character uppercase property names, and 80-byte metadata rows. We need full Unicode support, structured property identifiers, unlimited lengths for names and values, and much more data types and data structures than 'logical', 'numeric', and 'character'.
- Rigid header-data sequential organization. Magnetic tapes are also very interesting, but definitely not contemporary devices. With a monolithic XISF file, an application can download the entire XISF header, which contains all of the metadata describing all stored images and properties, in a single operation. There is no need to seek throughout large files; just a single sequential read operation to know everything about the entire XISF unit. With a distributed XISF unit, we can download the header and then download just the required data blocks, including arbitrary combinations of local and remote resources.
- Lack of a distributed storage model. Distributed XISF units store the header (metadata) and the data as separate files, including local and remote resources. This allows for flexible storage configurations that are not remotely possible with FITS. Furthermore, an XISF data blocks file allows indexed access to images and data objects through symbolic identifiers instead of file offsets. This means that XISF data blocks files can be reorganized, modified and extended freely without invalidating existing XISF units that depend on them. XISF data blocks files allow for unlimited data reutilization, that is, existing data objects can be referenced from multiple XISF units without duplicating data. Distributed XISF units are also relocatable, so one can transport them across file systems, machines, and networks.
- No effective data integrity and authentication protection. Digital signatures based on XML signatures, X.509 certificates and cryptographic checksums are formalized in the XISF specification. A digitally signed XISF unit is effectively sealed and cannot be modified. There are proposed checksum keywords in the current draft of the FITS standard 4.0, but they are not cryptographic and don't protect authenticity because checksums can be tampered. SHA-1, SHA-256, SHA-512 and SHA-3 checksums have been formalized in the XISF 1.0 specification.
FITS Compatibility
For all of the important reasons enumerated in the preceding section and some others, the FITS format has been declared deprecated in PixInsight since the beginning of 2015.
However, a large number of tools and processing pipelines still depend on metadata acquired from FITS header keywords. To facilitate the transition to XISF and the coexistence of data stored in both formats, XISF is fully compatible with FITS metadata. FITS header keywords can be stored in XISF units and retrieved transparently, as if the reader were working with actual FITS files and a standard FITS implementation. There is absolutely no difference at a high level, other than a much higher efficiency and flexibility.