Questions on XISF format

ColinThomas

Member
Hi,

To understand the XISF format a little better, I am looking at writing a python XISF parser.

I have a small RGB XISF file (from a preview) to test on.

I am having an issue with the XISF Block format, and was wondering what I have missed.

The parser can see:
XISF signature 'XISF0100'
Located the header length
Identify the reserved location
Extract the XISF XML header
An then the series of zero byte blocks.

Both the end of the zero byte blocks,and the XML location:attachement value, point to byte
4096

But the 8 bytes at this location are as follows:

  INFO: XISF Signature located... :)
  INFO: Header Length located: 1181
  INFO: XISF Reserved located... :)
  INFO: Extract Image Data frm XML
  INFO: geometry  -> 555:483:3
  INFO: sampleFormat  -> Float32
  INFO: bounds  -> 0:1
  INFO: colorSpace  -> RGB
  INFO: location  -> attachment:4096:3216780
  INFO: Found end to Unused Space.... :)
  ######################
  DEBUG: Read from byte No: 4096  8  bytes...
  b'\xad\x8dL>\x12t\x1f>'

I have checked all 8 byte from 4096 till the end and can not locate XISB signature of 'XISB0100', (where
I can locate the initial XISF signature 'XISF0100') hence the confusion...

Both signature checks are made using the same python:
        self.infile = open(self.xisfFile, 'rb')
        while True:
        self.numChars = 8
            self.buf = self.infile.read(self.numChars)
            if (self.DEBUG):
                print("DEBUG: Read from byte No:", self.xisfByteLocation, " " , self.numChars , " bytes...")

I have double checked and the xisf is happily being read into pixinsight..

The spec I am looking at is :

https://pixinsight.com/doc/docs/XISF-1.0-spec/XISF-1.0-spec.html


Any pointers/help welcomed :)

Clear skies

Colin
 
An Update..

Just used linux/mac "strings" on a couple of .xisf files. And again the XISB0100 string does not seem to be present.

i.e.

strings Preview01.xisf | grep XIS
XISF0100
Extensible Image Serialization Format - XISF version 1.0
--><xisf version="1.0" xmlns="http://www.pixinsight.com/xisf" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.pixinsight.com/xisf http://pixinsight.com/xisf/xisf-1.0.xsd"><Image geometry="555:483:3" sampleFormat="Float32" bounds="0:1" colorSpace="RGB" location="attachment:4096:3216780"><Resolution horizontal="72" vertical="72" unit="inch"/><ICCProfile location="attachment:3223552:5964"/><Thumbnail geometry="400:348:3" sampleFormat="UInt8" colorSpace="RGB" location="attachment:3231744:417600"/></Image><Metadata><Property id="XISF:CreationTime" type="String">2020-02-27T14:29:20Z</Property><Property id="XISF:CreatorApplication" type="String">PixInsight 01.08.06.1457</Property><Property id="XISF:CreatorModule" type="String">XISF module version 01.00.09.0187</Property><Property id="XISF:CreatorOS" type="String">macOS</Property><Property id="XISF:BlockAlignmentSize" type="UInt16" value="4096"/><Property id="XISF:MaxInlineBlockSize" type="UInt16" value="3072"/></Metadata></xisf>

So my python code looks okay.
Has the XISB0100 Signature been removed or altered wrt the spec ?

many thanks again.

Colin
 
Hi Colin,

In the first place, thank you for your interest in the XISF format.

can not locate XISB signature of 'XISB0100'

Because the file you are working with is a monolithic XISF file. "XISB0100" is the signature of an XISF data blocks file, which, along with an XISF header file, is part of a distributed XISF unit. So you cannot find the "XISB0100" signature in this file.

Distributed XISF units are still not implemented in current versions of PixInsight, which can only work with monolithic XISF files. So unless somebody has implemented the distributed XISF model—which is something I seriously doubt—, there are no XISF data block files and hence no "XISB0100" signature exists out of the formal specification. Of course, support for distributed XISF units will be implemented in a future version of PixInsight, probably during the 1.9 version cycle.
 
Hi Juan,

Many thanks for getting back" really appreciated:)

I have reread the XISF-1.0 spec from:
https://pixinsight.com/doc/docs/XISF-1.0-spec/XISF-1.0-spec.html

Section 9.2 is Monolithic XISF File, which is what I always thought the Preview XISF
created from pixinsight was (which is what I'm parsing).

In Figure 8 "Structure of a monolithic XISF File" , after the XISF Header and Unused Space,
is the "Attached XISF data block". Both the end of the Unused space and the XISF Header,
point this to be byte 4096. So I am after the XISF data block format, and 2 sections further
it seems to be discussed.

Section 9.4 "XISF Data Blocks File", the goes onto talk about "An XISF data blocks file
shall have the following structure" :the Signature (A sequence of eight contiguous bytes
whose values must form the set 'XISB0100'). It state that block files can be local or remote.

Section 10.1"Attached XISF Data Block" has:

> An attached XISF data block is an XISF data block stored in a monolithic XISF file.
>
>The position of an attached data block from the beginning of the monolithic XISF file
>where it is stored, as well as its length in bytes, must be completely and unambiguously
>defined by the XISF header of the XISF unit.
>
>Attached XISF data blocks shall not occur in a distributed XISF unit.

So Attached XISF data blocks are in monolith and not distributed. The XISF data block
hyperlink does back to section 10


In the XISF header, the Image is described as:

<Image
geometry="555:483:3"
sampleFormat="Float32"
bounds="0:1"
colorSpace="RGB"
location="attachment:4096:3216780"
>

So my file's location is attachment :)

Section 10.3 has a bullet '"location="attachment:position:size"' and states "Defines an
attached XISF data block in a monolithic XISF file": which is what I have.

So, given that the monolith description states that I should have an attached XISF data
block (Figure 8), the description of the data block in 9.4 states that it should have the
XIFB0100 signature.

It looks like the spec on the website therefore has some ambiguity ;-)
Where else, apart from Section 10.1,should I get the format of  XISF data block used in a monolith XISF file ?
Do I just assume that a data block is described from Figure 9, from the start of the block index, (as the data block in this diagram states 'Position and length defined by a block index element, so needs the preamble from a block index) ?

=>an attached XISF block definition is Figure 9 (minus the XISB signature and 8 bytes reserved ??)

I hope my explanation above is useful, as a newbie reading the spec. I hope you can point
me to where I've slipped up ?

best regards


Colin Thomas
 
Hi Juan,

Okay another read...
I'm more used to parsing microelectronics GDS binary files so this is new, please bare with me..:)

<Image
geometry="555:483:3"
sampleFormat="Float32"
bounds="0:1"
colorSpace="RGB"
location="attachment:4096:3216780"
>

So the Image data starts at 4096, and is 3216780 bytes long.

The image is 555x483 = 268065 pixels (on channel 3)
The image is RGB and Float32, so each pixel looked like
<32bits R/ 32bits G/32bits B> = 96 bits = 12 bytes

12 bytes x 268065 = 3216780 == attachment byte length :)

the bounds state the range of values is from 0->1

The ICC data then starts after the Image data:

<ICCProfile
  location="attachment:3223552:5964"
/>
ICC starts at byte 3223552 (the image data ended at 4096+3216780 = 3220876, so I expect some zero bytes.

Then there is a thumbnail
<Thumbnail
geometry="400:348:3"
sampleFormat="UInt8"
colorSpace="RGB"
location="attachment:3231744:417600"
/>

Thumbnail starts at byte 3231774 (the ICC ended at 3223552+5964 = 3229516, so again expect some zero bytes
There are 400x348 pixels = 139200 (on channel 3)
The image is RGB and Uint8, so each pixel looks like
<8bits R/8bits G/8bits B> = 24 bits = 3 bytes

3bytes x 139200 = 417600 == attachment byte length :)

One question: if the geometry is width:height:channel, are pixels being stored as row by row,
i.e
height0/width0, height0,width1, height0,width2 ....height0,widthN
...
heigthN/width0 ....................................heightN,widthN

where we start at the bottom left pixel, row after row to finish at top right pixel

So everything is specified from the header file, no need for an index section.

If this is it, then I can start coding again..

Many thanks

Colin
 
Hi Colin,

Sorry for the delay in getting back to you. I've been (and will be) extremely busy with important ongoing projects.

Your analysis is correct. There are small blocks of zero bytes between data blocks because our XISF support module stores uncompressed blocks aligned at integer multiples of 4096 bytes by default. This tends to improve I/O performance on all platforms, although not much on modern solid state storage devices, so this may change in a future version, or at least alignment will probably be optional. The XISF format specification allows for arbitrary unused blocks, which must be filled with zero byte values.

So everything is specified from the header file, no need for an index section.

Yes. This is one of the distinctive characteristics of XISF. Unlike FITS, you get the entire information about an XISF unit, including all stored images, data blocks and metadata, from its XML header. If you read the header and parse it (for example, our PCL framework includes exhaustive XML support), then you know absolutely everything about the XISF unit and all of the objects it contains.

Let me know if you need further information or help with your projects.
 
Back
Top