PZX (Perfect ZX Tape) file format version 1.0
=============================================

The PZX file format consists of a sequence of blocks. Each block has the
following uniform structure:

offset type     name   meaning
0      u32      tag    unique identifier for the block type.
4      u32      size   size of the block in bytes, excluding the tag and size fields themselves.
8      u8[size] data   arbitrary amount of block data.

The block tags are four ASCII letter identifiers indicating how to interpret the
block data. The first letter is stored first in the file (i.e., at
offset 0 of the block), the last letter is stored last (i.e., at offset 3).
This means the tag may be internally conveniently represented either as 32bit
multicharacter constants stored in big endian order or 32bit reversed
multicharacter constant stored in little endian order, whichever way an
implementation prefers.

All integral values specified by this format are stored in little endian
format, that is, the least significant byte at lower offsets.
All values specified are in decimal, unless prefixed with 0x,
in which case they are hexadecimal. Any bits specified as unused should be
set to zero by encoding implementations and ignored by decoding implementations.

All strings specified by this format are stored in UTF-8 encoding. However
implementations are not required to render Unicode data at all. Any
implementation may choose to render only the characters it understands and
replace the rest with either a question mark character, or distinctive
hexadecimal representation of the character, whatever the implementor prefers.
For example an implementation which understands ASCII characters only may
render only those ASCII characters in the 32-126 range it understands. Also
note that the minimal compliant implementation is not required to render any
strings at all.

The sole purpose of the format is to encode sequence of pulses of certain duration.
Pulse level is defined to be either low or high. When loading, low level
corresponds to the bit 6 of port 0xFE reset (EAR off) and high level set (EAR on).
Similarly, when saving, low level corresponds to the bit 3 of port 0xFE reset
(MIC off) and high level to set (MIC on). (Note that O'Hara's ROM Disassembly
book got these EAR/MIC on/off terms incorrectly swapped). The block descriptions below
define the initial level at start of each block (and would do so even for any future
block which might be ever introduced, so an implementation may take this for
granted), and assume the level changes after each pulse generated (even if
its duration is zero). However note that implementations may as well decide
to invert that initial level and invert the level before each pulse, if it
makes them easier to implement. All that matters is that the level of the
pulses themselves remain the same.

Any durations are expressed in T cycles of standard 48k Spectrum CPU. This means
one T cycle equals 1/3500000 second. It's up to an implementation to convert
the durations appropriately if it may not or does not want to use the T cycles directly.
However note that this is not necessary in case of 128k emulation. In that
case the CPU speed is not substantially different and the ROM I/O routines
use the same timing constants anyway, so an implementation may safely use
the T cycles directly as well.

Overall regarding duration precision, note that the pulses generated by the
original Spectrum ROM saving routine itself vary by +1, -3, and -1 T cycles
in case of first pulse of first bit of leader, regular, and checksum bytes,
respectively. And memory contention can increase this by another 6 T cycles.
This means that an encoding implementation may choose to consider pulses
which do not vary by more than about 2% as reasonably equal for comparison
purposes when encoding pulse durations.

Blocks
------

There are four mandatory block types which any implementation must implement in order
to claim PZX compatibility, namely PZXT, PULS, DATA and PAUS blocks. All
other block types may be safely ignored, although implementors are welcome
to implement them as they see fit. However whenever an implementation
encounters a block it doesn't understand, it must skip it and continue as if
the block was not there at all.

In case the specified block size is smaller than the minimum size possible
for given block type, it is an error and an application may either ignore the
block and try to process the next one or to stop processing the file
entirely, whatever an implementor finds more appropriate. Similarly, if some
data inside the block indicate that the block should have at least some
minimum size but the specified block size is smaller than this, this is an
error as well and the above applies as well.

To allow custom extensions, anyone is free to define his own custom block,
however to prevent clashes with the standard, such blocks may use only
lowercase tag names, while any standardized blocks use uppercase tag names.


The core blocks which must be supported are:

PZXT - PZX header block
-----------------------

offset type     name   meaning
0      u8       major  major version number (currently 1).
1      u8       minor  minor version number (currently 0).
2      u8[?]    info   tape info, see below.

This block distinguishes the PZX files from other files and may provide
additional info about the file as well. This block must be always present as
the first block of any PZX file.

Any implementation should check the version number before processing the
rest of the PZX file or even the rest of this block itself. Any
implementation should accept only files whose major version it implements
and reject anything else. However an implementation may report a warning in
case it encounters minor greater than it implements for given major, if the
implementor finds it convenient to do so.

Note that this block also allows for simple concatenation of PZX files. Any
implementation should thus check the version number not only in case of the
first block, but anytime it encounters this block anywhere in the file. This
in fact suggests that an implementation might decide not to treat the first
block specially in any way, except checking the file starts with this block
type, which is usually done because of file type recognition anyway.

The rest of the block data may provide additional info about the tape. Note
that an implementation is not required to process any of this information in
any way and may as well safely ignore it.

The additional information consists of sequence of strings, each terminated
either by character 0x00 or end of the block, whichever comes first.
This means the last string in the sequence may or may not be terminated.

The first string (if there is any) in the sequence is always the title of
the tape. The following strings (if there are any) form key and value pairs,
each value providing particular type of information according to the key.
In case the last value is missing, it should be treated as empty string.
The following keys are defined (for reference, the value in brackets is the
corresponding type byte as specified by the TZX standard):

Publisher  [0x01] - Software house/publisher
Author     [0x02] - Author(s)
Year       [0x03] - Year of publication
Language   [0x04] - Language
Type       [0x05] - Game/utility type
Price      [0x06] - Original price
Protection [0x07] - Protection scheme/loader
Origin     [0x08] - Origin
Comment    [0xFF] - Comment(s)

Note that some keys (like Author or Comment) may be used more than once.

Any encoding implementation must use any of the key names as they are listed
above, including the case. For any type of information not covered above, it
should use either the generic Comment field or invent new sensible key name
following the style used above. This allows any decoding implementation to
classify and/or localize any of the key names it understands, and use any
others verbatim.

Overall, same rules as for use in TZX files apply, for example it is not
necessary to specify the Language field in case all texts are in English.

PULS - Pulse sequence
---------------------

offset type   name      meaning
0      u16    count     bits 0-14 optional repeat count (see bit 15), always greater than zero
                        bit 15 repeat count present: 0 not present 1 present
2      u16    duration1 bits 0-14 low/high (see bit 15) pulse duration bits
                        bit 15 duration encoding: 0 duration1 1 ((duration1<<16)+duration2)
4      u16    duration2 optional low bits of pulse duration (see bit 15 of duration1) 
6      ...    ...       ditto repeated until the end of the block

This block is used to represent arbitrary sequence of pulses. The sequence
consists of pulses of given duration, with each pulse optionally repeated
given number of times. The duration may be up to 0x7FFFFFFF T cycles,
however longer durations may be achieved by concatenating the pulses by use
of zero pulses. The repeat count may be up to 0x7FFF times, however more
repetitions may be achieved by simply storing the same pulse again together
with another repeat count.

The optional repeat count is stored first. When present, it is stored as
16 bit value with bit 15 set. When not present, the repeat count is considered to be 1.
Note that the stored repeat count must be always greater than zero, so when
decoding, a value 0x8000 is not a zero repeat count, but prefix indicating the
presence of extended duration, see below.

The pulse duration itself is stored next. When it fits within 15 bits, it is
stored as 16 bit value as it is, with bit 15 not set. Otherwise the 15 high
bits are stored as 16 bit value with bit 15 set, followed by 16 bit value
containing the low 16 bits. Note that in the latter case the repeat count
must be present unless the duration fits within 16 bits, otherwise the
decoding implementation would treat the high bits as a repeat count.

The above can be summarized with the following pseudocode for decoding:

    count = 1 ;
    duration = fetch_u16() ;
    if ( duration > 0x8000 ) {
        count = duration & 0x7FFF ;
        duration = fetch_u16() ;
    }
    if ( duration >= 0x8000 ) {
        duration &= 0x7FFF ;
        duration <<= 16 ;
        duration |= fetch_u16() ;
    }

The pulse level is low at start of the block by default. However initial
pulse of zero duration may be easily used to make it high. Similarly, pulse
of zero duration may be used to achieve pulses lasting longer than
0x7FFFFFFF T cycles. Note that if the repeat count is present in case of
zero pulse for some reason, any decoding implementation must consistently
behave as if there was one zero pulse if the repeat count is odd and as if
there was no such pulse at all if it is even.

For example, the standard pilot tone of Spectrum header block (leader < 128)
may be represented by following sequence:

0x8000+8063,2168,667,735

The standard pilot tone of Spectrum data block (leader >= 128) would be:

0x8000+3223,2168,667,735

For the record, the standard ROM save routines create the pilot tone in such
a way that the level of the first sync pulse is high and the level of the
second sync pulse is low. The bit pulses then follow, each bit starting with
high pulse. The creators of the PZX files should use this information to
determine if they got the polarity of their files right. Note that although
most loaders are not polarity sensitive and would work even if the polarity
is inverted, there are some loaders which won't, so it is better to always
stick to this scheme.

DATA - Data block
-----------------

offset      type             name  meaning
0           u32              count bits 0-30 number of bits in the data stream
                                   bit 31 initial pulse level: 0 low 1 high
4           u16              tail  duration of extra pulse after last bit of the block
6           u8               p0    number of pulses encoding bit equal to 0.
7           u8               p1    number of pulses encoding bit equal to 1.
8           u16[p0]          s0    sequence of pulse durations encoding bit equal to 0.
8+2*p0      u16[p1]          s1    sequence of pulse durations encoding bit equal to 1.
8+2*(p0+p1) u8[ceil(bits/8)] data  data stream, see below.

This block is used to represent binary data using specified sequences of
pulses. The data bytes are processed bit by bit, most significant bits first.
Each bit of the data is represented by one of the sequences, s0 if its value
is 0 and s1 if its value is 1, respectively. Each sequence consists of
pulses of specified durations, p0 pulses for sequence s0 and p1 pulses for
sequence s1, respectively.

The initial pulse level is specified by bit 31 of the count field. For data
saved with standard ROM routines, it should be always set to high, as
mentioned in PULS description above. Also note that pulse of zero duration
may be used to invert the pulse level at start and/or the end of the
sequence. It would be also possible to use it for pulses longer than 65535 T
cycles in the middle of the sequence, if it was ever necessary.

For example, the sequences for standard ZX Spectrum bit encoding are:
(initial pulse level is high):

bit 0: 855,855
bit 1: 1710,1710

The sequences for ZX81 encoding would be (initial pulse level is high):

bit 0: 530, 520, 530, 520, 530, 520, 530, 4689
bit 1: 530, 520, 530, 520, 530, 520, 530, 520, 530, 520, 530, 520, 530, 520, 530, 520, 530, 4689

The sequence for direct recording at 44100kHz would be (assuming initial pulse level is low):

bit 0: 79,0
bit 1: 0,79

The sequence for direct recording resampled to match the common denominator
of standard pulse width would be (assuming initial pulse level is low):

bit 0: 855,0
bit 1: 0,855

After the very last pulse of the last bit of the data stream is output, one
last tail pulse of specified duration is output. Non zero duration is
usually necessary to terminate the last bit of the block properly, for
example for data block saved with standard ROM routine the duration of the
tail pulse is 945 T cycles and only then goes the level low again. Of course
the specified duration may be zero as well, in which case this pulse has no
effect on the output. This is often the case when multiple data blocks are
used to represent continuous stream of pulses.

PAUS - Pause
------------

offset type   name      meaning
0      u32    duration  bits 0-30 duration of the pause
                        bit 31 initial pulse level: 0 low 1 high

This block may be used to produce pauses during which the pulse level is not
particularly important. The pause consists of pulse of given duration and
given level. However note that some emulators may choose to simulate random
noise during this period, occasionally toggling the pulse level. In such case
the level must not be toggled during the first 70000 T cycles and then more
than once during each 70000 T cycles.

For example, in case of ZX Spectrum program saved by standard SAVE command
there is a low pulse of about one second (49-50 frames) between the header
and data blocks. On the other hand, there is usually no pause between the
blocks necessary at all, as long as the tail pulse of the preceding block was
used properly.


Additional blocks which should be supported:

BRWS - Browse point
-------------------

offset type   name   meaning
0      u8[?]  text   text describing this browse point

This block may be used when it is desirable to mark some point on the tape
as target for tape browsing. The text is the single line describing this
target.

If an implementation supports tape browsing, it should implement a mode in
which it allows jumping to PZXT or BRWS blocks only. These blocks are the
reasonable points to which jumping makes sense, and both provide descriptive
name of that point. In case of the PZXT block it is the tape title (or "Tape
start" if no name is specified) and in case of BRWS block it is the the
description of that browse point. An implementation may choose to also
implement a mode in which it allows jumping to arbitrary blocks as well. In
such case the blocks may be referred to by their tag names as well, and an
implementation may choose to attempt to extract any additional information
from the block itself as it sees fit.

Creators of the PZX files are urged to put the BRWS blocks to wherever it makes
sense. They are often not needed, but they should be inserted prior any
level data block to which a user might eventually need to fast forward or
rewind to, together with appropriately descriptive name.

STOP - Stop tape command
------------------------

offset type   name   meaning
0      u16    flags  when exactly to stop the tape (1 48k only, other always).

This block is used to instruct an emulator to stop the virtual tape deck.
The flags field is used to specify under which conditions this should
happen. In case the value is 0, the tape should be always stopped, in case
the value is 1, it should be stopped only if 48k Spectrum is being emulated.
Other values are not defined yet, however for future compatibility, any
implementation should treat any value it doesn't understand as 0.

Creators of the PZX files are urged to put these blocks to whichever position the program
asks the user to stop the tape at, and use the flags field appropriately to
the situation as well.





Issues
======

Q: Some pulse sequences are more common than the others. Shall we include
some of them in the format definition and provide a way of simply specifying
them in the blocks? For example, PULS block of zero size might refer to the
standard pilot sequence. Similarly, in DATA block, setting p0 to 0 might
indicate that p1 specifies one of the default sets of sequences, say 0 for
standard ZX Spectrum and 1 for ZX81. Do we want to do either or both of
this or is it just a feature creep?

A: It is better not to hard code any specific knowledge of this type into an
implementation. This way an implementation doesn't have to deal with special
cases and it has a better chance of understanding any future revisions of the
format. The size savings alone are not worth it.



F.A.Q.
======

Q: How do I recognize that a DATA block is suitable for flashloading?

A: By testing the pulse sequences. What exactly you need to test may depend
on what loaders your implementation can flashload. In the common case of the
standard loaders you would simply test that each sequence consists of two
non-zero pulses, and that the total duration of the sequence s0 is less than
the total duration of sequence s1.



Mapping of TZX blocks to PZX blocks:
====================================

Here is a brief roadmap how each of the TZX blocks may be mapped to
corresponding PZX block(s):

ID 10 - Standard speed data block

Trivially encoded with PULS and DATA blocks.

ID 11 - Turbo speed data block

Trivially encoded with PULS and DATA blocks.

ID 12 - Pure tone

Trivially encoded with PULS block.

ID 13 - Sequence of pulses of various lengths

Trivially encoded with PULS block.

ID 14 - Pure data block

Trivially encoded with DATA block.

ID 15 - Direct recording block

Encoded with DATA block, following the example sequences in DATA block
description.

ID 18 - CSW recording block

Encoded either as PULS block or DATA block, using the DRB like encoding with
appropriate denominator in the latter case. In either case, it would help if
the pulse durations are cleaned from the noise caused by interfering sampling frequency.

ID 19 - Generalized data block

Encoded as PULS block and DATA block for alphabets with no more than 2
symbols. In case of 3+ symbols for the data the pilot would be encoded as PULS
block, and for data an encoding same as for CSW above would be used.

ID 20 - Pause (silence) or 'Stop the tape' command

Encoded either as PAUS block or STOP block.

ID 21 - Group start

Encoded as BRWS block.

ID 22 - Group end

Not needed.

ID 23 - Jump to block
ID 24 - Loop start
ID 25 - Loop end
ID 26 - Call sequence
ID 27 - Return from sequence
ID 28 - Select block

All these were dropped as not really needed.

ID 2A - Stop the tape if in 48K mode

Encoded as STOP block.

ID 2B - Set signal level

Not needed, every block defines pulse levels regardless of other blocks.

ID 30 - Text description

Encoded as BRWS block.

ID 32 - Archive info

Included as part of PZXT block.

ID 31 - Message block
ID 33 - Hardware type
ID 35 - Custom info block

All these were dropped as not needed, although they might be easily included
via custom blocks if anyone really has some use for them.

ID 5A - "Glue" block (90 dec, ASCII Letter 'Z')

Not exactly needed, although PZXT block may be used in the same way as well.



History
=======

1.0 (28.6.2007)

* Changed ZXTP tag to PZXT to eliminate any possible problems with draft implementations.

0.3 draft (20.5.2007)

* Some values are now intentionally limited to 31 bits to prevent overflow issues while decoding.
* Number of pulses of DATA block bit sequences is now stored as 8 bits only.

0.2 draft (12.5.2007)

* The ZXTP block now stores the info types verbosely (suggested by Kio and AndyC).
* The encoding of PULS block now allows 32 bit durations and optional repeat count.
* The DATA block and PAUS blocks now contain explicit initial pulse level.
* The PAUS block now consists of single pulse, and specifies how random noise should be used.

0.1 draft (28.4.2007)

+ Patrik Rak presents the initial draft release.
