The Playstation 1 Video File (STR) Format v0.43 draft, March 2008 http://code.google.com/p/jpsxdec/ http://jpsxdec.blogspot.com/ -------------------------------------------------------------------------------- This document, copyright (c) 2008 Michael Sabin, is licensed under the MIT License. Permission has been obtained (http://sourceforge.net/mailarchive/message.php?msg_name=474C4DEB.5080304%40multimedia.cx) to also include source code from the xine media player (in chapters 1.1, 2.1 and chapter 3), copyright (c) the xine project, under the MIT License. The text of this licence is at the end of this file. Note that the related jPSXdec program is NOT licensed under the MIT license, but is licensed under the GPL v2. -------------------------------------------------------------------------------- History v0.2 draft - Draft. Initial public release. v0.21 draft - Corrected the Playstation default quantization matrix, which in turn fixed the mysterious divide-by-four in the dequantization step. v0.22 draft - Finished documenting FF8 movie format v0.30 draft - Obtained permission to use xine source code. This entire document now under modified MIT License. - ch 1.1: subcode.form is NOT unimportant - ch 2.2: Added what DC and AC stand for v0.40 draft - Changed license to just use the standard (unmodified) MIT License. - ch 2.3.6: Corrected YUV -> RGB conversion to use PSX equations. - ch 3.2: Checked and fixed FF8 audio decoding. - ch 3.3: Added Final Fantasy 9 video format (untested). v0.41 draft - ch 3.2: Added note about FF8 audio-only 'movie'. - ch 3.3: Checked and fixed FF9 decoding. v0.42 draft - ch 3.3: Corrected FF9 audio decoding. - ch 3.4: Added note that Lain DC Coefficients are handled in the normal version 2 method. v0.43 draft - ch 3.2: Flushed out more of the FF8 audio header - ch 3.3: Added some audio variations found on FF9 disc 4. - ch 3.4: Added Chrono Cross audio sector format. -------------------------------------------------------------------------------- ## ## Introduction ## ## 1. The disc ## 1.1. How data is stored on the disc ## 1.2. How the Playstation reads data from the disc ## 1.3. Getting the data off the disc ## 2. Decoding a Playstation 1 video frame ## 2.1. Demultiplex the frame ## 2.2. Uncompress the data ## 2.2.1. Read the DC Coefficient ## 2.2.2. Read all AC Coefficients ## 2.2.3. Convert to MDEC format ## 2.3. MDEC emulation ## 2.3.1. Translate the DC and run length codes into a 64 value list ## 2.3.2. Un-zig-zag the list into a matrix ## 2.3.3. Dequantization of the matrix ## 2.3.4. Apply Inverse Discrete Cosine Transform to the matrix ## 2.3.5. Combine the blocks into (Y, Cb, Cr) pixels ## 2.3.6. Convert the (Y, Cb, Cr) pixels into RGB pixels ## 3. Variations by some PSX games ## 3.1. Final Fantasy VII ## 3.2. Final Fantasy VIII ## 3.3. Final Fantasy IX ## 3.4. Serial Experiments Lain ## 4. Credits, Thanks, etc. ## ################################################################################ ## Introduction ################################################################################ Playstation 1 videos, usually with the extension STR, MOV, or BIN, contain compressed video data similar to an MPEG1 movie. It also contains interleaved audio. This document attempts to explain the decoding process of a single video frame. Audio is not covered in this document, but has already been documented by Jonathan Atkins (http://freshmeat.net/projects/cdxa/, mirrored at http://code.google.com/p/jpsxdec/downloads/list). Like MPEG1 streams, the decoding process is long, and rather complicated. Specifically, Chapters 2.2 and 2.3 closely resemble two aspects of MPEG1 decoding: translation of variable length codes, and macro-block decoding. I have tried to keep the descriptions as simple as possible, and explain some of the details and terminology of MPEG1 decoding. However, this document doesn't contain everything, and I've been immersed in this stuff for so long that I no longer can see where my explainations fall short. Therefore, you may need other sources of information to fully grasp these steps. The most helpful source would be the MPEG1 specification (ISO/IEC 11172). It is available to purchase from the ISO web site for a small fortune. Alternatively, if perfer to spend much less money, there are many good books that cover the MPEG1 format, and are a lot cheaper. There are some free alternatives that will help, but don't apply as well as the MPEG1 spec. H.261, the first specification using MPEG-like encoding, is available for free from ITU-T. Also available from ITU-T is H.262, which (according to Wikipedia) is "completely identical in all aspects" to the MPEG2 specification. Finally, you could also search for information about JPEG encoding, which can be found in many places on the web. ################################################################################ ## 1. The disc ################################################################################ ################################################################################ ## 1.1. How data is stored on the disc ################################################################################ Each raw compact-disc sector is 2352 bytes long. For a normal "Mode 1" sector, 304 of those bytes contain information about the sector and error correction data. This leaves 2048 of data per sector for information. "Mode 1" sectors are what nearly all software is designed to work with. Playstation video frames are stored on the disc in "Mode 2 Form 1" sectors, which are nearly identical to "Mode 1" sectors (it has small header differences that won't be covered here). Modern computer operating systems are able to read both sector types without problems. A "Mode 1" or "Mode 2 Form 1" compact-disc sector +-24 bytes--+-2048 bytes-------------------------------------+-280 bytes--+ | CD-XA | Normal sector data | other | | Header | | sector | | | | data | +-----------+------------------------------------------------+------------+ Audio on Playstation discs are stored in "Mode 2 Form 2" sectors. These sectors only use 28 bytes for the sector header and error correction. This leaves 2324 bytes for data. These "From 2" sectors are intermingled with "Form 1" sectors. Modern operating systems don't like "Mode 2 Form 2" sectors, so you need special programs to get these sectors off the disc. Knowing the raw sector data on the disc is mostly important for understanding Playstation 1 audio, but it also helps with the video. A CD-XA Sector Header will contain information about the sector: specifically whether it contains audio, video, or data. For audio sectors, it will also contain the audio format used (channels, sample rate, and bits-per-sample). CD-XA Header: [mostly from xine media player source code: demux_str.c] - 12 bytes: sync header (00 FF FF FF FF FF FF FF FF FF FF 00) - 4 bytes: timecode relative to start of disc (mm ss ff 02;BCD,not decimal) - 4 bytes: sector parameters - 0x10 file_num - 0x11 channel_num - 0x12 subcode - 0x13 coding_info - 4 bytes: copy of parameters (should be identical) sector parameters: - file_num is purely to distinguish where a 'file' ends and a new 'file' begins among the sectors. It's usually 1. - channel_num is a sub-channel in this 'file'. Video, audio and data sectors can be mixed into the same channel or can be on separate channels. Usually used for multiple audio tracks (e.g. 5 different songs in the same 'file', on channels 0, 1, 2, 3 and 4) - subcode is a set of bits - bit 7: eof_marker -- 0, or 1 if this sector is the end of the 'file' - bit 6: real_time -- unimportant (always set in PSX STR streams) - bit 5: form -- 0 = Form 1 (2048 data), 1 = Form 2 (2324 data) - bit 4: trigger -- for use by reader application (unimportant) - bit 3: DATA -- set to 1 to indicate DATA sector, otherwise 0 - bit 2: AUDIO -- set to 1 to indicate AUDIO sector, otherwise 0 - bit 1: VIDEO -- set to 1 to indicate VIDEO sector, otherwise 0 - bit 0: end_audio -- end of audio frame (never set in PSX STR streams) - bits 1, 2 and 3 are mutually exclusive - coding_info is a set of bits, interpretation is dependant on the DATA/AUDIO/VIDEO bits setting of subcode. - For AUDIO: - bit 7: reserved -- should always be 0 - bit 6: emphasis -- boost audio volume (ignored by us) - bit 5: bitssamp -- must always be 0 - bit 4: bitssamp -- 0 for mode B/C (4 bits/sample, 8 sound sectors) 1 for mode A (8 bits/sample, 4 sound sectors) - bit 3: samprate -- must always be 0 - bit 2: samprate -- 0 for 37.8kHz playback, 1 for 18.9kHz playback - bit 1: stereo -- must always be 0 - bit 0: stereo -- 0 for mono sound, 1 for stereo sound - For DATA or VIDEO: - always seems to be 0 in PSX STR files ################################################################################ ## 1.2. How the Playstation reads data from the disc ################################################################################ Data is read from the disc one sector at a time at either 75 sectors per second (single speed) or 150 sectors per second (double speed). The video and audio is spaced out over these sectors so they can be delivered at the appropriate times. Example: A movie in the game runs 15 frames per second. If the Playstation is set to read the data at 75 sectors per second (single speed), each frame needs to be spaced over 5 disc sectors (75 sectors per second / 15 frames per second = 5 sectors per frame). Audio is also intermixed every so many sectors (usually 8 or 32). Since video frame data doesn't (usually? ever?) need all the sectors allocated to it, an audio sector can quickly be squeezed in. Each audio sector generates 4032 samples of decoded audio. If the audio is in stereo, then the samples are split for the left/right channels, to 2016. As shown above, the raw CD-XA Sector Header explains how the data is stored, the sample rate, and if it is mono or stereo. Example: A movie in the game has mono audio running at 37800 samples per second. If the Playstation is set to read at 75 sectors per second, an audio sector needs to appear every 8 sectors (4032 samples per sector * 75 sectors per second / 37800 samples per second = 8 sectors between audio sector). Sector 1: Video frame 1, chunk #0 (of 5) Sector 2: Video frame 1, chunk #1 (of 5) Sector 3: Video frame 1, chunk #2 (of 5) Sector 3: Video frame 1, chunk #3 (of 5) Sector 4: Video frame 1, chunk #4 (of 5) Sector 5: Video frame 2, chunk #0 (of 4) Sector 6: Video frame 2, chunk #1 (of 4) Sector 7: Video frame 2, chunk #2 (of 4) Sector 8: 4032 samples of audio at 37800 samples/second Sector 9: Video frame 2, chunk #3 (of 4) Sector 10: Video frame 3, chunk #0 (of 5) ... If you are interested in more details of how audio is decoded, you could check the Green Book, CD-i Specification. Or you could check out Jonathan Atkins's cdxa program. He has done a good job of including documentation. ################################################################################ ## 1.3. Getting the data off the disc ################################################################################ The most common and easily accessible way to read the full raw sectors off the disc is to copy the entire disc to a raw image file. This disc image format is commonly referred to as BIN/CUE. An ISO disc image does NOT copy the raw sector data (which is necessary for audio decoding). Alternatively, you may find tools to copy just the raw sectors that contain movie data (such as the popular PSmplay tool). There is no standard on how to sore these raw sectors from CDs. Depending on the tool used, the specifics of the resulting file may vary slightly. Some programs add a RIFF header at the start of the file. Finally, you might sometimes get the data off the disc using the normal copying of files by your operating system. However, like ISO image files, your operating system will only copy the normal 2048 bytes per sector, missing the raw sector information. ################################################################################ ## 2. Decoding a Playstation 1 video frame ################################################################################ There are three major steps the Playstation goes through to decode 1 frame out of a STR file. 1) Read all the video sectors that contain the frame 'chunks' from the disc and "demux" them into a solid stream (the Playstation hardware/libraries and the game do this...I think) 2) Decompress that disc data into MDEC compatible run length codes (it is entirely the game's responsibility to do this) 3) Translate all those run length codes into actual image data, in 24 or 15 bit RGB format (what the MDEC chip does) The following sub-sections attempt to emulate these 3 steps. ################################################################################ ## 2.1. Demultiplex the frame ################################################################################ Each frame 'chunk' sector begins with 32 bytes of information, followed by 2012 bytes of 'chunk data'. How a frame chunk fits into a "Mode 2 Form 1" sector +-24 bytes--+-32 bytes-+-2012 bytes-----------------------------+-280 bytes--+ | CD-XA | Chunk | chunk data | cd sector | | Header | Header | | stuff | +-----------+----------+----------------------------------------+------------+ Frame Chunk Header: [mostly from xine media player source code: demux_str.c] * all values are little-endian - 4 bytes; unknown -- usually 0x80010160 for a video frame. according to PSX hardware guide, this value is written to mdec0 register: - bit 27: 1 for 16-bit colour, 0 for 24-bit colour depth - bit 24: if 16-bit colour, 1/0=set/clear transparency bit - all other bits unknown - 2 bytes; 'chunk number' of this video frame (0 to numchunks-1) - 2 bytes; number of chunks in this frame - 4 bytes; frame number (starts at 1) - 4 bytes; seemingly random number. frame duration? - 2 bytes; width of frame in pixels - 2 bytes; height of frame in pixels - 2 bytes; the number of run length codes in the frame? or size of data (in bytes) following this header? - 2 bytes; always 0x3800 - 2 bytes; frame's quantization scale - 2 bytes; version of the video frame (see next section) - 4 bytes; always 0x00000000 Size of Frame Chunk Header: 32 bytes The video frame 'chunk data' from all the sectors related to the frame need to be appended together to form a solid stream. +-2012 bytes----+-2012 bytes----+-- --+-2012 bytes-----+ | chunk 0 data | chunk 1 data | ... | chunk n-1 data | +---------------+---------------+-- --+----------------+ That was the easy part. It gets harder from here. ################################################################################ ## 2.2. Uncompress the data ################################################################################ There are two common and understood video frame types found on Playstation game discs: version 2, and version 3 (I don't know what happened to version 1). These two versions I assume cover the majority of video frame formats. My guess is they were part of the standard development tools given to game developers. It would be wondrous if every movie found in every game used these two formats. However, since it is the game's responsibility to decompress the data off the disc, some studious game developers used their own method. Alas, the only way one could ever hope to understand the decoding scheme used by some games would be to reverse engineer the game's code. So let us decode a version 2 or 3 frame. At the highest level, a demultiplexed frame consists of: Frame Data Header & Macro-blocks: * all values are little-endian - 2 bytes; the number of run length codes in the frame? or size of data (in bytes) following this header? (again) - 2 bytes; always 0x3800 (again) - 2 bytes; the frame's quantization scale (again) - 2 bytes; the version of the video frame (again) Size of Frame Data Header: 8 - Compressed macro block 1 - Compressed macro block 2 ... - Compressed macro block (width+15)/16 * (height+15)/16 These "macro blocks" will eventually turn into 16 x 16 pixel squares. They start at the top left of the image, then work their way down in a column. Then continue at the top of the next column, and so on. Example 64 x 64 image: +----+----+----+----+ | 1 | 5 | 9 | 13 | +----+----+----+----+ | 2 | 6 | 10 | 14 | +----+----+----+----+ | 3 | 7 | 11 | 15 | +----+----+----+----+ | 4 | 8 | 12 | 16 | +----+----+----+----+ If the frame dimensions are not divisible by 16, you must round up the width and/or height to be a multiple of 16. The extra data in the final decoded frame can simply be cropped off. Each 'macro block' consists of 6 'blocks' (in this order!): Macro-block: - Chrominance Red (Cr) block - Chrominance Blue (Cb) block - Top-Left Luminance (Y1) block - Top-Right Luminance (Y2) block - Bottom-Left Luminance (Y3) block - Bottom-Right Luminance (Y4) block Yes, thanks to smf (MAME developer) we now know that Cr comes before Cb, contrary to what you may find in some other documentation and source code. Here is what each of those 6 blocks consist of: Block: - One "Discrete Cosine Transform Direct Current Coefficient" - Zero or more "Discrete Cosine Transform Alternating Current Coefficients" - One "End of Block" code At the start of every block is what is called the "Discrete Cosine Transform Direct Current Coefficient". Most often it is simply referred to as DC. It is the most important value of the block. Following the DC Coefficient are compressed "Discrete Cosine Transform Alternating Current Coefficients", usually referred to as simply AC. **!! Note that the block bit stream data !!** **!! is read 16-bits at a time in *little-endian* order !!** ################################################################################ ## 2.2.1. Read the DC Coefficient ################################################################################ For version 2 frames, the DC Coefficient of all 6 blocks are encoded the same: 10-bits, signed. Very simple. For version 3 frames, each Chrominance Red (Cr) DC Coefficient is relative to the previous Cr DC Coefficient, and each Chrominance Blue (Cb) DC Coefficient is relative to the previous Cb DC Coefficient. They are also encoded using a tricky arrangement of variable length codes. Variable Number of bits used to Negative Positive Length Code store DC Coefficient Differential Differential 11111110 8 -255 to -128 128 to 255 1111110 7 -127 to -64 64 to 127 111110 6 -63 to -32 32 to 63 11110 5 -31 to -16 16 to 31 1110 4 -15 to -8 8 to 15 110 3 -7 to -4 4 to 7 10 2 -3 to -2 2 to 3 01 1 -1 1 00 0 0 0 After the variable length code, there is the corresponding number of bits for the DC Coefficient. The first of these bits is the sign bit. If it's 0, then use the 'Negative Differential' on the remaining bits. If it's 1, use the 'Positive Differential' on the remaining bits. Once that value is determined, it is then multiplied by 4 (for some reason). -- Pseudocode to decode version 3 DC Coefficient for Cr or Cb ----------------- /* At the start of the frame, initialize Previous_DC_Coefficient = 0 */ If Peek_Next_Bits() == "11111110" Skip_Bits(8) If Read_Bits(1) = "0" Then DC_Coefficient = Read_UnsignedBits(7) - 255 Else DC_Coefficient = Read_UnsignedBits(7) + 128 End If Else If Peek_Next_Bits() == "1111110" Skip_Bits(7) If Read_Bits(1) = "0" Then DC_Coefficient = Read_UnsignedBits(6) - 127 Else DC_Coefficient = Read_UnsignedBits(6) + 64 End If Else If Peek_Next_Bits() == "111110" Skip_Bits(6) If Read_Bits(1) = "0" Then DC_Coefficient = Read_UnsignedBits(5) - 63 Else DC_Coefficient = Read_UnsignedBits(5) + 32 End If /* ...and so on... */ Else If Peek_Next_Bits() == "01" Skip_Bits(2) If Read_Bits(1) = "0" Then DC_Coefficient = -1 Else DC_Coefficient = 1 End If Else If Peek_Next_Bits() == "00" Skip_Bits(2) DC_Coefficient = 0 End If DC_Coefficient *= 4 /* for some reason we multiply by 4 */ /* If Cr, use prevous Cr. If Cb, use previous Cb */ DC_Coefficient += Previous_DC_Coefficient Previous_DC_Coefficient = DC_Coefficient ------------------------------------------------------------------------------ The DC Coefficient for the Luminance blocks (Y1, Y2, Y3, Y4) are all stored relative to the previous Luminance block (e.g. Y2 value is stored relative to Y1, etc.). They use a similar arrangement of variable length codes. Variable Number of bits used to Negative Positive Length Code store DC Coefficient Differential Differential 1111110 8 -255 to -128 128 to 255 111110 7 -127 to -64 64 to 127 11110 6 -63 to -32 32 to 63 1110 5 -31 to -16 16 to 31 110 4 -15 to -8 8 to 15 101 3 -7 to -4 4 to 7 01 2 -3 to -2 2 to 3 00 1 -1 1 100 0 0 0 Use similar pseudocode as the Chrominance DC. ################################################################################ ## 2.2.2. Read all AC Coefficients ################################################################################ The AC Coefficients are stored the same for both version 2 and 3 frames. They are each encoded using the standard MPEG1 AC Coefficient variable length codes. - variable length code - variable length code ... - variable length code - END_OF_BLOCK code Here are all 111 variable length codes, and their equivalent run of zeros and AC Coefficient. These decoded values are often referred to as 'zero run-length codes'. Variable length code # of zero-value Non-zero AC Coefficients AC Coefficient value 11s 0 1 011s 1 1 0100 s 0 2 0101 s 2 1 0010 1s 0 3 0011 0s 4 1 0011 1s 3 1 0001 00s 7 1 0001 01s 6 1 0001 10s 1 2 0001 11s 5 1 0000 100s 2 2 0000 101s 9 1 0000 110s 0 4 0000 111s 8 1 0010 0000 s 13 1 0010 0001 s 0 6 0010 0010 s 12 1 0010 0011 s 11 1 0010 0100 s 3 2 0010 0101 s 1 3 0010 0110 s 0 5 0010 0111 s 10 1 0000 0010 00 s 16 1 0000 0010 01 s 5 2 0000 0010 10 s 0 7 0000 0010 11 s 2 3 0000 0011 00 s 1 4 0000 0011 01 s 15 1 0000 0011 10 s 14 1 0000 0011 11 s 4 2 0000 0001 0000 s 0 11 0000 0001 0001 s 8 2 0000 0001 0010 s 4 3 0000 0001 0011 s 0 10 0000 0001 0100 s 2 4 0000 0001 0101 s 7 2 0000 0001 0110 s 21 1 0000 0001 0111 s 20 1 0000 0001 1000 s 0 9 0000 0001 1001 s 19 1 0000 0001 1010 s 18 1 0000 0001 1011 s 1 5 0000 0001 1100 s 3 3 0000 0001 1101 s 0 8 0000 0001 1110 s 6 2 0000 0001 1111 s 17 1 0000 0000 1000 0s 10 2 0000 0000 1000 1s 9 2 0000 0000 1001 0s 5 3 0000 0000 1001 1s 3 4 0000 0000 1010 0s 2 5 0000 0000 1010 1s 1 7 0000 0000 1011 0s 1 6 0000 0000 1011 1s 0 15 0000 0000 1100 0s 0 14 0000 0000 1100 1s 0 13 0000 0000 1101 0s 0 12 0000 0000 1101 1s 26 1 0000 0000 1110 0s 25 1 0000 0000 1110 1s 24 1 0000 0000 1111 0s 23 1 0000 0000 1111 1s 22 1 0000 0000 0100 00s 0 31 0000 0000 0100 01s 0 30 0000 0000 0100 10s 0 29 0000 0000 0100 11s 0 28 0000 0000 0101 00s 0 27 0000 0000 0101 01s 0 26 0000 0000 0101 10s 0 25 0000 0000 0101 11s 0 24 0000 0000 0110 00s 0 23 0000 0000 0110 01s 0 22 0000 0000 0110 10s 0 21 0000 0000 0110 11s 0 20 0000 0000 0111 00s 0 19 0000 0000 0111 01s 0 18 0000 0000 0111 10s 0 17 0000 0000 0111 11s 0 16 0000 0000 0010 000s 0 40 0000 0000 0010 001s 0 39 0000 0000 0010 010s 0 38 0000 0000 0010 011s 0 37 0000 0000 0010 100s 0 36 0000 0000 0010 101s 0 35 0000 0000 0010 110s 0 34 0000 0000 0010 111s 0 33 0000 0000 0011 000s 0 32 0000 0000 0011 001s 1 14 0000 0000 0011 010s 1 13 0000 0000 0011 011s 1 12 0000 0000 0011 100s 1 11 0000 0000 0011 101s 1 10 0000 0000 0011 110s 1 9 0000 0000 0011 111s 1 8 0000 0000 0001 0000 s 1 18 0000 0000 0001 0001 s 1 17 0000 0000 0001 0010 s 1 16 0000 0000 0001 0011 s 1 15 0000 0000 0001 0100 s 6 3 0000 0000 0001 0101 s 16 2 0000 0000 0001 0110 s 15 2 0000 0000 0001 0111 s 14 2 0000 0000 0001 1000 s 13 2 0000 0000 0001 1001 s 12 2 0000 0000 0001 1010 s 11 2 0000 0000 0001 1011 s 31 1 0000 0000 0001 1100 s 30 1 0000 0000 0001 1101 s 29 1 0000 0000 0001 1110 s 28 1 0000 0000 0001 1111 s 27 1 These stings of bits are mutually exclusive. The 's' at the end of every bit string is the 'sign bit'. If that bit is set, then the AC Coefficient should instead be negative. Simply walk the bits of data until a match is found, then record the corresponding # of zero AC coefficients, and the non-zero AC Coefficient. The table above doesn't cover all possible combinations, so an escape code is provided for all other values. 000001 Escape code Following the "000001" bits will be 16 bits: 6-bits unsigned for the # of zero AC coefficients, and 10-bits signed for the AC Coefficient. Finally, every block must be terminated by the END_OF_BLOCK code. 10 END_OF_BLOCK Note that unlike MPEG1, blocks may consist of only an END_OF_BLOCK code. -- Pseudocode to decode AC Coefficients in one block -------------------------- While Peek_Next_Bits() != END_OF_BLOCK /* 11s -> 0 , 1 */ If Peek_Next_Bits() == "110" Then Print "Num of Zeros = 0, AC Coefficient = 1" Skip_Bits(3) Continue While End If If Peek_Next_Bits() == "111" Then Print "Num of Zeros = 0, AC Coefficient = -1" Skip_Bits(3) Continue While End If /* 011s -> 1 , 1 */ If Peek_Next_Bits() == "0110" Then Print "Num of Zeros = 1, AC Coefficient = 1" Skip_Bits(4) Continue While End If If Peek_Next_Bits() == "0111" Then Print "Num of Zeros = 1, AC Coefficient = -1" Skip_Bits(4) Continue While End If /* 0100s -> 0 , 2 */ If Peek_Next_Bits() == "01000" Then Print "Num of Zeros = 0, AC Coefficient = 2" Skip_Bits(5) Continue While End If If Peek_Next_Bits() == "01001" Then Print "Num of Zeros = 0, AC Coefficient = -2" Skip_Bits(5) Continue While End If /* ... and so on ... */ If Peek_Next_Bits() == "000001" Then /* escape code */ Skip_Bits(6) Num_of_0 = Read_Unsigned_Bits(6) AC_Coeff = Read_Signed_Bits(10) Print "Num of Zeros = " Num_of_0 ", AC Coefficient = " AC_Coeff End If End While ------------------------------------------------------------------------------ Once you've reached the END_OF_BLOCK code, the sum of all the zero value AC coefficients, plus the number of AC Coefficients read, should be less than or equal to 63. ################################################################################ ## 2.2.3. Convert to MDEC format ################################################################################ Now we will pack all this data into the format the Playstation MDEC chip understands. First we start with the frame's quantization scale (found in the Frame Chunk Header, or in the Frame Data Header), and the block's DC coefficient. Pack the frame's Quantization Scale into 6 bits by chopping of the top 10 bits. Then combine it with the DC Coefficient. ((Frame_Quantization_Scale & 0x3F) << 10) | (DC_Coefficient & 0x3FF) The # of zeros and AC Coefficient are packed similarly. You take the 6 bits from the # of zeros, and the 10 bits from the AC coefficient to form a 16 bit value. ((Num_Of_Zeros & 0x3F) << 10) | (AC_Coefficient & 0x3FF) Finally, the binary '01' END_OF_BLOCK is converted to the MDEC END_OF_BLOCK code 0xFE00. -- Pseudocode to generate a macro block readable by the MDEC ------------------ // TODO: Is there a better way to pseudocode this? Print ((Frame_Quantization_Scale & 0x3F) << 10) | (DC_Coefficient & 0x3FF) For 6 times For Each Run and AC_Coefficient, Whle Not END_OF_BLOCK Print ((Run & 0x3F) << 10) | (AC_Coefficient & 0x3FF) Next Print 0xFE00 Next ------------------------------------------------------------------------------ Now you have a long list of 16 bit values ready to be sent to the MDEC. Note that since the MDEC reads data as little-endian, if these 16 bit values are stored as a stream, they should be done so as little-endian. ################################################################################ ## 2.3. MDEC emulation ################################################################################ The MDEC chip simply works on macro blocks. It has no concept of frames. So all that a simple MDEC emulator needs to do is take in one macro-block, and spit out a 16x16 image (either 24 or 15 bit RGB). The 6 blocks in each macro block are decoded using the same steps that MPEG1 I-frames use. If you know how MPEG1 decodes macro blocks, then you can pretty much guess how the rest of this will go. It takes 6 steps to decode a macro-block to an RGB 16x16 pixel square. For each block (Cr, Cb, Y1, Y2, Y3, Y4): 1) Expand the zero run-length codes into a 64 value list. 2) Wind the list into an 8x8 matrix of values using the normal MPEG/JPEG zig-zag order. 3) De-quantisize the values using the normal MPEG1 quantization table, and the macro-block's quantization scale. 4) Perform the complicated inverse discrete cosine transform on the 8x8 matrix 5) Once that has been done for all 6 blocks, then merge the Cr and Cb values together with the Y1, Y2, Y3, Y4 values. 6) Convert every YCbCr pixel into an RGB pixel ################################################################################ ## 2.3.1. Translate the DC and run length codes into a 64 value list ################################################################################ As we saw in the prevous section, the first 16 bits hold the Quantization Scale, and the DC Coefficient. We decode those values the same way we encoded them: Quantization_Scale = (First_16_Bits() >> 10) DC_Coefficient = (First_16_Bits() & 0x3FF) The remaining 16 bit values hold a run of zero-value AC coefficients, and a non-zero AC coefficient. These 16 bit values continue until the MDEC END_OF_BLOCK (0xFE00) code is encountered. Here's some pseudocode that would print the full 64 values of the list. ------------------------------------------------------------------------------ Print DC_coefficient Length = 1 Run_Length_Code = First_16_Bits() While Run_Length_Code != END_OF_BLOCK /* 0xFE00 */ For 1 To (Run_Length_Code >> 10) Print "0" Length += 1 End Loop Print (Run_Length_Code & 0x3FF) Length += 1 Run_Length_Code = Next_16_Bits() End While For 1 To (64 – Length) /* fill the rest with zeros */ Print "0" Next ------------------------------------------------------------------------------ Alternatively, here is some code that would fill an array of 64 values. ------------------------------------------------------------------------------ Define Coefficient_List[64] For i = 0 to 63 /* start by filling the array with zeros */ Coefficient_List[i] = 0 Next Coefficient_List[0] = DC_coefficient i = 0 Run_Length_Code = First_16_Bits() While Run_Length_Code != END_OF_BLOCK i += 1 + (Run_Length_Code >> 10) Coefficient_List[i] = (Run_Length_Code & 0x3FF) Run_Length_Code = Next_16_Bits() End While ------------------------------------------------------------------------------ The resulting list will be one DC coefficient, and 63 AC coefficients (most of which will be zero). [DC, AC1, AC2, AC3, AC4, AC5, AC6, AC7, AC8, AC9, AC10, AC11, AC12, AC13, AC14, AC15, AC16, AC17, AC18, AC19, AC20, AC21, AC22, AC23, AC24, AC25, AC26, AC27, AC28, AC29, AC30, AC31, AC32, AC33, AC34, AC35, AC36, AC37, AC38, AC39, AC40, AC41, AC42, AC43, AC44, AC45, AC46, AC47, AC48, AC49, AC50, AC51, AC52, AC53, AC54, AC55, AC56, AC57, AC58, AC59, AC60, AC61, AC62, AC63] ################################################################################ ## 2.3.2. Un-zig-zag the list into a matrix ################################################################################ Wind the list into an 8x8 matrix of values using the normal MPEG/JPEG zig-zag order. Here is the standard MPEG1 zig-zag order: ZIG_ZAG_MATRIX[x,y] x=0 1 2 3 4 5 6 7 -------------------------------- y=0 | 0, 1, 5, 6, 14, 15, 27, 28 | 1 | 2, 4, 7, 13, 16, 26, 29, 42 | 2 | 3, 8, 12, 17, 25, 30, 41, 43 | 3 | 9, 11, 18, 24, 31, 40, 44, 53 | 4 | 10, 19, 23, 32, 39, 45, 52, 54 | 5 | 20, 22, 33, 38, 46, 51, 55, 60 | 6 | 21, 34, 37, 47, 50, 56, 59, 61 | 7 | 35, 36, 48, 49, 57, 58, 62, 63 | -------------------------------- Each value in that matrix represents an index in the list. -- Pseudocode to un-zig-zag the list into a matrix --------------------------- Define Coefficient_Matrix[8, 8] For x = 0 to 7 For y = 0 to 7 Coefficient_Matrix[x, y] = Coefficient_List[ ZIG_ZAG_MATRIX[x, y] ] Next Next ------------------------------------------------------------------------------ Now you have a matrix with the DC Coefficient and AC Coefficients. Coefficient_Matrix[x, y] x=0 1 2 3 4 5 6 7 ------------------------------------------------ y=0 | DC , AC1 , AC5 , AC6 , AC14, AC15, AC27, AC28 | 1 | AC2 , AC4 , AC7 , AC13, AC16, AC26, AC29, AC42 | 2 | AC3 , AC8 , AC12, AC17, AC25, AC30, AC41, AC43 | 3 | AC9 , AC11, AC18, AC24, AC31, AC40, AC44, AC53 | 4 | AC10, AC19, AC23, AC32, AC39, AC45, AC52, AC54 | 5 | AC20, AC22, AC33, AC38, AC46, AC51, AC55, AC60 | 6 | AC21, AC34, AC37, AC47, AC50, AC56, AC59, AC61 | 7 | AC35, AC36, AC48, AC49, AC57, AC58, AC62, AC63 | ------------------------------------------------ ################################################################################ ## 2.3.3. Dequantization of the matrix ################################################################################ To quantisize basically means to divide a value by some number to make it smaller. De-quantization is just the opposite--we multiply the number back to its original value. Here is the default MDEC quantization table. It is identical to the MPEG-1 intra quantization matrix, except the first value is 2 instead of 8. PSX_QUANIZATION_TABLE[x,y] x=0 1 2 3 4 5 6 7 -------------------------------- y=0 | 2, 16, 19, 22, 26, 27, 29, 34 | 1 | 16, 16, 22, 24, 27, 29, 34, 37 | 2 | 19, 22, 26, 27, 29, 34, 34, 38 | 3 | 22, 22, 26, 27, 29, 34, 37, 40 | 4 | 22, 26, 27, 29, 32, 35, 40, 48 | 5 | 26, 27, 29, 32, 35, 40, 48, 58 | 6 | 26, 27, 29, 34, 38, 46, 56, 69 | 7 | 27, 29, 35, 38, 46, 56, 69, 83 | -------------------------------- ------------------------------------------------------------------------------ Define Deqantizized_Matrix[8, 8] For x = 0 to 7 For y = 0 to 7 If x == 0 And y == 0 Then /* The DC coefficient is not multiplied by the quantization scale */ Deqantizized_Matrix[x, y] = 8 * Coefficient_Matrix[x, y] / 16 Else Deqantizized_Matrix[x, y] = Coefficient_Matrix[x, y] * PSX_QUANIZATION_TABLE[x, y] * Quantization_Scale End If Next Next ------------------------------------------------------------------------------ This leaves us with values between ??? and ??? for each coefficient. ################################################################################ ## 2.3.4. Apply Inverse Discrete Cosine Transform to the matrix ################################################################################ In mathematical terms, the inverse discrete cosine transform used by the PSX (and MPEG1) looks like this: 7 7 2*x+1 2*y+1 f(x,y) = sum sum c(u)*c(v)*F(u,v)* cos (------- *u*PI)* cos (------- *v*PI) u=0 v=0 2 * 8 2 * 8 x,y=0,1,...,7 F(,) is the input matrix f(,) is the output matrix c(u) = { sqrt(1/8) when u=0 { sqrt(2/8) otherwise c(v) = { sqrt(1/8) when v=0 { sqrt(2/8) otherwise Egad, what the heck does that mean??? Here it is in pseudocode: -- Pseudocode for the inverse discrete cosine transform ---------------------- Define block[8, 8] For Block_x = 0 to 7 For Block_y = 0 to 7 Total = 0 For DCT_x 0 to 7 For DCT_y = 0 to 7 Sub_Total = Unqantizized_Matrix[DCT_x, DCT_y] If DCT_x == 0 Sub_Total *= Sqrt(1 / 8) Else Sub_Total *= Sqrt(2 / 8) End If If DCT_y == 0 Sub_Total *= Sqrt(1 / 8) Else Sub_Total *= Sqrt(2 / 8) End If Sub_Total *= Cos( DCT_x * PI * (2 * Block_x + 1) / (2 * 8) ) Sub_Total *= Cos( DCT_y * PI * (2 * Block_y + 1) / (2 * 8) ) Total += Sub_Total; Next Next block[Block_x, Block_y] = Total Next Next ------------------------------------------------------------------------------ ################################################################################ ## 2.3.5. Combine the blocks into (Y, Cb, Cr) pixels ################################################################################ Now you have 6 block matrices: Cr_block, Cb_block, Y1_block, Y2_block, Y3_block, and Y4_block The four Luminance blocks (Y1, Y2, Y3, Y4) are arranged in a square: top-left, top-right, bottom-left, bottom-right. Then there is one Cb and one Cr for every 2x2 square of Luminance values (this is standard 4:2:0 sampling method used in jpeg and mpeg1). +----+----+ | Y1 | Y2 | +----+ +----+ +----+----+ | Cb | | Cr | | Y3 | Y4 | +----+ +----+ +----+----+ Pseudocode to convert the Y1 Y2 Y3 Y4 and Cb any Cr blocks into a 16x16 array of (Y, Cb, Cr) pixels. ------------------------------------------------------------------------------ Define Macroblock_YCbCr[16, 16] of structure {Y, Cb, Cr} For x = 0 to 7 For y = 0 to 7 Macroblock_YCbCr[x, y ].Y = Y1_block[x, y] Macroblock_YCbCr[x + 8, y ].Y = Y2_block[x, y] Macroblock_YCbCr[x, y + 8].Y = Y3_block[x, y] Macroblock_YCbCr[x + 8, y + 8].Y = Y4_block[x, y] Macroblock_YCbCr[x * 2 , y * 2 ].Cb = Cb_block[x, y] Macroblock_YCbCr[x * 2 + 1, y * 2 ].Cb = Cb_block[x, y] Macroblock_YCbCr[x * 2 , y * 2 + 1].Cb = Cb_block[x, y] Macroblock_YCbCr[x * 2 + 1, y * 2 + 1].Cb = Cb_block[x, y] Macroblock_YCbCr[x * 2 , y * 2 ].Cr = Cr_block[x, y] Macroblock_YCbCr[x * 2 + 1, y * 2 ].Cr = Cr_block[x, y] Macroblock_YCbCr[x * 2 , y * 2 + 1].Cr = Cr_block[x, y] Macroblock_YCbCr[x * 2 + 1, y * 2 + 1].Cr = Cr_block[x, y] Next Next ------------------------------------------------------------------------------ The resulting YCbCr "color space" is: Y (Luminance) : -128 to +127 Cr (Crominance Red) : -128 to +127 Cb (Crominance Blue): -128 to +127 ################################################################################ ## 2.3.6. Convert the (Y, Cb, Cr) pixels into RGB pixels ################################################################################ The equations the MDEC uses to convert YCbCr to RGB are: Red = Y + 1.402 * Cr Green = Y - 0.3437 * Cb - 0.7143 * Cr Blue = Y + 1.772 * Cb But these equations expect a "color space" of: Y : 0 to 255 Cr: -128 to +127 Cb: -128 to +127 So to convert from the YCbCr "color space" to RGB, you use these equations. Red = (Y + 128) + 1.402 * Cr Green = (Y + 128) - 0.3437 * Cb - 0.7143 * Cr Blue = (Y + 128) + 1.772 * Cb Because this can result in RGB values below 0, and above 255, you also should "clamp" the Red, Green, and Blue within a range of 0 to 255. If Red > 255 Then Red = 255 Else If Red < 0 Then Red = 0 If Green > 255 Then Green = 255 Else If Green < 0 Then Green = 0 If Blue > 255 Then Blue = 255 Else If Blue < 0 Then Blue = 0 -- Pseudocode to convert from YCbCr to RGB ----------------------------------- Define Macroblock_RGB[16, 16] of structure {Red, Green, Blue} For x = 0 to 15 For y = 0 to 15 r = (Macroblock_YCbCr[x, y].Y + 128) + 1.402 * Macroblock_YCbCr[x, y].Cr g = (Macroblock_YCbCr[x, y].Y + 128) - 0.34414 * Macroblock_YCbCr[x, y].Cb - 0.71414 * Macroblock_YCbCr[x, y].Cr b = (Macroblock_YCbCr[x, y].Y + 128) + 1.772 * Macroblock_YCbCr[x, y].Cb Macroblock_RGB[x, y].Red = Max( Min(r, 255), 0) Macroblock_RGB[x, y].Green = Max( Min(g, 255), 0) Macroblock_RGB[x, y].Blue = Max( Min(b, 255), 0) Next Next ------------------------------------------------------------------------------ ################################################################################ ## 3. Variations by some PSX games ############################################################################### ################################################################################ ## 3.1. Final Fantasy VII ################################################################################ Frame Chunk Header: * all values are little-endian - 32 bits; always 0x80010160 - 16 bits; 'chunk number' of this video frame (0 to numchunks-1) - 16 bits; number of chunks in this frame - 32 bits; frame number (starts at 1) - 32 bits; seemingly random number. frame duration? - 16 bits; width of frame in pixels - 16 bits; height of frame in pixels - 16 bits; unknown - 16 bits; unknown - 16 bits; unknown - 16 bits; unknown - 32 bits; always 0x00000000 //TODO: Confirm these At the start of some demultiplexed frames is an additional 40 bytes of what is believed to be camera angle data. Frame Data Header: - 40 bytes; camera data? - 16 bits; the number of run length codes in the frame? or size of data (in bytes) following this header? - 16 bits; always 0x3800 - 16 bits; the frame's quantization scale - 16 bits; 1, the version of the video frame - Compressed macro blocks... The frame version claims to be 1, but decodes just like version 2 frames, except for one difference. The variable-length-code escape codes will sometimes decode to some # of zeros, followed by an AC Cofficient of zero (e.g. (6, 0) ). This never seems to happen in version 2 or version 3 frames. This makes me think they changed the frame's quantization scale to make it smaller, but didn't combine the empty run-length codes. ################################################################################ ## 3.2. Final Fantasy VIII ################################################################################ FF8 makes a large departure from how the data is stored in each sector. Each frame consists of 10 sectors. The first sector contains the left audio channel, the second contains the right audio channel. The remaining 8 sectors hold the video data for the frame. 10 sectors running at 2x speed (150 sectors/second) means 15 frames-per-second. There is at least one exception found on disc 1: a movie with no video. Each 'frame' consists of two sectors: the first is the left audio channel, the second is the right audio channel. =============== ==== Audio ==== =============== Audio sectors, like the video sectors, are "Mode 2 Form 1". FF8 Audio Chunk Header: * non character/string values are little-endian - 4 bytes; 'S' 'M' ? '\1' - ? = 'N' for left audio channel - ? = 'R' for right audio channel - 1 byte; chunk number (from 0 to numchunks) - 1 byte; number of chunks (9 or 1) - 2 bytes; frame number (starts at 0) - 232 bytes; unknown (camera data?) - 6 bytes; usually 'MORIYA', sometimes 'SHUN.M' - 10 bytes; unknown - 4 bytes; 'AKAO' - 4 bytes; frame number (again) - 20 bytes; unknown - 4 bytes; unknown constant 0x00001000 - 4 bytes; number of bytes of audio data (1680) - 76 bytes; unknown - 1680 bytes; audio data There are 105 Sound Groups per sector, 14 Sound Units per Sound Group, and 2 samples per Sound Unit. The Sound Data is not interleaved, so the decoding process is much more linear than the normal PSX audio sector format. FF8/FF9/Chrono Cross Sound Group: - 1 byte; sound parameter - 1 byte; unknown - 14 Sound Units - 1 byte; ADPCM sound data, 4 bits-per-sample (2 samples) Size of FF8/FF9 Sound Group: 16 bytes Each sound group generates 28 samples of audio. FF8/FF9/Chrono Cross use filter tables with one extra item: K0[5] = { 0.0, 0.9375, 1.796875, 1.53125, 1.90625 } K1[5] = { 0.0, 0.0, -0.8125, -0.859375, -0.9375 } -- Pseudocode to decode Square's unique ADPCM audio sector data --------------- PreviousSample1 = 0 PreviousSample2 = 0 For SoundGroup = 1 to 105 SoundParameter = InputStream.ReadByte() InputStream.SkipByte() /* odd that this byte is skipped */ Range = SoundParameter & 0x0F Filter1 = K0[SoundParameter >> 4] Filter2 = K1[SoundParameter >> 4] For SoundUnit = 1 to 14 ADPCMSample1 = InputStream.ReadSignedBits(4) ADPCMSample2 = InputStream.ReadSignedBits(4) PCMSample = ADPCMSampleToPCMSample(ADPCMSample1, Range, Filter1, Filter2, byref PreviousSample1, byref PreviousSample2) OutputStream.Write(PCMSample) PCMSample = ADPCMSampleToPCMSample(ADPCMSample2, Range, Filter1, Filter2, byref PreviousSample1, byref PreviousSample2) OutputStream.Write(PCMSample) Next Next ------------------------------------------------------------------------------ Audio is played back at 44100 samples-per-second. In total: 14 Sound Units with 2 samples per unit = 28 samples per Sound Group. 28 samples * 105 Sound Groups = 2940 samples per sector (for left & right). At 44100 samples per second, each frame generates 0.067 seconds of audio, which is exactly how long it takes for the PSX to spin the disc through 10 sectors at 2x speed (150 sectors/second). 44100 samples/second 15 frames/second 150 sectors/second 10 sectors/frame (14 * 2 * 105) = 2940 samples/frame (for each channel) 0.0667 seconds/frame =============== ==== Video ==== =============== FF8 Frame Chunk Header: - 4 bytes; 'S' 'M' 'J' '\1' * remaining values are little-endian - 1 byte; chunk number (from 0 to numchunks) - 1 byte; number of chunks (always 9?) - 2 bytes; frame number (starts at 0) - 2040 bytes; Frame Chunk Data FF8 Frame Data Header & Macro-blocks (pretty much the same as normal): * all values are little-endian - 2 bytes; the number of run length codes in the frame? or size of data (in bytes) following this header? - 2 bytes; always 0x3800 - 2 bytes; the frame's quantization scale - 2 bytes; the version of the video frame (always 2) - Compressed macro block 1 - Compressed macro block 2 ... - Compressed macro block 320/16 * 224/16 Video frames are always 320 x 224. ################################################################################ ## 3.3. Final Fantasy IX ################################################################################ FF9 makes even a larger departure from how the data is stored in each sector. Like FF8, each frame consists of 10 sectors. The first sector contains the left audio channel, the second contains the right audio channel. The remaining 8 sectors hold the video data for the frame. 10 sectors running at 2x speed (150 sectors/second) means 15 frames-per-second. =============== ==== Audio ==== =============== The two audio chunks are in *Mode 2 Form 1* sectors. FF9 Audio Chunk Header: * all values are little-endian - 4 bytes; 0x00080160 - 2 bytes; chunk number (0 to numchunks - 1) - 2 bytes; number of chunks (always 10) - 4 bytes; frame number (starts at 1) - 116 bytes; unknown - 4 bytes; has audio: "AKAO" (big-endian) no audio: 0x00000000 - 4 bytes; has audio: frame number - 1 no audio: 0x00000000 - 20 bytes; unknown - 4 bytes; has audio: 0x0000116a no audio: 0x00000000 - 4 bytes; number of bytes of audio data: 0, 1824, 1840, or 1680 for the final movie - 44 bytes; unknown - 1840 bytes; audio data and/or leftovers There is an exception to this for the last frame of a movie on disc 4. Strange FF9 Audio Chunk Header: * all values are little-endian - 4 bytes; 0x00080160 - 2 bytes; chunk number (0 to numchunks - 1) - 2 bytes; number of chunks (always 10) - 4 bytes; frame number (starts at 1) - 116 bytes; unknown - 1920 bytes; 1920 bytes of 0xab I believe this can just be considered a frame with no audio. FF9 audio is essentially the same as FF8 audio, just most movies have a different sample rate. See the FF8 chapter for details on how to decode the data. The playback rate for all but the final movie is 48000 samples/second, and the number of sound groups per sector vary depending on how much audio data there is. 1824 bytes / 16 bytes/sound group = 114 sound groups which generate (114 sound groups * 28 samples/sound group) = 3192 samples 1840 bytes / 16 bytes/sound group = 115 sound groups which generate (115 sound groups * 28 samples/sound group) = 3220 samples The size of audio data follows a 7 frame sequence: 1840, 1824, 1824, 1840, 1824, 1824, 1824 Over 7 frames, that is (1840*2+1824*5) = 12800 bytes of ADPCM audio data. 12800 bytes / (16 bytes/sound group) * (28 samples/sound group) = 22400 samples. 22400 samples / 7 frames = 3200 samples/frame, which is exactly what we need for 48000 samples/second. 22400 bytes for every 7 frames (for each channel) 3200 samples/frame (average) 10 sectors/frame 150 sectors/second 15 frames/second 0.0667 seconds/frame (average) 48000 samples/second The final movie is different because every frame has 1680 bytes of audio data (like FF8), so it must be played back at 44100 samples/second. Final movie: 1680 bytes per frame 2940 samples/frame 10 sectors/frame 150 sectors/second 15 frames/second 0.0667 seconds/frame 44100 samples/second =============== ==== Video ==== =============== The eight video frame chunks are in *Mode 2 Form 2*, and the chunks are *in reverse order*. So you order them from chunk 9 down to chunk 2. FF9 Frame Chunk Header: * all values are little endian - 4 bytes; 0x00040160 - 2 bytes; chunk number (0 to numchunks - 1) - 2 bytes; number of chunks (always 10?) - 4 bytes; frame number (starts at 1) - 4 bytes; seemingly random number. frame duration? - 2 bytes; width (always 320) - 2 bytes; height (always 224) - 2 bytes; the number of run length codes in the frame? or size of data (in bytes) following this header? - 2 bytes; 0x3800 - 2 bytes; frame's quantization scale - 2 bytes; version of the video frame (always 2) - 4 bytes; Usually 0x00000000. In some movies Chunk #2 has an unknown value here. - 2292 bytes; Frame Chunk Data ################################################################################ ## 3.4. Chrono Cross ################################################################################ Like FF8 and FF9, Chrono Cross frames are 10 sectors long, starting with 2 sectors for audio, followed by 8 sectors of video. It uses FF9 style audio sectors, but standard STR video sectors. All movie sectors are "Mode 2 Form 1". =============== ==== Audio ==== =============== Chrono Cross Audio Chunk Header: * all values are little-endian - 4 bytes; 0x00000160 or 0x00010160 - 2 bytes; chunk number (0 to numchunks - 1) - 2 bytes; number of chunks (always 2) - 2 bytes; frame number (starts at 1) - 118 bytes; unknown - 4 bytes; always "AKAO" (big-endian) - 4 bytes; always frame number - 1 - 20 bytes; unknown - 4 bytes; always 0x00001000 - 4 bytes; number of bytes of audio data: always 1680 - 44 bytes; unknown - 1680 bytes; audio data - 160 bytes; unknown Like the final FF9 movie, with 1680 bytes of audio data, the audio would play back at 44100 samples/second. ################################################################################ ## 3.5. Serial Experiments Lain ################################################################################ Frame Chunk Header: * only header values are little-endian - 4 bytes; 0x80010160 - 2 bytes; 'chunk number' of this video frame (0 to numchunks-1) - 2 bytes; number of chunks in this frame - 4 bytes; frame number (starts at 1) - 4 bytes; seemingly random number. frame duration? - 2 bytes; width of frame in pixels - 2 bytes; height of frame in pixels - 1 byte; quantization scale for luminance blocks - 1 byte; quantization scale for chrominance blocks - 2 bytes; all but the last movie: 0x3800 the last movie: frame number (again) - 2 bytes; number of run length codes in the frame - 2 bytes; 0, version of the video frame - 4 bytes; always 0x00000000 Frame Data Header: * only header values are little-endian - 1 byte; quantization scale for luminance blocks (again) - 1 byte; quantization scale for chrominance blocks (again) - 2 bytes; all but the last movie: 0x3800 (again) the last movie: frame number (again again) - 2 bytes; number of run length codes in the frame (again) - 2 bytes; 0, the version of the video frame (again) The video frame version is always 0. The reason why the last movie doesn't have 0x3800 in the headers is beacuse it needs to know what frame it is showing, since it blacks-out video frames you have not seen yet. The bit stream data following the header is read in *BIG-ENDIAN* order. The DC coefficient is stored in the standard version 2 style. A unique set of variable-length-codes are used: 11s (0, 1) 011s (0, 2) 0100 s (1, 1) 0101 s (0, 3) 0010 1s (0, 4) 0011 0s (2, 1) 0011 1s (0, 5) 0001 00s (0, 6) 0001 01s (3, 1) 0001 10s (1, 2) 0001 11s (0, 7) 0000 100s (0, 8) 0000 101s (4, 1) 0000 110s (0, 9) 0000 111s (5, 1) 0010 0000 s (0, 10) 0010 0001 s (0, 11) 0010 0010 s (1, 3) 0010 0011 s (6, 1) 0010 0100 s (0, 12) 0010 0101 s (0, 13) 0010 0110 s (7, 1) 0010 0111 s (0, 14) 0000 0010 00s (0, 15) 0000 0010 01s (2, 2) 0000 0010 10s (8, 1) 0000 0010 11s (1, 4) 0000 0011 00s (0, 16) 0000 0011 01s (0, 17) 0000 0011 10s (9, 1) 0000 0011 11s (0, 18) 0000 0001 0000 s (0, 19) 0000 0001 0001 s (1, 5) 0000 0001 0010 s (0, 20) 0000 0001 0011 s (10, 1) 0000 0001 0100 s (0, 21) 0000 0001 0101 s (3, 2) 0000 0001 0110 s (12, 1) 0000 0001 0111 s (0, 23) 0000 0001 1000 s (0, 22) 0000 0001 1001 s (11, 1) 0000 0001 1010 s (0, 24) 0000 0001 1011 s (0, 28) 0000 0001 1100 s (0, 25) 0000 0001 1101 s (1, 6) 0000 0001 1110 s (2, 3) 0000 0001 1111 s (0, 27) 0000 0000 1000 0s (0, 26) 0000 0000 1000 1s (13, 1) 0000 0000 1001 0s (0, 29) 0000 0000 1001 1s (1, 7) 0000 0000 1010 0s (4, 2) 0000 0000 1010 1s (0, 31) 0000 0000 1011 0s (0, 30) 0000 0000 1011 1s (14, 1) 0000 0000 1100 0s (0, 32) 0000 0000 1100 1s (0, 33) 0000 0000 1101 0s (1, 8) 0000 0000 1101 1s (0, 35) 0000 0000 1110 0s (0, 34) 0000 0000 1110 1s (5, 2) 0000 0000 1111 0s (0, 36) 0000 0000 1111 1s (0, 37) 0000 0000 0100 00s (2, 4) 0000 0000 0100 01s (1, 9) 0000 0000 0100 10s (1, 24) 0000 0000 0100 11s (0, 38) 0000 0000 0101 00s (15, 1) 0000 0000 0101 01s (0, 39) 0000 0000 0101 10s (3, 3) 0000 0000 0101 11s (7, 3) 0000 0000 0110 00s (0, 40) 0000 0000 0110 01s (0, 41) 0000 0000 0110 10s (0, 42) 0000 0000 0110 11s (0, 43) 0000 0000 0111 00s (1, 10) 0000 0000 0111 01s (0, 44) 0000 0000 0111 10s (6, 2) 0000 0000 0111 11s (0, 45) 0000 0000 0010 000s (0, 47) 0000 0000 0010 001s (0, 46) 0000 0000 0010 010s (16, 1) 0000 0000 0010 011s (2, 5) 0000 0000 0010 100s (0, 48) 0000 0000 0010 101s (1, 11) 0000 0000 0010 110s (0, 49) 0000 0000 0010 111s (0, 51) 0000 0000 0011 000s (0, 50) 0000 0000 0011 001s (7, 2) 0000 0000 0011 010s (0, 52) 0000 0000 0011 011s (4, 3) 0000 0000 0011 100s (0, 53) 0000 0000 0011 101s (17, 1) 0000 0000 0011 110s (1, 12) 0000 0000 0011 111s (0, 55) 0000 0000 0001 0000 s (0, 54) 0000 0000 0001 0001 s (0, 56) 0000 0000 0001 0010 s (0, 57) 0000 0000 0001 0011 s (21, 1) 0000 0000 0001 0100 s (0, 58) 0000 0000 0001 0101 s (3, 4) 0000 0000 0001 0110 s (1, 13) 0000 0000 0001 0111 s (23, 1) 0000 0000 0001 1000 s (8, 2) 0000 0000 0001 1001 s (0, 59) 0000 0000 0001 1010 s (2, 6) 0000 0000 0001 1011 s (19, 1) 0000 0000 0001 1100 s (0, 60) 0000 0000 0001 1101 s (9, 2) 0000 0000 0001 1110 s (24, 1) 0000 0000 0001 1111 s (18, 1) 0000 01 escape 10 EOB The escape code is handled in the MPEG1 fashion: 6 bits for the run, then either 8 or 16 bits for the level according to this table: Fixed Length Code Level forbidden -256 1000 0000 0000 0001 -255 1000 0000 0000 0010 -254 ... 1000 0000 0111 1111 -129 1000 0000 1000 0000 -128 1000 0001 -127 1000 0010 -126 ... 1111 1110 -2 1111 1111 -1 forbidden 0 0000 0001 1 0000 0010 2 ... 0111 1110 126 0111 1111 127 0000 0000 1000 0000 128 0000 0000 1000 0001 129 ... 0000 0000 1111 1110 254 0000 0000 1111 1111 255 ################################################################################ ## 4. Thanks, credits, etc. ################################################################################ Mike Melanson and Stuart Caie for adding STR decoding support to xine, including the documentation in the source. (http://osdir.com/ml/video.xine.devel/2003-02/msg00179.html) Also for archiving some example STR files (http://osdir.com/ml/video.xine.devel/2003-02/msg00186.html). The q-gears development team for their forums, source code, and documentation (http://forums.qhimm.com/index.php?topic=6473.msg81373). Their STR decoding source code PSXMDECDecoder.cpp was invaluable (http://q-gears.svn.sourceforge.net/viewvc/q-gears/trunk/src/common/movie/decoders/). "Everything You Have Always Wanted to Know about the Playstation But Were Afraid to Ask." compiled / edited by Joshua Walker. Even if it has the CrCb reversal error, it will always be a great reference for PSX hacking. smf, developer for MAME, for figuring out that everyone was getting the order of CrCb wrong (http://smf.mameworld.info/?p=12). Jonathan Atkins for his open source cdxa code and documentation (http://freshmeat.net/projects/cdxa/ http://jcatki.no-ip.org:8080/cdxa/ http://jonatkins.org:8080/cdxa/). The PCSX Team, creators of one of the two open source Playstation emulators (http://www.pcsx.net/). This version of their mdec.c file is particularly useful http://www.koders.com/c/fidF23C1EAFCEAF84CA539927A01093D37D9695722A.aspx Developers of the pSX emulator. While not open source, at least it is still under active development, and provides a very nice debugger for reverse engineering games (http://psxemulator.gazaxian.com/). "Fyiro", the Japanese fellow that wrote the source code for the PsxMC FF8 plugin. (http://homepage2.nifty.com/~mkb/PsxMC/). T_chan for sharing a bit of his knowledge about the FF9 format (http://www.network54.com/Forum/119865/thread/1196268797/last-1197023290/Final+Fantasy+9+Format). The most excellent folks at IRCNet #lain :D Finally, a very special thanks to all the Playstation hackers who thought it was a good idea to keep their decoders/emulators/hacking tools closed source, then completely stop working on them. Extra thanks to those who now provide a 404 page for a web site. You sirs are real men of genius. -------------------------------------------------------------------------------- Copyright (c) 2008 Michael Sabin Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.