Previous | Contents | Index |
dwSuggestedBufferSize
Specifies the maximum buffer size for reading the stream. Typically,
this field contains a value corresponding to the largest chunk present
in the stream. Using the correct buffer size makes playback more
efficient. Use zero if you do not know the correct buffer size.
dwQuality
Specifies an indicator of the quality of the data in the stream.
Quality is represented as a number from 0 to 10,000. For compressed
data, this typically represents the value of the quality parameter
passed to the compression software. If set to --1, drivers use the
default quality value.
dwSampleSize
Specifies the size of a single sample of data. This is set to zero if
the samples can vary in size. If this number is nonzero, then multiple
samples of data can be grouped into a single chunk within the file. If
it is zero, each sample of data, such as a video frame, must be in a
separate chunk.
For video streams, this number is typically zero. It can be nonzero if all video frames are the same size.
For audio streams, this number is the same as the nBlockAlign field in the WAVEFORMAT data structure describing the audio data.
rcFrame
Specifies the destination rectangle for a text or video stream within
the movie rectangle specified by the dwWidth and
dwHeight fields of the MainAVIHeader
data structure. The rcFrame field is typically used in
support of multiple video streams. Set this rectangle to the
coordinates corresponding to the movie rectangle to update the whole
movie rectangle. Units for this field are pixels. The upper-left corner
of the destination rectangle is relative to the upper-left corner of
the movie rectangle.
Some of the fields in the stream header data structure are also present in the main header data structure. The data in the main header structure applies to the whole file, and the data in the stream header structure applies only to a stream.
A stream format 'strf' chunk must follow a stream header 'strh' chunk. The stream format chunk describes the format of the data in the stream. For video streams, the information in this chunk is a BITMAPINFOHEADER data structure (including palette information, if appropriate). See Section 7.2 for more information about the BITMAPINFOHEADER data structure and specifying palette information.
For audio streams, the information in this chunk is a PCMWAVEFORMAT data structure. The PCMWAVEFORMAT data structure is an extended version of the WAVEFORMAT data structure. See Section 3.6.1 and Section 3.6.2 for more information about the PCMWAVEFORMAT and WAVEFORMAT data structures, respectively.
The 'strl' chunk might also contain a stream data 'strd' chunk. If used, this chunk follows the stream format chunk. The format and content of this chunk are defined by installable compression or decompression drivers. Typically, drivers use this information for configuration. Applications that read and write RIFF files do not need to decode this information. They transfer this data to and from a driver as a memory block.
An AVI player associates the stream headers in the
LIST
chunk's
'hdrl'
subchunk with the stream data in the
LIST
chunk's
'movi'
subchunk by using the order of the
'strl'
chunks. The first
'strl'
chunk applies to stream 0, the second applies to stream 1, and so
forth. For example, if the first
'strl'
chunk describes the waveform audio data, the waveform audio data is
contained in stream 0. Similarly, if the second
'strl'
chunk describes video data, then the video data is contained in stream
1.
8.6.3 LIST 'movi' Chunk
The LIST 'movi' chunk follows the header information. The LIST 'movi' chunk contains chunks of the actual data in the streams; that is, the pictures and sounds themselves. The data chunks can reside directly in the LIST 'movi' chunk, or they can be grouped into 'rec' chunks. The 'rec' grouping implies that the grouped chunks must be read from disk all at once. The 'rec' chunk is used for interleaved files.
Like any RIFF chunk, the data chunks contain a four-character code to identify the chunk type. The four-character code that identifies each chunk consists of the stream number and a two-character code that defines the type of information encapsulated in the chunk. For example, a waveform chunk is identified by the two-character code 'wb' . If a waveform chunk corresponds to the second LIST 'hdrl' stream description, it has a four-character code of '01wb' .
Because all the format information is in the header, the audio data contained in these data chunks does not contain any information about the format of the data. Example 8-11 shows the format of an audio data chunk. The number signs in the format represent the stream identifier.
Example 8-11 Audio Data Chunk |
---|
WAVE Bytes '##wb' BYTE abBytes[]; |
Video data can be compressed or uncompressed DIBs. An uncompressed DIB has the BI_RGB flag specified as the value of the biCompression field in its associated BITMAPINFOHEADER data structure. An uncompressed DIB can be BI_RGB, BICOMP_DECXIMAGEDIB, or BICOMP_DECYUVDIB. A compressed DIB can be JPEG_DIB, MJPG_DIB, BI_RLE8, or BI_RLE4.
The BITMAPINFOHEADER data structure should be the proper BITMAPINFOHEADER for the data contained in the file. This includes the extended bitmap information header and the data-specific bitmap information headers, if appropriate. See Chapter 7 for more details.
A data chunk for an uncompressed DIB contains RGB video data. These chunks are identified by the two-character code 'db' . The db code is an abbreviation for DIB bits. Data chunks for a compressed DIB are identified by the two-character code 'dc' . The dc code is an abbreviation for DIB compressed. Neither data chunk contains any header information about the DIBs.
Example 8-12 shows the data chunk for an uncompressed DIB.
Example 8-12 Uncompressed DIB Data Chunk |
---|
DIB Bits '##db' BYTE abBits[]; |
The number signs in the format represent the stream identifier.
Example 8-13 shows the data chunk for a compressed DIB.
Example 8-13 Compressed DIB Data Chunk |
---|
Compressed DIB '##dc' BYTE abBits[]; |
The number signs in the format represent the stream identifier.
8.6.4 AVIPALCHANGE Data Structure
The AVIPALCHANGE data structure is used in video streams containing palletized data to indicate the palette should change for subsequent video data.
Example 8-14 shows the AVIPALCHANGE data structure definition.
Example 8-14 AVIPALCHANGE Data Structure Definition |
---|
typedef struct { BYTE bFirstEntry; /* first palette entry to change */ BYTE bNumEntries; /* number of palette entries to change */ WORD wFlags; /* reserved field - set to 0 */ PALETTEENTRY peNew; /* array of new palette entries */ } AVIPALCHANGE; |
The AVIPALCHANGE data structure has the following fields:
bFirstEntry
Specifies the first palette entry to change.
bNumEntries
Specifies the number of palette entries to change.
wFlags
Reserved field. Set to 0.
peNew
Specifies an array of new palette entries.
When including palette changes in a video stream, set the
AVITF_VIDEO_PALCHANGES flag in the dwFlags field of
the stream header. This flag indicates that this video stream contains
palette changes and warns the playback software that it needs to
animate the palette.
8.6.5 AVIINDEXENTRY Data Structure
The AVI file index consists of an array of AVIINDEXENTRY data structures contained in an 'idx1' chunk at the end of the AVI file. The index chunk contains a list of the data chunks and their locations in the file, and follows the main LIST 'movi' chunk.
Example 8-15 shows the AVIINDEXENTRY data structure definition.
Example 8-15 AVIINDEXENTRY Data Structure Definition |
---|
typedef struct { DWORD ckid; /* chunk ID of data chunk */ DWORD dwFlags; /* information about the data chunk */ DWORD dwChunkOffset; /* file position of the data chunk */ DWORD dwChunkLength; /* length of the data chunk */ } AVIINDEXENTRY; |
The AVIINDEXENTRY data structure has the following fields:
ckid
Specifies a four-character code corresponding to the chunk ID of a data
chunk in the file.
dwFlags
Specifies any applicable flags. The flags in the low-order word are
reserved for AVI, and those in the high-order word can be used for
stream information and information specific to a compressor or
decompressor. The following flags are defined:
AVIIF_LIST
Indicates that the specified chunk is a LIST chunk and that the ckid field contains the list type of the chunk.
AVIIF_KEYFRAME
Indicates that the chunk is a key frame and does not require additional preceding chunks to be properly decoded.
AVIIF_FIRSTPART
Indicates that the chunk needs the frames following it to be used; it cannot stand alone.
AVIIF_LASTPART
Indicates that the chunk needs the frames preceding it to be used; it cannot stand alone.
AVIIF_NOTIME
Indicates that the chunk has no effect on timing or calculating time values based on the number of chunks. For example, set this flag for palette change chunks in a video stream so that they are not counted as taking up an entire frame's time.
dwChunkOffset
Specifies the position in the file of the specified chunk. The position
value includes the 8-byte RIFF header.
dwChunkLength
Specifies the length of the specified chunk. The length value does not
include the 8-byte RIFF header.
8.6.6 Other Data Chunks
Add a 'JUNK' chunk to align data in the AVI file. This chunk is a standard RIFF type. Applications reading these chunks ignore their contents. Files played from CD-ROM use these chunks to align data so they can be read more efficiently. For example, use a 'JUNK' chunk to align data for the 2-kB CD-ROM boundaries.
Example 8-16 shows the 'JUNK' chunk format.
Example 8-16 'JUNK' Chunk Format |
---|
AVI Padding 'JUNK' Byte data[] |
As with any other RIFF files, an application that reads AVI files must
ignore the non-AVI chunks that it does not recognize. An application
that reads and writes AVI files must preserve the non-AVI chunks when
it saves files it has loaded.
8.7 Special Information for Interleaved Files
Files that are interleaved for playback from CD-ROM need some special handling. They can be read like any other AVI files but require special care when produced.
The audio data has to be separated into single-frame pieces, and audio and video for each frame needs to be grouped into record ( 'rec' ) chunks. The record chunks must be padded so that their sizes are multiples of 2 kB, and the beginning of the actual data in the LIST chunk lies on a 2-kB boundary in the file.
To give the audio driver enough audio to work with, the audio data has to be skewed ahead of the video data. Typically, the audio data is moved forward enough frames to allow approximately 0.75 seconds of audio data to be preloaded. Set the dwInitialFrames field in the main header (MainAVIHeader data structure) and the dwInitialFrames field in the audio stream header (AVIStreamHeader data structure) to the number of frames the audio is skewed ahead of the video.
Ensure that the CD-ROM drive is capable of reading the
data fast enough to support the AVI sequence. Non-Multimedia PC (MPC)
CD-ROM drives can have a data rate of fewer than 150
kB per second.
8.8 JPEG Data in AVI Files
This section provides specific information about the format of the JPEG
data (JPEG_DIB or MJPG_DIB) that goes into the AVI
'##dc'
JPEG data chunks.
8.8.1 JPEG AVI RIFF Form
JPEG AVI files use the standard AVI RIFF form. The JPEG AVI file format has the same mandatory LIST chunks as any other AVI files. Example 8-17 shows the JPEG AVI RIFF form expanded with the chunks needed to complete the LIST 'hdr1' and LIST 'movi' chunks.
As defined in the AVI file format, key frames have the key frame bit set in the index flags. Because all JPEG frames are key frames, the key frame flag will always be set for all the frames in a motion JPEG AVI file.
Example 8-17 Expanded JPEG AVI RIFF Form |
---|
RIFF ('AVI' LIST ('hdr1' 'avih'(<Main AVI header>0 LIST ('str1' 'strh' (<Stream header>) 'strf (<Stream format>) 'strd (<additional header data>) . . . ) LIST ('movi' { '##dc' <DIB compressed> Byte abJPEGdata[ ]; <JPEG image data> } . . . <or> LIST ('rec' '##dc' <DIB compressed> Byte abJPEGdata [ ]; <JPEG image data> . . . ) ) . . . ) ['idx' <AVI Index>] ) ) |
The strh chunk contains the stream header chunk that describes the type of data the stream contains. The strf chunk describes the format of the data in the stream. For the JPEG AVI case, the information in this chunk is a BITMAPINFOHEADER for JPEG data, which includes the extended bitmap information header and the JPEG-specific information header.
The strf chunk contains the FOURCC ID and associated state structure containing any specific state data for initializing the identified compressor or decompressor. This is optional data and is specific to the compressor or decompressor.
All frames in the AVI file are keyframes and have a form similar to that defined for JPEG "abbreviated format for compressed image data" as specified in ISO 10918, paragraph B.4.
Following the header information is a
LIST 'movi'
chunk that contains chunks of the actual data in the streams; that is,
the pictures and sounds themselves. The data chunks can reside directly
in the
LIST 'movi'
chunk, or they might be grouped into
'rec'
chunks as described in the AVI file format technical note. As in any
RIFF chunk, a four-character code is used to identify the chunk.
8.8.2 JPEG Data
As in the JPEG DIB format, the JPEG stream syntax is used for the image data with the constraints listed in the following paragraphs. The JPEG marker codes SOI, DRI, DQT, SOF0, SOS, and EOI are mandatory in the image data chunk, and the constrained values shown in Example 8-18 are mandatory for the image data within the AVI stream.
Any parameters in the SOF0 (frame) and SOS (start of scan) headers that are duplicated in the BITMAPINFOHEADER data structure for JPEG must be the same. This would include sample precision, subsampling, number of components (as implied by the JPEGColorSpaceID field), and so on. The number of lines and samples per lines in the SOF0 segment and the width and height defined in the format chunk must match the main AVI header width and height values. All of these values are expected to remain the same for every image data chunk in the AVI sequence.
Within the image data chunk, two JPEG segments beginning with the SOI marker and ending with the EOI marker may accommodate field-interleaved streams. There is an APP0 marker immediately following the SOI marker that contains information about the video image. Specifically, this allows the identification of the ODD and EVEN fields of an image for images stored in field-interleaved fashion. This APP0 marker is expected to have the first four bytes following the length bytes set to the characters A, V, I, and 1. The next byte indicates which field the JPEG data was compressed from and has an expected value of one for the first JPEG data segment and two for the second segment, indicating the ODD and EVEN fields, respectively. If the stream is not field interleaved, then this value will be 0 and there will only be one JPEG segment. The remaining seven bytes are expected to be set to 0 and will be ignored by the compressor or decompressor.
If a compressor or decompressor cannot handle the interleaved fields, the compressor or decompressor will use only the first (ODD) field and will replicate the lines as necessary to provide an image that conforms to the image size defined in the main AVI header. Conversely, if a capture system accesses just a single field of each source frame, only a single (ODD) field image may be present in a JPEG stream. This implies that the single (ODD) field data should be used as the source of both fields by a decompressor that wants to process full interlaced data.
It is an advantage to keep the interlace structure of all the frames in a particular motion JPEG AVI file consistent. To accomplish this, the following convention can be followed concerning the relationship of interlace structure to the biHeight value of each motion JPEG image, and hence the entire AVI sequence:
biHeight | Interlace Structure Suggested |
---|---|
<= 240 | Single JPEG data block describing entire frame. |
> 240 | A pair of half-height JPEG data blocks describing ODD and EVEN fields of the frame. (EVEN field data is optional if these blocks are identical.) |
The interlace structure and individual fields of data should be treated as an internal feature of the image data representation. The entire frame remains an indivisible unit on which editors should operate. |
Example 8-18 shows what the image data chunk would look like for a noninterleaved stream.
Example 8-18 Image Data Chunk for Noninterleaved Stream |
---|
X'FF', SOI X'FF', APP0' 14, "AVI1", 0, 0, 0, 0, 0, 0, 0, 0 X'FF', DRI, length, restart interval X'FF', DQT, length Lq = 67 for JPEG_Y or Lq = 132 for JPEG_RGB or JPEG_YCbCr Precision, Table ID, Pq =0, Tq = 0 DQT data [64] [If 3 Components Precision, Table ID, Pq =0, Tq = 1 DQT data [64] ] X'FF', SOF0, length, Sample Precision P = 8 Number of lines Y = biHeight Sample per line X = biWidth Number of components Nc = 1 or 3 (must match information from JPEGColorSpaceID) YCbCr RGB 1st Component parameters C1= 1 =Y 4 =R 2nd Component parameters C2= 2 =Cb 5 =G 3rd Component parameters C3= 3 =Cr 6 =B * *] X'FF', SOS, length, Number of components Ns = 1 or 3 (must match information from JPEGColorSpaceID) YCbCr RGB 1st Component parameters C1= 1 =Y 4 =R 2nd Component parameters C2= 2 =Cb 5 =G 3rd Component parameters C3= 3 =Cr 6 =B * * * X'FF', EOI |
The Microsoft specification uses a different length value in the DQT segment. It uses 65 and 130 instead of 67 and 132, respectively. Microsoft also uses the length of 12 in the APP0 segment instead of 14. Currently, Compaq products will use both sets of values until clarification is received from Microsoft about the values. The order in which the internal JPEG data segments (other than APP0) shown in Example 8-18 can actually occur is not restricted by this definition. See ISO 10918 for any ordering restrictions that are defined. |
To identify motion JPEG frames in an AVI 'movi' segment, the stream ID plus the two-character code for a compressed DIB are used and would have the following format:
DIB Bits '##dc' BYTE abJPEGImageData [ ]; |
Previous | Next | Contents | Index |