|
@@ -0,0 +1,98 @@
|
|
|
+ The official guide to swscale for confused developers.
|
|
|
+ ========================================================
|
|
|
+
|
|
|
+Current (simplified) Architecture:
|
|
|
+---------------------------------
|
|
|
+ Input
|
|
|
+ v
|
|
|
+ _______OR_________
|
|
|
+ / \
|
|
|
+ / \
|
|
|
+ special converter [Input to YUV converter]
|
|
|
+ | |
|
|
|
+ | (8bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:0:0 )
|
|
|
+ | |
|
|
|
+ | v
|
|
|
+ | Horizontal scaler
|
|
|
+ | |
|
|
|
+ | (15bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:1:1 / 4:0:0 )
|
|
|
+ | |
|
|
|
+ | v
|
|
|
+ | Vertical scaler and output converter
|
|
|
+ | |
|
|
|
+ v v
|
|
|
+ output
|
|
|
+
|
|
|
+
|
|
|
+Swscale has 2 scaler pathes, each side must be capable to handle
|
|
|
+slices, that is consecutive non overlapping rectangles of dimension
|
|
|
+(0,slice_top) - (picture_width, slice_bottom)
|
|
|
+
|
|
|
+special converter
|
|
|
+ This generally are unscaled converters of common
|
|
|
+ formats, like YUV 4:2:0/4:2:2 -> RGB15/16/24/32. Though it could also
|
|
|
+ in principle contain scalers optimized for specific common cases.
|
|
|
+
|
|
|
+Main path
|
|
|
+ The main path is used when no special converter can be used, the code
|
|
|
+ is designed as a destination line pull architecture. That is for each
|
|
|
+ output line the vertical scaler pulls lines from a ring buffer that
|
|
|
+ when the line is unavailable pulls it from the horizontal scaler and
|
|
|
+ input converter of the current slice.
|
|
|
+ When no more output can be generated as lines from a next slice would
|
|
|
+ be needed then all remaining lines in the current slice are converted
|
|
|
+ and horizontally scaled and put in the ring buffer.
|
|
|
+ [this is done for luma and chroma, each with possibly different numbers
|
|
|
+ of lines per picture]
|
|
|
+
|
|
|
+Input to YUV Converter
|
|
|
+ When the input to the main path is not planar 8bit per component yuv or
|
|
|
+ 8bit gray then it is converted to planar 8bit YUV, 2 sets of converters
|
|
|
+ exist for this currently one performing horizontal downscaling by 2
|
|
|
+ before the convertion and the other leaving the full chroma resolution
|
|
|
+ but being slightly slower. The scaler will try to preserve full chroma
|
|
|
+ here when the output uses it, its possible to force full chroma with
|
|
|
+ SWS_FULL_CHR_H_INP though even for cases where the scaler thinks its
|
|
|
+ useless.
|
|
|
+
|
|
|
+Horizontal scaler
|
|
|
+ There are several horizontal scalers, a special case worth mentioning is
|
|
|
+ the fast bilinear scaler that is made of runtime generated mmx2 code
|
|
|
+ using specially tuned pshufw instructions.
|
|
|
+ The remaining scalers are specially tuned for various filter lengths
|
|
|
+ they scale 8bit unsigned planar data to 16bit signed planar data.
|
|
|
+ Future >8bit per component inputs will need to add a new scaler here
|
|
|
+ that preserves the input precission.
|
|
|
+
|
|
|
+Vertical scaler and output converter
|
|
|
+ There is a large number of combined vertical scalers+output converters
|
|
|
+ Some are:
|
|
|
+ * unscaled output converters
|
|
|
+ * unscaled output converters that average 2 chroma lines
|
|
|
+ * bilinear converters (C, MMX and accurate MMX)
|
|
|
+ * arbitrary filter length converters (C, MMX and accurate MMX)
|
|
|
+ And
|
|
|
+ * Plain C 8bit 4:2:2 YUV -> RGB converters using LUTs
|
|
|
+ * Plain C 17bit 4:4:4 YUV -> RGB converters using multiplies
|
|
|
+ * MMX 11bit 4:2:2 YUV -> RGB converters
|
|
|
+ * Plain C 16bit Y -> 16bit gray
|
|
|
+ ...
|
|
|
+
|
|
|
+ RGB with less than 8bit per component uses dither to improve the
|
|
|
+ subjective quality and low frequency accuracy.
|
|
|
+
|
|
|
+
|
|
|
+Filter coefficients:
|
|
|
+--------------------
|
|
|
+There are several different scalers (bilinear, bicubic, lanczos, area, sinc, ...)
|
|
|
+Their coefficients are calculated in initFilter().
|
|
|
+Horinzontal filter coeffs have a 1.0 point at 1<<14, vertical ones at 1<<12.
|
|
|
+The 1.0 points have been choosen to maximize precission while leaving a
|
|
|
+little headroom for convolutional filters like sharpening filters and
|
|
|
+minimizing SIMD instructions needed to apply them.
|
|
|
+It would be trivial to use a different 1.0 point if some specific scaler
|
|
|
+would benefit from it.
|
|
|
+Also as already hinted at initFilter() accepts an optional convolutional
|
|
|
+filter as input that can be used for contrast, saturation, blur, sharpening
|
|
|
+shift, chroma vs. luma shift, ...
|
|
|
+
|