Lynne cyanreg

Video encoding and decoding with Vulkan Compute shaders

Video on the internet has largely became a solved problem. Most devices with a video output ship with decoding and encoding accelerator chips. APIs, like the Vulkan® Video set of APIs, allow for direct access to them. Newer codecs are royalty free with open specifications, or become royalty free thanks to time, making support for standards accessible to more.

Few remember how much decoding 720p H.264 stressed out the majority of CPUs 18 years ago, or the optimizations, competition between software implementations, and the long road through which decoding in real-time was possible until decoding hardware arrived, and APIs were written.

ANSI Escape Sequences

Standard escape codes are prefixed with Escape:

Ctrl-Key: ^[
Octal: \033
Unicode: \u001b
Hexadecimal: \x1b
Decimal: 27

	/* How to use any randomly retrieved VkImages in ffmpeg-related code. */

	/* If you don't have a device initialized, but you have a VkDevice,
	* you have to import it. */
	{
	AVBufferRef *ctx_ref = av_hwdevice_ctx_alloc(AV_HWDEVICE_TYPE_VULKAN);
	AVHWDeviceContext ctx = (AVHWDeviceContext )ctx_ref->data;
	AVVulkanDeviceContext *hwctx = ctx->hwctx;

	/* Mandatory. */

	/*
	* This file is part of FFmpeg.
	*
	* FFmpeg is free software; you can redistribute it and/or
	* modify it under the terms of the GNU Lesser General Public
	* License as published by the Free Software Foundation; either
	* version 2.1 of the License, or (at your option) any later version.
	*
	* FFmpeg is distributed in the hope that it will be useful,
	* but WITHOUT ANY WARRANTY; without even the implied warranty of

	;******************************************************************************
	;* Copyright (c) Lynne
	;*
	;* This file is part of FFmpeg.
	;*
	;* FFmpeg is free software; you can redistribute it and/or
	;* modify it under the terms of the GNU Lesser General Public
	;* License as published by the Free Software Foundation; either
	;* version 2.1 of the License, or (at your option) any later version.
	;*

	void forward_qmf(float out_low, float out_high, float delay, const float in,
	int samples, int delay_samples)
	{
	memcpy(delay, delay + samples, delay_samples * sizeof(float));
	memcpy(delay + delay_samples, in, samples * sizeof(float));

	for (int i = 0; i < samples; i += 2) {
	float low = 0.0f, high = 0.0f;

	/* Can be done via float_dsp */

	#define BF(x, y, a, b) \
	do { \
	x = (a) - (b); \
	y = (a) + (b); \
	} while (0)

	#define BUTTERFLIES_MIX(a0,a1,a2,a3, P1, P2, P5, P6) \
	do { \
	r0=a0.re; \
	i0=a0.im; \

	static void fft8(void s, FFTComplex z, FFTComplex *temp)
	{
	FFTSample r1 = z[0].re - z[4].re;
	FFTSample r2 = z[0].im - z[4].im;
	FFTSample r3 = z[1].re - z[5].re;
	FFTSample r4 = z[1].im - z[5].im;

	FFTSample j1 = z[2].re - z[6].re;
	FFTSample j2 = z[2].im - z[6].im;
	FFTSample j3 = z[3].re - z[7].re;

	// from https://gist.github.com/cyanreg/665b9c79cbe51df9296a969257f2a16c
	static void fft4(FFTComplex *z)
	{
	FFTSample r1 = z[0].re - z[4].re;
	FFTSample r2 = z[0].im - z[4].im;
	FFTSample r3 = z[1].re - z[5].re;
	FFTSample r4 = z[1].im - z[5].im;
	/* r5-r8 second transform */

	FFTSample t1 = z[0].re + z[4].re;

	static void fft4(void s, FFTComplex z, FFTComplex *temp)
	{
	FFTSample r1 = z[0].re - z[2].re;
	FFTSample r2 = z[0].im - z[2].im;
	FFTSample r3 = z[1].re - z[3].re;
	FFTSample r4 = z[1].im - z[3].im;
	/* r5-r8 second transform */

	FFTSample t1 = z[0].re + z[2].re;
	FFTSample t2 = z[0].im + z[2].im;