Topic: Addressing video in wiki page "howto:transcode and convert"

Posted under Tag/Wiki Projects and Questions

I'm noticing the wiki page for howto:transcode and convert (https://e621.net/wiki_pages/34804#Video) has conflicting and missing information about the AV1 codec it intends you to use, SVT-AV1.

In the context of ffmpeg:

In the context of AV1:

[3] SVT AV1's CRF values, coupled with a high quality preset, should produce higher quality video than identical CRF values used with libvpx's VP9 at deadline best.

I believe comparing a CRF value is invalid in codec comparison and needs to be challenged. I have reason to believe that improper practical tests were used to determine this. Deliberation below:

Why SVT-AV1 is less efficient than libaom

liboam is designed to reach out the highest efficiency in the AV1 codec, whereas SVT-AV1 is designed for parallelism and encoding-at-scale. SVT-AV1 trades off per frame search (GOP on a timed interval vs. scene-based GOP) to achieve faster encoding times, which can result in lower quality frames on a practical level. This can be suppressed by manually setting the GOP to a lower amount of frames, which will in turn increase the file size, but minimizes (not eliminate) this tradeoff.
That aside, "measuring CRF" between two encoders - is not similar to how QP control works in these codecs AT ALL. The internal mapping from CRF > quantizer / rate control behavior differs between implementations (scale, default rate control mode, default CRF value, lookahead and TPL usage). SVT historically used a default CRF that is high (35 is the default in every standard ffmpeg build) and FFmpeg/SVT wrappers also changed defaults over time. libaom’s CRF mapping and defaults are different. You cannot treat the same numeric CRF in both encoders as the same quality.

To explain further:
libaom’s RDO and rate control aim for different tradeoffs (libaom’s slow presets push better BD-rate for many scenes). SVT-AV1’s CRF mode is implemented with SVT-AV1 rate control internals (and SVT provides CQP/VBR/CBR modes too). Because the encoders apply lookahead, TPL, and filtering differently, identical CRF produces different bit allocations and visual results.

A CRF value of 30 in libaom will not be the same visual quality or file size as CRF value of 30 in SVT-AV1. Do benchmark CRF points per encoder.

If e621 intends to host "forever copies" in a space efficient manner, why not direct users to use the reference AV1 encoder (libaom-av1), same as for vp9? it eliminates quality variance that is produced by SVT-AV1 especially if an encoding preset is used.

It should be good to note that novice users should be explained - be it in layman's terms or not - what every parameter would do in the context of any codec and ffmpeg. The line

- output.mp4 -> the finished file's name. ffmpeg will notice the .mp4 in the output file name and adjust accordingly.

"adjusting accordingly" doesn't explain anything. you could deliberate by saying that "ffmpeg will mux the output codec to the context of an MP4 container" or simply "ffmpeg will save the video information encoded by SVT-AV1 to your final .mp4 file".

N.B., MacOS instructions are missing. If you need a directive, here is an example:

Installation

Use of a package manager, like Homebrew is recommended. Open a Terminal instance (Command + Space > 'Terminal.app' and paste the string

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

followed by Enter.
After installation, follow the instructions on the ffmpeg Homebrew Formulae

  • Open a Terminal instance (if not already open). You may do this by pressing Command + Space, then typing in Terminal.app followed by Enter.
  • Type or paste
brew install ffmpeg

followed by Enter. Wait for Homebrew to install ffmpeg.

Preparation for usage
  • Open a Terminal instance. (Command + Space > Terminal.app).
  • Optionally change the working directory. For instance, type
cd

followed by a space, then drag and drop your preferred working folder, followed by pressing Enter. You can verify your working directory by typing in

pwd

followed by Enter.
You may also drag and drop files into the terminal to insert them as a string, which may make picking input files easier.

If staff would like help re-structuring this wiki page, I'd be happy to give out directives or changes. The way it is written right now isn't particularly helpful nor accurate, at least to me personally.
That's my two cents.
(edit: incorrect term used fixed, link fixes)

Updated

Aacafah

Moderator

Me, abadbird, & Mairo are like halfway through writing it, it's currently an incomplete rough draft. If you'd like, you can give feedback here. If you really want to go above & beyond, feel free to write out sections & say where you'd want them to go; we can take snippets & add them. We'll be working on it when we find time.

Though, if you're on our Discord server, you can DM me from there; we're a lot more active there & there's less of a worry about DText screwing with your plaintext formatting.

I don't intend on joining the Discord server much less make this a private topic, but I would be able to provide a draft wiki page as a guideline if that is satisfactory. I'll get to it at a later time as right now is too far into the night for me.

I'd be interested in hearing the opinions of others, too.

Main problem has been that I write stuff down as notebook on my profile bio, suddenly fucking everyone is linking to my profile for instructions for stuff and then the stuff is outdated and I don't have time to update it.
Happened with howto:sites_and_sources, moved away from my profile back in 2017 and luckily people took control over that whole thing so I don't need to worry about it.
Same with howto:transcode and convert but literally just threw all the information there and haven't had time to check on it.

If you are the author and don't need to figure out how much filesize is for a thing and just need consistant instructions, general rule of thumb with h264 has been 10mbps for 1080p 24FPS.
However on e621, we do not transcode the file uploaded like video hosting sites do, so generally we want as much quality and for the video to fit 100MB filesize limit, so for majority of animations which are few seconds long, you want to use constant quality mode, hence why with VP9, you do want -b:v 0! Also modern codecs do not even have constant bitrate mode, it's all constrained or variable and to have constant bitrate you basically have to brute force the bitrate to be exact and it still fluctuates.

That article and things I have said are still mostly for WebM and especially VP9, because AV1 support took like at least 5 years or so for me pestering about it.
However the main benefit of AV1 is that creators and artists, has it in their software already, where WebM and VP9 you generally need to first export as something like h264 and then transcode to VP9, which is not ideal.
As for AV1 encoders, holy fucking shit libaom is actually unuseabely slow! libvpx is already extremely slow that I sometimes take whole day on longer videos, where similar settings on libaom takes more than several days, it's straight up unfeasable to use.

Also I haven't fully moved to AV1 because I have done my testing with VP8 initially, then VP9 and have come to my own conclusions with manual inspections and haven't had that much time with AV1 encoders, especially because of how slow they are and because MP4 file cannot be viewed when it's still encoding, but with videos needing to be fit into 100MB filesize limit, SVT is slightly better from libvpx VP9 for sure, but overall the improvements with newer stuff is that you get higher quality per bitrate, so if filesize limit is not hit, VP9 is still sufficient, just makes bit larger file.
If you google up stuff, many things tell to use CRF 20'sh for h264 and CRF 30'sh for AV1, but that generally is for real life footage taken with like phone camera where it's impossible to see the difference and doesn't take into account that fully computer made drawn or rendered material could have visual impact. Taken this into account and taking cues from quality focused weeb groups, as low CRF as 12 could be warranted with h264, own testing around 16 is fine in majority of scenarios with VP9.

I cannot comment on macOS, because I think I only know of like one single person who has Apple computer.
For windows, I absolutely hate winget and chocolatey, they suck, I use scoop, but this is yet another problem why I hate writing these kind of instructions or tutorials, because if you are biased to winget, then you instruct everyone to use that, even when they don't want to or even need to.

As long as the instructions aim for visually lossless, aka 99% of people cannot distinguish a difference in quality of original and transcode, then instructions should be fine. But obviously there's other stuff even missing like -movflags +faststart on MP4 files to make them streaming friendly so...

Mairo, I'm not entirely sure if we're on the same page when it comes to codec features. My interest is to try and educate users on how to optimize AV1 encoding specifically for the reason of efficiency. Let me break it down to you:

on the royalty free codecs VP9 and AV1, you have the option for lossless encoding. This disables QP entirely and stores video how it is originally presented. This naturally takes up a lot of space.

Then you've got QP based encoding. In relatively modern video codecs, this type of video coding is a lossy approach for handling pixel data in a data efficient manner.
The idea is that you'd want to optimize this type of coding as efficiently as possible without major quality loss-- hence why I was very insistent in explaining how the reference codec does not trade off quality for speed and should be considered.

The codec features I'm interested in having users optimize for is for the sake of bandwidth. This will help people with weaker internet connections, or people that try to load data all the way from the other side of the world (that's me!)

I understand your concern for time to encode-- especially on older architectures-- so I understand wanting to use SVT over AOM. I'm considering researching how to manipulate SVT to behave more like AOM so that there may be less concern for quality and sizes.

I'll write up a proposal for a different wiki page, but this will take some time. I intend to write explanations in layman terms so I'll need to structure it as such too. I'll post again once I have a draft for it.

Here is a proposal draft to change out the current transcoding instructions and guidance.

Video (and audio)

e621 accepts video uploads in the following formats:

.webm (VP9)
  • 24-bit color with 8 or 10 bit depth
  • YUV limited range (tv, progressive)
  • 4:2:0 pixel format
  • Optional audio in Opus (48KHz sample rate, floating point)
.mp4 (AV1)

.mp4 (AV1)

  • 24-bit color with 8 or 10 bit depth
  • YUV limited range (tv, progressive)
  • 4:2:0 pixel format
  • Optional audio in Opus (48KHz sample rate, floating point)
  • Optional audio in MP3 (44.1KHz or 48KHZ sample rate, floating point)

Due to these limitations, you may need to transcode your video files before you may upload them.

What are codecs?

In the context of media, A codec is a method to store picture (or audio) data. To write picture data, one would need to encode to save it. To read picture data, one would need to decode to view it. The same applies to audio.

If you want to upload a video in a codec or format that e621 does not accept (H264, H265, Realvid; .mkv, .avi, .raw, .prores,), you will need to transcode it. Transcoding is the process of decoding a video to raw data, to then encode this using another method (VP9, AV1; Opus, MP3).
To transcode, you will need to use a utility.

FFmpeg

FFmpeg is a cross-platform open source tool that is designed to decode, encode and transcode video that we intend you to use for site uploads. This tool is to be used on the command line, or terminal.

Installation

To get the most up-to-date version of FFmpeg, refer to the FFmpeg downloads page. If you prefer to install an older stable build, refer to these platform instructions:

Windows
  • Open a command prompt (Press Win + R > type cmd > Press Enter)
  • Install using WinGet (paste winget install --id=Gyan.FFmpeg -f > Press Enter)
  • Wait for the installation to finish, this should be prompted in the command line. Verify with typing ffmpeg -version and pressing Enter.
macOS
  • Open a Terminal instance (Press Command + Spacebar > type Terminal.app > press Enter)
  • If not already installed, install Homebrew (paste /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" > press Enter
  • Wait for brew to finish installation, this should be prompted in the terminal instance.
  • Install ffmpeg (type/paste brew install ffmpeg > Press Enter)
  • You may be prompted to install FFmpeg dependencies. Press Y to accept this or N to abort installation
  • Wait for brew to finish installing FFmpeg. Verify the installation by typing ffmpeg -version and pressing Enter.
Linux

Please refer to the package manager in your Linux distribution. For common distros:

  • Ubuntu: sudo apt install ffmpeg
  • Fedora, CentOS or Rocky Linux: sudo yum install ffmpeg
  • Arch Linux, Manjaro: pacman -S ffmpeg as root

Getting started

You may need to familiarize with the command syntax of your command prompt or terminal.
For Windows, Dell provides a beginner's user guide for Command prompt
For MacOS, Academind has a comprehensive user's guide for Terminal

The most basic syntax of ffmpeg is as follows:

ffmpeg -i input.avi output.webm

By passing this command, ffmpeg will convert your input.avi to output.webm by transcoding the the video codec. In doing so, you enable playback of this file in a .webm file.
What's important in ffmpeg is to provide parameters to how exactly you'd like to encode the video. In the context of this guide, reference to high quality transcoding is provided by use of ffmpeg parameters.

Transcoding to .mp4 (AV1)

libaom-av1 - Slowest transcode, highest efficiency

Frame-by-frame

ffmpeg -i [INPUT] -c:v libaom-av1 -row-mt 1 -threads N -pix_fmt yuv420p -b:v 0  -crf 15 -g 1 -keyint_min 1 -c:a libopus -b:a 128k output.mp4
  • -i - The input file to use
  • -c:v -The video codec to use
  • libaom-av1 specific parameter -b:v 0 - set bitrate to infinity (this is necessary to disable bitrate-constrained encoding)
  • libaom-av1 specific parameter -row-mt 1 - enable row-based multithreading (to use multiple CPU cores on your system for video encoding)
  • libaom-av1 specific parameter -threads N - use N CPU cores on your system for row-based multithreading (use the amount available to your system)
  • -pix_fmt yuv420p - set the color space to YUV with 4:2:0 chroma subsampling for playback in web browsers
  • -crf 15 set the constant quality rate factor, an abstract value that dictates constant visual quality. Decrease this value to retain more visual data (quality) at the expense of more data used. The inverse applies when this value increases
  • -g 1 - set the group-of-picture size to one frame. This disables intra-frame prediction which is important for quality in frame-by-frame animations
  • -keyint_min 1 - Sets the interval between I-frames to 1 frame. This disables frame referencing, which is important for quality in frame-by-frame animations
  • container specific parameter -c:a libopus - the audio codec to use. must be specified or will default to aac otherwise, which is incompatible with e621

Non frame-by-frame animations

ffmpeg -i [INPUT] -c:v libaom-av1 -row-mt 1 -threads N -pix_fmt yuv420p -b:v 0 -crf 15 -c:a libopus output.mp4
  • -i - The input file to use
  • -c:v -The video codec to use
  • libaom-av1 specific parameter -b:v 0 - set bitrate to infinity (this is necessary to disable bitrate-constrained encoding)
  • libaom-av1 specific parameter -row-mt 1 - enable row-based multithreading (to use multiple CPU cores on your system for video encoding)
  • libaom-av1 specific parameter -threads N - use N CPU cores on your system for row-based multithreading (use the amount available to your system)
  • -pix_fmt yuv420p - set the color space to YUV with 4:2:0 chroma subsampling for playback in web browsers
  • -crf 15 set the constant quality rate factor, an abstract value that dictates constant visual quality. Decrease this value to retain more visual data (quality) at the expense of more data used. The inverse applies when this value increases
  • container specific parameter -c:a libopus - the audio codec to use. must be specified or will default to aac otherwise, which is incompatible with e621

Note: by omitting -g 1 -keyint_min 1, libaom-av1 will encode using intra-frame prediction. this saves bandwidth for videos that show movement between frames

libsvtav1 — Faster transcode, lower efficiency

Frame-by-frame

ffmpeg -i [INPUT] -c:v libsvtav1 -pix_fmt yuv420p -svtav1-params tune=0 -crf 15 -preset 4 -g 1 -c:a libopus output.mp4
  • -i - The input file to use
  • -c:v -The video codec to use
  • -pix_fmt yuv420p - set the color space to YUV with 4:2:0 chroma subsampling for playback in web browsers
  • libsvtav1 specific parameter svtav1-params tune=0 - Set VQ tuning metric which is intended to enhance visual fidelity in SVT-AV1
  • -crf 15 set the constant quality rate factor, an abstract value that dictates constant visual quality. Decrease this value to retain more visual data (quality) at the expense of more data used. The inverse applies when this value increases
  • libsvtav1 specific parameter -preset 4 - an indicator for which set of SVT-AV1 parameters to use for encoding. impacts encoding time and encoding quality and is non-distinctive from the constant quality rate factor.
  • -g 1 - set the group-of-picture size to one frame. This disables intra-frame prediction which is important for quality in frame-by-frame animations
  • container specific parameter -c:a libopus - the audio codec to use. must be specified or will default to aac otherwise, which is incompatible with e621

Non frame-by-frame animations

ffmpeg -i [INPUT] -c:v libsvtav1 -pix_fmt yuv420p -svtav1-params tune=0 -crf 15 -preset 4 -c:a libopus output.mp4
  • -i - The input file to use
  • -c:v -The video codec to use
  • libsvtav1 specific parameter svtav1-params tune=0 - Set VQ tuning metric which is intended to enhance visual fidelity in SVT-AV1
  • -pix_fmt yuv420p - set the color space to YUV with 4:2:0 chroma subsampling for playback in web browsers
  • -crf 15 set the constant quality rate factor, an abstract value that dictates constant visual quality. Decrease this value to retain more visual data (quality) at the expense of more data used. The inverse applies when this value increases
  • libsvtav1 specific parameter -preset 4 - an indicator for which set of SVT-AV1 parameters to use for encoding. impacts encoding time and encoding quality and is non-distinctive from the constant quality rate factor.
  • container specific parameter -c:a libopus - the audio codec to use. must be specified or will default to aac otherwise, which is incompatible with e621

Note: by omitting -g 1, libsvtav1 will encode using intra-frame prediction. this saves bandwidth for videos that show movement between frames

Hardware encoders — Fastest transcode, worst efficiency

Note: Hardware encoders ignore parameter -g 1, meaning that these codecs aren't suited for quality frame-by-frame animations.
Use of any of these codecs is best suited for motion content where speed matters more than size.

av1_vulkan

ffmpeg -i [INPUT] -c:v av1_vulkan -pix_fmt yuv420p -cq 15 -c:a libopus output.mp4

av1_amf (AMD)

ffmpeg -i [INPUT] -c:v av1_amf -pix_fmt yuv420p -rc cqp -qp_i 15 -qp_p 15 -c:a libopus output.mp4

av1_nvenc (NVIDIA Ada+)

ffmpeg -i [INPUT] -c:v av1_nvenc -pix_fmt yuv420p -rc constqp -qp 15 -c:a libopus output.mp4

av1_qsv (Intel Arc / 13th gen+ CPUs)

ffmpeg -i [INPUT] -c:v av1_qsv -pix_fmt yuv420p -global_quality 15 -c:a libopus output.mp4

Transcoding to .webm (VP9)

libvpx-vp9 — Slowest transcode, best efficiency

Frame-by-frame

ffmpeg -i [INPUT] -c:v libvpx-vp9 -row-mt 1 -threads N -pix_fmt yuv420p -b:v 0 -crf 13 -g 1 -keyint_min 1 -pass 1 -an -f null NUL && ^
ffmpeg -i [INPUT] -c:v libvpx-vp9 -row-mt 1 -threads N -pix_fmt yuv420p -b:v 0 -crf 13 -g 1 -keyint_min 1 -pass 2 output.webm
  • -i - The input file to use
  • -c:v -The video codec to use
  • -crf 13 set the constant quality rate factor, an abstract value that dictates constant visual quality. Decrease this value to retain more visual data (quality) at the expense of more data used. The inverse applies when this value increases
  • libvpx-vp9 specific parameter -b:v 0 - set bitrate to infinity (this is necessary to disable bitrate-constrained encoding)
  • libvpx-vp9 specific parameter -row-mt 1 - enable row-based multithreading (to use multiple CPU cores on your system for video encoding)
  • libvpx-vp9 specific parameter -threads N - use N CPU cores on your system for row-based multithreading (use the amount available to your system)
  • -pix_fmt yuv420p - set the color space to YUV with 4:2:0 chroma subsampling for playback in web browsers
  • -crf 13 set the constant quality rate factor, an abstract value that dictates constant visual quality. Decrease this value to retain more visual data (quality) at the expense of more data used. The inverse applies when this value increases
  • -g 1 - set the group-of-picture size to one frame. This disables intra-frame prediction which is important for quality in frame-by-frame animations
  • -keyint_min 1 - Sets the interval between I-frames to 1 frame. This disables frame referencing, which is important for quality in frame-by-frame animations

The above reference is a two-pass encode. two-pass encoding will enable libvpx-vp9 to drastically reduce data without sacrificing visual fidelity- should speed be preferred you can truncate into a single pass:

ffmpeg -i [INPUT] -c:v libvpx-vp9 -row-mt 1 -threads N -pix_fmt yuv420p -b:v 0 -crf 13 -g 1 -keyint_min 1 output.webm

Non frame-by-frame animations

ffmpeg -i [INPUT] -c:v libvpx-vp9 -row-mt 1 -threads N -pix_fmt yuv420p -b:v 0 -crf 13  -pass 1 -an -f null NUL && ^
ffmpeg -i [INPUT] -c:v libvpx-vp9 -row-mt 1 -threads N -pix_fmt yuv420p -b:v 0 -crf 13 -pass 2 output.webm
  • -i - The input file to use
  • -c:v -The video codec to use
  • -crf 13 set the constant quality rate factor, an abstract value that dictates constant visual quality. Decrease this value to retain more visual data (quality) at the expense of more data used. The inverse applies when this value increases
  • libvpx-vp9 specific parameter -b:v 0 - set bitrate to infinity (this is necessary to disable bitrate-constrained encoding)
  • libvpx-vp9 specific parameter -row-mt 1 - enable row-based multithreading (to use multiple CPU cores on your system for video encoding)
  • libvpx-vp9 specific parameter -threads N - use N CPU cores on your system for row-based multithreading (use the amount available to your system)
  • -pix_fmt yuv420p - set the color space to YUV with 4:2:0 chroma subsampling for playback in web browsers
  • -crf 13 set the constant quality rate factor, an abstract value that dictates constant visual quality. Decrease this value to retain more visual data (quality) at the expense of more data used. The inverse applies when this value increases

The above reference is a two-pass encode. two-pass encoding will enable libvpx-vp9 to drastically reduce data without sacrificing visual fidelity- should
speed be preferred you can truncate into a single pass:

ffmpeg -i [INPUT] -c:v libvpx-vp9 -row-mt 1 -threads N -pix_fmt yuv420p -b:v 0 -crf 13 output.webm
vp9_qsv (Intel Arc/7th gen+ CPUs) — Fastest transcode, worst efficiency

Note: Hardware encoders ignore parameter -g 1, meaning that this codec isn't suited for quality frame-by-frame animations.
Use of this codec is best suited for motion content where speed matters more than size.

ffmpeg -i [INPUT] -c:v vp9_qsv -pix_fmt yuv420p -global_quality 13 output.webm

Do let me know if I should change or restructure anything in this. Questions and feedback are welcome.
edit 1: added implicit -pix_fmt yuv420p parameters to all references (of course i'm the one to forget that lol); added -svtav1-params tune=0 to SVT-AV1 references

Updated

10-bit is not supported as it's still causing major issues when actually viewing to many, e621:supported_filetypes, altough this could be changed if the situation has changed that half of the comments aren't complaining that the video doesn't play for them.
I still would not instruct anyone to use winget on windows, especially if only thing they want or need is ffmpeg.
Also this kinda highlights the reason why this has not been established properly, because you do have to take into account what kind of input file you have, I have seen many try do like windows batch files but it falls apart because of feature creep and demanding this level of knowledge and care from all users uploading is bit unfeasable.

mairo said:
10-bit is not supported as it's still causing major issues when actually viewing to many, e621:supported_filetypes, altough this could be changed if the situation has changed that half of the comments aren't complaining that the video doesn't play for them.
I still would not instruct anyone to use winget on windows, especially if only thing they want or need is ffmpeg.
Also this kinda highlights the reason why this has not been established properly, because you do have to take into account what kind of input file you have, I have seen many try do like windows batch files but it falls apart because of feature creep and demanding this level of knowledge and care from all users uploading is bit unfeasable.

Will update the post accordingly to specify only 8 bit color depth is supported. If you have other instructions for installing specifically on windows, please do post them or post suggestions for them as this isn't really in my scope (I just run the latest publicly accessible GPL binary builds)

I understand your reasoning for not establishing the wiki(?) properly, hence why I hope that a more refined article, like this draft, would help upright this. I'm just doing as Aacafah suggested and I hope it is useful for the scope of e621 staff. if you want me to account for more codecs, I could give out examples of how to do to bit depth conversions or more advanced video filters for converting MJPEG encoded files to web media with browser support. I'm just not entirely sure what scope is relevant right now so please elaborate on that if possible.

Aacafah

Moderator

As for simple feedback, I'd recommend specifying what you mean by efficiency; efficiency is by definition optimizing 2 or more factors, & there's more than 2 in play:

  • Time
  • File size
  • File quality

As for scope, I'd say I'm not particularly worried about it when done right; clever usage of formatting (with links, collapsed sections, & a table of contents) can make a tangled rat's nest seem simple & straightforward (although making a simple guide that links to a more in-depth one is also a valid strategy). However, it might be wise to start by tailoring to the simplest cases & hiding unneeded info from them while referencing additional details when appropriate that users can look at as they need. One way to do that is the footnote system, though you could link those anywhere. Another is just starting simple & recopying the simpler sections with more info as you go down (& vice versa). If you'd prefer, you can try to break down sections & supplemental info as much as you'd like & I can rearrange them & add the fancy formatting to stitch them together.

I'd say the simplest cases would be:
1. A user making an audio edit of an already uploaded post: just need to know the options & the codecs to use
2. Repackaging an .mkv to the appropriate containers (no transcoding).
3. Transcoding a silent video w/ no concern for file size.
4. A tie between transcoding a silent video while balancing file size & quality & transcoding a video with sound.
5. Transcoding a video with sound while balancing file size & quality.

The required knowledge gets bigger at each step, so targeting them 1 at a time might be best.

Thank you so much for the work you've already done, & remember you can tap out at any time; please don't feel pressured to work on this if you lose motivation, this is already very helpful.

No pressure! Just need to find time to write and in cases where I don't fully understand (like a lot of FFmpeg filters) I do my research.
I'll definitely get started with simpler cases and restructure in terms of compatibility first, technicality last as I do seem to forget that approach is much more user friendly. I'll be sure to write conversion parameters for your sampled use cases, too.
I'll flesh out what I mean with codec efficiency under a separate header. The gist of it is that certain codecs do a better job at compressing elements of pixel data (framing, grain, fidelity, effects) than others, but more on that when I find enough time to write.
Thanks a lot for your feedback Acafah, your insight helps break through my rather naïve approach to explaining technicality in video codecs. Will get back at a later time for this.

Something I want to interject with is that while I'm sure plenty of people would be fine with using raw FFmpeg, for each of those, there are 20 more people who see a command line and tune it out for a miriad of reasons.

I think Handbrake instructions would help a LOT, both for posters that don't have access to the "master" file (either because they are downloading the video off social media or a gallery site, or they are the commissioner that got a h264 MP4 and the artist isn't willing to re-export a video) or artist that might be willing to out in a bit more effort in exporting

I very much agree, Handbrake is much more of a user friendly program simply because it has a GUI. for this reason I'm thinking of making presets for use with handbrake for e621 video uploads specifically

Aacafah

Moderator

We plan to eventually include instructions for both; we started w/ ffmpeg b/c:

  • It's what Mairo, myself, & most other staff who handle video encoding (from what I know) use, so
    • we're most familiar with it & can write authoritatively on it
    • we can help users who get stuck halfway through far better
  • Once they get the CLI of choice open, instructions are just "put this text in (maybe changing these parts in these ways for your situation) & press enter"; dead simple & virtually impossible to screw up (if we explain things properly)
    • This is also way easier to format
  • We can't embed external images in wiki pages, so reference images would be a pain for us & users
  • Even ignoring that, GUIs change far more often & drastically than CLIs, so upkeep becomes a problem
  • We're figuring out the best way to format this as we go, & managing all of the aforementioned while doing that is far from desirable
  • The simple task of making & sharing a preset list of options for specific common tasks (which is what we're aiming for most users to be able to use in most cases) goes from "copy & paste this string" to sharing a preset file that's harder to tweak & more opaque

For svtav1, you have it on preset 4. While it's far faster to encode that way, it also gives noticeably worse results (both on size and output quality) than preset 0.
Also, for audio, transcoding to av1 can just use -c:a copy when the source is also an mp4, so most of the time.

Updated

mairo said:
However the main benefit of AV1 is that creators and artists, has it in their software already, where WebM and VP9 you generally need to first export as something like h264 and then transcode to VP9, which is not ideal.

WTH? They don't have ability to output lossless format to convert from or use external encoders on the raw frames? It feels like that's a pretty big oversight by Adobe and others.

As for AV1 encoders, holy fucking shit libaom is actually unuseabely slow! libvpx is already extremely slow that I sometimes take whole day on longer videos, where similar settings on libaom takes more than several days, it's straight up unfeasable to use.

Examples: [1] [2] [3] "In fact some parts are single-threaded only." Ouch! Well that probably explains it. People with Threadrippers getting lulzy speeds like 0.25FPS or hours for a 2-minute 1080P video IOW.

Doom9 is actually a great resource.

I cannot comment on macOS, because I think I only know of like one single person who has Apple computer.

It's literally a BSD family OS, but with all it's special Apple quirks. I have not used one in years, nor do I need or want to, but from what I've heard and read they don't seem awfully different from Linux machines for this sort of work on the command-line. I've used non-proprietary BSDs but I don't know how similar they are to Apple.

@Mothership:
I tried to help with explaining how to narrow down the documentation shown for options used in examples. topic #58882 And yes, Discord is where effort trying to document something goes to die. It's like taking knowledge behind the woodshed and shooting it. XD

TBH, I've always just used the provided Windows binary for FFMPEG on my laptop and never used *insert package manager name* version. I've had mixed results on Python programs where sometimes the native binary is better option, and sometimes you want to just use a provided environment. The former is kind of bulletproof assuming CPU compatibility, but the latter can be a huge PITA but is more flexible. Java is... fun. Cygwin can byte me.

Using [ section ] tags might help reduce the overwhelming effect of documentation details.

alphamule said:
WTH? They don't have ability to output lossless format to convert from or use external encoders on the raw frames? It feels like that's a pretty big oversight by Adobe and others.

Majority of creators will not export as lossless format, but as format they can share immidiately. Hence why the extra step of export lossless, convert to WebM has been extremely cumbersome here, when artists could just upload their file directly to 𝕏 instead.
Most video hosting sites have much more robust systems of handling and serving videos, which requires a lot of expertise, code and hosting expenses, hence why your choises to upload pornographic furry videos is and has always been, extremely limited and even though there are sites like Inkbunny and Itaku artists could use, because they also aren't video hosting sites and require specificly formatted files to fit on their site, where 𝕏 just takes whatever basically.

mairo said:
Majority of creators will not export as lossless format, but as format they can share immidiately. Hence why the extra step of export lossless, convert to WebM has been extremely cumbersome here, when artists could just upload their file directly to 𝕏 instead.
Most video hosting sites have much more robust systems of handling and serving videos, which requires a lot of expertise, code and hosting expenses, hence why your choises to upload pornographic furry videos is and has always been, extremely limited and even though there are sites like Inkbunny and Itaku artists could use, because they also aren't video hosting sites and require specificly formatted files to fit on their site, where 𝕏 just takes whatever basically.

So FFMPEG is NIH, and you have to use built-in tools or additional steps. :( I guess it makes sense not to want to waste GB's of space for unoptimal files, then almost immediately delete them.

Original page: https://e621.net/forum_topics/59189