4. Reference

This document explains some general aspects and terms.

For the API reference, see this document.

4.1. Format spec

The desired MPEG-DASH representations, referred to media segments of specific format, could be selected by conditional expressions (or format spec). One format spec could refer to one or more representations.

4.1.1. Grammar

Format specs have the following top-level grammar (defined in Lark syntax):

query: expression

?expression: ALL -> all
           | NONE -> none
           | conditional_expression
           | fallback_expression
           | piped_expression
           | "(" expression ")" -> group

conditional_expression: condition
piped_expression: expression ("|" (expression | function))+ -> pipe
fallback_expression: expression ("?:" expression)+

condition: CONDITION_STRING

function.2: FUNCTION_NAME

ALL: "all"
NONE: "none" | "''" | "\"\""
CONDITION_STRING: /[@a-z0-9_\.,<>=!:\[\]'"\s]+/i
FUNCTION_NAME: /[a-z0-9_\-]+/i

The parsing of conditional expressions is done using the pycond package.

Queries

Used to filter audio and/or video streams (MPEG-DASH representations).

Expressions

The simplest expressions are conditional ones. They can be stacked, along with query functions, with the pipe operator (|) to form a piped expression. To group expressions, use parantheses (()).

Conditions

Composed of atomic conditions (atoms) and can be placed in a row with combine boolean operators (and, or,…). Atoms, in turn, lookup and compare attributes (MPEG-DASH representation attributes) with conditional operators. Conditions can be structured by bracketing to create complex queries, e.g., codecs eq vp9 and [height eq 720 or height eq 1080].

Operators

Text or symbolic operators that refer to the Python’s standard rich-comparison methods.

Functions

Applies to the result of a query and therefore should be placed after a query expression. The available functions are best and worst. For example, quality >= 720p | best.

Fallback expressions

Several expressions can be placed in a row with the help of the ?: operator. Each expression is qualified separately and the first (the most left) truthy expression is used. It can be useful to provide fallback expression(s) in addition to the main one. For example, (quality >= 720p and format = mp4 ?: quality >= 720p and format = webm) | best.

4.1.2. Attributes

The attributes of audio and video streams (MPEG-DASH representations) available for use in conditions are listed below.

Common

class ytpb.representations.RepresentationInfo(itag: str, mime_type: str, codecs: str, base_url: str)

Bases: object

Represents common attributes of audio and video representations.

itag: str

itag value, e.g. ‘140’, ‘247’.

mime_type: str

MIME type, e.g. ‘audio/mp4’, ‘video/webm’.

codecs: str

Codec name, e.g. ‘mp4’, ‘vp9’.

base_url: str

Segment base URL.

property type: str

An alias for a MIME type, e.g. ‘audio’, ‘video’.

property format: str

An alias for a MIME subtype, e.g. ‘mp4’, ‘webm’.

Audio only

class ytpb.representations.AudioRepresentationInfo(itag: str, mime_type: str, codecs: str, base_url: str, audio_sampling_rate: int)

Bases: RepresentationInfo

Represents attributes of audio representations.

audio_sampling_rate: int

Sampling rate (in Hz).

Video only

class ytpb.representations.VideoRepresentationInfo(itag: str, mime_type: str, codecs: str, base_url: str, width: int, height: int, frame_rate: int)

Bases: RepresentationInfo

Represents attributes of video representations.

width: int

Width of frame.

height: int

Height of frame.

frame_rate: int

Frame per second (FPS).

property quality: VideoQuality

Quality string (resolution and FPS), e.g. ‘720p’, ‘1080p60’.

property fps: int

An alias for frame_rate.

4.1.3. Aliases

The format spec expressions can be simplified with aliases (@alias). There are built-in aliases as well as custom, user-defined ones. Also, they can be formally divided into (a) those that are explicitly defined and (b) pattern aliases, those that are described by the Python regular expressions.

Built-in aliases

itags
  • (\d+)\bitag eq \1

    Example: @140itag = 140

Formats
  • mp4format = mp4

  • webmformat = webm

Codecs
  • mp4acodecs contains mp4a

  • avc1codecs contains avc1

  • vp9codecs = vp9

Qualities
  • (\d+)p\bheight = \1

    Example: @720pheight = 720p

  • (\d+)p(\d+)\b[height = \1 and frame_rate \2]

    Example: @1080p60[heigth = 1080p and frame_rate = 60]

Qualities with operators
  • ([<>=]=?)(\d+)p\bheight \1 \2

    Example: @>=720pheight >= 720p

Available operators: <, >, =, <=, >=. Note that the frame_rate part is not included.

Frame rate
  • (\d+)fps\bframe_rate = \1

    Example: @30fpsframe_rate = 30

Named qualities
  • lowheight = 144

  • mediumheight = 480

  • highheight = 720

  • FHDheight = 1080

  • 2Kheight = 1440

  • 4Kheight = 2160

Custom aliases

The custom aliases could extend and update the built-in ones. The corresponding configuration field is general.aliases.

Here is an example of how to define aliases in config.toml:

[general.aliases]
fast = "codecs = vp9 and height = 480"

Aliases can be nested, allowing them to be reused:

preferred = "@<=1080p and @30fps"
for-mpd = "@vp9 and @preferred | best"

To define pattern aliases, use single quotes to surround alias names and values:

'(\d+)x(\d+)\b' = 'width eq \1 and height eq \2'

4.1.4. Practical examples

Defining fallback conditions

Let’s assume that we, as users, prefer videos based on the following criteria: (a) MP4 format in 1080p or 720p quality and 30 fps, @mp4 and [@720p or 1080p] and @30fps | best, and (b) if the condition is not met, try another, broader one: @mp4 and @<=1080p | best.

To demonstrate the exampe, we’ll look at two live streams named Stream I and Stream II and outputs from the yt-dlp --live-from-start -F command (highlighted is a representation that satisfies the first criterion).

For Stream I, the first format spec will match itag = 136, the best available stream, since there are no streams in higher quality:

Stream I
ID  EXT  RESOLUTION FPS │   TBR PROTO  │ VCODEC        VBR ACODEC     ABR ASR MORE INFO
...
135 mp4  854x480     30 │ 1350k dashG  │ avc1.4d401f 1350k video only         DASH video, mp4_dash
244 webm 854x480     30 │  528k dashG  │ vp9          528k video only         DASH video, webm_dash
136 mp4  1280x720    30 │ 2684k dashG  │ avc1.4d401f 2684k video only         DASH video, mp4_dash
247 webm 1280x720    30 │  733k dashG  │ vp9          733k video only         DASH video, webm_dash

For Stream II, the first condition will not be fulfilled because there are no such streams of 30 fps in MP4 format:

Stream II
...
135 mp4  854x480     30 │ 1350k dashG │ avc1.4d401f 1350k video only          DASH video, mp4_dash
244 webm 854x480     30 │  528k dashG │ vp9          528k video only          DASH video, webm_dash
298 mp4  1280x720    60 │ 4018k dashG │ avc1.4d4020 4018k video only          DASH video, mp4_dash
302 webm 1280x720    60 │ 1276k dashG │ vp9         1276k video only          DASH video, webm_dash
299 mp4  1920x1080   60 │ 6686k dashG │ avc1.64002a 6686k video only          DASH video, mp4_dash
303 webm 1920x1080   60 │ 4816k dashG │ vp9         4816k video only          DASH video, webm_dash
308 webm 2560x1440   60 │ 9016k dashG │ vp9         9016k video only          DASH video, webm_dash

To match the second condition (itag = 299), we can make an addition to our first query by combining two conditions with the fallback operator:

(@mp4 and [@720p or @1080p] and @30fps ?: @mp4 and @<=1080p) | best

This allows us to have one query expression that works for different cases.

4.1.5. Default option values

The choice of the default --audio-format(s) and --video-format(s) option values depends on user’s preferences (formats, qualities, etc.), needs (fast skimming or high-quality archiving), and formats availability. Here below are below the default, built-in values. As part of Configuring, they can be overriden.

Please note that these values don’t correspond to the best quality videos and are focused on a smooth experience balancing between video quality and download speed.

[options.download]
audio_format = "itag = 140"
video_format = """\
(@avc1 and [@720p or @1080p] and @30fps ?: \
 @avc1 and @<=1080p ?: \
 @<=1080p) | best"""

[options.capture.frame]
video_format = "(@>=1080p and @30fps ?: all) | best"

[options.capture.timelapse]
video_format = "(@>=1080p and @30fps ?: all) | best"

[options.mpd.compose]
audio_formats = "itag = 140"
video_formats = """\
@vp9 and [@720p or @1080p] and @30fps ?: \
@vp9 and [@720p or @1080p]"""

4.2. Templating and context variables

Output paths can be provided as templates. Our choice of templates settled on Jinja. It’s versatile, expressive and allow users to produce very flexible outputs.

4.2.1. Quick intro

Jinja has its own detailed reference for template designers. For our needs we only need the basic features: to output variables, format values, and run some simple expressions.

Using variables

The simplest form to display template variables (link) is to place them in between the {{ }} expression delimiters:

{{ variable }}
"A variable's value"

Mutliple variables can be formatted together by using: (a) several expressions, (b) the standard str.format() method or the related filter (link), or (c) the ~ (tilde) operator.

{{ A }} and {{ B }}
{{ '{} and {}'.format(A, B) }}
{{ A ~ 'and' ~ B  }}
"Alpha and Beta"

Each command has its own context: a set of variables, such as YouTube video ID, title, start and end dates, etc. See Context variables for the list of all available variables.

Processing with filters

In most cases, you will need to format values of variables. With filters (link) you can process them and change their string representation.

For example, let’s change the case of a title and strip whitespace:

{{ 'Stream title'|title|replace(' ', '') }}
"StreamTitle"

As you can see, filters can be combined and called without brackets (if there are no required arguments or no need to redefine default values).

Jinja comes with a lot of useful built-in filters. We also provide our custom filters.

Running expressions

Expressions (link) let you work with templates very similar to regular Python. Actually, you’re already familiar with expressions: literals are their simplest form and the pipe (|) symbol is an operator to apply a filter.

For example, let’s keep only some part of a title with Python methods and make it titlecase again with a filter:

{{ 'Stream title - Bla bla'.split(' - ')[0]|title }}
"Stream Title"

4.2.2. Custom filters

In addition to Jinja built-in filters, here is the list of our custom available filters, which can be applied on variables of the listed types:

* * *

ytpb.cli.templating.adjust(value: str, chars: str = 'posix', length: int = 30, separator: str = ' ', break_words: bool = False) str

Adjusts a string for platform-independent filename.

The default allowed character set is POSIX-compliant (chars='posix') with ‘-’ used as the fallback symbol for separator. For other character sets ('ascii' or 'unicode'), a whitespace separator will be used.

The filter does the following:

  • Sanitize a string by removing non-valid characters

  • Translate characters to allowed ones (ASCII-only or POSIX-compliant [1]) or keep them as is (chars='unicode')

  • Reduce the length to the provided value. By default, words are truncated at boundaries.

References

  1. https://www.gnu.org/software/automake/manual/html_node/Limitations-on-File-Names.html

Examples

  1. Allow only POSIX-compliant characters (default):

    {{ "Vidéo en direct – 24/7"|adjust }}
    "Video-en-direct--24-7"
    
  2. Truncate to a shorten length and break words:

    {{ "Vidéo en direct – 24/7"|adjust(length=12, break_words=True) }}
    "Video-en-dir"
    
  3. Use different separator between words:

    {{ "Vidéo en direct – 24/7"|adjust(separator='_') }}
    "Video_en_direct--24-7"
    
  4. Allow only ASCII characters:

    {{ "Vidéo en direct – 24/7"|adjust('ascii') }}
    "Video en direct -- 24-7"
    
  5. Keep original (sanitized) characters and length:

    {{ "Vidéo en direct – 24/7"|adjust('unicode', length=255) }}
    "Vidéo en direct – 24-7"
    
ytpb.cli.templating.isodate(value: datetime, styles: str = 'basic,complete,hh') str

Formats a date according to the ISO 8601 standard.

Supports complete and reduced date and time representations in basic and extended formats. The list of available styles: basic or extended, complete or reduced, hh or hhmm, z. Can be used together separated by a comma.

If no styles are provided, or not all styles are specified, the default ones will be applied separately: basic, complete, hh.

Examples

{# Complete representation, basic format #}
{{ input_start_date|isodate }}
"20240102T102000+00"

{# Complete representation of the Zulu time, extended format #}
{{ input_start_date|isodate('extended,z') }}
"2024-01-02T10:20:00Z"

{# Reduced representation, basic format #}
{{ input_start_date|isodate('reduced') }}
"20240102T1020+00"

{# Complete representation, basic format, HHMM offset #}
{{ input_start_date|isodate('hhmm') }}
"20240102T102000+0000"
ytpb.cli.templating.utc(value: datetime) str

Converts a date to the UTC timezone.

Example

{{ input_start_date|utc }}
"2024-01-02 10:20:30.123456+00:00"
ytpb.cli.templating.timestamp(value: datetime) int

Converts a date to an Unix timestamp in seconds.

Example

{{ input_start_date|timestamp }}
1704190830
ytpb.cli.templating.duration(value: timedelta, style: str = 'iso') str

Formats a timedelta to a duration string.

Available styles: hms, iso (default), numeric.

Examples

{{ duration|duration }}
"PT1H20M30S"

{{ duration|duration('hms') }}
"1h20m30s"

{{ duration|duration('numeric') }}
"01:20:30"

4.2.3. Context variables

Here are the available variables that you can use in your templates. The variables are defined by contexts of the (sub-)commands:

Base contexts

class ytpb.cli.templating.MinimalOutputPathContext

Bases: TypedDict

id: str

YouTube video ID.

title: str

Video’s title.

class ytpb.cli.templating.AudioStreamOutputPathContext

Bases: TypedDict

audio_stream: AudioRepresentationInfo | None

Audio stream (representaiton).

class ytpb.cli.templating.VideoStreamOutputPathContext

Bases: TypedDict

video_stream: VideoRepresentationInfo | None

Video stream (representaiton).

class ytpb.cli.templating.IntervalOutputPathContext

Bases: TypedDict

input_start_date: datetime

Input start date.

input_end_date: datetime

Input end date.

actual_start_date: datetime

Actual start date.

actual_end_date: datetime

Actual end date.

duration: timedelta

Actual duration.

Command contexts

ytpb download
class ytpb.cli.commands.download.DownloadOutputPathContext

Bases: MinimalOutputPathContext, AudioStreamOutputPathContext, VideoStreamOutputPathContext, IntervalOutputPathContext

id: str
title: str
audio_stream: AudioRepresentationInfo | None
video_stream: VideoRepresentationInfo | None
input_start_date: datetime
input_end_date: datetime
actual_start_date: datetime
actual_end_date: datetime
duration: timedelta
ytpb capture frame
class ytpb.cli.commands.capture.CaptureOutputPathContext

Bases: MinimalOutputPathContext, VideoStreamOutputPathContext

moment_date: datetime

Date the frame was captured.

id: str
title: str
video_stream: VideoRepresentationInfo | None
ytpb capture timelapse
class ytpb.cli.commands.capture.TimelapseOutputPathContext

Bases: MinimalOutputPathContext, VideoStreamOutputPathContext, IntervalOutputPathContext

every: timedelta

Interval at wich frames are captured.

id: str
title: str
video_stream: VideoRepresentationInfo | None
input_start_date: datetime
input_end_date: datetime
actual_start_date: datetime
actual_end_date: datetime
duration: timedelta
ytpb mpd compose
class ytpb.cli.commands.mpd.MPDOutputPathContext

Bases: MinimalOutputPathContext, IntervalOutputPathContext

id: str
title: str
input_start_date: datetime
input_end_date: datetime
actual_start_date: datetime
actual_end_date: datetime
duration: timedelta

4.2.4. Practical examples

Let’s practice with some showcase examples.

Sanitize file paths

The default approach for working with output file paths is to render portable paths, e.g. with multi-platform support. The following steps can be highlighted:

  • Prepare variables: Replace unsafe delimiter characters (/, |) with -, sanitize some variables (title, author, etc.)

  • Render output: Render a template string to an output file path

  • Post-process: Sanitize a rendered file path

The first and last steps are carried out automatically.

Let’s imagine that we have a live stream with the following title:

Original title
"Panorama Pointe Percée | 24/7 en direct !"

As you see, it contains a letter with diactrical mark, but also some unsafe characters such as | (not portable) and / (used to separate directories), so we strip the latter ones by default:

Sanitized title
{{ title }}
"Panorama Pointe Percée - 24-7 en direct !"

Now this variable can be effortlessly used in output file paths, but let’s adjust it more. For example, get rid of whitespaces (to use a path without quoting) and accept only POSIX-compliant filename characters with the adjust() filter:

{{ title|adjust }}
"Panorama-Pointe-Percee-24-7-en-direct"

And with another separator between words:

{{ title|adjust(separator='_') }}
"Panorama_Pointe_Percee_24_7_en_direct"

Of course, ASCII characters can be kept with:

{{ title|adjust('ascii', separator='-') }}
"Panorama_Pointe_Percee_24_7_en_direct_!"

As for an output path, any non-portable characters will be automatically removed, as well as ending separators:

Sanitized file path
{{ title|adjust('ascii', separator='-')}}_?
"Panorama_Pointe_Percee_24_7_en_direct_!"

Custom format dates

While the custom isodate() filter is available, dates can be formatted with the standard strftime() function.

Let’s take a date, convert to UTC with the utc() filter and then custom format it:

{{ (input_start_date|utc).strftime('%Y%m%d_%H%M%S') }}
"20240102_102030"

Set and reuse variables

Sometimes it would be useful to set new variables. You can define a variable with the {% set ... %} statement, and use new variables later:

{% set a_title = title|adjust %}
{% set destination = '{}/{}'.format(a_title, input_start_date.format('%Y/%m')) %}
{{ destination ~ '/' ~ a_title ~ '_' ~ input_start_date|isoformat }}
"Stream-title/2024/01/Stream-title_20240102T102030+00"

Conditionally print variables

What if you want to include some information based on a condition? Let’s try to print the ‘HD’ suffix only for HD quality representations in this example.

Set a new variable based on the result of the if-else inline expression (link) by accessing an attribute of a video stream (VideoRepresentationInfo) object:

{% set hd_suffix = 'HD' if video_stream.height >= 720 else None %}

Output string can be composed in several ways:

{# Using multiple expressions and string concatenation #}
{{ title|adjust }}_{{ video_stream.quality }}{{ '_' ~ hd_suffix if hd_suffix }}

{# Using *string* list elements joined by the delimiter #}
{{ [title|adjust, video_stream.quality, hd_suffix]|select('string')|join('_') }}

And rendered outputs will be:

{# Some representation #}
"Stream-title_480p"

{# Another representation #}
"Stream-title_1080p60_HD"

However, as you may have noticed, the example will fail for audio-only downloads. While you can use the inline condition, in the next example we’ll see another approach based on statements.

Use condition statements

The idea would be to differentiate between audio and video template strings with the help of the if statement (link): it will make a template much cleaner. Here’s a slightly simplified example:

{% if video_stream %}
    {{ title|adjust }}/{{ video_stream.quality }}/... }}
{% else %}
    {{ title|adjust }}/audio/... }}
{% endif %}
"Stream-title/1080p/..." or
"Stream-title/auto/..."