Hledáme nové posily do ITnetwork týmu. Podívej se na volné pozice a přidej se do nejagilnější firmy na trhu - Více informací.
Pouze tento týden sleva až 80 % na e-learning týkající se Swiftu. Zároveň využij výhodnou slevovou akci až 30 % zdarma při nákupu e-learningu - více informací.
swift week + discount 30

Diskuze: Nefunkční regex match s odkazy které by neměly být match v mem youtube-dl a gallery-dl automatizačnim scriptem

Aktivity
Avatar
heavyblack1
Člen
Avatar
heavyblack1:8.11.2021 13:55

Ahoj mam svůj automatizační script na dávkové stahovaní používám gallery-dl a youtube-dl na stahovaní odkazů z seznamu později chcu přidal podporu pro you-get a potřeboval bych odkazy které jsou podporovány youtube-dl dát pouze youtube-dl . A tak jsem se podival do zdrojáku youtube-dl protože na strankach mají pouze jména webu ne odkazy a tak jsem našel extrakčni sctipt na regex pattern a upravil jsem ho abych měl funkci find_yt_dl_si­tes(lines: list) popsanou níže.Problém je že nevrací nic a pomocí debugeru jsem zjistil že ze zahradního důvodu mam match se https://www.pixiv.net/ https://www.deviantart.com/ odkazy které yt-dl nepodporuje.

Zkusil jsem:

def find_yt_dl_sites(lines: list):
    """
    Return list of sites supported by youtube-dl
    """

    #lines = "".join(lines)

    s = []

    # Supported sites
    [s.append(re.findall(r"""https?://(?:www\.)?1tv\.ru/(?:[^/]+/)+(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:www\.)?20min\.ch/
                            (?:
                                videotv/*\?.*?\bvid=|
                                videoplayer/videoplayer\.html\?.*?\[email protected]
                            )
                            (?P<id>\d+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""(?x)(?:https?://)?(?:www\.)?220\.ro/(?P<category>[^/]+)/(?P<shorttitle>[^/]+)/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?P<domain>[^.]+\.(?:twentythree\.net|23video\.com|filmweb\.no))/v\.ihtml/player\.html\?(?P<query>.*?\bphoto(?:_|%5f)id=(?P<id>\d+).*)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?247sports\.com/Video/(?:[^/?#&]+-)?(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?P<host>
                                (?:(?:www|porno?)\.)?24video\.
                                (?:net|me|xxx|sexy?|tube|adult|site|vip)
                            )/
                            (?:
                                video/(?:(?:view|xml)/)?|
                                player/new24_play\.swf\?id=
                            )
                            (?P<id>\d+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://playout\.3qsdn\.com/(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?3sat\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?P<kind>www|m)\.)?4tube\.com/(?:videos|embed)/(?P<id>\d+)(?:/(?P<display_id>[^/?#&]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:www|player)\.)?56\.com/(?:.+?/)?(?:v_|(?:play_album.+-))(?P<textid>.+?)\.(?:html|swf)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:5min:|https?://(?:[^/]*?5min\.com/|delivery\.vidible\.tv/aol)(?:(?:Scripts/PlayerSeed\.js|playerseed/?)?\?.*?playList=)?)(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:6play:|https?://(?:www\.)?(?P<domain>6play\.fr|rtlplay\.be|play\.rtl\.hr|rtlmost\.hu)/.+?-c_)(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?7plus\.com\.au/(?P<path>[^?]+\?.*?\bepisode-id=(?P<id>[^&#]+))""",li)) for li in lines]
    [s.append(re.findall(r"""https?://8tracks\.com/(?P<user>[^/]+)/(?P<id>[^/#]+)(?:#.*)?$""",li)) for li in lines]
    [s.append(re.findall(r"""(?:https?://)(?:www\.|)91porn\.com/.+?\?viewkey=(?P<id>[\w\d]+)""",li)) for li in lines]
    [s.append(re.findall(r"""9c9media:(?P<destination_code>[^:]+):(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?9gag\.com/gag/(?P<id>[^/?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?9now\.com\.au/(?:[^/]+/){2}(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://iview\.abc\.net\.au/(?:[^/]+/)*video/(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                abcnews\.go\.com/
                                (?:
                                    (?:[^/]+/)*video/(?P<display_id>[0-9a-z-]+)-|
                                    video/(?:embed|itemfeed)\?.*?\bid=
                                )|
                                fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
                            )
                            (?P<id>\d+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?P<site>abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:(?:/[^/]+)*/(?P<display_id>[^/]+))?/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://clips\.abcotvs\.com/(?:[^/]+/)*video/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""^https?://(?:www\.)?academicearth\.org/playlists/(?P<id>[^?#/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:(?:embed|www)\.)?acast\.com/|
                                play\.acast\.com/s/
                            )
                            (?P<channel>[^/]+)/(?P<id>[^/#?]+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:www\.)?acast\.com/|
                                play\.acast\.com/s/
                            )
                            (?P<id>[^/#?]+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?animedigitalnetwork\.fr/video/[^/]+/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://\w+\.adobeconnect\.com/(?P<id>[\w-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://tv\.adobe\.com/embed/\d+/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://video\.tv\.adobe\.com/v/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?adultswim\.com/videos/(?P<show_path>[^/?#]+)(?:/(?P<episode_path>[^/?#]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://
            (?:(?:www|play|watch)\.)?
            (?P<domain>
                (?:history(?:vault)?|aetv|mylifetime|lifetimemovieclub)\.com|
                fyi\.tv
            )/(?P<id>
            shows/[^/]+/season-\d+/episode-\d+|
            (?:
                (?:movie|special)s/[^/]+|
                (?:shows/[^/]+/)?videos
            )/[^/?#&]+
        )""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://
            (?:(?:www|play|watch)\.)?
            (?P<domain>
                (?:history(?:vault)?|aetv|mylifetime|lifetimemovieclub)\.com|
                fyi\.tv
            )/(?:[^/]+/)*(?:list|collections)/(?P<id>[^/?#&]+)/?(?:[?#&]|$)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://
            (?:(?:www|play|watch)\.)?
            (?P<domain>
                (?:history(?:vault)?|aetv|mylifetime|lifetimemovieclub)\.com|
                fyi\.tv
            )/shows/(?P<id>[^/?#&]+)/?(?:[?#&]|$)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:(?:live|afbbs|www)\.)?afreeca(?:tv)?\.com(?::\d+)?
                                (?:
                                    /app/(?:index|read_ucc_bbs)\.cgi|
                                    /player/[Pp]layer\.(?:swf|html)
                                )\?.*?\bnTitleNo=|
                                vod\.afreecatv\.com/PLAYER/STATION/
                            )
                            (?P<id>\d+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://air\.mozilla\.org/(?P<id>[0-9a-z-]+)/?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://live\.aliexpress\.com/live/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?aljazeera\.com/(?P<type>program/[^/]+|(?:feature|video)s)/\d{4}/\d{1,2}/\d{1,2}/(?P<id>[^/?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?allocine\.fr/(?:article|video|film)/(?:fichearticle_gen_carticle=|player_gen_cmedia=|fichefilm_gen_cfilm=|video-)(?P<id>[0-9]+)(?:\.html)?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?alphaporno\.com/videos/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?amara\.org/(?:\w+/)?videos/(?P<id>\w+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<site>amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?P<id>(?:movies|shows(?:/[^/]+)+)/[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?:americastestkitchen|cooks(?:country|illustrated))\.com/(?P<resource_type>episode|videos)/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<show>americastestkitchen|cookscountry)\.com/episodes/browse/season_(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?anderetijden\.nl/programma/(?:[^/]+/)+(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?anime-on-demand\.de/anime/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""anvato:(?P<access_key_or_mcp>[^:]+):(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>\d{9}|[0-9a-f]{24}|[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12})""",li)) for li in lines]
    [s.append(re.findall(r"""(?P<base_url>https?://[^/]+\.apa\.at)/embed/(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?aparat\.com/(?:v/|video/video/embed/videohash/)(?P<id>[a-zA-Z0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://itunes\.apple\.com/\w{0,2}/?post/(?:id)?sa\.(?P<id>[\w-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(www|ent)\.appledaily\.com\.tw/[^/]+/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://podcasts\.apple\.com/(?:[^/]+/)?podcast(?:/[^/]+){1,2}.*?\bi=(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.|movie)?trailers\.apple\.com/(?:trailers|ca)/(?P<company>[^/]+)/(?P<movie>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?trailers\.apple\.com/#section=(?P<id>justadded|exclusive|justhd|mostpopular|moviestudios)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?archive\.org/(?:details|embed)/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""arcpublishing:(?P<org>[a-z]+):(?P<id>[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12})""",li)) for li in lines]
    [s.append(re.findall(r"""(?P<mainurl>https?://(?:www\.)?daserste\.de/(?:[^/?#&]+/)+(?P<id>[^/?#&]+))\.html""",li)) for li in lines]
    [s.append(re.findall(r"""^https?://(?:(?:(?:www|classic)\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de|one\.ard\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?""",li)) for li in lines]
    [s.append(re.findall(r"""https://(?:(?:beta|www)\.)?ardmediathek\.de/(?:[^/]+/)?(?:player|live|video)/(?:[^/]+/)*(?P<id>Y3JpZDovL[a-zA-Z0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                            https?://
                                (?:
                                    video\.(?:arkena|qbrick)\.com/play2/embed/player\?|
                                    play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)
                                )
                            """,li)) for li in lines]
    [s.append(re.findall(r"""https?://arte\.sky\.it/video/(?P<id>[^/?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:www\.)?arte\.tv/(?P<lang>fr|de|en|es|it|pl)/videos|
                                api\.arte\.tv/api/player/v\d+/config/(?P<lang_2>fr|de|en|es|it|pl)
                            )
                            /(?P<id>\d{6}-\d{3}-[AF])
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?arte\.tv/player/v\d+/index\.php\?.*?\bjson_url=.+""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?arte\.tv/(?P<lang>fr|de|en|es|it|pl)/videos/(?P<id>RC-\d{6})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<host>(?:(?:asiancrush|yuyutv|midnightpulp)\.com|(?:cocoro|retrocrush)\.tv))/video/(?:[^/]+/)?0+(?P<id>\d+)v\b""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<host>(?:(?:asiancrush|yuyutv|midnightpulp)\.com|(?:cocoro|retrocrush)\.tv))/series/0+(?P<id>\d+)s\b""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/[^/]+/[^/]+/(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://techchannel\.att\.com/play-video\.cfm/([^/]+/)*(?P<id>.+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?atv\.at/(?:[^/]+/){2}(?P<id>[dv]\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?audi-mediacenter\.com/(?:en|de)/audimediatv/(?:video/)?(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?audiomack\.com/song/(?P<id>[\w/-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?audiomack\.com/album/(?P<id>[\w/-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?:awaan|dcndigital)\.ae/(?:#/)?show/(?P<show_id>\d+)/[^/]+(?:/(?P<video_id>\d+)/(?P<season_id>\d+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?:awaan|dcndigital)\.ae/(?:#/)?live/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?:awaan|dcndigital)\.ae/(?:#/)?program/(?:(?P<show_id>\d+)|season/(?P<season_id>\d+))""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?:awaan|dcndigital)\.ae/(?:#/)?(?:video(?:/[^/]+)?|media|catchup/[^/]+/[^/]+)/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:www\.)?
                            (?P<host>
                                telezueri\.ch|
                                telebaern\.tv|
                                telem1\.ch
                            )/
                            [^/]+/
                            (?P<id>
                                [^/]+-(?P<article_id>\d+)
                            )
                            (?:
                                \#video=
                                (?P<kaltura_id>
                                    [_0-9a-z]+
                                )
                            )?
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://v\.baidu\.com/(?P<type>[a-z]+)/(?P<id>\d+)\.htm""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?b-ch\.com/titles/(?P<id>\d+/\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://[^/]+\.bandcamp\.com/track/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<id>[^/?#&]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bandcamp\.com/?\?(?:.*?&)?show=(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://bangumi\.bilibili\.com/anime/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bbc\.(?:com|co\.uk)/(?:[^/]+/)+(?P<id>[^/#?]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:www\.)?bbc\.co\.uk/
                            (?:
                                programmes/(?!articles/)|
                                iplayer(?:/[^/]+)?/(?:episode/|playlist/)|
                                music/(?:clips|audiovideo/popular)[/#]|
                                radio/player/|
                                sounds/play/|
                                events/[^/]+/play/[^/]+/
                            )
                            (?P<id>(?:[pbm][\da-z]{7}|w[\da-z]{7,14}))(?!/(?:episodes|broadcasts|clips))
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bbc\.co\.uk/programmes/articles/(?P<id>[a-zA-Z0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bbc\.co\.uk/iplayer/episodes/(?P<id>(?:[pbm][\da-z]{7}|w[\da-z]{7,14}))""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bbc\.co\.uk/iplayer/group/(?P<id>(?:[pbm][\da-z]{7}|w[\da-z]{7,14}))""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bbc\.co\.uk/programmes/(?P<id>(?:[pbm][\da-z]{7}|w[\da-z]{7,14}))/(?:episodes|broadcasts|clips)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bbv\-tv\.net/watch/(?P<channel>[^/]+?)/(?P<id>[0-9]+)[^/]+(?:/(?P<recid>[0-9]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.|pro\.)?beatport\.com/track/(?P<display_id>[^/]+)/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?beeg\.(?:com|porn(?:/video)?)/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bellator\.com/[^/]+/[\da-z]{6}(?:[/?#&]|$)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://(?:www\.)?
            (?P<domain>
                (?:
                    ctv|
                    tsn|
                    bnn(?:bloomberg)?|
                    thecomedynetwork|
                    discovery|
                    discoveryvelocity|
                    sciencechannel|
                    investigationdiscovery|
                    animalplanet|
                    bravo|
                    mtv|
                    space|
                    etalk|
                    marilyn
                )\.ca|
                (?:much|cp24)\.com
            )/.*?(?:\b(?:vid(?:eoid)?|clipId)=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://player\.bfi\.org\.uk/[^/]+/film/watch-(?P<id>[\w-]+)-online""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bfmtv\.com/(?:[^/]+/)*[^/?&#]+_V[A-Z]-(?P<id>\d{12})\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bfmtv\.com/(?:[^/]+/)*[^/?&#]+_A[A-Z]-(?P<id>\d{12})\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bfmtv\.com/(?P<id>(?:[^/]+/)?en-direct)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bibeltv\.de/mediathek/videos/(?:crn/)?(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bigflix\.com/.+/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bild\.de/(?:[^/]+/)+(?P<display_id>[^/]+)-(?P<id>\d+)(?:,auto=true)?\.bild\.html""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:(?:www|bangumi)\.)?
                            bilibili\.(?:tv|com)/
                            (?:
                                (?:
                                    video/[aA][vV]|
                                    anime/(?P<anime_id>\d+)/play\#
                                )(?P<id_bv>\d+)|
                                video/[bB][vV](?P<id>[^/?#&]+)
                            )
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bilibili\.com/audio/au(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bilibili\.com/audio/am(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://player\.bilibili\.com/player\.html\?.*?\baid=(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:tv|www)\.biobiochile\.cl/(?:notas|noticias)/(?:[^/]+/)+(?P<id>[^/]+)\.shtml""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?biography\.com/video/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?biqle\.(?:com|org|ru)/watch/(?P<id>-?\d+_\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bitchute\.com/(?:video|embed|torrent/[^/]+)/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bitchute\.com/channel/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bleacherreport\.com/articles/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36}|\d{5})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bloomberg\.com/(?:[^/]+/)*(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?P<host>(?:[^/]+\.)?bongacams\d*\.com)/(?P<id>[^/?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?i)https?://(?:www\.)?bostonglobe\.com/.*/(?P<id>[^/]+)/\w+(?:\.html)?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:[^.]+\.)?app\.box\.com/s/(?P<shared_name>[^/]+)/file/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bpb\.de/mediathek/(?P<id>[0-9]+)/""",li)) for li in lines]
    [s.append(re.findall(r"""(?P<base_url>https?://(?:www\.)?br(?:-klassik)?\.de)/(?:[a-z0-9\-_]+/)+(?P<id>[a-z0-9\-_]+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<req_id>bravotv|oxygen)\.com/(?:[^/]+/)+(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?break\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*(?P<content_type>video|playlist)Id=(?P<video_id>\d+|ref:[^&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?br\.de/mediathek/video/[^/?&#]*?-(?P<id>av:[0-9a-f]{24})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bt\.no/(?:[^/]+/)+(?P<id>[^/]+)-\d+\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?bt\.no/spesial/vestlendingen/#!/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:[^/]+\.)?businessinsider\.(?:com|nl)/(?:[^/]+/)*(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?buzzfeed\.com/[^?#]*?/(?P<id>[^?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?byutv\.org/(?:watch|player)/(?!event/)(?P<id>[0-9a-f-]+)(?:/(?P<display_id>[^/?#&]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?camdemy\.com/media/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?camdemy\.com/folder/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cammodels\.com/cam/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:www|api)\.)?camtube\.co/recordings?/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?camwithher\.tv/view_video\.php\?.*\bviewkey=(?P<id>\w+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:www\.)?canalc2\.tv/video/|archives-canalc2\.u-strasbg\.fr/video\.asp\?.*\bidVideo=)(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<site>mycanal|piwiplus)\.fr/(?:[^/]+/)*(?P<display_id>[^?/]+)(?:\.html\?.*\bvid=|/p/)(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://mediazone\.vrt\.be/api/v1/(?P<site_id>canvas|een|ketnet|vrt(?:video|nieuws)|sporza|dako)/assets/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<site_id>canvas|een)\.be/(?:[^/]+/)*(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:carambatv:|https?://video1\.carambatv\.ru/v/)(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://carambatv\.ru/(?:[^/]+/)+(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cartoonnetwork\.com/video/(?:[^/]+/)+(?P<id>[^/?#]+)-(?:clip|episode)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cbc\.ca/(?!player/)(?:[^/]+/)+(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://olympics\.cbc\.ca/video/[^/]+/(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:cbcplayer:|https?://(?:www\.)?cbc\.ca/(?:player/play/|i/caffeine/syndicate/\?mediaId=))(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:gem|watch)\.cbc\.ca/(?:[^/]+/)+(?P<id>[0-9a-f-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://api-cbc\.cloud\.clearleap\.com/cloffice/client/web/play/?\?.*?\bcontentId=(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})""",li)) for li in lines]
    [s.append(re.findall(r"""(?:cbs:|https?://(?:www\.)?(?:(?:cbs|paramountplus)\.com/shows/[^/]+/video|colbertlateshow\.com/(?:video|podcasts))/)(?P<id>[\w-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<site>cnet|zdnet)\.com/(?:videos|video(?:/share)?)/(?P<id>[^/?]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://[a-z]+\.cbslocal\.com/video/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://[a-z]+\.cbslocal\.com/\d+/\d+/\d+/(?P<id>[0-9a-z-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cbsnews\.com/(?:news|video)/(?P<id>[\da-z_-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cbsnews\.com/embed/video[^#]*#(?P<id>.+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cbsnews\.com/live/video/(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cbssports\.com/[^/]+/video/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?ix)https?://(?:(?:www\.)?cbs|embed\.247)sports\.com/player/embed.+?
            (?:
                ids%3D(?P<id>[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12})|
                pcid%3D(?P<pcid>\d+)
            )""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ccma\.cat/(?:[^/]+/)*?(?P<type>video|audio)/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:[^/]+)\.(?:cntv|cctv)\.(?:com|cn)|(?:www\.)?ncpa-classic\.com)/(?:[^/]+/)*?(?P<id>[^/?#&]+?)(?:/index)?(?:\.s?html|[?#&]|$)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:www\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ceskatelevize\.cz/ivysilani/(?:[^/?#&]+/)*(?P<id>[^/#?]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ceskatelevize\.cz/porady/(?:[^/?#&]+/)*(?P<id>[^/#?]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?:channel9\.msdn\.com|s\.ch9\.ms)/(?P<contentpath>.+?)(?P<rss>/RSS)?/?(?:[?#&]|$)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?charlierose\.com/(?:video|episode)(?:s|/player)/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:[^/]+\.)?chaturbate\.com/(?:fullvideo/?\?.*?\bb=)?(?P<id>[^/?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?chilloutzone\.net/video/(?P<id>[\w|-]+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?chirb\.it/(?:(?:wp|pl)/|fb_chirbit_player\.swf\?key=)?(?P<id>[\da-zA-Z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?chirbit\.com/(?:rss/)?(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cielotv\.it/video/(?P<id>[^.]+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://player\.cinchcast\.com/.*?(?:assetId|show_id)=(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cinemax\.com/(?P<path>[^/]+/video/[0-9a-z-]+-(?P<id>\d+))""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ciscolive(?:\.cisco)?\.com/(?:global/)?on-demand-library(?:\.html|/)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ciscolive(?:\.cisco)?\.com/[^#]*#/session/(?P<id>[^/?&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cjsw\.com/program/(?P<program>[^/]+)/episode/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://(?:www\.)?cliphunter\.com/w/
            (?P<id>[0-9]+)/
            (?P<seo>.+?)(?:$|[#\?])
        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?clippituser\.tv/c/(?P<id>[a-z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?clip\.rs/(?P<id>[^/]+)/\d+""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:chic|www)\.clipsyndicate\.com/video/play(list/\d+)?/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?closertotruth\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:watch\.)?(?:cloudflarestream\.com|(?:videodelivery|bytehighway)\.net)/|
                                embed\.(?:cloudflarestream\.com|(?:videodelivery|bytehighway)\.net)/embed/[^/]+\.js\?.*?\bvideo=
                            )
                            (?P<id>[\da-f]{32}|[\w-]+\.[\w-]+\.[\w-]+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cloudy\.ec/(?:v/|embed\.php\?.*?\bid=)(?P<id>[A-Za-z0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?clubic\.com/video/(?:[^/]+/)*video.*-(?P<id>[0-9]+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?clyp\.it/(?P<id>[a-z0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cmt\.com/(?:videos|shows|(?:full-)?episodes|video-clips)/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://video\.cnbc\.com/gallery/\?video=(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cnbc\.com(?P<path>/video/(?:[^/]+/)+(?P<id>[^./?#&]+)\.html)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://(?:(?P<sub_domain>edition|www|money)\.)?cnn\.com/(?:video/(?:data/.+?|\?)/)?videos?/
            (?P<path>.+?/(?P<title>[^/]+?)(?:\.(?:[a-z\-]+)|(?=&)))""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:edition|www)\.)?cnn\.com/(?!videos?/)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://[^\.]+\.blogs\.cnn\.com/.+""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cc\.com/(?:episodes|video(?:-clips)?)/(?P<id>[0-9a-z]{6})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?comedycentral\.tv/folgen/(?P<id>[0-9a-z]{6})""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://(?:video|www|player(?:-backend)?)\.(?:allure|architecturaldigest|arstechnica|bonappetit|brides|cnevids|cntraveler|details|epicurious|glamour|golfdigest|gq|newyorker|self|teenvogue|vanityfair|vogue|wired|wmagazine)\.com/
            (?:
                (?:
                    embed(?:js)?|
                    (?:script|inline)/video
                )/(?P<id>[0-9a-f]{24})(?:/(?P<player_id>[0-9a-f]{24}))?(?:.+?\btarget=(?P<target>[^&]+))?|
                (?P<type>watch|series|video)/(?P<display_id>[^/?#]+)
            )""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?contv\.com/details-movie/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:www\.)?
                            (?P<domain>
                                (?:
                                    globaltv|
                                    etcanada|
                                    seriesplus|
                                    wnetwork|
                                    ytv
                                )\.com|
                                (?:
                                    hgtv|
                                    foodnetwork|
                                    slice|
                                    history|
                                    showcase|
                                    bigbrothercanada|
                                    abcspark|
                                    disney(?:channel|lachaine)
                                )\.ca
                            )
                            /(?:[^/]+/)*
                            (?:
                                video\.html\?.*?\bv=|
                                videos?/(?:[^/]+/)*(?:[a-z0-9-]+-)?
                            )
                            (?P<id>
                                [\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}|
                                (?:[A-Z]{4})?\d{12,20}
                            )
                        """,li)) for li in lines]
    [s.append(re.findall(r"""(?:coub:|https?://(?:coub\.com/(?:view|embed|coubs)/|c-cdn\.coub\.com/fb-player\.swf\?.*\bcoub(?:ID|id)=))(?P<id>[\da-z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cracked\.com/video_(?P<id>\d+)_[\da-z-]+\.html""",li)) for li in lines]
    [s.append(re.findall(r"""(?:crackle:|https?://(?:(?:www|m)\.)?(?:sony)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://embed\.crooksandliars\.com/(?:embed|v)/(?P<id>[A-Za-z0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.(?:com|fr)/(?:media(?:-|/\?id=)|(?:[^/]*/){1,2}[^/?&]*?)(?P<video_id>[0-9]+))(?:[/?&]|$)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.com/(?!(?:news|anime-news|library|forum|launchcalendar|lineup|store|comics|freetrial|login|media-\d+))(?P<id>[\w\-]+))/?(?:\?|$)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?c-span\.org/video/\?(?P<id>[0-9a-f]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://news\.cts\.com\.tw/[a-z]+/[a-z]+/\d+/(?P<id>\d+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ctv\.ca/(?P<id>(?:show|movie)s/[^/]+/[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:.+?\.)?ctvnews\.ca/(?:video\?(?:clip|playlist|bin)Id=|.*?)(?P<id>[0-9.]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://cu\.ntv\.co\.jp/(?!program)(?P<id>[^/?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:m\.)?culturebox\.francetvinfo\.fr/(?:[^/]+/)*(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cultureunplugged\.com/documentary/watch-online/play/(?P<id>\d+)(?:/(?P<display_id>[^/]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:app\.)?curiositystream\.com/video/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:app\.)?curiositystream\.com/(?:collections?|series)/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?cw(?:tv(?:pr)?|seed)\.com/(?:shows/)?(?:[^/]+/)+[^?]*\?.*\b(?:play|watch)=(?P<id>[a-z0-9]{8}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{12})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://dagelijksekost\.een\.be/gerechten/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dailymail\.co\.uk/(?:video/[^/]+/video-|embed/video/)(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?ix)
                        https?://
                            (?:
                                (?:(?:www|touch)\.)?dailymotion\.[a-z]{2,3}/(?:(?:(?:embed|swf|\#)/)?video|swf)|
                                (?:www\.)?lequipe\.fr/video
                            )
                            /(?P<id>[^/?_]+)(?:.+?\bplaylist=(?P<playlist_id>x[0-9a-z]+))?
                        """,li)) for li in lines]
    [s.append(re.findall(r"""(?:https?://)?(?:www\.)?dailymotion\.[a-z]{2,3}/playlist/(?P<id>x[0-9a-z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dailymotion\.[a-z]{2,3}/(?!(?:embed|swf|#|video|playlist)/)(?:(?:old/)?user/)?(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:m\.)?tvpot\.daum\.net/v/|videofarm\.daum\.net/controller/player/VodPlayer\.swf\?vid=)(?P<id>[^?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:m\.)?tvpot\.daum\.net/(?:clip/ClipView.(?:do|tv)|mypot/View.do)\?.*?clipid=(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:m\.)?tvpot\.daum\.net/mypot/(?:View\.do|Top\.tv)\?.*?playlistid=(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:m\.)?tvpot\.daum\.net/mypot/(?:View|Top)\.(?:do|tv)\?.*?ownerid=(?P<id>[0-9a-zA-Z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dagbladet\.no/video/(?:(?:embed|(?P<display_id>[^/]+))/)?(?P<id>[0-9A-Za-z_-]{11}|[a-zA-Z0-9]{8})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dctp\.tv/(?:#/)?filme/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?deezer\.com/playlist/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://.*?\.defense\.gouv\.fr/layout/set/ligthboxvideo/base-de-medias/webtv/(?P<id>[^/?#]*)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?democracynow\.org/(?P<id>[^\?]*)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dhm\.de/filmarchiv/(?:[^/]+/)+(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?digg\.com/video/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:s?evt\.dispeak|events\.digitallyspeaking)\.com/(?:[^/]+/)+xml/(?P<id>[^.]+)\.xml""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
            https?://(?:www\.)?(?:digiteka\.net|ultimedia\.com)/
            (?:
                deliver/
                (?P<embed_type>
                    generic|
                    musique
                )
                (?:/[^/]+)*/
                (?:
                    src|
                    article
                )|
                default/index/video
                (?P<site_type>
                    generic|
                    music
                )
                /id
            )/(?P<id>[\d+a-z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://
            (?P<site>
                go\.discovery|
                www\.
                    (?:
                        investigationdiscovery|
                        discoverylife|
                        animalplanet|
                        ahctv|
                        destinationamerica|
                        sciencechannel|
                        tlc
                    )|
                watch\.
                    (?:
                        hgtv|
                        foodnetwork|
                        travelchannel|
                        diynetwork|
                        cookingchanneltv|
                        motortrend
                    )
            )\.com/tv-shows/(?P<show_slug>[^/]+)/(?:video|full-episode)s/(?P<id>[^./?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://(?:www\.)?(?:
                discovery|
                investigationdiscovery|
                discoverylife|
                animalplanet|
                ahctv|
                destinationamerica|
                sciencechannel|
                tlc|
                velocitychannel
            )go\.com/(?:[^/]+/)+(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://(?:www\.)?(?:
                discovery|
                investigationdiscovery|
                discoverylife|
                animalplanet|
                ahctv|
                destinationamerica|
                sciencechannel|
                tlc|
                velocitychannel
            )go\.com/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<domain>(?:tlc|dmax)\.de|dplay\.co\.uk)/(?:programme|show|sendungen)/(?P<programme>[^/]+)/(?:video/)?(?P<alternate_id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?discoveryplus\.com/video/(?P<id>[^/]+/[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?discoveryvr\.com/watch/(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
            https?://(?P<domain>(?:[^/]+\.)?(?:disney\.[a-z]{2,3}(?:\.[a-z]{2})?|disney(?:(?:me|latino)\.com|turkiye\.com\.tr|channel\.de)|(?:starwars|marvelkids)\.com))/(?:(?:embed/|(?:[^/]+/)+[\w-]+-)(?P<id>[a-z0-9]{24})|(?:[^/]+/)?(?P<display_id>[^/?#]+))""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dlive\.tv/(?!p/)(?P<id>[\w.-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dlive\.tv/p/(?P<uploader_id>.+?)\+(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dotsub\.com/view/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://v(?:mobile)?\.douyu\.com/show/(?P<id>[0-9a-zA-Z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?douyu(?:tv)?\.com/(?:[^/]+/)*(?P<id>[A-Za-z0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://
            (?P<domain>
                (?:www\.)?(?P<host>d
                    (?:
                        play\.(?P<country>dk|fi|jp|se|no)|
                        iscoveryplus\.(?P<plus_country>dk|es|fi|it|se|no)
                    )
                )|
                (?P<subdomain_country>es|it)\.dplay\.com
            )/[^/]+/(?P<id>[^/]+/[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dr\.dk/bonanza/[^/]+/\d+/[^/]+/(?P<id>\d+)/(?P<display_id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dropbox[.]com/sh?/(?P<id>[a-zA-Z0-9]{15})/.*""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:www|m)\.)?drtuber\.com/(?:video|embed)/(?P<id>\d+)(?:/(?P<display_id>[\w-]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:www\.)?dr\.dk/(?:tv/se|nyheder|radio(?:/ondemand)?)/(?:[^/]+/)*|
                                (?:www\.)?(?:dr\.dk|dr-massive\.com)/drtv/(?:se|episode|program)/
                            )
                            (?P<id>[\da-z_-]+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dr\.dk/(?:tv|TV)/live/(?P<id>[\da-z-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?d\.tube/(?:#!/)?v/(?P<uploader_id>[0-9a-z.-]+)/(?P<id>[0-9a-z]{8})""",li)) for li in lines]
    [s.append(re.findall(r"""(?P<protocol>https?)://(?:(?:www|legacy)\.)?dumpert\.nl/(?:mediabase|embed|item)/(?P<id>[0-9]+[/_][0-9a-zA-Z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://video\.aktualne\.cz/(?:[^/]+/)+r~(?P<id>[0-9a-f]{32})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dw\.com/(?:[^/]+/)+(?:av|e)-(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?dw\.com/(?:[^/]+/)+a-(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        (?:
                            eagleplatform:(?P<custom_host>[^/]+):|
                            https?://(?P<host>.+?\.media\.eagleplatform\.com)/index/player\?.*\brecord_id=
                        )
                        (?P<id>\d+)
                    """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ebaumsworld\.com/videos/[^/]+/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?echo\.msk\.ru/sounds/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https://(?:app\.)?egghead\.io/(?:course|playlist)s/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https://(?:app\.)?egghead\.io/(?:api/v1/)?lessons/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ehftv\.com/[a-z]+(?:-[a-z]+)?/[^/]+/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ehow\.com/[^/_?]*_(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?1und1\.tv/watch/(?P<channel>[^/]+?)/(?P<id>[0-9]+)[^/]+(?:/(?P<recid>[0-9]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?P<host>einthusan\.(?:tv|com|ca))/movie/watch/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?eitb\.tv/(?:eu/bideoa|es/video)/[^/]+/\d+/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                            (?:
                                ellentube:|
                                https://api-prod\.ellentube\.com/ellenapi/api/item/
                            )
                            (?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ellentube\.com/(?:episode|studios)/(?P<id>.+?)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ellentube\.com/video/(?P<id>.+?)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:[^.]+\.)?elpais\.com/.*/(?P<id>[^/#?]+)\.html(?:$|[?#])""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www|cdn\.)?embedly\.com/widgets/media\.html\?(?:[^#]*?&)?url=(?P<id>[^#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?empflix\.com/(?:videos/(?P<display_id>.+?)-|[^/]+/(?P<display_id_2>[^/]+)/video)(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?engadget\.com/video/(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?eporner\.com/(?:(?:hd-porn|embed)/|video-)(?P<id>\w+)(?:/(?P<display_id>[\w-]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?eroprofile\.com/m/videos/view/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://?(?:(?:www|v1)\.)?escapistmagazine\.com/videos/view/[^/]+/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:
                                    (?:
                                        (?:(?:\w+\.)+)?espn\.go|
                                        (?:www\.)?espn
                                    )\.com/
                                    (?:
                                        (?:
                                            video/(?:clip|iframe/twitter)|
                                            watch/player
                                        )
                                        (?:
                                            .*?\?.*?\bid=|
                                            /_/id/
                                        )|
                                        [^/]+/video/
                                    )
                                )|
                                (?:www\.)espnfc\.(?:com|us)/(?:video/)?[^/]+/\d+/video/
                            )
                            (?P<id>\d+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:espn\.go|(?:www\.)?espn)\.com/(?:[^/]+/)*(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://video\.esri\.com/watch/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://ec\.europa\.eu/avservices/(?:video/player|audio/audioDetails)\.cfm\?.*?\bref=(?P<id>[A-Za-z0-9-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?tvonline\.ewe\.de/watch/(?P<channel>[^/]+?)/(?P<id>[0-9]+)[^/]+(?:/(?P<recid>[0-9]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?expotv\.com/videos/[^?#]*/(?P<id>[0-9]+)($|[?#])""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:www\.)?(?:expressen|di)\.se/
                            (?:(?:tvspelare/video|videoplayer/embed)/)?
                            tv/(?:[^/]+/)*
                            (?P<id>[^/?#&]+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?extremetube\.com/(?:[^/]+/)?video/(?P<id>[^/#?&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?eyedo\.tv/[^/]+/(?:#!/)?Live/Detail/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                    (?:
                        https?://
                            (?:[\w-]+\.)?(?:facebook\.com|facebookcorewwwi\.onion)/
                            (?:[^#]*?\#!/)?
                            (?:
                                (?:
                                    video/video\.php|
                                    photo\.php|
                                    video\.php|
                                    video/embed|
                                    story\.php|
                                    watch(?:/live)?/?
                                )\?(?:.*?)(?:v|video_id|story_fbid)=|
                                [^/]+/videos/(?:[^/]+/)?|
                                [^/]+/posts/|
                                groups/[^/]+/permalink/|
                                watchparty/
                            )|
                        facebook:
                    )
                    (?P<id>[0-9]+)
                    """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:[\w-]+\.)?facebook\.com/plugins/video\.php\?.*?\bhref=(?P<id>https.+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?faz\.net/(?:[^/]+/)*.*?-(?P<id>\d+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""^(?:https?://video\.fc2\.com/(?:[^/]+/)*content/|fc2:)(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://video\.fc2\.com/flv2\.swf\?(?P<query>.+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?fc-zenit\.ru/video/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:https?://(?:www\.)?filmon\.com/vod/view/|filmon:)(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?filmon\.com/(?:tv|channel)/(?P<id>[a-z0-9-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?filmweb\.no/(?P<type>trailere|filmnytt)/article(?P<id>\d+)\.ece""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?fivethirtyeight\.com/features/(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:www\.)?5-tv\.ru/
                            (?:
                                (?:[^/]+/)+(?P<id>\d+)|
                                (?P<path>[^/?#]+)(?:[/?#])?
                            )
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.|secure\.)?flickr\.com/photos/[\w\[email protected]]+/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?ft\.dk/webtv/video/[^?#]*?\.(?P<id>[0-9]+)\.aspx""",li)) for li in lines]
    [s.append(re.findall(r"""https?://footyroom\.com/matches/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?formula1\.com/en/latest/video\.[^.]+\.(?P<id>\d+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?fox\.com/watch/(?P<id>[\da-fA-F]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?fox9\.com/video/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?fox9\.com/news/(?P<id>[^/?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?foxgay\.com/videos/(?:\S+-)?(?P<id>\d+)\.shtml""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?P<host>video\.(?:insider\.)?fox(?:news|business)\.com)/v/(?:video-embed\.html\?video_id=)?(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?:insider\.)?foxnews\.com/(?!v)([^/]+/)+(?P<id>[a-z-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?foxsports\.com/(?:[^/]+/)*video/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://generation-what\.francetv\.fr/[^/]+/video/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?franceculture\.fr/emissions/(?:[^/]+/)*(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?franceinter\.fr/emissions/(?P<id>[^?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        (?:
                            https?://
                                sivideo\.webservices\.francetelevisions\.fr/tools/getInfosOeuvre/v2/\?
                                .*?\bidDiffusion=[^&]+|
                            (?:
                                https?://videos\.francetv\.fr/video/|
                                francetv:
                            )
                            (?P<id>[^@]+)(?:@(?P<catalog>.+))?
                        )
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://embed\.francetv\.fr/*\?.*?\bue=(?P<id>[^&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www|mobile|france3-regions)\.francetvinfo\.fr/(?:[^/]+/)*(?P<id>[^/?#&.]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?P<url>https?://(?:www\.)?(?:zouzous|ludo)\.fr/heros/(?P<id>[^/?#&]+))""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:www\.)?france\.tv|mobile\.france\.tv)/(?:[^/]+/)*(?P<id>[^/]+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?freesound\.org/people/[^/]+/sounds/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?freespeech\.org/stories/(?P<id>.+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://freshlive\.tv/[^/]+/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:frontendmasters:|https?://api\.frontendmasters\.com/v\d+/kabuki/video/)(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?frontendmasters\.com/courses/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?frontendmasters\.com/courses/(?P<course_name>[^/]+)/(?P<lesson_name>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://i\.fod\.fujitv\.co\.jp/plus7/web/[0-9a-z]{4}/(?P<id>[0-9a-z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?funimation(?:\.com|now\.uk)/(?:[^/]+/)?shows/[^/]+/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?funk\.net/(?:channel|playlist)/[^/]+/(?P<display_id>[0-9a-z-]+)-(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?fusion\.(?:net|tv)/(?:video/|show/.+?\bvideo=)(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?P<kind>www|m)\.)?fux\.com/(?:video|embed)/(?P<id>\d+)(?:/(?P<display_id>[^/?#&]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?gaia\.com/video/(?P<id>[^/?]+).*?\bfullplayer=(?P<type>feature|preview)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?gameinformer\.com/(?:[^/]+/)*(?P<id>[^.?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?gamespot\.com/(?:video|article|review)s/(?:[^/]+/\d+-|embed/)(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?game(?P<site>pro|star)\.de/videos/.*,(?P<id>[0-9]+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?gaskrank\.tv/tv/(?P<categories>[^/]+)/(?P<id>[^/]+)\.htm""",li)) for li in lines]
    [s.append(re.findall(r"""(?P<url>https?://(?:www\.)?gazeta\.ru/(?:[^/]+/)?video/(?:main/)*(?:\d{4}/\d{2}/\d{2}/)?(?P<id>[A-Za-z0-9-_.]+)\.s?html)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?gdcvault\.com/play/(?P<id>\d+)(?:/(?P<name>[\w-]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)https?://video\.
            (?:
                (?:
                    (?:espresso\.)?repubblica
                    |lastampa
                    |ilsecoloxix
                )|
                (?:
                    iltirreno
                    |messaggeroveneto
                    |ilpiccolo
                    |gazzettadimantova
                    |mattinopadova
                    |laprovinciapavese
                    |tribunatreviso
                    |nuovavenezia
                    |gazzettadimodena
                    |lanuovaferrara
                    |corrierealpi
                    |lasentinella
                )\.gelocal
            )\.it(?:/[^/]+){2,3}?/(?P<id>\d+)(?:[/?&#]|$)""",li)) for li in lines]
    [s.append(re.findall(r""".*""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:www|giant|thumbs)\.)?gfycat\.com/(?:ru/|ifr/|gifs/detail/)?(?P<id>[^-/?#\.]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?giantbomb\.com/(?:videos|shows)/(?P<display_id>[^/]+)/(?P<id>\d+-\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?giga\.de/(?:[^/]+/)*(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?iptv\.glattvision\.ch/watch/(?P<channel>[^/]+?)/(?P<id>[0-9]+)[^/]+(?:/(?P<recid>[0-9]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://share\.glide\.me/(?P<id>[A-Za-z0-9\-=_+]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:globo:|https?://.+?\.globo\.com/(?:[^/]+/)*(?:v/(?:[^/]+/)?|videos/))(?P<id>\d{7,})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://.+?\.globo\.com/(?:[^/]+/)*(?P<id>[^/.]+)(?:\.html)?""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:(?P<sub_domain>abc|freeform|watchdisneychannel|watchdisneyjunior|watchdisneyxd|disneynow|fxnow.fxnetworks)\.)?go|
                                (?P<sub_domain_2>abc|freeform|disneynow|fxnow\.fxnetworks)
                            )\.com/
                            (?:
                                (?:[^/]+/)*(?P<id>[Vv][Dd][Kk][Aa]\w+)|
                                (?:[^/]+/)*(?P<display_id>[^/?\#]+)
                            )
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?godtube\.com/watch/\?v=(?P<id>[\da-zA-Z]+)""",li)) for li in lines]
    [s.append(re.findall(r"""^https?://video\.golem\.de/.+?/(?P<id>.+?)/""",li)) for li in lines]
    [s.append(re.findall(r"""https?://podcasts\.google\.com/feed/(?P<feed_url>[^/]+)/episode/(?P<id>[^/?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://podcasts\.google\.com/feed/(?P<id>[^/?&#]+)/?(?:[?#&]|$)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                            https?://
                                (?:
                                    (?:docs|drive)\.google\.com/
                                    (?:
                                        (?:uc|open)\?.*?id=|
                                        file/d/
                                    )|
                                    video\.google\.com/get_player\?.*?docid=
                                )
                                (?P<id>[a-zA-Z0-9_-]{28,})
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?goshgay\.com/video(?P<id>\d+?)($|/)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://on-demand\.gputechconf\.com/gtc/2015/video/S(?P<id>\d+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?groupon\.com/deals/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hbo\.com/(?:video|embed)(?:/[^/]+)*/(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hearthis\.at/(?P<artist>[^/]+)/(?P<title>[A-Za-z0-9\-]+)/?$""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?heise\.de/(?:[^/]+/)+[^/]+-(?P<id>[0-9]+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hellporno\.(?:com/videos|net/v)/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://video\.helsinki\.fi/Arkisto/flash\.php\?id=(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""^https?://hentai\.animestigma\.com/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hetklokhuis\.nl/[^/]+/\d+/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hgtv\.com/shows/[^/]+/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://de\.hgtv\.com/sendungen/(?P<id>[^/]+/[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hidive\.com/stream/(?P<title>[^/]+)/(?P<key>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?historicfilms\.com/(?:tapes/|play)(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?P<domain>(?:history|biography)\.com)/player/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?history\.com/topics/[^/]+/(?P<id>[\w+-]+?)-video""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?:hitbox|smashcast)\.tv/(?:[^/]+/)*videos?/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?(?:hitbox|smashcast)\.tv/(?P<id>[^/?#&]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hitrecord\.org/records/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hkedcity\.net/etv/resource/(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""http?://(?:www\.)?hornbunny\.com/videos/(?P<title_dash>[a-z-]+)-(?P<id>\d+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hotnewhiphop\.com/.*\.(?P<id>.*)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?hotstar\.com
 
Odpovědět
8.11.2021 13:55
Avatar
heavyblack1
Člen
Avatar
heavyblack1:8.11.2021 14:21

Ahoj mam svůj automatizační script na dávkové stahovaní používám gallery-dl a youtube-dl na stahovaní odkazů z seznamu později chcu přidal podporu pro you-get a potřeboval bych odkazy které jsou podporovány youtube-dl dát pouze youtube-dl . A tak jsem se podival do zdrojáku youtube-dl protože na strankach mají pouze jména webu ne odkazy a tak jsem našel extrakčni sctipt na regex pattern a upravil jsem ho abych měl funkci find_yt_dl_si­tes(lines: list) popsanou níže.Problém je že nevrací nic a pomocí debugeru jsem zjistil že ze zahradního důvodu mam match se https://www.pixiv.net/ https://www.deviantart.com/ odkazy které yt-dl nepodporuje.

Zkusil jsem:

def find_yt_dl_sites(lines: list):
    """
    Return list of sites supported by youtube-dl
    """

    #lines = "".join(lines)

    s = []

    # Supported sites
    [s.append(re.findall(r"""https?://(?:www\.)?1tv\.ru/(?:[^/]+/)+(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:www\.)?20min\.ch/
                            (?:
                                videotv/*\?.*?\bvid=|
                                videoplayer/videoplayer\.html\?.*?\[email protected]
                            )
                            (?P<id>\d+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""(?x)(?:https?://)?(?:www\.)?220\.ro/(?P<category>[^/]+)/(?P<shorttitle>[^/]+)/(?P<id>[^/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?P<domain>[^.]+\.(?:twentythree\.net|23video\.com|filmweb\.no))/v\.ihtml/player\.html\?(?P<query>.*?\bphoto(?:_|%5f)id=(?P<id>\d+).*)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?247sports\.com/Video/(?:[^/?#&]+-)?(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?P<host>
                                (?:(?:www|porno?)\.)?24video\.
                                (?:net|me|xxx|sexy?|tube|adult|site|vip)
                            )/
                            (?:
                                video/(?:(?:view|xml)/)?|
                                player/new24_play\.swf\?id=
                            )
                            (?P<id>\d+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://playout\.3qsdn\.com/(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?3sat\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?P<kind>www|m)\.)?4tube\.com/(?:videos|embed)/(?P<id>\d+)(?:/(?P<display_id>[^/?#&]+))?""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:(?:www|player)\.)?56\.com/(?:.+?/)?(?:v_|(?:play_album.+-))(?P<textid>.+?)\.(?:html|swf)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:5min:|https?://(?:[^/]*?5min\.com/|delivery\.vidible\.tv/aol)(?:(?:Scripts/PlayerSeed\.js|playerseed/?)?\?.*?playList=)?)(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?:6play:|https?://(?:www\.)?(?P<domain>6play\.fr|rtlplay\.be|play\.rtl\.hr|rtlmost\.hu)/.+?-c_)(?P<id>[0-9]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?7plus\.com\.au/(?P<path>[^?]+\?.*?\bepisode-id=(?P<id>[^&#]+))""",li)) for li in lines]
    [s.append(re.findall(r"""https?://8tracks\.com/(?P<user>[^/]+)/(?P<id>[^/#]+)(?:#.*)?$""",li)) for li in lines]
    [s.append(re.findall(r"""(?:https?://)(?:www\.|)91porn\.com/.+?\?viewkey=(?P<id>[\w\d]+)""",li)) for li in lines]
    [s.append(re.findall(r"""9c9media:(?P<destination_code>[^:]+):(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?9gag\.com/gag/(?P<id>[^/?&#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?9now\.com\.au/(?:[^/]+/){2}(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://iview\.abc\.net\.au/(?:[^/]+/)*video/(?P<id>[^/?#]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                abcnews\.go\.com/
                                (?:
                                    (?:[^/]+/)*video/(?P<display_id>[0-9a-z-]+)-|
                                    video/(?:embed|itemfeed)\?.*?\bid=
                                )|
                                fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
                            )
                            (?P<id>\d+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?P<site>abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:(?:/[^/]+)*/(?P<display_id>[^/]+))?/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://clips\.abcotvs\.com/(?:[^/]+/)*video/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""^https?://(?:www\.)?academicearth\.org/playlists/(?P<id>[^?#/]+)""",li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:(?:embed|www)\.)?acast\.com/|
                                play\.acast\.com/s/
                            )
                            (?P<channel>[^/]+)/(?P<id>[^/#?]+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""(?x)
                        https?://
                            (?:
                                (?:www\.)?acast\.com/|
                                play\.acast\.com/s/
                            )
                            (?P<id>[^/#?]+)
                        """,li)) for li in lines]
    [s.append(re.findall(r"""https?://(?:www\.)?animedigitalnetwork\.fr/video/[^/]+/(?P<id>\d+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://\w+\.adobeconnect\.com/(?P<id>[\w-]+)""",li)) for li in lines]
    [s.append(re.findall(r"""https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)""",li)) for li in lines]
    # atakdale
    print(s)
    cleaner_list = [it for it in s if it != []] # remove empty list

    clean_list = []
    for list_item in cleaner_list:
        for st in list_item:
            if isinstance(st,str):
                if st != [] and st != '' and "http" in st:
                    clean_list.append(st)
            else:
                for v in st:
                    if v != [] and v != '' and "http" in v:
                        clean_list.append(v)
    # TODO: bug remove youtube link without video id

    return clean_list

Co potřebuji:
chtěl bych aby funkce vrátila odkazy pouze pro yt-dl abych mohl dat stahovacím funkcím stahovat paralelně s yt-dl a galery-dl odkazy aby mi nedávaly chyby že ten daný odkaz nepodporují. jinak celý kod je na celý kod funkce

 
Nahoru Odpovědět
8.11.2021 14:21
Avatar
Peter Mlich
Člen
Avatar
Peter Mlich:8.11.2021 14:39

To je spam?
Jestli nee. Nebylo by 10x rychlejsi vyfiltrovat domain-name a pak jej porovnat se stringy nez pres regularni vyrazy?
google = python regular domain-name from url

domainlist=[‘m.google.com’,
‘m.docs.google.com’,
‘www.someisotericdomain.innersite.mall.co.uk',
‘www.ouruniversity.department.mit.ac.us',
‘www.somestrangeurl.shops.relevantdomain.net',
‘www.example.info']

for l in domainlist:
res=re.findall(r’(?<=\.)([^.]+)(?:\.(?:co\.uk|ac\.us|[^.]+(?:$|\n)))’,l)
print(l, “|”, res[0])

m.google.com | google
m.docs.google.com | google
www.someisotericdomain.innersite.mall.co.uk | mall
www.ouruniversity.department.mit.ac.us | mit
www.somestrangeurl.shops.relevantdomain.net | relevantdomain
www.example.info | example
 
Nahoru Odpovědět
8.11.2021 14:39
Avatar
Peter Mlich
Člen
Avatar
Peter Mlich:8.11.2021 14:47

https://github.com/…rom-a-URL.py

import re
def domain_name(url):
    return re.search('(https?://)?(www\d?\.)?(?P<name>[\w-]+)\.', url).group('name')
 
Nahoru Odpovědět
8.11.2021 14:47
Avatar
Peter Mlich
Člen
Avatar
Peter Mlich:8.11.2021 14:52

V podstate, v tom kodu mas nekolik takovych filtru...

[s.append(re.findall(r"""(?x)https?://(?:www\.)?(?:
            discovery|
            investigationdiscovery|
            discoverylife|
            animalplanet|
            ahctv|
            destinationamerica|
            sciencechannel|
            tlc|
            velocitychannel
        )go\.com/(?P<id>[^/?#&]+)""",li)) for li in lines]
 
Nahoru Odpovědět
8.11.2021 14:52
Tento výukový obsah pomáhají rozvíjet následující firmy, které dost možná hledají právě tebe!
Avatar
heavyblack1
Člen
Avatar
Odpovídá na Peter Mlich
heavyblack1:8.11.2021 22:25

ne není to spam došel mi čas na editaci potom to to editační tlačitko zmizelo tak jsem to musel postnout znovu
jinak jsi mě špatně pochopil ja vim že google = python regular domain-name from url ale já jsem upravil make_supported­sites.py abych dostal regex patern protože tenhle seznam nemůžu použit abych dostal odkazy které jsou podporovány yt-dl aby je nestahoval gallery-dl jinak jména webu máji zde supportedsites.md ale ja potřebuji url např na gallery-dl mají seznam url tak jsem je zkopíroval z webu gallery-dl
Já hledám nejednodušší způsob jak vyfiltrovat linky na yt-dl s seznamu odkazů.

 
Nahoru Odpovědět
8.11.2021 22:25
Avatar
Peter Mlich
Člen
Avatar
Peter Mlich:9.11.2021 8:10

Nepises tecky, carky ani nove radky, takze se ten text odpovedi neda cist. S youtube nepracuji, takze "odkazy které jsou podporovány yt-dl aby je nestahoval gallery-dl" To je naprosty blabol, ktery nedava smysl. Kdo co, aby nestahoval? Url se preci filtruji pres htaccess ne v py. Nevim, jaky je rozdil mezi slovy yt-dl gallery-dl nebo cimkoliv napsanym japonsky nebo arabsky.

Ja bych sel po jmene domeny

//[s.append(re.findall(r"""(?x)https?://(?:www\.)?(?:
[s.append(re.findall(r"""(?x)https|http?://(?:www\.)?(?:
            discovery|
            investigationdiscovery|
            discoverylife|
            animalplanet|
            ahctv|
            destinationamerica|
            sciencechannel|
            tlc|
            velocitychannel
        )go\.com/(?P<id>[^/?#&]+)""",li)) for li in lines] // a tady to nejak prepsat, vypsat tam vsechny koncovky ze vsech stranek

A nebo bych sel cestou, kterou pouziva HW filtr treba pro maily, seznam vsech stranek, seznam stringu.
list = []
list.append("https://y­outube.com")
list.append("https://y­ahoo.com")
Z toho bych si vyrobil nejake indexovani, skatulkovani podle prvnich 10 znaku, Najdes prvnich 10 znaku a pak prohledava mensi seznam adres nez cely seznam. Respektive, to roztrid podle delky textu. Ja nevim, jak PY, nedelam s PY, ale obvykle reg. vyrazy jsou tak 10x pomalejsi a k fitrovani adress se moc nehodi.

 
Nahoru Odpovědět
9.11.2021 8:10
Avatar
Peter Mlich
Člen
Avatar
Peter Mlich:9.11.2021 8:17

Nebo bych to udelal jeste trochu jinak. Prvnich, treba 10 znaku (podle nejkratsi adresy), uplne smazal a nechal tam jen zbytek adresy, kde bych to roztridil podle prvniho znaku.
Jakoze, zacatek adresy tam vzdycky nejaky bude. A takhle bys mel ziskat vic skupin s mensim poctem adres.

https://youtube.com
https://yahoo.com

tube.com
oo.com
Akceptované řešení
+20 Zkušeností
+1 bodů
Řešení problému
 
Nahoru Odpovědět
9.11.2021 8:17
Avatar
heavyblack1
Člen
Avatar
Odpovídá na Peter Mlich
heavyblack1:12.11.2021 22:53

nakonec jsem se rozhodl z url pomoci urlparse modulu ziskal domenu a tu sem porovnal z domenami z supportedsites.md

 
Nahoru Odpovědět
12.11.2021 22:53
Děláme co je v našich silách, aby byly zdejší diskuze co nejkvalitnější. Proto do nich také mohou přispívat pouze registrovaní členové. Pro zapojení do diskuze se přihlas. Pokud ještě nemáš účet, zaregistruj se, je to zdarma.

Zobrazeno 9 zpráv z 9.