BeautifulSoup:如何以 datwtime 格式獲取 youtube 視頻的發布日期時間? (BeautifulSoup: How to get publish datetime of a youtube video in datwtime format?)


問題描述

BeautifulSoup:如何以 datwtime 格式獲取 youtube 視頻的發布日期時間? (BeautifulSoup: How to get publish datetime of a youtube video in datwtime format?)

在我的爬蟲的一部分中,我需要以 youtube 視頻的日期時間格式抓取發布的時間和日期。我正在使用 bs4,到目前為止,我可以按照 YT GUI 向我們顯示的方式獲得發布的時間格式,即“發佈於 2017 年 5 月 6 日”。但我無法檢索實際的日期時間。我該怎麼做?

我的代碼:

    video_obj["date_published"] = video_soup.find("strong", attrs={"class": "watch‑time‑text"}).text
    return video_obj["date_published"] 

輸出:

Published on Feb 8, 2020

我想要的方式:

YYYY‑MM‑DD HH:MM:SS

參考解法

方法 1:

Once you get:

Published on Feb 8, 2020

You can do following to remove "Published on"

date_string = soup_string.strip("Published on")

To get this in format of YYYY‑MM‑DD HH:MM:SS you can use python‑dateutil library in python. You can install it using:

pip install python‑dateutil

Code:

from dateutil import parser
formatted_date = parser.parse("Published on Feb 8, 2020", fuzzy=True)

This will output date in YYYY‑MM‑DD HH:MM:SS

You can read more about python‑dateutil parser here

方法 2:

You could use pythons datetime to parse the String and Format the output.

pubstring = video_obj["date_published"]  # "Published on Feb 8, 2020"
# pubstring[:13] cuts of first 13 chars
dt = datetime.datetime.strptime(pubstring[13:], "%b %d, %Y")
return dt.strftime("%F") # Format as needed

(by Proteeti ProvaChinmay AtrawalkarBen)

參考文件

  1. BeautifulSoup: How to get publish datetime of a youtube video in datwtime format? (CC BY‑SA 2.5/3.0/4.0)

#datetime #datetime-format #Python #youtube #beautifulsoup






相關問題

NHibernate:HQL:從日期字段中刪除時間部分 (NHibernate:HQL: Remove time part from date field)

如何獲得在給定時間內發送超過 X 個數據包的 IP (How do I get IPs that sent more than X packets in less than a given time)

Памылка дадання даты пры адніманні ад 0:00 (Dateadd error when subtracting from 0:00)

查找與日曆相比缺失的日期 (Find missing date as compare to calendar)

CodeReview:java Dates diff(以天為單位) (CodeReview: java Dates diff (in day resolution))

顯示兩個給定時間之間的 15 分鐘步長 (display 15-minute steps between two given times)

如何在 C# 中獲取月份名稱? (How to get the month name in C#?)

fromtimestamp() 的反義詞是什麼? (What is the opposite of fromtimestamp()?)

構建 JavaScript 時缺少模塊 (Missing Module When Building JavaScript)

setTimeout 一天中的特定時間,然後停止直到下一個特定時間 (setTimeout for specific hours of day and then stop until next specific time)

將浮點數轉換為 datatime64[ns] (Converting float into datatime64[ns])

Python Dataframe 在連接時防止重複 (Python Dataframe prevent duplicates while concating)







留言討論