Javascript 通过程序获取Coursera视频下载链接

Javascript 通过程序获取Coursera视频下载链接,javascript,java,python,python-requests,coursera-api,Javascript,Java,Python,Python Requests,Coursera Api,我想通过这些链接后面的程序(主要是Python)提取Coursera视频下载链接 在阅读了大量关于此的文章后,仍然无法找到通过程序提取视频下载链接的方法,任何人都可以提供一个逐步提取视频下载链接的解决方案吗?谢谢 我知道这个项目,但代码太复杂了,所以我退出了 感谢您的回答,我已经成功地进行了chrome扩展以下载视频,我将使用它请求页面、登录等,并解析生成的数据。总的想法是 r = requests.get(url_you_want) domTree = BeautifulSoup(r.t

我想通过这些链接后面的程序(主要是Python)提取Coursera视频下载链接

在阅读了大量关于此的文章后,仍然无法找到通过程序提取视频下载链接的方法,任何人都可以提供一个逐步提取视频下载链接的解决方案吗?谢谢

我知道这个项目,但代码太复杂了,所以我退出了


感谢您的回答,我已经成功地进行了chrome扩展以下载视频,我将使用它请求页面、登录等,并解析生成的数据。总的想法是

r = requests.get(url_you_want)
domTree = BeautifulSoup(r.text)
link=domTree.find(id="WhateverIDTheLinkHasInTheDownloadPage")
[...etc...]

如果你想让某人为你完成全部工作,我帮不了你,不过…

我今天用,package,为自己制作了一个Coursera下载程序。(这三个工具在回答的其余部分中是必需的)

首先,你需要在coursera中找到你想要的课程并注册。 之后,您应该完成下面的代码并运行它。这需要一段时间,但结果(所有视频链接)将写入文本文件:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

# ########################### #
# ####-fill these vars-###### #
# ########################### #

# coursera login information:

username = "~"  # e.g. : username = "john@doe.com"
password = "~"  # e.g. : password = "12345asdfg"

# course details to download IMPORTANT: you should be enrolled in the course

path_course = "https://www.coursera.org/learn/programming-languages/home/week/1"  # link to the course first week e.g. : path_course = "https://www.coursera.org/learn/game-theory-1/home/week/1"
num_of_weeks = 5  # number of course weeks(or desired weeks to download) e.g. : num_of_weeks = 5
path_to_save = "E:\\programming-languages-coursera-links.txt"  # path to the file in wich the links will be saved e.g. : path_to_save = "E:\\interactive-python-links.txt"
#############################
#############################
#############################
print_each_link = False


# defining functions :
def get_links_of_week(week_add):
    """
    this function gets the download links from the course.
    :param week_add: address to the specific week in order to get links
    :return: a list containing all download links regarding the specific week.
    """
    driver.get(week_add)
    print("going for" + week_add)
    driver.implicitly_wait(5)
    elems = driver.find_elements_by_xpath("//a[@href]")
    links = []
    for elem in elems:
        sublink = elem.get_attribute("href")
        # print(sublink)
        if sublink.find("lecture") != -1 and sublink not in links:
            links.append(sublink)
    # print("---------------")
    # print(links)
    inner_links = []

    for link in links:
        driver.get(link)
        driver.implicitly_wait(5)
        inner_elems = driver.find_elements_by_xpath("//a[@href]")
        for inelem in inner_elems:
            sub_elem = inelem.get_attribute("href")

            # print(sub_elem)
            if sub_elem.find("mp4") != -1:
                print("the link : " + sub_elem[37:77] + "... fetched")
                inner_links.append(sub_elem)

    return inner_links


def get_week_list():
    """
    this function gets the URL address from predefined variables from the top
    :return: a list containing each week main page.
    """
    weeks = []
    print('list of weeks are : ')
    for x in range(1, num_of_weeks + 1):
        weeks.append(path_course[:-1] + str(x))
        print(path_course[:-1] + str(x))
    return weeks


# loading chrome driver
driver = webdriver.Chrome("E:\\chromedriver.exe")
# login to Coursera
driver.get(path_course)
driver.implicitly_wait(10)
email = driver.find_element_by_name("email")
email.click()
email.send_keys(username)
pas = driver.find_element_by_name("password")
pas.click()
pas.send_keys(password)
driver.find_element_by_xpath("//*[@id=\"rendered-content\"]/div/div/div/div[3]/div/div/div/form/button").send_keys(
    Keys.RETURN)

# fetching links from each week web page
weeks_link = get_week_list()
all_links = []
for week in weeks_link:
    all_links += get_links_of_week(week)
driver.close()

# write to file
print("now writing to file ...")
text_file = open(path_to_save, "w")
for each_link in all_links:
    if print_each_link:
        print(each_link + "\n")
    text_file.write(each_link)
    text_file.write("\n")
text_file.close()
print("---------------------------------")
print("all Links are fetched successfully")

如果您遇到任何问题,请在此处发表评论。

是的,我知道这两个软件包,我也可以通过Python登录Coursera,困难在于如何从页面中找到视频下载链接。该链接在源代码中是否有特定的类或ID?希望这两件事中有一件是真的;另一种方法是尝试查看它是否每次都出现在DOM树中的相同位置。