Javascript Apple或Shellscript |获取属性&引用;src";由它生成的图像';s classname并下载它
正在与这个问题斗争:Javascript Apple或Shellscript |获取属性&引用;src";由它生成的图像';s classname并下载它,javascript,shell,web-scraping,applescript,Javascript,Shell,Web Scraping,Applescript,正在与这个问题斗争: 通过var All_url(包含url作为字符串)循环 使用类“image\u stack\u image js default img”从图像获取所有src属性 将所有图像下载到文件夹中,并使用源页面中的URL作为img名称 这就是我现在所拥有的一切,我找不到一个工作的解决方案(除了Automator中的一个动作)能按预期工作 tell application "Finder" set myPath to container of (path to
tell application "Finder"
set myPath to container of (path to me) as text -- SET MAIN PATH
end tell
set AllUrls to {"https://teespring.com/shop/CLASSIC-DODGE-CHARGER-MOP?aid=marketplace&tsmac=marketplace&tsmic=search#pid=212&cid=5819&sid=front", "https://teespring.com/shop/greaser-mechanics-t-shirt?aid=marketplace&tsmac=marketplace&tsmic=campaign#pid=2&cid=2397&sid=front"}
--set ImageSrc to (script to get the src attribute from the class "image_stack__image js-default-img"
--set IMGname to the Page URL where the image is
set dFolder to myPath & "thumbnails"
set fName to IMGname & ".jpg" as string
do shell script ("mkdir -p " & dFolder & "; curl -A/--user-agent " & AllUrls & " >> " & (dFolder & fName))
非常感谢您的帮助。谢谢
更新:
tell application "Finder" -- get filepath to file container/folder
set myPath to container of (path to me) as text -- SET MAIN PATH
end tell
set allURLs to {"https://teespring.com/shop/CLASSIC-DODGE-CHARGER-MOP?aid=marketplace&tsmac=marketplace&tsmic=search#pid=212&cid=5819&sid=front", "https://teespring.com/shop/dodge-mopar-m?aid=marketplace&tsmac=marketplace&tsmic=search#pid=2&cid=2397&sid=front"}
set JS to "document.querySelector('.image_stack__image').src"
set sh to {"cd ~/desktop/thumbnails;", "curl --remote-name-all ", {}} -- need to set the location to the home folder of the script and the filename to 1.jpg , 2.jpg ..
set the text item delimiters to space
tell application "Safari" to repeat with www in allURLs
set D to (make new document with properties {URL:www})
# Wait until webpage has loaded
tell D to repeat until not (exists)
delay 0.5
end repeat
set the last item of sh to do JavaScript JS in the front document
close the front document
do shell script (sh as text)
结束重复要从类为
image\u stack\u image
的元素中获取所有图像URL(假设此类元素为属性值:
Array.from(document.querySelectorAll('.image_stack__image'), e=>e.src)
在Safari中使用DoJavaScript
命令时,AppleScript将自动将其转换为列表
要将所有URLcURL
放入主文件夹中的目录“缩略图”
,并将每个图像以与远程文件相同的名称保存,请先将cd
放入目录,然后使用--remote name all
选项cURL
:
cd ~/thumbnails; curl --remote-name-all %url1% %url2% ...
警告:可能不会下载具有异常URL的图像,例如通过CGI请求动态生成的图像,或者src
属性包含base64编码数据的图像。事实上,curl
请求中存在这些图像可能会中断整个请求
要连接从JavaScript方法返回的URL列表,以便将其直接插入cURL
,只需使用空格
作为分隔符,强制将AppleScript列表插入文本
:
set JS to "Array.from(document.querySelectorAll('.image_stack__image'), e=>e.src);"
set sh to {"cd ~/thumbnails;", "curl --remote-name-all"}
set the text item delimiters to space
tell application "Safari" to tell ¬
the front document to set ¬
the end of sh to ¬
do JavaScript JS
do shell script (sh as text)
然后,通过在repeat
循环中包含适当的代码行,对每个网页URL重复完全相同的过程:
set allURLs to {%your list of URLs%}
set JS to "Array.from(document.querySelectorAll('.image_stack__image'),e=>e.src);"
set sh to {"cd ~/thumbnails;", "curl --remote-name-all", {}}
set the text item delimiters to space
tell application "Safari" to repeat with www in allURLs
set D to (make new document with properties {URL:www})
# Wait until webpage has loaded
tell D to repeat until not (exists)
delay 0.5
end repeat
set the last item of sh to do JavaScript JS in the front document
close the front document
do shell script (sh as text)
end repeat
这是最基本的。在URL格式异常或网页无法加载等情况下,您需要注意错误处理,但现在您已经拥有了完成所请求步骤的所有工具
另外,我建议您阅读curl
(在终端中键入man curl
)的手册页,并阅读--remote name all
选项,并发现许多您可能认为有益的其他选项
但是我会尽我最大的努力帮助你解决遇到的任何小路障或与我所写内容相关的问题。URL不包含类mainImage
@vadian这只是一个例子。提供的URL上的类将是“image\u stack\u image js default img”ThanksHTML解析很大程度上依赖于具体数据。在AppleScript中,您可以使用文本项分隔符来进行解析。
@vadian:更新脚本并解析部分,以使用.js获得图像src-我只是无法解决列表中的Nr.3和Nr.4。Grüess uf St.Galle:)这两个告诉应用程序“Finder”的结果
代码块和设置主路径到…
代码行生成“Macintosh HD:Applications:Utilities:”
在脚本编辑器中。你为什么把目标定在那个地方?无论如何,SO不是一个代码编写服务,如果您需要代码方面的帮助,那么您确实需要提供您想要的工作示例。例如,在#3中,从其中一个URL中选择一个实际文件,并显示要保存的文件和内容!至于#4,这就是repeat
命令的用途,例如循环浏览列表。您好@CJK:非常感谢您的帮助。我一直在玩弄你的代码有一段时间了,但还不能使它按预期工作。我已经阅读了手册页,但找不到远程名称所有描述?谷歌也没有提供任何细节。无论如何,我会把我编辑过的代码添加到我的OT中,并在上面添加问题。再次感谢,非常欢迎您的帮助。
set allURLs to {%your list of URLs%}
set JS to "Array.from(document.querySelectorAll('.image_stack__image'),e=>e.src);"
set sh to {"cd ~/thumbnails;", "curl --remote-name-all", {}}
set the text item delimiters to space
tell application "Safari" to repeat with www in allURLs
set D to (make new document with properties {URL:www})
# Wait until webpage has loaded
tell D to repeat until not (exists)
delay 0.5
end repeat
set the last item of sh to do JavaScript JS in the front document
close the front document
do shell script (sh as text)
end repeat