尝试使用selenium chrome驱动程序(Java)避免检测
我正在尝试进行自动浏览(航班搜索),但在尝试提交表单时,网站总是检测到我。我第一次尝试使用尝试使用selenium chrome驱动程序(Java)避免检测,java,selenium,selenium-webdriver,webdriver,selenium-chromedriver,Java,Selenium,Selenium Webdriver,Webdriver,Selenium Chromedriver,我正在尝试进行自动浏览(航班搜索),但在尝试提交表单时,网站总是检测到我。我第一次尝试使用Thread.sleep减慢表单填写过程,但也没有成功。我还尝试了这些帖子中建议的答案,删除了“Chrome正在由自动测试软件控制”的通知,但在提交表单时我仍然被检测到,并被要求验证我是人类。该网站是 下面是我的自动浏览代码:填写表单,但在提交时被要求验证 System.setProperty("webdriver.chrome.driver", "path\\chromed
Thread.sleep
减慢表单填写过程,但也没有成功。我还尝试了这些帖子中建议的答案,删除了“Chrome正在由自动测试软件控制”的通知,但在提交表单时我仍然被检测到,并被要求验证我是人类。该网站是
下面是我的自动浏览代码:填写表单,但在提交时被要求验证
System.setProperty("webdriver.chrome.driver", "path\\chromedriver.exe");
ChromeOptions options = new ChromeOptions();
//options.addArguments("--headless");
//My avoid detection tries
options.addArguments("user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36"); //specify user agent
options.setExperimentalOption("excludeSwitches", new String [] {"enable-automation"});
options.setExperimentalOption("useAutomationExtension", false);
ChromeDriver driver = new ChromeDriver(options); //create headless browser
Map javascriptCodes = Map.of(
"source", "Object.defineProperty(navigator, 'webdriver', {get: () => undefined }); Object.defineProperty(navigator, 'languages', {get: function() { return ['en-US', 'en']; }, }); Object.defineProperty(navigator, 'plugins', { get: function() { return [1, 2, 3, 4, 5]; }, });"
);
driver.executeCdpCommand("Page.addScriptToEvaluateOnNewDocument", javascriptCodes);
JavascriptExecutor js = driver;
driver.get("https://www.americanairlines.ie/intl/ie/index.jsp");
try {
//AUTOMATED BROWSING
//Wait till button is clickable
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
String waitCondition = "bookingModule-submit";
wait.until(ExpectedConditions.elementToBeClickable(By.id(waitCondition)));
WaitFor(5);
//if cookie popup alert, close it
if (driver.findElements(By.xpath("//button[@class='optoutmulti_button']")).size() != 0) {
//close popup
WebElement popUp = driver.findElement(By.xpath("//button[@class='optoutmulti_button']"));
popUp.click();
WaitFor(1);
}
//Scroll to form
//Find element by link text and store in variable "Element"
WebElement Element = driver.findElement(By.id("bookingModule"));
//This will scroll the page till the element is found
js.executeScript("arguments[0].scrollIntoView();", Element);
//Fill in origin
WebElement origin = driver.findElement(By.xpath("//input[@name='origin']"));
origin.sendKeys("MIA");
WaitFor(2);
//Fill in destination
WebElement destination = driver.findElement(By.xpath("//input[@name='destination']"));
destination.sendKeys("LAX");
WaitFor(2);
//Clear depart date
driver.findElement(By.id("aa-leavingOn")).clear();
WaitFor(1);
//Fill in depart date
driver.findElement(By.id("aa-leavingOn")).sendKeys("15/08/2020");
WaitFor(2);
//Clear depart date
driver.findElement(By.id("aa-returningFrom")).clear();
WaitFor(1);
//Fill in return date
driver.findElement(By.id("aa-returningFrom")).sendKeys("30/08/2020");
WaitFor(2);
//Submit form
//class = "btn btn-fullWidth"
WebElement submitBtn = driver.findElement(By.xpath("//input[@id='bookingModule-submit']"));
submitBtn.click();
//UPON CLICKING HERE IS WHERE I GET ASKED TO VERIFY MYSELF :(
//Set web driver wait to 20 seconds. Waits till it receives the element with id = fare-select-button
wait = new WebDriverWait(driver, Duration.ofSeconds(20));
waitCondition = "owd-calendar-slide-previous";
wait.until(ExpectedConditions.elementToBeClickable(By.className(waitCondition)));
String innerHTML = driver.findElement(By.className("row upsell-bound")).getAttribute("innerHTML");
System.out.println(innerHTML);
}
catch (Exception ex){
ex.printStackTrace();
}
还有我的睡眠方法
public static void WaitFor(int seconds) throws InterruptedException {
seconds = seconds * 1000;
Thread.sleep(seconds);
}
我的Chrome浏览器版本是83。Selenium是3.141.x,我也尝试了Selenium 4.0.0-alpha-4,但没有效果。许多网站都有反爬虫机制。从硒的角度来看,没有机会避免这些检测。您必须要求web管理员允许特定爬虫程序(selenium)或一些旁路(例如,特定的请求头和一些凭据)。您如何尝试
options.addArguments(“--disable blink features=AutomationControlled”)代码>未在代码中提及。执行ecdpcommand(“Page.addscripttoevaluationnewdocument”。
如果在调用新文档之前读取该值,则此操作没有帮助。我使用一个简单的html+JS页面进行了测试,该页面在navigator
中查找,许多网站都有反爬虫机制。从硒的角度来看,没有机会避免这些检测。您必须要求web管理员允许特定爬虫程序(selenium)或一些旁路(例如,特定的请求头和一些凭据)。您如何尝试options.addArguments(“--disable blink features=AutomationControlled”)代码>未在代码中提及。执行ecdpcommand(“Page.addscripttoevaluationnewdocument”。
如果在调用新文档之前读取该值,则此操作没有帮助。我使用一个简单的html+JS页面进行测试,该页面在navigator