Scala 浏览器和web浏览器是否使用同一链接访问不同的源?

Scala 浏览器和web浏览器是否使用同一链接访问不同的源?,scala,xml-parsing,xhtml,web-scraping,Scala,Xml Parsing,Xhtml,Web Scraping,我目前正在尝试使用scala和eclipse从一些网页中获取一些数据,我的问题是,当我在浏览器中查看页面的源代码时,使用scala的xml包读取内容似乎非常简单: <!doctype html> <html lang="de"> <head> <meta charset="utf-8"> <title>some text</title> <meta name="keywords" content="som

我目前正在尝试使用scala和eclipse从一些网页中获取一些数据,我的问题是,当我在浏览器中查看页面的源代码时,使用scala的xml包读取内容似乎非常简单:

<!doctype html>
  <html lang="de">
  <head>
  <meta charset="utf-8">
<title>some text</title>

<meta name="keywords" content="some text" />
<meta name="description" content="some text" />
<meta name="robots" content="noodp"/>
<meta name="page-topic" content="some text" />

<meta http-equiv="x-ua-compatible" content="ie=edge"/>
...
那为什么我只能使用其他手机呢?我可以在浏览器中查看的页面版本,为什么会收到这样的错误消息


感谢

看起来可能是服务器根据用户代理标题确定您的scraper是移动浏览器,并向您提供了错误的页面,或者是您的scraper没有登录,因此发送了403禁止。你的刮刀能处理饼干吗?不。它没有饼干。
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de">
<head>
  <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
  <meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1" />
  <title>some text</title>
  <link rel="shortcut icon" type="image/ico" href="/favicon.ico" />
  <link href="/res/im.min.css" media="all, handheld" rel="stylesheet" type="text/css" />
</head>
...
Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: http://www.wapforum.org/DTD/xhtml-mobile10.dtd
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1625)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:633)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1271)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1238)
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:260)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1153)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1049)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:962)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:607)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:489)
    ...