Scala 如何使用Akka Http进行并行Http请求?

Scala 如何使用Akka Http进行并行Http请求?,scala,akka,akka-http,Scala,Akka,Akka Http,我是Scala新手,正在尝试实现一个库,在这个库中我将获得数千个URL。我的工作是从这些URL下载内容。我本来会选择simplescalajhttplibrary,但它不符合我的目的。 我附带的代码是: class ProxyHttpClient { def get(url: String, proxy: ProxySettings,urlDownloaderConfig: UrlDownloaderConfig)(implicit ec: ExecutionCon

我是Scala新手,正在尝试实现一个库,在这个库中我将获得数千个URL。我的工作是从这些URL下载内容。我本来会选择simple
scalajhttp
library,但它不符合我的目的。 我附带的代码是:

    class ProxyHttpClient {
      def get(url: String, proxy: ProxySettings,urlDownloaderConfig: 
    UrlDownloaderConfig)(implicit ec: ExecutionContext): Either[HttpError, 
    HttpSuccessResponse] = {
        implicit val system: ActorSystem = ActorSystem()
        implicit val materializer: ActorMaterializer = ActorMaterializer()


        val auth = headers.BasicHttpCredentials(proxy.userName, 
    proxy.secret)
    val httpsProxyTransport = 
      ClientTransport.httpsProxy(InetSocketAddress.createUnresolved(
    proxy.host, proxy.port), auth)
    val settings = 
ConnectionPoolSettings(system).withTransport(httpsProxyTransport)
    val response: Future[HttpResponse] = 

Http().singleRequest(HttpRequest().
withMethod(HttpMethods.GET).withUri(url), settings = settings)

    val data: Future[Either[HttpError, HttpSuccessResponse]] = `response.map {`
      case response@HttpResponse(StatusCodes.OK, _, _, _) => {
        val content: Future[String] = Unmarshal(response.entity).to[String]
        val finalContent = Await.ready(content, timeToWaitForContent).value.get.get.getBytes
        Right(HttpSuccessResponse(url, response.status.intValue(), finalContent))
      }
      case errorResponse@HttpResponse(StatusCodes.GatewayTimeout, _, _, _) => Left(HttpError(url, errorResponse.status.intValue(), errorResponse.entity.toString))
    }
    val result: Try[Either[HttpError, HttpSuccessResponse]] = Await.ready(data, timeToWaitForResponse).value.get
    val pop: Either[HttpError, HttpSuccessResponse] = try {
      result.get
    } catch {
      case e: Exception => Left(HttpError(url, HttpStatus.SC_INTERNAL_SERVER_ERROR, e.getMessage))
    }
    pop
  }
}
用于调用我正在使用的
get
方法

val forkJoinPool = new scala.concurrent.forkjoin.ForkJoinPool(8)
picList.par.tasksupport = new ForkJoinTaskSupport(forkJoinPool)
picList.par.map(testUrl => {
      val resp = get(url, Option(proxy))

    })
它平稳地运行了几次,但当我试图调用1000个URL的方法来获取批量大小为100的图像时,它抛出了下面的错误。之后,即使是单个URL,我也会遇到同样的错误

**java.lang.OutOfMemoryError: unable to create new native thread**
  • 我应该在这里使用actors而不是actorsystem,并为其指定一个单独的调度器吗

  • 既然我持有的是二进制图像的内容,那么在它们达到目的后,我是否必须注意将其从内存中删除

  • 代码片段将更有帮助。提前谢谢

    我试图按照人们建议使用的在线建议进行操作

    val blockingExecutionContext = system.dispatchers.lookup("blocking-dispatcher")
    
    但当我尝试时,
    system.dispatchers.lookup
    返回的是messagedispatcher类型

    implicit val system: ActorSystem = ActorSystem()
        val ex: MessageDispatcher =system.dispatchers.lookup("io-blocking-dispatcher")
    

    我是否缺少任何库或导入?

    您的问题很可能与为每个http调用创建actor系统有关。参与者系统通常是每个应用程序一个

    做一个小的重构并尝试一下

    class ProxyHttpClient() {
      private implicit val system: ActorSystem = ActorSystem()
      private implicit val materializer: ActorMaterializer = ActorMaterializer()
    
      def get(url: String, proxy: ProxySettings,urlDownloaderConfig: 
        UrlDownloaderConfig)(implicit ec: ExecutionContext): Either[HttpError, 
        HttpSuccessResponse] = {???}
    }
    
    或者提取actor系统并将其作为隐式参数传递

    class ProxyHttpClient() {
    
      def get(url: String, proxy: ProxySettings,urlDownloaderConfig: 
        UrlDownloaderConfig)(implicit ec: ExecutionContext, system: ActorSystem, materializer: ActorMaterializer): Either[HttpError, 
        HttpSuccessResponse] = {???}
    }
    

    您的问题很可能与为每个http调用创建actor系统有关。参与者系统通常是每个应用程序一个

    做一个小的重构并尝试一下

    class ProxyHttpClient() {
      private implicit val system: ActorSystem = ActorSystem()
      private implicit val materializer: ActorMaterializer = ActorMaterializer()
    
      def get(url: String, proxy: ProxySettings,urlDownloaderConfig: 
        UrlDownloaderConfig)(implicit ec: ExecutionContext): Either[HttpError, 
        HttpSuccessResponse] = {???}
    }
    
    或者提取actor系统并将其作为隐式参数传递

    class ProxyHttpClient() {
    
      def get(url: String, proxy: ProxySettings,urlDownloaderConfig: 
        UrlDownloaderConfig)(implicit ec: ExecutionContext, system: ActorSystem, materializer: ActorMaterializer): Either[HttpError, 
        HttpSuccessResponse] = {???}
    }
    

    所有URL都指向同一主机和端口吗?是的,它们通过同一个代理路由。所有URL都指向同一主机和端口吗?是的,它们通过同一个代理路由。是的,我怀疑是相同的,创建了一个对象,而系统现在只创建了一次。你的评论证实了我的怀疑。非常感谢。是的,我也怀疑这一点,我创建了一个对象,而系统现在只创建了一次。你的评论证实了我的怀疑。非常感谢。