Go 如何处理无阻塞增长的队列_Go_Channel_Goroutine

Go 如何处理无阻塞增长的队列

Go 如何处理无阻塞增长的队列,go,channel,goroutine,Go,Channel,Goroutine,我试图理解如果队列可以从处理函数本身增长，如何在Go中处理队列。请参阅下面的代码在这段伪代码中，我想将处理程序的数量限制为10个。因此，我创建了10个处理队列的处理程序。然后，我用url启动队列我的问题是，根据文档，发送到某个频道的发送方将阻塞，直到接收方收到数据。在下面的代码中，每个进程都是一个处理新url的接收器。然而，很容易看出，如果一个进程向队列发送11个链接，它将阻塞，直到所有接收者处理完这些新链接。如果这些接收器每个都有一条链路，那么它们在向队列发送新的1条链路时也会阻塞。因为每

我试图理解如果队列可以从处理函数本身增长，如何在Go中处理队列。请参阅下面的代码

在这段伪代码中，我想将处理程序的数量限制为10个。因此，我创建了10个处理队列的处理程序。然后，我用url启动队列

我的问题是，根据文档，发送到某个频道的

发送方将阻塞，直到接收方收到数据。在下面的代码中，每个进程都是一个处理新url的接收器。然而，很容易看出，如果一个进程向队列发送11个链接，它将阻塞，直到所有接收者处理完这些新链接。如果这些接收器每个都有一条链路，那么它们在向队列发送新的1条链路时也会阻塞。因为每个人都被封锁了，所以什么都没有结束
我想知道go中的一般解决方案是什么，用于处理可以从进程本身增长的队列。请注意，我想我可以通过一个名为队列
的数组上的锁来实现这一点，但我试图了解如何使用通道来实现这一点
var queue = make(chan string)

func process(){
    for currentURL := range queue {
        links, _ := ... // some http call that gets links from a url
        for _, link := links {
            queue <- link
        }
    }
}

func main () {
   for i :=0; i < 10; i++ {
        go process()
   }

   queue <- "https://stackoverflow.com"
   ...
   // block until receive some quit message
   <-quit 
}

var queue=make（chan字符串）
func进程（）{
对于currentURL:=范围队列{
links，\:=…//从url获取链接的某个http调用
对于,link:=links{
queue有两件事可以做，一是使用缓冲通道，即使另一端没有人接收，也不要阻塞。这样，您可以立即刷新通道内的值
一种更有效的方法是检查通道中是否有可用的值，或者通道是否已关闭，当发送所有值时，发送方应该关闭这些值
接收机可以通过分配一个信道来测试信道是否已关闭
接收表达式的第二个参数
有两件事可以做，一是使用缓冲通道，即使另一端没有人接收，也不要阻塞。这样，您可以立即刷新通道内的值
一种更有效的方法是检查通道中是否有可用的值，或者通道是否已关闭，当发送所有值时，发送方应该关闭这些值
接收机可以通过分配一个信道来测试信道是否已关闭
接收表达式的第二个参数
一种简单的方法是将添加频道链接的代码移动到频道自己的go例程中。
这样，您的主处理可以继续，而阻塞的通道写入将阻塞一个单独的go例程
func process(){
    for currentURL := range queue {
        links, _ := ... // some http call that gets links from a url
        for _, link := links {
            l := link // this is important! ...
            // the loop will re-set the value of link before the go routine is started

            go func(l) {
                queue <- link // we'll be blocked here...
                // but the "parent" routine can still iterate through the channel
                // which in turn un-blocks the write
            }(l)
        }
    }
}

func进程（）{
对于currentURL:=范围队列{
links，\:=…//从url获取链接的某个http调用
对于,link:=links{
l:=链接//这很重要。。。
//循环将在go例程启动之前重新设置link的值
go func（左）{
queue您可以使用的一个简单方法是将添加到频道的链接的代码移动到它自己的go例程中。
这样，您的主处理可以继续，而阻塞的通道写入将阻塞一个单独的go例程
func process(){
    for currentURL := range queue {
        links, _ := ... // some http call that gets links from a url
        for _, link := links {
            l := link // this is important! ...
            // the loop will re-set the value of link before the go routine is started

            go func(l) {
                queue <- link // we'll be blocked here...
                // but the "parent" routine can still iterate through the channel
                // which in turn un-blocks the write
            }(l)
        }
    }
}

func进程（）{
对于currentURL:=范围队列{
links，\:=…//从url获取链接的某个http调用
对于,link:=links{
l:=链接//这很重要。。。
//循环将在go例程启动之前重新设置link的值
go func（左）{
queue Damn，我曾想过使用延迟语句，但出于某种原因，我没有想到只运行另一个goroutine。谢谢！我最初接受了这一点，但我遇到了另一个问题。即此解决方案可能会导致无限的go例程。在有许多链接的网站上使用我的webcrawler（如亚马逊这样的产品网站），我很快就超过了8192 goroutine的限制。有没有关于如何解决这个问题的建议？目前，我正在为整个链接列表创建goroutine，而不是为每个链接创建goroutine，但理论上仍然可以达到限制。您可以使用信号量[为了限制允许脚本启动的go例程的数量。我将用一个示例编辑我的答案…该死，我曾想过使用DERED语句，但出于某种原因，我没有想到只运行另一个GOROUTIE。谢谢！我最初接受了这一点，但我遇到了另一个问题。即此解决方案可能导致无限go routi在一个有很多链接的网站上使用我的webcrawler（比如像亚马逊这样的产品网站），我很快就超过了8192 goroutine的限制。有没有关于如何解决这个问题的建议？目前，我正在为整个链接列表创建goroutine，而不是为每个链接创建goroutine，但理论上仍然可以达到限制。您可以使用信号量[为了限制允许脚本启动的go例程的数量。我将用一个示例编辑我的答案。。。
func process(){
    for currentURL := range queue {
        links, _ := ... // some http call that gets links from a url
        for _, link := links {
            l := link // this is important! ...
            // the loop will re-set the value of link before the go routine is started

            go func(l) {
                queue <- link // we'll be blocked here...
                // but the "parent" routine can still iterate through the channel
                // which in turn un-blocks the write
            }(l)
        }
    }
}

func main () {
    maxWorkers := 5000
    sem := semaphore.NewWeighted(int64(maxWorkers))
    ctx := context.TODO()
    for i :=0; i < 10; i++ {
        go process(ctx)
    }

    queue <- "https://stackoverflow.com"
    // block until receive some quit message
    <-quit 
}

func process(ctx context.Context){
    for currentURL := range queue {
        links, _ := ... // some http call that gets links from a url
        for _, link := links {
            l := link // this is important! ...
            // the loop will re-set the value of link before the go routine is started

            // acquire a go routine...
            // if we are at the routine limit, this line will block until one becomes available
            sem.Acquire(ctx, 1)
            go func(l) {
                defer sem.Release(1)
                queue <- link // we'll be blocked here...
                // but the "parent" routine can still iterate through the channel
                // which in turn un-blocks the write
            }(l)
        }
    }
}