Groovy splitCsv然后在Nextflow中映射URL列表

Groovy splitCsv然后在Nextflow中映射URL列表,groovy,nextflow,Groovy,Nextflow,我正在尝试获取GIAB数据索引文件(CSV),并在Nextflow中下载每个文件。我想我的总体结构是正确的,但是当我运行nextflow run file.nf时,什么都没有发生 Channel.fromPath(file('https://raw.githubusercontent.com/genome-in-a-bottle/giab_data_indexes/master/NA12878/sequence.index.NA12878_Illumina_HiSeq_Exome_Garvan_

我正在尝试获取GIAB数据索引文件(CSV),并在Nextflow中下载每个文件。我想我的总体结构是正确的,但是当我运行
nextflow run file.nf
时,什么都没有发生

Channel.fromPath(file('https://raw.githubusercontent.com/genome-in-a-bottle/giab_data_indexes/master/NA12878/sequence.index.NA12878_Illumina_HiSeq_Exome_Garvan_trimmed_fastq_09252015'))
    .splitCsv(header: true)
    .map { it.FASTQ }
    .set { giab_urls }


process download_giab {
    storeDir 'giab'

    input:
        file giab_url from giab_urls

    output:
        file '*.fastq' into giab_fastqs

    script:
        """
        lftp -c 'get $giab_url'
        """
}
生成的日志文件如下所示:

Nov-13 18:18:43.537 [main] DEBUG nextflow.cli.Launcher - $> /opt/miniconda3/bin/nextflow run main.nf
Nov-13 18:18:43.653 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 18.10.1
Nov-13 18:18:43.661 [main] INFO  nextflow.cli.CmdRun - Launching `main.nf` [agitated_cori] - revision: 5cf3310536
Nov-13 18:18:43.757 [main] DEBUG nextflow.Session - Session uuid: c19f86b4-0eff-43de-8ad4-cb7936701490
Nov-13 18:18:43.758 [main] DEBUG nextflow.Session - Run name: agitated_cori
Nov-13 18:18:43.759 [main] DEBUG nextflow.Session - Executor pool size: 4
Nov-13 18:18:43.769 [main] DEBUG nextflow.cli.CmdRun - 
  Version: 18.10.1 build 5003
  Modified: 24-10-2018 14:03 UTC (25-10-2018 01:03 AEDT)
  System: Linux 4.15.0-38-generic
  Runtime: Groovy 2.5.3 on OpenJDK 64-Bit Server VM 1.8.0_181-8u181-b13-1ubuntu0.18.04.1-b13
  Encoding: UTF-8 (UTF-8)
  Process: 8747@michael-Latitude-7480 [127.0.1.1]
  CPUs: 4 - Mem: 23.4 GB (1.9 GB) - Swap: 2 GB (2 GB)
Nov-13 18:18:43.832 [main] DEBUG nextflow.Session - Work-dir: /home/michael/Programming/CromwellValidation/work [ext2/ext3]
Nov-13 18:18:43.832 [main] DEBUG nextflow.Session - Script base path does not exist or is not a directory: /home/michael/Programming/CromwellValidation/bin
Nov-13 18:18:43.904 [main] DEBUG nextflow.Session - Session start invoked
Nov-13 18:18:43.911 [main] DEBUG nextflow.processor.TaskDispatcher - Dispatcher > start
Nov-13 18:18:43.911 [main] DEBUG nextflow.script.ScriptRunner - > Script parsing
Nov-13 18:18:44.244 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
Nov-13 18:18:44.586 [main] DEBUG nextflow.processor.ProcessFactory - << taskConfig executor: null
Nov-13 18:18:44.586 [main] DEBUG nextflow.processor.ProcessFactory - >> processorType: 'local'
Nov-13 18:18:44.593 [main] DEBUG nextflow.executor.Executor - Initializing executor: local
Nov-13 18:18:44.596 [main] INFO  nextflow.executor.Executor - [warm up] executor > local
Nov-13 18:18:44.600 [main] DEBUG n.processor.LocalPollingMonitor - Creating local task monitor for executor 'local' > cpus=4; memory=23.4 GB; capacity=4; pollInterval=100ms; dumpInterval=5m
Nov-13 18:18:44.604 [main] DEBUG nextflow.processor.TaskDispatcher - Starting monitor: LocalPollingMonitor
Nov-13 18:18:44.605 [main] DEBUG n.processor.TaskPollingMonitor - >>> barrier register (monitor: local)
Nov-13 18:18:44.616 [main] DEBUG nextflow.executor.Executor - Invoke register for executor: local
Nov-13 18:18:44.672 [main] DEBUG nextflow.Session - >>> barrier register (process: download_giab)
Nov-13 18:18:44.676 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > download_giab -- maxForks: 4
Nov-13 18:18:44.736 [main] DEBUG nextflow.script.ScriptRunner - > Await termination 
Nov-13 18:18:44.736 [main] DEBUG nextflow.Session - Session await
Nov-13 18:18:44.758 [Actor Thread 3] DEBUG nextflow.Session - <<< barrier arrive (process: download_giab)
Nov-13 18:18:44.759 [main] DEBUG nextflow.Session - Session await > all process finished
Nov-13 18:18:44.813 [Task monitor] DEBUG n.processor.TaskPollingMonitor - <<< barrier arrives (monitor: local)
Nov-13 18:18:44.813 [main] DEBUG nextflow.Session - Session await > all barriers passed
Nov-13 18:18:44.818 [main] DEBUG nextflow.trace.StatsObserver - Workflow completed > WorkflowStats[succeedCount=0; failedCount=0; ignoredCount=0; cachedCount=0; succeedDuration=0ms; failedDuration=0ms; cachedDuration=0ms]
Nov-13 18:18:44.826 [main] DEBUG nextflow.CacheDB - Closing CacheDB done
Nov-13 18:18:44.842 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye
11月13日18:18:43.537[main]调试nextflow.cli.Launcher-$>/opt/miniconda3/bin/nextflow运行main.nf
11月13日18:18:43.653[main]INFO nextflow.cli.CmdRun-N E X T F L O W~版本18.10.1
11月13日18:18:43.661[main]INFO nextflow.cli.CmdRun-启动'main.nf`[magitated_cori]-版本:5cf3310536
11月13日18:18:43.757[主]调试下一个流。会话-会话uuid:c19f86b4-0eff-43de-8ad4-cb7936701490
11月13日18:18:43.758[main]调试下一个流程。会话-运行名称:激动的_cori
11月13日18:18:43.759[main]调试下一个流。会话-执行器池大小:4
11月13日18:18:43.769[main]调试nextflow.cli.CmdRun-
版本:18.10.1构建5003
修改日期:2018年10月24日14:03 UTC(2018年10月25日01:03 AEDT)
系统:Linux 4.15.0-38-generic
运行时:OpenJDK 64位服务器VM 1.8.0_181-8u181-b13-1ubuntu0.18.04.1-b13上的Groovy 2.5.3
编码:UTF-8(UTF-8)
过程:8747@michael-纬度-7480[127.0.1.1]
CPU:4-内存:23.4 GB(1.9 GB)-交换:2 GB(2 GB)
11月13日18:18:43.832[main]调试下一个流程。会话-工作目录:/home/michael/Programming/cromwellvalization/Work[ext2/ext3]
11月13日18:18:43.832[main]调试下一个流程。会话-脚本基本路径不存在或不是目录:/home/michael/Programming/cromwellvalization/bin
11月13日18:18:43.904[main]调试下一个流程。会话-已调用会话启动
11月13日18:18:43.911[main]调试nextflow.processor.TaskDispatcher-Dispatcher>start
11月13日18:18:43.911[main]调试nextflow.script.ScriptRunner->脚本解析
11月13日18:18:44.244[main]调试nextflow.script.ScriptRunner->启动执行
11月13日18:18:44.586[main]调试nextflow.processor.ProcessFactory->processorType:“本地”
11月13日18:18:44.593[主]调试nextflow.executor.executor-初始化执行器:本地
11月13日18:18:44.596[主]信息nextflow.executor.executor-[预热]executor>local
11月13日18:18:44.600[main]DEBUG n.processor.LocalPollingMonitor-为执行器'local'>CPU=4创建本地任务监视器;内存=23.4 GB;容量=4;脉冲间隔=100ms;间距=5m
11月13日18:18:44.604[main]调试nextflow.processor.TaskDispatcher-启动监视器:LocalPollingMonitor
11月13日18:18:44.605[main]调试n.processor.TaskPollingMonitor->>>屏障寄存器(监视器:本地)
11月13日18:18:44.616[主]调试nextflow.executor.executor-执行器的调用寄存器:本地
11月13日18:18:44.672[main]调试下一个流。会话->>>屏障寄存器(进程:下载)
11月13日18:18:44.676[main]调试nextflow.processor.TaskProcessor-创建操作符>下载\u giab--maxForks:4
11月13日18:18:44.736[main]调试nextflow.script.ScriptRunner->等待终止
11月13日18:18:44.736[main]调试下一个流程。会话-会话等待
11月13日18:18:44.758[Actor Thread 3]调试nextflow.Session-执行完成--再见

知道我做错了什么吗?nextflow输出都不是很有启发性。

需要使用
file
函数将fastq路径字符串映射到文件对象,例如:

Channel.fromPath('https://raw.githubusercontent.com/genome-in-a-bottle/giab_data_indexes/master/NA12878/sequence.index.NA12878_Illumina_HiSeq_Exome_Garvan_trimmed_fastq_09252015')
    .splitCsv(header: true, sep:'\t')
    .map { file(it.FASTQ) }
    .set { giab_urls }
另外请注意,您需要指定
sep
选项来处理选项卡分隔的文件,并且在将url传递给fromPath方法时不需要
file
函数


您可以找到此用例的描述。

谢谢,这非常有用!不过,我不知道你说的第一句话是什么意思