Apache storm Apache Apex与Apache Storm有何不同?

Apache storm Apache Apex与Apache Storm有何不同?,apache-storm,stream-processing,apache-apex,bigdata,Apache Storm,Stream Processing,Apache Apex,Bigdata,看起来很像 用户在两个平台上都以有向无环图(DAG)的形式构建应用程序/拓扑。Apex使用操作员/溪流,Storm使用喷口/溪流/螺栓 它们都实时处理数据,而不是批处理 两者似乎都具有高吞吐量和低延迟 所以,乍一看,两者看起来都很相似,我不太明白其中的区别。有人能解释一下主要的区别吗?换句话说,我什么时候应该使用一个而不是另一个?体系结构存在根本性差异,这使得每个平台在延迟、扩展和状态管理方面都非常不同 在最基本的层面上, Apache Storm使用记录确认来保证消息传递 ApacheAp

看起来很像

  • 用户在两个平台上都以有向无环图(DAG)的形式构建应用程序/拓扑。Apex使用操作员/溪流,Storm使用喷口/溪流/螺栓
  • 它们都实时处理数据,而不是批处理
  • 两者似乎都具有高吞吐量和低延迟

所以,乍一看,两者看起来都很相似,我不太明白其中的区别。有人能解释一下主要的区别吗?换句话说,我什么时候应该使用一个而不是另一个?

体系结构存在根本性差异,这使得每个平台在延迟、扩展和状态管理方面都非常不同

在最基本的层面上,

  • Apache Storm使用记录确认来保证消息传递
  • ApacheApex使用检查点来保证消息传递 你可以在下面的博客中了解更多的不同,其中还包括其他主流处理平台


    体系结构和功能

    +-------------------+---------------------------+---------------------+
    |                   |           Storm           |         Apex        |
    +-------------------+---------------------------+---------------------+
    | Model             | Native Streaming          | Native Streaming    |
    |                   | Micro batch (Trident      |                     |
    +-------------------+---------------------------+---------------------+
    | Language          | Java.                     | Java (Scala)        |
    |                   | Ability to use non        |                     |
    |                   | JVM languages support     |                     |
    +-------------------+---------------------------+---------------------+
    | API               | Compositional             | Compositional (DAG) |
    |                   | Declarative (Trident)     | Declarative         |
    |                   | Limited SQL               |                     |
    |                   | support (Trident)         |                     |
    +-------------------+---------------------------+---------------------+
    | Locality          | Data Locality             | Advance Processing  |
    +-------------------+---------------------------+---------------------+
    | Latency           | Low                       | Very Low            |
    |                   | High (Trident)            |                     |
    +-------------------+---------------------------+---------------------+
    | Throughput        | Limited in Ack mode       | Very high           |
    +-------------------+---------------------------+---------------------+
    | Scalibility       | Limited due to Ack        | Horizontal          |
    +-------------------+---------------------------+---------------------+
    | Partitioning      | Standard                  | Advance             |
    |                   | Set parallelism at work,  | Parallel pipes,     |
    |                   | executor and task level   | unifiers            |
    +-------------------+---------------------------+---------------------+
    | Connector Library | Limited (certification)   | Rich library of     |
    |                   |                           | connectors in       |
    |                   |                           | Apex Malhar         |
    +-------------------+---------------------------+---------------------+
    
    可操作性

    +------------+--------------------------+---------------------+
    |            |           Storm          |         Apex        |
    +------------+--------------------------+---------------------+
    | State      | External store           | Checkpointing       |
    | Management | Limited checkpointing    | Local checkpointing |
    |            | Difficult to exploit     |                     |
    |            | local state              |                     |
    +------------+--------------------------+---------------------+
    | Recovery   | Cumbersome API to        | Incremental         |
    |            | store and retrieve state | (buffer server)     |
    |            | Require user code        |                     |
    +------------+--------------------------+---------------------+
    | Processing | At least once            |                     |
    | Semantic   | Exactly once require     | At least once       |
    |            | user code and affect     | End to end          |
    |            | latency                  |                     |
    |            |                          | exactly once        |
    +------------+--------------------------+---------------------+
    | Back       | Watermark on queue       | Automatic           |
    | Pressure   | size for spout and bolt  | Buffer server       |
    |            | Does not scale           | memory and disk     |
    +------------+--------------------------+---------------------+
    | Elasticity | Through CLI only         | Yes w/ full user    |
    |            |                          | control             |
    +------------+--------------------------+---------------------+
    | Dynamic    | No                       | Yes                 |
    | topology   |                          |                     |
    +------------+--------------------------+---------------------+
    | Security   | Kerberos                 | Kerberos, RBAC,     |
    |            |                          | LDAP                |
    +------------+--------------------------+---------------------+
    | Multi      | Mesos, RAS - memory,     | YARN                |
    | Tenancy    | CPU, YARN                | full isolation      |
    +------------+--------------------------+---------------------+
    | DevOps     | REST API                 | REST API            |
    | Tools      | Basic UI                 | DataTorrent RTS     |
    +------------+--------------------------+---------------------+
    
    资料来源:
    网络研讨会:ApacheApex(下一代Hadoop)vs.Storm-比较和迁移大纲

    添加ApacheFlink和ApacheBeam,所有DAG处理器也请添加用例,我更喜欢适合每种情况的用例。