{"id":7690,"date":"2018-07-18T16:29:39","date_gmt":"2018-07-18T19:29:39","guid":{"rendered":"http:\/\/blog.plataformatec.com.br\/?p=7690"},"modified":"2018-10-17T16:47:20","modified_gmt":"2018-10-17T19:47:20","slug":"whats-new-in-flow-v0-14","status":"publish","type":"post","link":"http:\/\/blog.plataformatec.com.br\/2018\/07\/whats-new-in-flow-v0-14\/","title":{"rendered":"What’s new in Flow v0.14"},"content":{"rendered":"

Flow v0.14 has been recently released with more fine grained control on data emission and tighter integration with GenStage.<\/p>\n

In this blog post we will start with a brief recap of Flow and go over the new changes. We end the post with a description of the new Elixir Development Subscription service by Plataformatec and how it has helped us bring those improvements to Flow.<\/p>\n

Quick introduction to Flow<\/h2>\n

Flow<\/a> is a library for computational parallel flows in Elixir. It is built on top of GenStage<\/a> which specifies how Elixir processes should communicate with back-pressure.<\/p>\n

Flow is inspired by the MapReduce and Apache Spark models but focuses on single node performance<\/a>. It aims to use all cores of your machines efficiently.<\/p>\n

The “hello world” of data processing is a word counter. Here is how we would count the words in a file with Flow<\/code>:<\/p>\n

File.stream!(\"path\/to\/some\/file\")\n|> Flow.from_enumerable()\n|> Flow.flat_map(&String.split(&1, \" \"))\n|> Flow.partition()\n|> Flow.reduce(fn -> %{} end, fn word, acc ->\nMap.update(acc, word, 1, & &1 + 1)\nend)\n|> Enum.to_list()\n<\/code><\/pre>\n

If you have a machine with 4 cores, the example above will create 9 light-weight Elixir processes that run concurrently:<\/p>\n