Apache Flume

Deepak Vohra

Book Chapter

Apache Flume

Vohra D

Apress, (2016), 287-300

DOI: 10.1007/978-1-4842-2199-0_6

N/ACitations

85Readers

Get full text

Abstract

Apache Flume is a framework based on streaming data flows for collecting, aggregating, and transferring large quantities of data. Flume is an efficient and reliable distributed service. A unit of data flow in Flume is called an event. The main components in Flume architecture are Flume source, Flume channel, and Flume sink, all of which are hosted by a Flume agent. A Flume source consumes events from an external source such as a log file or a web server. A Flume source stores the events it receives in a passive data store called a Flume channel. Examples of Flume channel types are a JDBC channel, a file channel, and a memory channel. The Flume sink component removes the events from the Flume channel and puts them in an external storage such as HDFS. A Flume sink can also forward events to another Flume source to be processed by another Flume agent. The Flume architecture for a single-hop data flow is shown in Figure 6-1.

Cite

CITATION STYLE

APA

Vohra, D. (2016). Apache Flume. In Practical Hadoop Ecosystem (pp. 287–300). Apress. https://doi.org/10.1007/978-1-4842-2199-0_6

Apache Flume

Abstract

Cite

Register to see more suggestions