Geoinsyssoft Hadoop Training : Flume installation
Download flume bin version
http://archive.apache.org/dist/flume/stable/
copy to home directory
Extract it
tar -xvf apache-flume-1.4.0.bin.tar.gz
Go to //apache-flume-1.4.0.bin/conf
sudo cp conf/flume-conf.properties.template conf/flume.conf
sudo cp conf/flume-env.sh.template conf/flume-env.sh
Copying Local file to HDFS thru Flume ;
# Define a memory channel called ch1 on agent1
agent1.channels.ch1.type = memory
# Here exec1 is source name.
agent1.sources.exec1.channels = ch1
agent1.sources.exec1.type = exec
agent1.sources.exec1.command = tail -F /home/geoinsys/test/
#in /home/geoinsys/test/ - source file path for a text file.
# Define a logger sink that simply logs all events it receives
# and connect it to the other end of the same channel.
# Here HDFS is sink name.
agent1.sinks.HDFS.channel = ch1
agent1.sinks.HDFS.type = hdfs
agent1.sinks.HDFS.hdfs.path = hdfs://localhost:54310/anand
agent1.sinks.HDFS.hdfs.file.Type = DataStream
# Finally, now that we've defined all of our components, tell
# agent1 which ones we want to activate.
agent1.channels = ch1
#source name can be of anything.(here i have chosen exec1)
agent1.sources = exec1
#sinkname can be of anything.(here i have chosen HDFS)
agent1.sinks = HDFS
Run in terminal
bin/flume-ng node -n agent1 -f conf/flume.conf
Download flume bin version
http://archive.apache.org/dist/flume/stable/
copy to home directory
Extract it
tar -xvf apache-flume-1.4.0.bin.tar.gz
Go to //apache-flume-1.4.0.bin/conf
sudo cp conf/flume-conf.properties.template conf/flume.conf
sudo cp conf/flume-env.sh.template conf/flume-env.sh
Copying Local file to HDFS thru Flume ;
# Define a memory channel called ch1 on agent1
agent1.channels.ch1.type = memory
# Here exec1 is source name.
agent1.sources.exec1.channels = ch1
agent1.sources.exec1.type = exec
agent1.sources.exec1.command = tail -F /home/geoinsys/test/
#in /home/geoinsys/test/ - source file path for a text file.
# Define a logger sink that simply logs all events it receives
# and connect it to the other end of the same channel.
# Here HDFS is sink name.
agent1.sinks.HDFS.channel = ch1
agent1.sinks.HDFS.type = hdfs
agent1.sinks.HDFS.hdfs.path = hdfs://localhost:54310/anand
agent1.sinks.HDFS.hdfs.file.Type = DataStream
# Finally, now that we've defined all of our components, tell
# agent1 which ones we want to activate.
agent1.channels = ch1
#source name can be of anything.(here i have chosen exec1)
agent1.sources = exec1
#sinkname can be of anything.(here i have chosen HDFS)
agent1.sinks = HDFS
Run in terminal
bin/flume-ng node -n agent1 -f conf/flume.conf
A flow in flumeNG describes the whole transport from a source to a sink.
The sink could also be a new source to collect different streams into
one sink. The process flume starts is an agent. A setup could be run
like the example:
source - -> source => channel => sink
\ /
source - => channel => sink
/ \
source - -> channel => source => channel => sink