06 - Acquire Data Using CLI and Flume
06 - Acquire Data Using CLI and Flume
Module 4: Data Unification and Analysis Lesson 08: Primary Administrative Tasks
for Oracle NoSQL Database
6-2
Objectives
6-3
Viewing File System Contents Using the CLI
6-4
Loading Data Using the CLI
6-5
What is Flume?
6-6
Flume: Architecture
Agent
HDFS
Web
Server
6-7
Flume Channels (Hold Events)
• Memory channel
• JDBC channel
• File channel
• Custom channel
Source Sink
Channel
Agent
Web
HDFS
Server
6-8
Flume: Data Flows
Extract
browser name
Downstream Upstream Downstream
Tail Apache from log string Upstream HDFS://
processor agent collector
HTTPD logs and attach it to processor namenode/
HTTPD node node node
event node /weblogs/ HDFS
%(browser)/
6-9
Configuring Flume
6 - 10
Exploring a flume*.conf File
6 - 11
Additional Resources
• http://flume.apache.org/index.html
6 - 12
Summary
6 - 13