Showing posts with label centos 6. Show all posts
Showing posts with label centos 6. Show all posts

Monday, July 6, 2015

Kafka Implementation on Centos


Hi Folks , here i am going to show how to implement the kafka on linux host and to make use of it, kafka installation is pretty simple and can be done in matter of few mins. 

Lets first understand what is kafka and why we are using it, some of basic stuff about kafka.

Kafka:- It is a distributed messaging system, where we have different components to produce / public and consume the messages stream. It is fault tolerant, consistent high performance throughput , very helpful even if you have to process live stream of TB's data. So lets see its architecture.






Zookeeper:- As we all know that it is used for maintain the state of the process and jobs, in this case it is used for maintaining and updating the consumed message offset/storing the broker address etc. Zookeeper is required to run kafka on machine.

Producer:- Producer create the topics and sent the message of that topics to broker for further processing.

Broker:-  Broker stores the data written by producers, its can store multiple read and write a time.

Consumer:- Consumer polls the messages from broker and use it.

Installation of kafka

1. Download the kafka from their official site or click here.

2. Untar it some place like i did on my user's home.

$  tar -xvzf kafka_2.10-0.8.2.0.tgz
$  mv kafka_2.10-0.8.2.0   kafka-0.8.2

3. Now you have start the zookeeper which is very important for running kafka producer , so just run the below commands

$ cd kafka-0.8.2
$ bin/zookeeper-server-start.sh /home/kafka/kafka-0.8.2/config/zookeeper.properties

note:- you need to edit your zookeeper.properties to define the data directory.

4. Now the zookeeper is started you can start your kafka server , which is pretty simple to execute

$ bin/kafka-server-start.sh /home/kafka/kafka-0.8.2/config/server.properties

5. Now the server is start , we can create a sample topic and public it to broker so lets create a sample topic

$ bin/kafka-topics.sh --create --zookeeper kafka:2181 --replication-factor 1 --partitions 1 --topic test

where kafka:2181 is zookeeper host and port , replication factor/partition  is 1 and topic name is test

6. Now you can check the topic is created or not by below command

$ bin/kafka-topics.sh --list --zookeeper kafka:2181

7. Now there are many ways to feed the data like command line and from Automated live feed

// you can feed the data from command line by below method.

$  bin/kafka-console-producer.sh --broker-list kafka:9092 --topic test

8. Now you can read the data whatever you have given in above commands

$ bin/kafka-console-consumer.sh --zookeeper kafka:2181 --topic test --from-beginning

output will the same words or data given while commands 7.

9. You can see the details of the topic you have created by below commands

$  bin/kafka-topics.sh --describe --zookeeper kafka:2181 --topic my-replicated-topic


That is all for now, please comment if you have any queries and doubts.