Apache Kafka is a distributed message broker designed to handle large volumes of real-time data efficiently. Unlike traditional brokers like ActiveMQ and RabbitMQ, Kafka runs as a cluster of one or more servers which makes it highly scalable and due to this distributed nature it has inbuilt fault-tolerance while delivering higher throughput when compared to its counterparts
Implementation Steps Single Node
3.1. Installing a Single Node Kafka
3.1.1. Installing Java
sudo apt-get update
sudo apt-get install default-jre
sudo apt-get install default-jre
3.1.2. Installing zookeeper
sudo apt-get install zookeeperd
3.1.3. Create a service User for Kafka
sudo adduser --system --no-create-home --disabled-password --disabled-login kafka
3.1.4. Installing Kafka
3.1.5. Download Kafka
cd ~
curl http://kafka.apache.org/KEYS | gpg --import
wget https://dist.apache.org/repos/dist/release/kafka/1.0.1/kafka_2.12-1.0.1.tgz.asc
gpg --verify kafka_2.12-1.0.1.tgz.asc kafka_2.12-1.0.1.tgz
3.1.6. Create a directory for extracting Kafka
sudo mkdir /opt/kafka
sudo tar -xvzf kafka_2.12-1.0.1.tgz --directory /opt/kafka --strip-components 1
sudo tar -xvzf kafka_2.12-1.0.1.tgz --directory /opt/kafka --strip-components 1
3.1.7. Delete Kafka tarball and .asc file
rm -rf kafka_2.12-1.0.1.tgz kafka_2.12-1.0.1.tgz.asc
3.1.8. Configuring Kafka Server
Setup Kafka to start automatically on bootup
Copy the following init script to /etc/init.d/kafka:
======***
DAEMON_PATH=/opt/kafka/bin
DAEMON_NAME=kafka
# Check that networking is up.
#[ ${NETWORKING} = "no" ] && exit 0
PATH=$PATH:$DAEMON_PATH
# See how we were called.
case "$1" in
start)
# Start daemon.
echo "Starting $DAEMON_NAME";
nohup $DAEMON_PATH/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
;;
stop)
# Stop daemons.
echo "Shutting down $DAEMON_NAME";
pid=`ps ax | grep -i 'kafka.Kafka' | grep -v grep | awk '{print $1}'`
if [ -n "$pid" ]
then
kill -9 $pid
else
echo "Kafka was not Running"
fi
;;
restart)
$0 stop
sleep 2
$0 start
;;
status)
pid=`ps ax | grep -i 'kafka.Kafka' | grep -v grep | awk '{print $1}'`
if [ -n "$pid" ]
then
echo "Kafka is Running as PID: $pid"
else
echo "Kafka is not Running"
fi
;;
*)
echo "Usage: $0 {start|stop|restart|status}"
exit 1
esac
exit 0
DAEMON_NAME=kafka
# Check that networking is up.
#[ ${NETWORKING} = "no" ] && exit 0
PATH=$PATH:$DAEMON_PATH
# See how we were called.
case "$1" in
start)
# Start daemon.
echo "Starting $DAEMON_NAME";
nohup $DAEMON_PATH/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
;;
stop)
# Stop daemons.
echo "Shutting down $DAEMON_NAME";
pid=`ps ax | grep -i 'kafka.Kafka' | grep -v grep | awk '{print $1}'`
if [ -n "$pid" ]
then
kill -9 $pid
else
echo "Kafka was not Running"
fi
;;
restart)
$0 stop
sleep 2
$0 start
;;
status)
pid=`ps ax | grep -i 'kafka.Kafka' | grep -v grep | awk '{print $1}'`
if [ -n "$pid" ]
then
echo "Kafka is Running as PID: $pid"
else
echo "Kafka is not Running"
fi
;;
*)
echo "Usage: $0 {start|stop|restart|status}"
exit 1
esac
exit 0
======***
3.1.8.2. Make the kafka service
sudo chmod 755 /etc/init.d/kafka
sudo update-rc.d kafka defaults
sudo update-rc.d kafka defaults
3.1.8.3. Start Stop the Kafka Services
sudo service kafka start
sudo service kafka status
sudo service kafka stop
3.1.9. Testing kafka topics
3.1.9.1. Starting Kafka
sudo service kafka start
sudo service kafka status
3.1.9.2. Topic creation
/opt/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
3.1.9.3. Publish Msg to test topic
/opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
This will prompt for Msgs, we can enter a test Msg
3.1.9.4. Consume Msg from topic
/opt/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
3.2. Making Kafka Scalable
Requirement
Clustering the Zookeeper in all the Servers
Clustering the kafka in All the servers
Install Zookeeper in all the server and configure the servers in
/etc/zookeeper/conf/zoo.cfg
to mention all the nodes of zookeeper
server.0=10.0.0.1:2888:3888
server.1=10.0.0.2:2888:3888
server.2=10.0.0.3:2888:3888
Once kafka is installed in all the servers
/opt/kafka/config/server.properties
We will change the following settings.
broker.id should be unique for each node in the cluster.
for node-2 broker.id=1
for node-3 broker.id=2
change zookeeper.connect value to have such that it lists all zookeeper hosts with port
zookeeper.connect=10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181
No comments:
Post a Comment