Sunday, July 22, 2018

Deploying Kafka into ubuntu

Apache Kafka is a distributed message broker designed to handle large volumes of real-time data efficiently. Unlike traditional brokers like ActiveMQ and RabbitMQ, Kafka runs as a cluster of one or more servers which makes it highly scalable and due to this distributed nature it has inbuilt fault-tolerance while delivering higher throughput when compared to its counterparts

 Implementation Steps Single Node

3.1. Installing a Single Node Kafka

3.1.1. Installing Java

sudo apt-get update
sudo apt-get install default-jre

3.1.2. Installing zookeeper

sudo apt-get install zookeeperd

3.1.3. Create a service User for Kafka

sudo adduser --system --no-create-home --disabled-password --disabled-login kafka

3.1.4. Installing Kafka

3.1.5. Download Kafka

cd ~

curl | gpg --import
gpg --verify kafka_2.12-1.0.1.tgz.asc kafka_2.12-1.0.1.tgz

3.1.6. Create a directory for extracting Kafka

sudo mkdir /opt/kafka
sudo tar -xvzf kafka_2.12-1.0.1.tgz --directory /opt/kafka --strip-components 1

3.1.7. Delete Kafka tarball and .asc file

rm -rf kafka_2.12-1.0.1.tgz kafka_2.12-1.0.1.tgz.asc

3.1.8. Configuring Kafka Server

Setup Kafka to start automatically on bootup

Copy the following init script to /etc/init.d/kafka:
# Check that networking is up.
#[ ${NETWORKING} = "no" ] && exit 0


# See how we were called.
case "$1" in
       # Start daemon.
       echo "Starting $DAEMON_NAME";
       nohup $DAEMON_PATH/ -daemon /opt/kafka/config/
       # Stop daemons.
       echo "Shutting down $DAEMON_NAME";
       pid=`ps ax | grep -i 'kafka.Kafka' | grep -v grep | awk '{print $1}'`
       if [ -n "$pid" ]
         kill -9 $pid
         echo "Kafka was not Running"
       $0 stop
       sleep 2
       $0 start
       pid=`ps ax | grep -i 'kafka.Kafka' | grep -v grep | awk '{print $1}'`
       if [ -n "$pid" ]
         echo "Kafka is Running as PID: $pid"
         echo "Kafka is not Running"
       echo "Usage: $0 {start|stop|restart|status}"
       exit 1

exit 0
======*** Make the kafka service

sudo chmod 755 /etc/init.d/kafka
sudo update-rc.d kafka defaults Start Stop the Kafka Services

sudo service kafka start
sudo service kafka status
sudo service kafka stop

3.1.9. Testing kafka topics Starting Kafka

sudo service kafka start
sudo service kafka status Topic creation

/opt/kafka/bin/ --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test Publish Msg to test topic

/opt/kafka/bin/ --broker-list localhost:9092 --topic test

This will prompt for Msgs,  we can enter a test Msg Consume Msg from topic

/opt/kafka/bin/ --zookeeper localhost:2181 --topic test --from-beginning

3.2. Making Kafka Scalable

Clustering the Zookeeper in all the Servers
Clustering the kafka in All the servers

Install Zookeeper in all the server and configure the servers in

to mention all the nodes of zookeeper


Once kafka is installed in all the servers

We will change the following settings. should be unique for each node in the cluster.

for node-2
for node-3
change zookeeper.connect value to have such that it lists all zookeeper hosts with port


Sunday, April 15, 2018

Enabling hive Authorization in Qubole

Once the Hive authorization is enabled in qubole we need to mange the users and permission by hive authentication,  following are the some of the commands which will be used for the same.

1. Listing the Current Roles

Set role admin;
show roles

2. Create the roles

CREATE ROLE <role_name>;
Creates a new role. Only the admin role has privilege for this.

Set role admin;
Create role sysadmin;

3. Grant Role to users

GRANT ROLE <role_name> TO USER <user_name>
Set role admin;
Grant Role sysadmin to user rahul ;

4. Revoke a role from user

REVOKE ROLE <role_name> FROM USER <user_name>;

Set role admin;
REVOKE Role sysadmin from user rahul;

5. List  Roles attached to a user


Set role admin;
show role grant user `rahul`;

6. List Users under a role


Set role admin;

7. Assign Role access to tables

Sample Permission
SELECT privilege: It provides read access to an object (table).
INSERT privilege: It provides ability for adding data to an object (table).
UPDATE privilege: It provides ability for running UPDATE queries on an object (table).
DELETE privilege: It provides ability for deleting data in an object (table).
ALL privilege: It provides all privileges. In other words, this privilege gets translated into all the above privileges.

GRANT <Permission> ON <table_name> TO ROLE <role_name>;

Grant all on default.testtable to role sysadmin

8. View Role/user Permissions on tables

Check all users who have been granted with a specific role

SHOW GRANT USER <user_name> ON <table_name|All>;
SHOW GRANT ROLE <user_name> ON <table_name|All>;

SHOW GRANT user analytics on all

Saturday, March 31, 2018

Parsing Value from a Json Field in Qubole.

When the data in one of the Filed in the hive env is in Json format and when we need to extract the value out of the Json we can use the following commands

get_json_object(column_name, '$.keyvalue')

The column name is : jdata and json the Column is as followes.

    "Foo": "ABC",
    "Bar": "20090101100000",
    "Quux": {
        "QuuxId": 1234,
        "QuuxName": "Sam"

if we have to extract ABC : get_json_object(jdata, '$.Foo') 

Friday, February 16, 2018

Azure VPN Gateway with Cisco ASA using Routing

When we configure the Azure VPN Gateway with Cisco ASA there will be issue realted to routing type so we need to enable UsePolicyBasedTrafficSelectors in the Azure Connection to Solve the issue .

$RG1          = "****************"
$Connection16 = "****************"

$connection6  = Get-AzureRmVirtualNetworkGatewayConnection -Name $Connection16 -ResourceGroupName $RG1

$newpolicy6   = New-AzureRmIpsecPolicy -IkeEncryption AES256 -IkeIntegrity SHA384 -DhGroup DHGroup24 -IpsecEncryption AES256 -IpsecIntegrity SHA1 -PfsGroup PFS24 -SALifeTimeSeconds 28800 -SADataSizeKilobytes 4608000

Set-AzureRmVirtualNetworkGatewayConnection -VirtualNetworkGatewayConnection $connection6 -IpsecPolicies $newpolicy6

Set-AzureRmVirtualNetworkGatewayConnection -VirtualNetworkGatewayConnection $connection6 -IpsecPolicies $newpolicy6 -UsePolicyBasedTrafficSelectors $True

PS Azure:\> $connection6.UsePolicyBasedTrafficSelectors



PS Azure:\> $connection6.IpsecPolicies

Docker Management using Portainer

mkdir -p /mnt/docker
yum install -y rsync

* * * * * rsync -avzh /mnt/docker/ root@dm01:/mnt/docker/
* * * * * rsync -avzh /mnt/docker/ root@dm02:/mnt/docker/
* * * * * rsync -avzh /mnt/docker/ root@dm03:/mnt/docker/
* * * * * rsync -avzh /mnt/docker/ root@dm04:/mnt/docker/

Install Portainer with a persistent container
mkdir -p /mnt/docker/portainer/data

docker pull portainer/portainer
docker service create \
    --name portainer \
    --publish 9090:9000 \
    --constraint 'node.role == manager' \
    --mount type=bind,src=/mnt/shared/portainer,dst=/data \
    --mount type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock \
    portainer/portainer \
   -H unix:///var/run/docker.sock

[root@dm01 ~]#

Sunday, December 3, 2017

Qubole load CSV with spark

val df ="com.databricks.spark.csv")
                    .option("delimiter", "|")
                    .option("header", "true")
                    .option("inferSchema", "true")


create table database.table as
select * from temp-table

Tuesday, November 28, 2017

Increases swap in azure linux machine

To create a swap file in the directory that's defined by the ResourceDisk.MountPoint parameter, you can update the /etc/waagent.conf file by setting the following three parameters:


Note The xx placeholder represents the desired number of megabytes (MB) for the swap file.
Restart the WALinuxAgent service by running one of the following commands, depending on the system in question:

Ubuntu: service walinuxagent restart
Red Hat/Centos: service waagent restart

Run one of the following commands to show the new swap apace that's being used after the restart:

dmesg | grep swap
swapon -s
cat /proc/swaps
file /mnt/resource/swapfile
free| grep -i swap

If the swap file isn't created, you can restart the virtual machine by using one of the following commands:

shutdown -r now
init 6