Integration of LVM with Hadoop-Cluster using AWS Cloud

What is LVM? Why we use it?

🔗Logical Volume Management enables the combining of multiple individual hard drives or disk partitions into a single volume group (VG). That volume group can then be subdivided into logical volumes (LV) or used as a single large volume. Regular file systems, such as EXT3 or EXT4, can then be created on a logical volume.

🔗The EXT2, 3, and 4 filesystems all allow both offline (unmounted) and online (mounted) resizing when increasing the size of a filesystem, and offline resizing when reducing the size.

🔗LVM helps to provides Elasticity to the storage Device and it’s an advance version of partition.

🔗In the below figure, two complete physical hard drives and one partition from a third hard drive have been combined into a single volume group. Two logical volumes have been created from the space in the volume group, and a filesystem, such as an EXT3 or EXT4 filesystem has been created on each of the two logical volumes.

What is Elasticity?

đŸ‘‰đŸ»The property of a substance that enables it to change its length, volume, or shape in direct response to a force effecting such a change and to recover its original form upon the removal of the force is called Elasticity.

Difference between static and dynamic partition

🔗There are two Memory Management Techniques: Contiguous, and Non-Contiguous. In Contiguous Technique, executing process must be loaded entirely in main-memory. Contiguous Technique can be divided into:

1.Fixed (or static) partitioning

2.Variable (or dynamic) partitioning

I) Fixed Partitioning:
This is the oldest and simplest technique used to put more than one processes in the main memory. In this partitioning, number of partitions (non-overlapping) in RAM are fixed but size of each partition may or may not be same. As it is contiguous allocation, hence no spanning is allowed. Here partition are made before execution or during system configure.

II)Variable(Dynamic) Partitioning –
It is a part of Contiguous allocation technique. It is used to alleviate the problem faced by Fixed Partitioning. In contrast with fixed partitioning, partitions are not made before the execution or during system configure.

Task Description 📄

🌀7.1: Elasticity Task

🔅Integrating LVM with Hadoop and providing Elasticity to Data Node Storage

🔅Increase or Decrease the Size of Static partition in linux using LVM storage.

Firstly we need to configure Data Node on AWS portal

While launching Data Node I am going to attach two more hard disks for LVM.

Here we can see that my both instances for data and master node are launched successfully

Now we need to configure HDFS cluster

Inside Master Node

We need softwares of hadoop and java to set-up hadoop cluster

Now we will configure core-site.xml file inside cd /etc/hadoop folder

Inside master node we gave neutral-IP 0.0.0.0 you can say that it is a default gateway to reach/connect to any other system IP both privately and publicly.

In my case I am using Port No : 9001 you can check from your system which port no is available by using #netstat -tnlp

Now let’s configure hdfs-site.xml file

Before connecting any data node or using any storage we need to format master node directory using #hadoop namenode -format

Now to start services of master node we use #hadoop-daemon.sh start namenode and we can verify it by using #jps command

Now master node is configured let’s configure data node in this node core-site.xml is configured exactly same as name node only IP changes.

Inside Data Node

Here also firstly install both softwares and then configure core-site.xml inside cd /etc/hadoop folder

Let us configure hdfs-site.xml file

Now to start services of data node we use #hadoop-daemon.sh start datanode and we can verify it by using #jps command

Hence,HDFS i.e hadoop cluster is configured successfullyđŸ‘đŸ» we can verify by using command : #hadoop dfsadmin -report

Now let’s move to our main task of LVM

Here we can check total number of hard disks we have attached by using command : #fdisk -l

LVM architecture

Now we need to convert our Physical H.D to Physical Volume(PV).Because VG(Volume Group) only understands in PV format.

To create a PV we use command : #pvcreate /dev/xvdf , #pvcreate /dev/xvdg& To Confirm and display our PV we use command: #pvdisplay /dev/xvdf or /xvdg

Now let’s combine these both PV’s and form one VG of 16GiB using this command : #vgcreate avivg /dev/xvdf /dev/xvdg

Here we got a new storage or H.D of nearly 16GiB because some of the part is already reserved for inode table to store it’s metadata. Here metadata means data about data i.e data about our storage.

Let’s do partition of this new H.D/storage/Logical Volume

Creating a partition let’s say of 1GiB or GB by using command :

#lvcreate — size 1GB — name partition-name vg-name

Now to confirm partition is created or not we use command :

#lvdisplay vg-name/partition-name

Let’s do now format using command : #mkfs.ext4 /dev/avivg/as1

Now to mount firstly we will create a directory or folder because to interact to device storage user need a folder using this command : #mkdir /foldername

We will mount this using command : #mount /dev/vg-name/lv-name /foldername and by using #df -h command we can confirm it’s mounted inside/dev/mapper/avivg-as1 on folder /avadhut

Technically we can say these all are link names/othernames/nicknames of LV

>> /dev/avivg/as1

>>dev/mapper/avivg-as1

>>/dm-0

Now the main benefit of LVM it is a dynamic partition here on the fly while doing something behind the scene we can increase/extend the size of partition

Step 2 : Increase or Decrease the Size of dynamic partition in linux using LVM storage.

We can extend the size of partition using command : #lvextend — size +2GB /dev/avivg/as1 here the volume is extended successfully we can confirm this by using #lvdisplay /dev/avivg/as1

Here as we can see LV size is now 3GiB but while using #df -h it still shows 1GiB??🧐

Because while we were first time doing partition we formatted and mounted only 1GiB storage so now to increase size we need to reformat not by using mkfs.ext4 because this command will also remove our important data so to reformat we gonna use command : #resize2fs /dev/avivg/as1 now no need to again mount it will automatically increase size and we can confirm it by using #df -h command here -h means human readable format.

Hence,LVM Architecture is successfully created.

Now it’s time to contribute this storage of LVM to master node via data node.

Inside master node we can see it is taking storage from root folder(default) but we are gonna provide storage using LVM.

So let’s mount this storage of LV to /dn1 using command : #mount /dev/avivg/as1 /dn1

We can also confirm from master node using #hadoop dfsadmin -report

Finally our Task i.e TASK 7.1.A is successfully accomplishedđŸ„łđŸ„ł

Thank you !

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store