To
add new nodes to the cluster:
1. Add the network addresses of the new nodes
to the include file.
hdfs-site.xml
<property>
<name>dfs.hosts.include</name>
<value>/<>/hadoop/conf/includes</value>
<final>true</final>
</property>
<name>dfs.hosts.include</name>
<value>/<>/hadoop/conf/includes</value>
<final>true</final>
</property>
mapred-site.xml
<property>
<name>mapred.hosts.include</name>
<value>/<>/hadoop/conf/includes</value>
<final>true</final>
</property>
<property>
<name>mapred.hosts.include</name>
<value>/<>/hadoop/conf/includes</value>
<final>true</final>
</property>
Datanodes that are permitted to connect to the namenode are specified in a file whose
name is specified by the dfs.hosts property.
name is specified by the dfs.hosts property.
Includes
file resides on the namenode’s local filesystem, and it contains a line
for each datanode, specified by network address (as reported by the
datanode; you can see what this is by looking at the namenode’s web UI).
If you need to specify multiple network addresses for a datanode, put
them on one line, separated by whitespace. eg :
slave01
slave02
slave03
.....
Similarly, tasktrackers that may connect to the jobtracker are specified in a file whose
name is specified by the mapred.hosts property. In most cases, there is one shared file,
referred to as the include file, that both dfs.hosts and mapred.hosts refer to, since nodes
in the cluster run both datanode and tasktracker daemons.
2. Update the namenode with the new set of
permitted datanodes using this
command:
% hadoop dfsadmin –refreshNodes
3. Update the jobtracker with the new set of
permitted tasktrackers using:
% hadoop mradmin –refreshNodes
4. Update the slaves file with the new nodes, so that they are
included in future operations
performed by the Hadoop control scripts.
5. Start the new datanodes and tasktrackers.
6. Check that the new
datanodes and tasktrackers appear in the web UI.
To
remove nodes from the cluster:
1. Add the network addresses of the nodes to
be decommissioned to the exclude file.
Do not update the include file at this point.
</property>
mapred-site.xml
hdfs-site.xml
<property>
<name>dfs.hosts.exclude</name>
<value>/<>/hadoop/conf/excludes</value>
<final>true</final>
<property>
<name>mapred.hosts. exclude
</name>
<value>/<>/hadoop/conf/excludes</value>
<final>true</final>
</property>
The decommissioning process is controlled by an exclude file, which for HDFS is set
by the dfs.hosts.exclude property and for MapReduce by the mapred.hosts.exclude
property. It is often the case that these properties refer to the same file. The exclude file
lists the nodes that are not permitted to connect to the cluster.
by the dfs.hosts.exclude property and for MapReduce by the mapred.hosts.exclude
property. It is often the case that these properties refer to the same file. The exclude file
lists the nodes that are not permitted to connect to the cluster.
2. Update the namenode with the new set of
permitted datanodes, using this
command:
% hadoop dfsadmin –refreshNodes
3. Update the jobtracker with the new set of
permitted tasktrackers using:
%
hadoop
mradmin –refreshNodes
4. Go to the web UI and check whether the
admin state has changed to “Decommission
In Progress” for the datanodes being
decommissioned. They will start copying
their blocks to other datanodes in the
cluster.
5. When all the datanodes report their state
as “Decommissioned,” all the blocks have
been replicated. Shut down the decommissioned
nodes.
6. Remove the nodes from the include file,
and run:
% hadoop dfsadmin -refreshNodes
% hadoop mradmin –refreshNodes
7. Remove the nodes
from the slaves file.
You want big data interview questions and answers follow this link.
ReplyDeletehttp://kalyanhadooptraining.blogspot.in/search/label/Big%20Data%20Interview%20Questions%20and%20Answers