Clear Dead Region Servers in HBase

HBase may sometimes still show a decommissioned regionserver as dead. This is because, 
the WAL (Write-Ahead Log) of the dead regionserver was still in HDFS in the “splitting” 
state, so from HBase perspective it’s not dead !
So the solution is to go to WALs directory in HDFS (usually at /hbase/WALs) and remove 
the files of the old regionserver.
hdfs dfs -ls /apps/hbase/data/WALs/

drwxrwx--- - hbase hdfs 0 2015-11-08 00:33 /apps/hbase/data/WALs/dn17.test.fr,60020,1446939183416
drwxrwx--- - hbase hdfs 0 2015-11-08 00:33 /apps/hbase/data/WALs/dn18.test.fr,60020,1446939179122
drwxrwx--- - hbase hdfs 0 2015-11-08 00:33 /apps/hbase/data/WALs/dn19.test.fr,60020,1446939182213
drwxrwx--- - hbase hdfs 0 2015-11-08 00:33 /apps/hbase/data/WALs/dn20.test.fr,60020,1446939182925
drwxrwx--- - hbase hdfs 0 2015-11-08 00:33 /apps/hbase/data/WALs/dn21.test.fr,60020,1446939185744
drwxrwx--- - hbase hdfs 0 2015-11-08 00:33 /apps/hbase/data/WALs/dn22.test.fr,60020,1446939173931
drwxrwx--- - hbase hdfs 0 2015-11-08 00:33 /apps/hbase/data/WALs/dn24.test.fr,60020,1409665198801-splitting
The WAL (Write-Ahead Log) was still in HDFS in the “splitting” state, so from HBase 
perspective it’s not dead.
Removed the dn24 WAL directory in HDFS, restarted HBaseMaster (no downtime on HBase 
when restarting HBaseMaster), it did go away.
0 Comments

There are no comments yet

Leave a comment

Your email address will not be published. Required fields are marked *