distcp commands with options

For distcp we need source and destination namenodes

Here source namenode is taken as "source" and destination as "dest"
# hadoop distcp -pb -delete -update -m 20 -bandwidth 20 -log -log hdfs://{source}:8020/tmp/copy_log hdfs://{source}:8020/${dir} hdfs://{dest}:8020/${dir}
-pb - preserve block size
-overwrite - for overwriting the content
-update - to update the files and directories by checking 
the size of file
-m - number of mappers to run simultaneously at a single time
-bandthwidth - bandwidth of network for a single mapper
-log - log location in hdfs to store log files
for queue specification in distcp : 
-Dmapreduce.job.queuename={queue name}
# hadoop distcp -Dmapreduce.job.queuename=migration -pb -delete -update -m 20 -bandwidth 20 -log -log hdfs://{source}:8020/tmp/copy_log hdfs://{source}:8020/${dir} hdfs://{dest}:8020/${dir}
0 Comments

There are no comments yet

Leave a comment

Your email address will not be published. Required fields are marked *