Showing posts with label hadoop commands. Show all posts
Showing posts with label hadoop commands. Show all posts

Tuesday, April 22, 2014

Linux & Hadoop Uniq Commands

Hi Folks,

Today i am going to show you some of important commands which you can use for different purposes.

1. Data read and written by the particular process by providing pid of process
 cat /proc/$pid/io | grep -wE "read_bytes|write_bytes" | awk -F':' '{print $1 " " $2/(1024*1024) " Mb"}'
2. Delete N nos of file
find . -name "*.gc" -print0 | xargs -0 rm
3. Generate random data for use cases Ex like 5 *10 MB files
dd if=/dev/urandom of=a.log bs=5M count=10 
4. Replace spaces from file name
IFS=$'\n';for f in `find .`; do file=$(echo $f | tr [:blank:] '_'); [ -e $f ] && [ ! -e $file ] && mv "$f" $file; done;unset IFS;
5. Difference between fileA and fileB
awk 'BEGIN { while ( getline < "fileB" ) { arr[$0]++ } } { if (!( $0 in arr ) ) { print } }' fileA
6.  Print the hostnames of datanodes by commandline (used when you have large no of nodes)
for a in `hadoop dfsadmin -report | grep -i name | awk -F ':' '{print $2}'`; do host $a| awk '{print $5}' | sed 's/.$//g'; done
7.  Dfs % used of hadoop nodes
hadoop dfsadmin -report | grep -A6 Name |  tr '\n' ' ' | tr '-' '\n' | awk '{print substr($2,0,13)" "$29}' 
8. Read XML (format:- [hdfs|core|mapred]-site.xml)  file from A to B
cat $fil | sed -n “/A/,/B/p" 
9.  Change XML (format:- [hdfs|core|mapred]-site.xml)  to Yaml
cat hdfs-site.xml |   grep -e "<name>" -e "<value>" | sed 's/<name>//g;s/<value>//g;s/<\/value>//g;s/<\/name>/:/g' | perl -p -e 's/:\n/:/' 
10. Get the value of particular parameter of xml file (format:- [hdfs|core|mapred]-site.xml)
 awk -F"[<>]" '/mapred.local.dir/ {getline;print $3;exit}'

Hope these are helpful to you :)