Friday, 16 October 2009

Copying compressed file via SSH to Hadoop HDFS

Had probably re-invented the way of copying gzipped files via SSH to Hadoop's HDFS:



gzip -c file.txt | ssh hadoop.gateway.host 'gunzip -cf - | hdfs -put - input/file.txt'


And all the way back:


ssh hadoop.gateway.host 'hdfs -cat output/result.txt/* | gzip -c' | gunzip -c - > resulttxt