User:Lindenb/Notebook/UMR915/20110610

=Hadoop= download & unzip hadoop-0.20.203.0rc1.tar.gz

Single node setup
http://hadoop.apache.org/common/docs/current/single_node_setup.html

export JAVA_HOME=/usr/local/package/jdk1.6.0_26 cd hadoop-0.20.203.0 mkdir input cp conf/*.xml input bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+' 11/06/10 10:59:36 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-lindenb/mapred/staging/lindenb-1423012718/.staging/job_local_0001 java.net.UnknownHostException: srv-clc-04.u915.irt.univ-nantes.prive3: srv-clc-04.u915.irt.univ-nantes.prive3 at java.net.InetAddress.getLocalHost(InetAddress.java:1354) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:815) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:791) at java.security.AccessController.doPrivileged(Native Method)

change congig
conf/core-site.xml:

fs.default.name hdfs://localhost:9000

conf/hdfs-site.xml:

dfs.replication 1

conf/mapred-site.xml:

mapred.job.tracker localhost:9001

Setup ssh for no password
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys


 * 1) Important, change chmod for ssh #############################################

$ chmod 700 ~/.ssh/ $ chmod 640 ~/.ssh/authorized_keys

Format a new distributed-filesystem
$ bin/hadoop namenode -format [lindenb@srv-clc-04 hadoop-0.20.203.0]$ bin/hadoop namenode -format 11/06/10 12:19:03 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG:  host = java.net.UnknownHostException: srv-clc-04.u915.irt.univ-nantes.prive3: srv-clc-04.u915.irt.univ-nantes.prive3 STARTUP_MSG:  args = [-format] STARTUP_MSG:  version = 0.20.203.0 STARTUP_MSG:  build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May  4 07:57:50 PDT 2011 Re-format filesystem in /tmp/hadoop-lindenb/dfs/name ? (Y or N) Y 11/06/10 12:19:07 INFO util.GSet: VM type      = 64-bit 11/06/10 12:19:07 INFO util.GSet: 2% max memory = 19.1675 MB 11/06/10 12:19:07 INFO util.GSet: capacity     = 2^21 = 2097152 entries 11/06/10 12:19:07 INFO util.GSet: recommended=2097152, actual=2097152 11/06/10 12:19:07 INFO namenode.FSNamesystem: fsOwner=lindenb 11/06/10 12:19:07 INFO namenode.FSNamesystem: supergroup=supergroup 11/06/10 12:19:07 INFO namenode.FSNamesystem: isPermissionEnabled=true 11/06/10 12:19:07 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 11/06/10 12:19:07 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 11/06/10 12:19:07 INFO namenode.NameNode: Caching file names occuring more than 10 times 11/06/10 12:19:07 INFO common.Storage: Image file of size 113 saved in 0 seconds. 11/06/10 12:19:08 INFO common.Storage: Storage directory /tmp/hadoop-lindenb/dfs/name has been successfully formatted. 11/06/10 12:19:08 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at java.net.UnknownHostException: srv-clc-04.u915.irt.univ-nantes.prive3: srv-clc-04.u915.irt.univ-nantes.prive3

start the server
[lindenb@srv-clc-04 hadoop-0.20.203.0]$ ./bin/start-all.sh namenode running as process 23788. Stop it first. localhost: starting datanode, logging to /home/lindenb/package/hadoop-0.20.203.0/bin/../logs/hadoop-lindenb-datanode-srv-clc-04.u915.irt.univ-nantes.prive3.out localhost: starting secondarynamenode, logging to /home/lindenb/package/hadoop-0.20.203.0/bin/../logs/hadoop-lindenb-secondarynamenode-srv-clc-04.u915.irt.univ-nantes.prive3.out starting jobtracker, logging to /home/lindenb/package/hadoop-0.20.203.0/bin/../logs/hadoop-lindenb-jobtracker-srv-clc-04.u915.irt.univ-nantes.prive3.out localhost: starting tasktracker, logging to /home/lindenb/package/hadoop-0.20.203.0/bin/../logs/hadoop-lindenb-tasktracker-srv-clc-04.u915.irt.univ-nantes.prive3.out

copy cdina's data
server1: scp Axiom_GW_Hu_SNP.r2.na31.annot.csv lindenb@172.18.254.164: lindenb@172.18.254.164's password: Axiom_GW_Hu_SNP.r2.na31.annot.csv            100%  765MB  40.3MB/s   00:19

Create a directory on HDFS: bin/hadoop fs -mkdir myfolder