User:Timothee Flutre/Notebook/Postdoc/2012/08/14

From OpenWetWare

(Difference between revisions)
Jump to: navigation, search
(Autocreate 2012/08/14 Entry for User:Timothee_Flutre/Notebook/Postdoc)
(About Git: add cheatsheet)
(18 intermediate revisions not shown.)
Line 6: Line 6:
| colspan="2"|
| colspan="2"|
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
-
==Entry title==
+
==About Git==
-
* Insert content here...
+
 +
* '''Motivation''': nowadays, it's pretty common to use a computer for a project in which one wants to: keep history of the changes, access them on different machines with different operating systems, share our work with someone else, etc. In such cases, it's very useful to use a distributed versioning system, such as [http://en.wikipedia.org/wiki/Git_%28software%29 Git].
 +
 +
 +
* '''Documentation''':
 +
** try it [http://try.github.io/levels/1/challenges/1 online]
 +
** if you liked, [http://git-scm.com/downloads download] it
 +
** official [http://git-scm.com/doc book]
 +
** [http://gitref.org/ quick ref], [http://www.ndpsoftware.com/git-cheatsheet.html cheatsheet]
 +
** tutorial for [http://nyuccl.org/pages/GitTutorial/ scientists] (another by a [http://kbroman.github.io/github_tutorial/ geneticist])
 +
** [http://gitready.com/ resources] depending on your level
 +
** make your repositories freely available online via [https://github.com/ github] (see its [https://help.github.com/ help pages] too) or [https://bitbucket.org/ bitbucket]
 +
** ask questions on [http://stackoverflow.com/ stackoverflow]
 +
** manage your code, papers, talks, courses and even [http://www.wired.com/wiredenterprise/2013/06/cades-witty-headline-here/ books] with it!
 +
 +
 +
* '''Conflicts''': when updating one branch with the content of another one (<code>git checkout branch1; git merge branch2</code>), some conflicts can happen, and it is usually hard to know how to solve them properly (but see a concrete example [http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging here]). In the following, branch1 can be master and branch2 can be origin/master, or branch1 can be master and branch2 can be dev.
 +
** The first solution is to edit each conflicted files by hand, then run <code>git add fileX.txt</code> (staging indicates to git that the conflict is resolved) and finally run <code>git commit -m "merge branch2 and solve conflicts" fileX.txt</code>.
 +
** The second solution is to ignore the conflicts and overwrite the files of branch1 with the content of branch2, one file at a time: <code>git checkout --patch branch2 fileX.txt</code>.
 +
** The third solution, even more radical, is to "overwrite" all of branch1 with the content of the branch2, all files at once: <code>git reset --hard branch2</code>.
 +
 +
 +
* '''Tips''':
 +
** undo uncommitted changes: <code>git checkout myfile.txt</code>
 +
** split a big commit in several smaller commits: <code>git add -p myfile.txt</code>
 +
** usual config: <code>git config --global user.name 'Timothée Flutre'; git config --global user.email 'timflutre@gmail.com'; git config --global core.editor emacs; git config --global i18n.commitEncoding 'utf8'; git config --global i18n.logOutputEncoding 'utf8'</code>
 +
** remote via ssh tunnel: first open the tunnel <code>ssh gateway.foo.bar -l tflutre -Nf -L 20400:maincluster:22</code>, then add the remote <code>git remote add mcl ssh://tflutre@localhost:20400/home/tflutre/myproject/.git</code>
 +
 +
 +
* '''Writing a paper''': in this example, I am writing a paper with two colleagues. We decide to do it as a [http://git-scm.com/book/en/Distributed-Git-Distributed-Workflows centralized workflow], the shared repository being hosted by [https://github.com/ github].
 +
** Setting up the infrastructure:
 +
*** Each of us needs to [https://github.com/signup/free create a free account].
 +
*** I need to upgrade my account in order to have the right to manage private repositories ([https://github.com/plans/ $7/month]).
 +
*** I create a private repository named "paper" and add my colleagues as collaborators to it.
 +
*** I retrieve the repository on my local machine: <code>git clone git://github.com/timflutre/paper.git</code>
 +
*** I create my first file, for instance "paper_main.tex", and add it to git in my local repository: <code>git add paper_main.tex</code> followed by <code>git commit -m "first commit" paper_main.tex</code>.
 +
*** I create one branch per collaborator (the default branch being "master"): <code>git branch tim</code>, then <code>git branch colleague1</code> and finally <code>git branch colleague2</code>. I can list the local branches with <code>git branch</code> and I can switch to my branch with <code>git checkout tim</code> for instance.
 +
*** I push the changes I made from my local repo onto github: <code>git push origin master</code>, this for each branch I created.
 +
*** I send an email to my colleagues telling them that they can retrieve the content of the repository from github into their local machine(s): <code>git clone https://github.com/timflutre/paper.git</code>.
 +
** Typical working cycle:
 +
*** Each of us can make modifications on its own branch, and push them on github in order to allow the others to access the changes: <code>git push origin colleague1</code> for instance.
 +
*** From time to time, one of us has the responsibility to merge the changes and update the "master" branch with the latest version.
 +
*** Once this is done, the others need to retrieve the new content of "master" in their local repo: <code>git checkout master</code>, <code>git fetch origin</code>, <code>git diff master origin/master</code>, <code>git merge origin/master</code>.
 +
*** Then, they need to update their local branch with the new content of "master": <code>git checkout colleague1</code>, <code>git diff --name-status colleague1..master</code>. This will list the files having differences between their local branch and the new content of "master".
 +
*** One can look at the differences file by file: <code>git diff --color-words colleague1:paper_main.tex master:paper_main.tex</code>. The options "--color-words" is especially useful in LaTeX.
 +
*** To merge the content of the recently-updated local "master" into his own local branch, we do: <code>git merge master</code>.
 +
** Tips: don't version the output pdf in the repository because, as it is binary, git can't merge it properly. But you can add a Makefile (see below) and, by entering <code>make main -i</code> on the command-line, it will compile your pdf document when you need it
 +
 +
<nowiki>
 +
all: main supp
 +
 +
main:
 +
latex paper_main.tex
 +
bibtex paper_main
 +
latex paper_main.tex
 +
latex paper_main.tex
 +
pdflatex paper_main
 +
 +
supp:
 +
latex paper_supplements.tex
 +
bibtex paper_supplements
 +
latex paper_supplements.tex
 +
latex paper_supplements.tex
 +
pdflatex paper_supplements
 +
 +
clean:
 +
rm -f *~ *.aux *.dvi *.log *.pdf *.bbl *.blg *.toc
 +
</nowiki>
 +
 +
 +
* '''Two remotes''': let's imagine that on cluster1 I have 2 branches, "master" and "dev", on github I only have "master", and I want to work with "dev" on cluster2.
 +
** first I log on cluster2 and I clone the repo from github: <code>git clone https://github.com/timflutre/myproject.git</code>
 +
** then I add my repo from cluster1 as a remote: <code>cd myproject/; git remote add cluster1 ssh://tflutre@cluster1:/home/tflutre/myproject/.git</code>
 +
** finally I fetch the remotes and create a "dev" branch which tracks the one on cluster1: <code>git remote update; git checkout -b dev cluster1/dev</code>
<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->

Revision as of 00:53, 9 November 2013

Project name Main project page
Previous entry      Next entry

About Git

  • Motivation: nowadays, it's pretty common to use a computer for a project in which one wants to: keep history of the changes, access them on different machines with different operating systems, share our work with someone else, etc. In such cases, it's very useful to use a distributed versioning system, such as Git.



  • Conflicts: when updating one branch with the content of another one (git checkout branch1; git merge branch2), some conflicts can happen, and it is usually hard to know how to solve them properly (but see a concrete example here). In the following, branch1 can be master and branch2 can be origin/master, or branch1 can be master and branch2 can be dev.
    • The first solution is to edit each conflicted files by hand, then run git add fileX.txt (staging indicates to git that the conflict is resolved) and finally run git commit -m "merge branch2 and solve conflicts" fileX.txt.
    • The second solution is to ignore the conflicts and overwrite the files of branch1 with the content of branch2, one file at a time: git checkout --patch branch2 fileX.txt.
    • The third solution, even more radical, is to "overwrite" all of branch1 with the content of the branch2, all files at once: git reset --hard branch2.


  • Tips:
    • undo uncommitted changes: git checkout myfile.txt
    • split a big commit in several smaller commits: git add -p myfile.txt
    • usual config: git config --global user.name 'Timothée Flutre'; git config --global user.email 'timflutre@gmail.com'; git config --global core.editor emacs; git config --global i18n.commitEncoding 'utf8'; git config --global i18n.logOutputEncoding 'utf8'
    • remote via ssh tunnel: first open the tunnel ssh gateway.foo.bar -l tflutre -Nf -L 20400:maincluster:22, then add the remote git remote add mcl ssh://tflutre@localhost:20400/home/tflutre/myproject/.git


  • Writing a paper: in this example, I am writing a paper with two colleagues. We decide to do it as a centralized workflow, the shared repository being hosted by github.
    • Setting up the infrastructure:
      • Each of us needs to create a free account.
      • I need to upgrade my account in order to have the right to manage private repositories ($7/month).
      • I create a private repository named "paper" and add my colleagues as collaborators to it.
      • I retrieve the repository on my local machine: git clone git://github.com/timflutre/paper.git
      • I create my first file, for instance "paper_main.tex", and add it to git in my local repository: git add paper_main.tex followed by git commit -m "first commit" paper_main.tex.
      • I create one branch per collaborator (the default branch being "master"): git branch tim, then git branch colleague1 and finally git branch colleague2. I can list the local branches with git branch and I can switch to my branch with git checkout tim for instance.
      • I push the changes I made from my local repo onto github: git push origin master, this for each branch I created.
      • I send an email to my colleagues telling them that they can retrieve the content of the repository from github into their local machine(s): git clone https://github.com/timflutre/paper.git.
    • Typical working cycle:
      • Each of us can make modifications on its own branch, and push them on github in order to allow the others to access the changes: git push origin colleague1 for instance.
      • From time to time, one of us has the responsibility to merge the changes and update the "master" branch with the latest version.
      • Once this is done, the others need to retrieve the new content of "master" in their local repo: git checkout master, git fetch origin, git diff master origin/master, git merge origin/master.
      • Then, they need to update their local branch with the new content of "master": git checkout colleague1, git diff --name-status colleague1..master. This will list the files having differences between their local branch and the new content of "master".
      • One can look at the differences file by file: git diff --color-words colleague1:paper_main.tex master:paper_main.tex. The options "--color-words" is especially useful in LaTeX.
      • To merge the content of the recently-updated local "master" into his own local branch, we do: git merge master.
    • Tips: don't version the output pdf in the repository because, as it is binary, git can't merge it properly. But you can add a Makefile (see below) and, by entering make main -i on the command-line, it will compile your pdf document when you need it
all: main supp

main:
	latex paper_main.tex
	bibtex paper_main
	latex paper_main.tex
	latex paper_main.tex
	pdflatex paper_main

supp:
	latex paper_supplements.tex
	bibtex paper_supplements
	latex paper_supplements.tex
	latex paper_supplements.tex
	pdflatex paper_supplements

clean:
	rm -f *~ *.aux *.dvi *.log *.pdf *.bbl *.blg *.toc


  • Two remotes: let's imagine that on cluster1 I have 2 branches, "master" and "dev", on github I only have "master", and I want to work with "dev" on cluster2.
    • first I log on cluster2 and I clone the repo from github: git clone https://github.com/timflutre/myproject.git
    • then I add my repo from cluster1 as a remote: cd myproject/; git remote add cluster1 ssh://tflutre@cluster1:/home/tflutre/myproject/.git
    • finally I fetch the remotes and create a "dev" branch which tracks the one on cluster1: git remote update; git checkout -b dev cluster1/dev


Personal tools