About Git
- Motivation: nowadays, it's pretty common to use a computer for a project in which one wants to: keep history of the changes, access them on different machines with different operating systems, share our work with someone else, etc. In such cases, it's very useful to use a distributed versioning system, such as Git.
- Conflicts: when updating one branch with the content of another one (
git checkout branch1; git merge branch2 ), some conflicts can happen, and it is usually hard to know how to solve them properly (but see a concrete example here). In the following, branch1 can be master and branch2 can be origin/master, or branch1 can be master and branch2 can be dev.
- The first solution is to edit each conflicted files by hand, then run
git add fileX.txt (staging indicates to git that the conflict is resolved) and finally run git commit -m "merge branch2 and solve conflicts" fileX.txt .
- The second solution is to ignore the conflicts and overwrite the files of branch1 with the content of branch2, one file at a time:
git checkout --patch branch2 fileX.txt .
- The third solution, even more radical, is to "overwrite" all of branch1 with the content of the branch2, all files at once:
git reset --hard branch2 .
- Tips:
- undo uncommitted changes:
git checkout myfile.txt
- split a big commit in several smaller commits:
git add -p myfile.txt
- usual config:
git config --global user.name 'Timothée Flutre'; git config --global user.email 'timflutre@gmail.com'; git config --global color.diff auto; git config --global color.status auto; git config --global color.branch auto; git config --global core.editor emacs; git config --global i18n.commitEncoding 'utf8'; git config --global i18n.logOutputEncoding 'utf8'
- remote via ssh tunnel: first open the tunnel
ssh gateway.foo.bar -l tflutre -Nf -L 20400:maincluster:22 , then add the remote git remote add mcl ssh://tflutre@localhost:20400/home/tflutre/myproject/.git
- create release on github: first create tags, and then create the release (automatic via tag name)
curl --user "timflutre" --data '{"tag_name":"v1.0","target_commitish":"master","name":"v1.0","body":"first release"}' https://api.github.com/repos/timflutre/eqtlbma/releases
- get download count of release:
curl -u "timflutre" -i https://api.github.com/repos/timflutre/eqtlbma/releases/:id/assets where the release id can be obtained via curl -u "timflutre" -i https://api.github.com/repos/timflutre/eqtlbma/releases/
- Writing a paper: in this example, I am writing a paper with two colleagues. We decide to do it as a centralized workflow, the shared repository being hosted by github.
- Setting up the infrastructure:
- Each of us needs to create a free account.
- I need to upgrade my account in order to have the right to manage private repositories ($7/month).
- I create a private repository named "paper" and add my colleagues as collaborators to it.
- I retrieve the repository on my local machine:
git clone git://github.com/timflutre/paper.git
- I create my first file, for instance "paper_main.tex", and add it to git in my local repository:
git add paper_main.tex followed by git commit -m "first commit" paper_main.tex .
- I create one branch per collaborator (the default branch being "master"):
git branch tim , then git branch colleague1 and finally git branch colleague2 . I can list the local branches with git branch and I can switch to my branch with git checkout tim for instance.
- I push the changes I made from my local repo onto github:
git push origin master , this for each branch I created.
- I send an email to my colleagues telling them that they can retrieve the content of the repository from github into their local machine(s):
git clone https://github.com/timflutre/paper.git .
- Typical working cycle:
- Each of us can make modifications on its own branch, and push them on github in order to allow the others to access the changes:
git push origin colleague1 for instance.
- From time to time, one of us has the responsibility to merge the changes and update the "master" branch with the latest version.
- Once this is done, the others need to retrieve the new content of "master" in their local repo:
git checkout master , git fetch origin , git diff master origin/master , git merge origin/master .
- Then, they need to update their local branch with the new content of "master":
git checkout colleague1 , git diff --name-status colleague1..master . This will list the files having differences between their local branch and the new content of "master".
- One can look at the differences file by file:
git diff --color-words colleague1:paper_main.tex master:paper_main.tex . The options "--color-words" is especially useful in LaTeX.
- To merge the content of the recently-updated local "master" into his own local branch, we do:
git merge master .
- Tips: don't version the output pdf in the repository because, as it is binary, git can't merge it properly. But you can add a Makefile (see below) and, by entering
make main -i on the command-line, it will compile your pdf document when you need it
all: main supp
main:
latex paper_main.tex
bibtex paper_main
latex paper_main.tex
latex paper_main.tex
pdflatex paper_main
supp:
latex paper_supplements.tex
bibtex paper_supplements
latex paper_supplements.tex
latex paper_supplements.tex
pdflatex paper_supplements
clean:
rm -f *~ *.aux *.dvi *.log *.pdf *.bbl *.blg *.toc
- Two remotes: let's imagine that on cluster1 I have 2 branches, "master" and "dev", on github I only have "master", and I want to work with "dev" on cluster2.
|