Introduction (Git)

Uit De Vliegende Brigade
Naar navigatie springen Naar zoeken springen

Probably the simplest way to describe Git/GitHub:

It's like Dropbox for programmers

or slightly more elaborate:

It's like Dropbox for programmers who work on the same project

So far so good, but then things get very abstract for me, really fast. See Sources at the end of this article for some basic stuff that I found really good, and stuff directly below, if I found it really, really good :)

Stokely

I quite like this introduction from Stokely:

In Git, you always create code on your local computer first and then save your code into Git's "local repository" (repo) on your computer. You then upload your changes to Git's shared "remote repository" when you are done so others can access your code changes. You also download changes from the "remote repository" to your "local repository" so your code stays up-to-date with other developer's changes. You then start the process all over again.

In this way, Git allows you to share your local project code with others remotely while saving versions of those code changes in case something goes wrong and you have to redo some bad code.

More Git details

The first step is always writing code on your local computer, ignoring Git which is not involved in saving or testing code in any way. When you save your local code on your computer, it is not saved in Git by default like you think. You have to do a second step called a "commit". (Saved code that is not commited yet is called "staged" code, btw.)

A commit is the same as saving your local code changes but in the 'Git world'. This confuses people. But when I see the word "commit" I think of it as a "Git Save". It's an extra step, because you already saved your code changes once, and now have to save them a second time in the Git system as a commit or they do not become part of your local Git repository system. I think "commits" are one reason some think Git is poorly designed. It just is not intuitive.

A push is done after you have finished all your code saves and commited your code to your Git repo locally. The push command sends your local repository changes (commits only) up to a remote repository so it is updated. When it does this it writes 100% of your changes over the remote repository completely so the two are in sync or the code matches 100% between the two. Think of this as a "Remote Git Save". It writes over the code on the remote repo with what you have locally on your computer. This made no sense to me at first. Would that not erase changes by other developers on the remote? What if the remote conflicts with your changes or you do not have changes in the remote you need first in your local repo? This confused me at first as no one could explain online if this was the same as a "merge", a "commit", "pull request", etc. It turns out this ONLY works under ONE CONDITION. Otherwise it blocks your push and fails!

A "push" only works if you are the only person changing the remote repo and the two code bases are the same other than the commits you added locally. Otherwise, any changes made by another user on that remote will cancel your push. So think of push as a "Private Remote Write Over" of the same copy as on your local rep. But we know many developers will be pushing changes to the remote copy same as you with pushes, right? So push would fail and everyone would be constantly out of synch on their local copy with the remote copy under this design.

As I mentioned, this push is ONLY allowed (not blocked on the remote repo) if the remote repository is in the exact state as your local repository BEFORE you made changes. In other words, you can only push your local change to the remote project and completely write over it with a push if the remote repo has not been modified by any other developer when you push commit local changes up. That is one weird aspect of Git that confuses people. If your local copy of code is out of sync in any way with the remote repo because it has been changed, the push will fail and you will be forced to do a pull or "rebase" which is a fancy word for "update your local repo first with the remote copy". If your push is blocked and you then do a pull, it will copy down the remote code and "merge" its code changes into your local copy. Once in sync again, you can still push your commits up with a push as they should still be present after the pull or merge.

This works perfectly in most cases unless a code change conflicts with commits or code changes you made the other developer also made in the same code area. In that rare case you have to resolve the conflict locally before you can proceed with anything else, as you could erase another developers changes with your own by accident. That is where pull requests (see below) are helpful instead of a push, as the former forces major code changes to be resolved on the remote copy manually by the code owners or admins first before any code is allowed to change the remote repo.

Interestingly, a "pull" does the same as a "push", but in this case pulls a copy of the latest remote project down to your local git system then "merges" those changes into your own copy, not write over them like a "push" does. Of course, this syncs your remote and local copies again, minus the new commits you are set to update on the remote repo using your "push" again.

Once you have sync'ed your local copy to remote via a pull, you can now do a push and send your commits or changes back up to the remote copy again and write over it safely knowing you have merged your changes with the ones made by all the other developers.

After the push writes over the remote copy with your local copy's commits or changes, the remote matches your local copy exactly. Because they both match, any additional commits or saves you make locally can be pushed again remotely without pulls - as long as no developers have changed the remote like you have. Git will always alert you on pushes if that is the case. You cannot screw it up. You can do "forced" pushes., "rebasing", and other tricks but that's not important. But as soon as other developers push their changes, you are out of sync again and have to do a pull again first before pushing.

This commit-pull-push is the real rhythm of Git development nobody tells you about and assumes you understand. Most do not. It just is not intuitive or logical.

Of course you can "force" a push and write over everything anyway. But this system will alert you before you try that. This system of pulls and pushes always works better when one developer is updating one branch in one remote repository they own. It fails as soon as a new person adds or changes anything in the remote. That is what causes pushes and pull alerts and failures until everyone sync's again with the remote. When that is done, the developer has power to push changes as their code matches the remote again. But its better to use the Git pull request command when there are a lot of changes or merges of branches and code going to remote repo's.

Lastly, it is important to note that people developing in Git are almost always encouraged to create a new local and remote repo branch first before making changes to software. In that case, the push and pull makes perfect sense as code changes are then almost always done on an isolated branch of the software by a single developer who never conflicts with other developer changes. This then explains why its common for one developer to work on their own branch, and in that case, push and pull works perfectly, can push/pull changes quickly, never causes code conflicts, and allows the one developer to store copies of their final local changes on a remote repo branch he pushes for merging later into main branches using the pull request system described below.

The Weird Pull Request

Last part of the Git puzzle. A pull request is a "pull" from the perspective of the remote repository pulling local repo code up into it. But it is a request and initially does NOT physically pull anything or change anything, nor push or merge any code. It is simply a request you send to a remote repo from your local repo to review code and if they approve, pull your code into their remote copy.

In a pull request, you are asking the remote repo admins or owners to upload your code changes or commits, review your local repo commit changes, and then merge your code changes into the remote repo upon their approval. When they approve of your reviewed local repo commits or changes, they pull your local repo code or branch and merge it into a remote repo branch.

The remote repo has admins or owners that control critical top branches of code in the remote repo that are being prepared for production. They do not like to merge changes of code into these larger branches without some control over the quality of the code. When you do a pull request you are alerting the admin or remote repo owners that you as a local developer have some finished branch of code and want them to manually pull your local repo code up into the remote repo code. This is the same as a push from a local to a remote repo, but in this case requires it be completed from the opposite direction by top remote repo owners. It also requires a manual code review and approval of changes from the remote repo owners, first.

Note: It is more common for a pull request to merge code between two remote repo branches, and usually affects finished branches of code on the remote server or device that must be merged into a main remote branch, not just local commit changes. Again, not intuitive, but there you go!

Why would you need a pull request if you have push and pull Git commands synchronizing and merging code? pull requests work better than pushes in scenarios where lots and lots of people are pulling and pushing huge changes to finished branches with merges to the main remote repo branches and where lots of code could conflict or code added that must be tested or code reviewed first before going to a major release or code base update. Push and 'Pull' works better for isolated smaller branches only one or two developers are working on and sharing. It is much easier to pull and push code between local and remote repos than to merge massive branches of complex remote repo changes into the master branches of a remote repo.

So rememeber...

Use 'Push' to update small branches you control both locally and remotely.

Use 'Pull Request' to have remote repository people merge your smaller branches into their larger ones on the remote servers.

So I like to think of pull requests as a Master Push and a push as a Local Push. I wish the Git guys had made the names for these processes more logical and easier to understand. It just is not intuitive at all!

It turns out a pull request is also a way of adding a layer of code security that asks permission by admins or team members first when merging large stacks of code into critical top level remote branches or merges of branches in projects. As such it asks the team members to approve large batches of local repo code changes and commits first before they pull them into important remote repo branches.

This serves to protect code updates to critical branches with code reviews and approvals first. But it also allows remote repos being updated by lots of teams to pause code changes on more important branches and merges before they are tested, approved, etc. This is why small branches of code private to a developer are simply pull and push changes, but larger merges of those branches are typically blocked from pushes by pull requests instead. This is more typical of finished branches using pushes that are then merged into larger branches up the Git tree.

Figuring this out took hours and hours of research, even after I had been using Git for years, since there is no documentation online explaining this difference.

So....always use a commit-pull-push routine when working on your own code changes or sections of projectss assigned to you. Pull down repos first to make sure your local repo has all the latest changes made to it by other developers. Commit or save all your local changes before and after the pull if you like...it doesn't matter. If there are conflicts, try to resolve them locally. Then and only then, do your push to write over the remote copy with your local code. Your local and remote repositories are then 100% in sync in the Git World!

Lastly, when your local and remote branches are done, send a pull request to your git administrator and have them handle the merge of your completed remote repository branch.

Sources