POV-Ray: Newsgroups: povray.off-topic: Git tutorial

POV-Ray : Newsgroups : povray.off-topic : Git tutorial		Server Time 6 Jul 2025 00:56:01 EDT (-0400)

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>

From: Invisible
Subject: Re: Git tutorial
Date: 21 Apr 2011 04:06:44
Message: <4dafe594$1@news.povray.org>

On 20/04/2011 18:11, Darren New wrote:

> Or, as an alternate example, say you've been working and every day you
> commit before lunch and you commit before you go home, even if it's not
> working, just so it gets backed up. And you implement two functions, and
> you write code on that, and then realize you should have put that first
> function elsewhere, and you don't need the second function at all, and
> the other code should be in separate objects, and etc etc etc.
>
> And at the end of the week, you have 50 messy changes committed.
>
> With git, you can say "OK, go diff the current version against where I
> branched, and give me exactly one commit with all the changes I need."
> It's trivial to do that in git and then say "now commit *that* change
> for everyone else to see, and abandon all the intermediate changes."
>
> I don't know how you'd do something like that in mercurial or darcs that
> store *changes* in the repository.

Assuming that your working copy matches everything Darcs has in its 
history, you'd do this:

1. You "unrecord" the 50 messy commits. That doesn't do anything to your 
working copy, just the history Darcs keeps.

2. You "record" a single commit. When you do this, Darcs diffs the whole 
working copy against what it has in its history, and records that.

(In other words, if you add 500 lines, commit, delete 450 of those lines 
commit, and then you unrecord the two commits and record a new commit, 
the 450 lines that you added then deleted don't show up any more.)

Needless to say, you do *not* want to be unrecording any history which 
other people have copies of. But if it's only your local repo, it's fine.

Post a reply to this message

From: Invisible
Subject: Re: Git tutorial
Date: 21 Apr 2011 04:17:13
Message: <4dafe809$1@news.povray.org>

>> The problem Git seems to have is that it uses heads to keep track of
>> things.
>> Delete the head and the corresponding commit drops off the face of the
>> Earth.
>
> Yes. That's why you shouldn't do that.

It's also why having sub-repos might be tricky. Much simpler if you 
don't need to keep updating pointers.

> On the other hand, if you work on something and decide it wasn't a good
> idea, you can delete the branch and no harm no done. Darcs apparently
> requires you to copy the entire repository before you even *start*
> making changes if you want to recover.

What craziness are you speaking? If you want to go back to an older 
version, you just say "take me back to an older version please". If you 
don't want changes you've made, you either record commits reverting 
them, or you just delete them from the history outright. That's kind of 
the whole point of version control, distributed or not.

>> Darcs manages a set [as in set theory] of changes. You don't need to keep
>> updating a "pointer" to point to the latest one or anything. I'd be
>> surprised if no over VCS has thought of this.
>
> But that's exactly why you need to start a new repository if you want a
> new branch. If you clone a repository in Darcs, make a bunch of changes,
> then accidentally delete the repository, you're in even worse shape than
> if you delete a branch in git.

Well, yes, if you delete all your work, you have a problem. This isn't 
unique to Darcs. I'm not seeing what your point is...

Post a reply to this message

From: Le Forgeron
Subject: Re: Git tutorial
Date: 21 Apr 2011 05:13:05
Message: <4daff521@news.povray.org>

Le 20/04/2011 19:11, Darren New a écrit :
> On 4/20/2011 9:03, Invisible wrote:
>> Fundamentally, VCS are about tracking changes. Git might *implement*
>> that by
>> storing the entire file, but *logically* what you're trying to do is keep
>> track of what you changed.
> 
> Or, as an alternate example, say you've been working and every day you
> commit before lunch and you commit before you go home, even if it's not
> working, just so it gets backed up. And you implement two functions, and
> you write code on that, and then realize you should have put that first
> function elsewhere, and you don't need the second function at all, and
> the other code should be in separate objects, and etc etc etc.
> 
> And at the end of the week, you have 50 messy changes committed.

Yes, but that is in your messy repository only.
"Commit often, Push when working" is a good approach with DVCS.

> 
> With git, you can say "OK, go diff the current version against where I
> branched, and give me exactly one commit with all the changes I need."
> It's trivial to do that in git and then say "now commit *that* change
> for everyone else to see, and abandon all the intermediate changes."
> 
> I don't know how you'd do something like that in mercurial or darcs that
> store *changes* in the repository.
> 

For mercurial, there is an extension which aggregate the change-line or
even a cloud: collapse.

As long as the set of commits was not published in another repository,
it's ok (you just loose the finer steps).

Post a reply to this message

From: Invisible
Subject: Re: Git tutorial
Date: 21 Apr 2011 05:26:12
Message: <4daff834$1@news.povray.org>

On 20/04/2011 17:55, Darren New wrote:
> On 4/20/2011 9:03, Invisible wrote:
>> Doesn't appear to me that that's what happens, from what little Mercurial
>> documentation I've read.
>
> I don't know. All the mercurial documentation I've read talks about
> change sets.

The documentation I saw talks about a linear series of file versions, 
just like Git and RCS and CVS and...

>> Fundamentally, VCS are about tracking changes.
>
> Fundamentally, they're about controlling versions. :-)

Well, that's a valid way to look at it I guess.

> I think it depends. If I want version 1.0 that was released, I don't
> really care what changed to get there. I want that version.

Yes, clearly.

On the other hand, if somebody sends you some stuff and says "add this 
to your repo, it fixes bug #23482", you probably want to know what changed.

So yes, it does depend.

> The advantage of storing what's actually there is you can write all
> kinds of better tools to tell you the differences.

You can still apply whatever tools you want to your files, no matter 
which way you store them. Although I will admit, having a diff algorithm 
built right into the version control software is quite nice. (Although 
sometimes I wish Darcs did this better.)

> Basically, you're storing absolutes and deducing differences, rather
> than storing differences and deducing absolutes. That means when you
> want to know what changed between release candidate 3.5RC2 and the
> version 4.2 that Fred compiled over on *his* machine, you can just
> compare the two. You don't have to reconstruct anything first.

If you just write "darcs diff", you can see the changes between any two 
versions of your repo. The fact that Darcs has to do lots of work behind 
the scenes to do this is of little consequence to me. Darcs has to apply 
an algorithm that generates the two versions and then diffs them. Git 
would have to apply an algorithm that unpacks the two commits and diffs 
them. I don't really care, so long as I get my answers.

>> OK, wow. I thought having to tell Darcs when I rename stuff was
>> inconvenient, but this just sounds insane...
>
> Why? You don't have to tell git you renamed something.

Which means that it tries to guess when you rename something, so it is 
100% guaranteed to guess wrong sometimes.

Still, I suppose if it's sufficiently rare, it doesn't matter too much...

>> It sounds simple enough. If this change affects line X and that change
>> affects line Y, they are independent.
>
> Yeah, until you get binary objects in there. :-)

Yeah, it's unclear how you can hope to version control a binary file, 
other than just keeping a linear sequence of versions (which is what 
Darcs apparently does). Personally I've never needed to try, but I guess 
somebody I might.

>> And if you merge the comments branch into the main branch, and then
>> somebody
>> adds more stuff to the comments branch, then what?
>
> Then you get a merge and then more changes on the comments branch. And
> if you merge the comments branch *again*, *that* is when git uses the
> parent pointers in the commit objects to figure out which files to diff
> in order to get the patches to the parent.
>
> A--B--C--D--E--F--G--H--I
> \ | /
> Q--R/--S--T/
>
> So you started with A, changed to B, branched B and made a change to
> create Q, then R. In the mean time, I changed B to be C. Now I merge
> your R back to my C. This looks back, sees B is the common ancestor, so
> diffs R against B and applies it to C, then creates D with C and R as
> parent commits. (Each letter is a commit, which includes the state of
> the entire repository.)
>
> Now you keep working on R without incorporating my B->C change, creating
> S and T. I change D to include E and F. Now I merge your work again.
>
> Git looks at F, follows it back to D, to C and R, and sees that R is a
> common ancestor of both F and T. So it diffs T against R, applies those
> diffs to F, and creates G. You can then delete the branch that points to
> T safely without losing anything.
>
> It's *super* straightforward to understand what merges do in git.

I don't know, man, that all looks very, very complicated to me.

If I want to fix a bug in (say) GHC [which uses Darcs], I find the files 
in question, edit them, record the changes, and email the file to the 
GHC developers. I don't need to care about branches or whether the 
development tree has changed since I got my copy of it. They don't need 
to care whether my repo is in sync with theirs. They just apply the 
change, and it's done. Simple.

> And if someone comes up with a better diff algorithm, no problem. The
> algorithm to do the diff during a merge isn't built into the repository.

This is only an issue for Darcs. I don't have to care how Darcs stores 
my stuff. I can apply any diff algorithm I want to my files.

>> Git doesn't support recording half a file modification,
>
> Yes it does. Indeed, you can even go back and retroactively say "oh,
> those two commits? The second one should have come first, and the first
> one should be broken up into these three commits."
>
> As I said, I do this all the time.

I don't see how that's possible.

>> and doesn't even figure out which files changed.
>
> Yes it does.

Then why do you have to manually tell it which files to commit?

> It's just a two-step process. You can build up the thing you want to
> commit, and then finally commit it. It sounds like Darcs needs you to do
> that all in one step.
>
> Git does it the other way around. First it asks you what modified lines
> you want to put in the commit (and puts them in the staging area), then
> it creates the commit (based on the staging area).

I'm not sure I'm understanding what Git does. What Darcs does is show 
you each change and say "do you want to put this into the commit?" If 
you say yes, it records that change. If you say no, the change stays as 
"new". My usual workflow when I edit stuff is to periodically run Darcs, 
gather up all the changes related to one thing into a commit, run Darcs 
again, gather up all the changes related to another thing into another 
commit, and so on. I'm not sure what you mean by "Darcs needs you to do 
that all in one step".

>>> This is trivial with GIT. I do it all the time. I'll be adding a new
>>> function, and while testing, realize there's a bug in some other
>>> function. So when everything works again, I'll do two commits, staging
>>> just particular hunks (in the diff sense of the word) and do two
>>> commits, one for the bugfix and one for the new change.
>>
>> Given that Git can only record the new file or the old one, how is that
>> possible?
>
> The staging area lies between the repository and the working directory.

So, wait, there's a third file storage area?

> So I check out some branch, and that copies it to the WD and maybe
> clears the staging area. The staging area is basically a commit that's
> not yet in the repository.
>
> Now I make changes to the WD.
>
> Then I use something like "git add" to add all the changes from the WD
> to the staging directory. Or I use "git add -i" (or, more likely, the
> GUI) to diff the WD against the staging area (or the repository), pick
> (say) three of the five diff hunks, and then create a new temp file that
> holds the repository with those three diff hunks applied, which I then
> put in the staging area. When I have everything the way I like, I commit
> the change, which copies the staging area into the repository and then
> adds a commit object pointing to it.

Damn that sounds complicated.

>> This boggles my mind. Apparently I /don't/ understand how Git works at
>> all, because the way it seems to work precludes two people touching the
>> same file at the same time...
>
> Sure. But you're thinking git tracks diffs. That's exactly the point.

I know Git doesn't track diffs - I just can't comprehend how that can 
actually work properly.

> If
> I change the file, and you change the file, then now there's three
> files. The original, the new one I have, and the new one you have. When
> we go to merge it, we create number four, which is your new one with the
> differences between my version and the original applied.

This just seems a very strange way to look at things. Generally you 
don't care about versions, you care about alterations. "Does this draft 
have the corrections to chapter 4 in it or not?"

It seems to me that with the Git model, any time anybody edits any file, 
you create a new version of the entire repo that then has to be 
laboriously merged back into everybody else's repos. (Assuming no other 
edits have happened in the meantime.) What a clunky way to work.

>> And the "minor detail" that if 200 people edit the same file, that's 200
>> separate branches which have to be manually merged back together again.
>
> And this differs from any other VCS how?

With a centralised system, usually it's a check-in / check-out model, so 
only one person can edit a file at once.

With something like Darcs, there are now 200 change-sets, each of which 
is only in some repos. Copy the change-sets around and everything is in 
sync again. No need for complex "merge" operations or tangled file 
histories.

> Note that if you're trying to *push* changes to a remote repository, you
> have to do it to a branch where nobody else has branched off since you
> did.

And what the hell are the chances of that ever happening? If every time 
anybody touches any file it generates a new branch, then there's no 
chance of ever being able to push changes back.

> In other words, if I say "update my repository to the DEV branch on
> the company's central reposityro", and I make changes, and someone else
> changes the DEV branch to point to a later version, I can no longer push
> my changes into the DEV branch. Instead, I have to fetch down the new
> DEV branch, merge my changes, then push the newly merged commit back up.

And hope that the DEV branch doesn't change while you're busy trying to 
catch up. Still, I suppose if you repeat this cycle enough times, 
eventually you might get lucky and be able to perform the push.

>>> If you want to merge someone's repository into yours, you simply copy
>>> from them any files or names that they have that you don't, and you're
>>> done. You're merged.
>>
>> It would be nice if Darcs worked that way.
>
> Right. In Darcs, you have to merge all the changes. In git, you have to
> merge all the changes.

No, I meant it would be nice if the Darcs repo format allowed you to 
update a repo just by copying some files. Unfortunately there's 
cross-references and stuff which also have to be updated, so it's not 
that simple. You actually have to run Darcs to import a new patch.

Darcs also doesn't explicitly support the "bare" format that Git does, 
despite it being obviously useful.

>>> Now if you want to incorporate their changes into
>>> your work, you generate a diff between their latest version and some
>>> earlier version, and apply that diff to your latest version, and you're
>>> merged.
>>
>> What a backwards way to look at it.
>
> Only if you're used to looking at source control as a series of diffs to
> start with. But that's (A) exactly what makes git hard to understand and
> (B) exactly what makes git brilliant. :-)

So doing things the hard way is brilliant?

Post a reply to this message

From: Invisible
Subject: Re: Git tutorial
Date: 21 Apr 2011 05:35:51
Message: <4daffa77$1@news.povray.org>

>> Given that it's the repository format used by Linux developers, I think
>> it's safe to say it works adequately for multiple people editing the
>> file in parallel.
>
> This boggles my mind. Apparently I /don't/ understand how Git works at
> all...

Perhaps I can summarise:

The Darcs workflow. I download the source code, make a small edit to it, 
ask Darcs to record that, and send the changes to the developers. They 
add it to the central repo, which checks whether the bit I just edited 
has changed since I got my copy. If not [which is quite likely], the 
change is added to the central repo. Done.

The Git workflow. I download the source code, make a small edit to it, 
and ask Git to record that. Git takes a complete record of every file in 
the entire repo. I send that to the developers, and they try to add it 
to the central repo, which makes Git check whether any unrelated changes 
have happened anywhere in the entire repo since I made my change. Since 
it is 100% guaranteed that this will have happened, the merge fails. I 
now have to download the latest version of the source code, do a bunch 
of work on my end to incorporate my 3-line edit into the newly updated 
source tree, record a completely new commit object, and send that plus 
the previous one back to the developers. They try to merge, find the 
exact same problem, and I have to repeat all the above steps. We repeat 
this endlessly until, by some fluke, I manage to execute the entire 
download / merge / commit / send / have the developers merge cycle 
without anybody else successfully merging to the central repo in 
between. When this happens, instead of the 3-line change being added to 
the central repo, we get a 25-mile string of merge commits plus the 
actual 3-line edit at the beginning.

FTW, people?

Obviously this model cannot possibly work, so there must be something 
I'm misunderstanding about how Git works.

Post a reply to this message

From: Warp
Subject: Re: Git tutorial
Date: 21 Apr 2011 10:46:10
Message: <4db04332@news.povray.org>

Darren New <dne### [at] sanrrcom> wrote:
> Having spoken with half a dozen people who said "I hate git, it's so 
> confusing" and then after showing them this they go "wow, that's really 
> easy", I figured it might be worthwhile to show people this. :-)

  I hate SVN. SVN projects are so easy to break accidentally, and if you
don't know the exact reason, you could be fighting to fix it for a long
time.

  SVN projects are extremely fragile. SVN has the totally braindead idea
of putting a .svn project directory on each single subdirectory in the
project. (This is very unlike git, which keeps one single project directory
under the main directory where the project resides.)

  For example, duplicate a directory with your favorite file manager.
Oops. SVN doesn't like that new directory at all. It refuses to add it
to the project, or do anything at all with it. If you don't know why this
happens, you are stuck. SVN refuses to do anything with it. (Solution:
Remove the .svn subdirectories from the entire offending directory hierarchy.
Most graphical file managers have no support for doing this recursively, of
course, and it's aggravated by . files being hidden in unix systems.)

  Copy the contents of a directory (and its possible subdirectories) from
somewhere else (eg. update the contents of a third-party library, or copy
the work you have been doing on another platform to the project). Oops,
you just broke SVN once again. SVN will once again refuse to do anything
with this directory. You can't commit, and even an update won't fix the
problem. (Solution: Remove the entire offending directory structure, then
update, then copy the individual files from the other directory, rather
than the entire directory structure.)

  Sometimes renaming/moving things from an SVN client itself can break
things, even though it shouldn't.

  On a Mac the file system adds an additional layer of annoyance. Try
changing just the case of a file name (for example change "settings.hh"
to "Settings.hh") and try to figure out how to make SVN work after that.
It can be a pretty fun evening. (Not.)

-- 
                                                          - Warp

Post a reply to this message

From: Darren New
Subject: Re: Git tutorial
Date: 21 Apr 2011 11:34:29
Message: <4db04e85@news.povray.org>

On 4/21/2011 1:17, Invisible wrote:
> If you don't want
> changes you've made, you either record commits reverting them, or you just
> delete them from the history outright.

OK. It wasn't obvious from the bits I read that it was easy to delete 
changes from the repository. I guess Darcs probably does that better than 
mercurial or something.

> Well, yes, if you delete all your work, you have a problem. This isn't
> unique to Darcs. I'm not seeing what your point is...

That the solution to deleting branches you're in the middle of working on is 
the same in both cases: "Duh, don't do that. Or make backups."

-- 
Darren New, San Diego CA, USA (PST)
   "Coding without comments is like
    driving without turn signals."

Post a reply to this message

From: Darren New
Subject: Re: Git tutorial
Date: 21 Apr 2011 12:13:05
Message: <4db05791$1@news.povray.org>

On 4/21/2011 2:26, Invisible wrote:
> You can still apply whatever tools you want to your files, no matter which
> way you store them. Although I will admit, having a diff algorithm built
> right into the version control software is quite nice. (Although sometimes I
> wish Darcs did this better.)

That's exactly what I mean. In order to have Darcs do this better, you have 
to actually fix your repo to use the better algorithm. In git, the algorithm 
isn't part of the repo. It's only part of the tools. Your very statement 
"it's built in, but I wish Darcs did it better" is exactly my point.

> If you just write "darcs diff", you can see the changes between any two
> versions of your repo.

What if I want to use kdiff3 or gvimdiff or some other diff visualization tool?

>>> OK, wow. I thought having to tell Darcs when I rename stuff was
>>> inconvenient, but this just sounds insane...
>>
>> Why? You don't have to tell git you renamed something.
>
> Which means that it tries to guess when you rename something, so it is 100%
> guaranteed to guess wrong sometimes.

If a file disappears from one place and at the same time reappears somewhere 
else with exactly to the byte the same contents, does it really matter 
whether you renamed it, whether you cut and pasted, or whether you typed it 
back in?

Remember, git is storing snapshots of the repository, not changes. The fact 
that something got renamed is sort of irrelevant.

Darcs is also storing snapshots, except snapshots of changes. If I rename 
file A to B, and then from B to C, and *then* I commit, Darcs isn't going to 
have those changes either.  If I delete five lines, then another ten, then 
commit, Darcs is going to lose that history that that was actually two changes.

> Yeah, it's unclear how you can hope to version control a binary file, other
> than just keeping a linear sequence of versions (which is what Darcs
> apparently does). Personally I've never needed to try, but I guess somebody
> I might.

Word documents. Images. Audio. Video game resources. People want version 
control for all of that.

Keeping all the old versions is the way to do it. The problem comes when you 
have a distributed repo, and you have to store locally every old version of 
all the binary files that you almost never are going to want.

Imagine if you were a Linux developer and people stored installation CD ISO 
images in the repository. Do you really want to check out every copy of 
every install CD just so you can fix bugs in one file system?

> I don't know, man, that all looks very, very complicated to me.

You asked what happens in that case. That's what happens.

> If I want to fix a bug in (say) GHC [which uses Darcs], I find the files in
> question, edit them, record the changes, and email the file to the GHC
> developers. I don't need to care about branches or whether the development
> tree has changed since I got my copy of it. They don't need to care whether
> my repo is in sync with theirs. They just apply the change, and it's done.
> Simple.

That's how it works with git also. Indeed, there's a git command that says 
"generate an email with the patch in it that I need to update someone else's 
repository."

You are making a branch. Your whole repository is a branch of the other 
guy's repository. If you look at the up-pointing lines as "you mail me a 
patch", then you get the same answer.

You asked what happens when someone keeps working on a branch that someone 
else already incorporated. I showed you how git decides which diffs to apply 
and which not to apply. Darcs does the same thing when building the working 
directory. It's going to apply the new patches, but how does it know what 
the new patches are? Right, it goes back until it finds the patches it 
already applied, then applies the newer ones.

>>> Git doesn't support recording half a file modification,
>>
>> Yes it does. Indeed, you can even go back and retroactively say "oh,
>> those two commits? The second one should have come first, and the first
>> one should be broken up into these three commits."
>>
>> As I said, I do this all the time.
>
> I don't see how that's possible.

Here's how to do it without the GUI:

http://book.git-scm.com/4_interactive_adding.html

With a GUI, you look at the patch list, right-click a hunk, and say "stage 
this to be committed".

If you want to change old commits, you do this:

http://book.git-scm.com/4_interactive_rebasing.html

Again, that's the text-based way of doing it without a gui.

>>> and doesn't even figure out which files changed.
>>
>> Yes it does.
>
> Then why do you have to manually tell it which files to commit?

Because maybe you don't want to commit all your changes in one step.

> I'm not sure I'm understanding what Git does. What Darcs does is show you
> each change and say "do you want to put this into the commit?" If you say
> yes, it records that change. If you say no, the change stays as "new".

In git, say you start with a working directory that matches the latest thing 
in the repository. You change files AA, BB, and CCC, and you add file DD. 
Changing the first two were to fix a bug, and the second two added a 
configuration option.

git add AA BB
git commit -m "fix bug"
git add CCC DD
git commit -m "add configuration option"

Let's say you then change 173 files, converting all single quotes to double 
quotes. You can then say

git add -a
git commit -m "change quote style"

> not sure what you mean by "Darcs needs you to do that all in one step".

I mean that gathering up the changes and committing them sounds like a 
single step in Darcs. In git, I can say

wings3d my_model.wings
git add my_model.wings
gimp my_image.jpg
git add my_image.jpg
vi configuration.ini
git add configuration.ini
git commit -m "add a textured model with some configuration"

>> The staging area lies between the repository and the working directory.
> So, wait, there's a third file storage area?

Yes. That's where you build up the next commit. It's called the staging 
area, or the index.

>> Then I use something like "git add" to add all the changes from the WD
>> to the staging directory. Or I use "git add -i" (or, more likely, the
>> GUI) to diff the WD against the staging area (or the repository), pick
>> (say) three of the five diff hunks, and then create a new temp file that
>> holds the repository with those three diff hunks applied, which I then
>> put in the staging area. When I have everything the way I like, I commit
>> the change, which copies the staging area into the repository and then
>> adds a commit object pointing to it.
>
> Damn that sounds complicated.

It's very simple with the gui. You start up the gui, it shows you a top-left 
pane of files that have changed that you haven't decided to commit yet. 
Bottom right pane are files that'll be in the commit. Right side is the 
listing of the diffs for whatever file you've highlighted.

If you want to put half the changes from configuration.ini in your commit, 
you click on that, go over to the list of diffs, click on each one you want 
in the commit, then click the commit button.

Pretty trivial.

>>> This boggles my mind. Apparently I /don't/ understand how Git works at
>>> all, because the way it seems to work precludes two people touching the
>>> same file at the same time...
>>
>> Sure. But you're thinking git tracks diffs. That's exactly the point.
>
> I know Git doesn't track diffs - I just can't comprehend how that can
> actually work properly.

Because files are actually named by their SHA-1, so nobody ever touches two 
different copies of the same file at the same time.

>> If
>> I change the file, and you change the file, then now there's three
>> files. The original, the new one I have, and the new one you have. When
>> we go to merge it, we create number four, which is your new one with the
>> differences between my version and the original applied.
>
> This just seems a very strange way to look at things. Generally you don't
> care about versions, you care about alterations. "Does this draft have the
> corrections to chapter 4 in it or not?"

And you can trivially tell that in git, not by looking at the files, but by 
looking at the commits.

> It seems to me that with the Git model, any time anybody edits any file, you
> create a new version of the entire repo that then has to be laboriously
> merged back into everybody else's repos. (Assuming no other edits have
> happened in the meantime.) What a clunky way to work.

It's not laborious to merge it in, any more than it's laborious to merge 
changes in Darcs into your repository and working directory.

If I clone a repository from you, your repository's URL is stored in my 
repository and called "origin" (by default). If I want to fetch all your 
changes, I say "git pull origin", which connects to your repository, gets 
the list of objects you have that I don't, and pulls them down. It also 
updates any branch names that you changed since last time I did that, so if 
you have a branch called "bugfix", I'll have a branch called 
"origin/bugfix". If Sally also cloned your repository and created a branch 
called bugfix, then I pulled from Sally also, I'll have a branch called 
"origin/bugfix" and one called "sally/bugfix".

>>> And the "minor detail" that if 200 people edit the same file, that's 200
>>> separate branches which have to be manually merged back together again.
>>
>> And this differs from any other VCS how?
>
> With a centralised system, usually it's a check-in / check-out model, so
> only one person can edit a file at once.

Um, no, not for the last 15 years or so. Not even CVS did things that way, 
let alone SVN.  Some systems work like that, yes, but they work really, 
really poorly when you have 200 people working on the same files, which is 
why people moved to CVS in the first place.

> With something like Darcs, there are now 200 change-sets, each of which is
> only in some repos. Copy the change-sets around and everything is in sync
> again. No need for complex "merge" operations or tangled file histories.

Of course you need to merge them, and of course you'll have tangled file 
histories. If all 200 people change the same part of the file, you'll have 
200 merge conflicts. If everyone is passing around partial change sets and 
making more changes that are dependent on those changes, you'll have a 
tangled file history.

>> Note that if you're trying to *push* changes to a remote repository, you
>> have to do it to a branch where nobody else has branched off since you
>> did.
>
> And what the hell are the chances of that ever happening? If every time
> anybody touches any file it generates a new branch, then there's no chance
> of ever being able to push changes back.

http://www.kernel.org/pub/software/scm/git/docs/git-rebase.html

Basically, you say "go look at the changes I made since I branched off the 
upstream repository, then apply those same changes to the new head of the 
upstream repository, and submit *that* as the new commit."

> And hope that the DEV branch doesn't change while you're busy trying to
> catch up. Still, I suppose if you repeat this cycle enough times, eventually
> you might get lucky and be able to perform the push.

If the DEV branch is changing that quickly, it means *someone* is going 
through this cycle successfully.   You're complaining that nobody goes to 
that restaurant any more because it's always too crowded.

The rebase is a single step, unless there are merge conflicts, so it's 
basically bound by network and CPU.

>>>> If you want to merge someone's repository into yours, you simply copy
>>>> from them any files or names that they have that you don't, and you're
>>>> done. You're merged.
>>>
>>> It would be nice if Darcs worked that way.
>>
>> Right. In Darcs, you have to merge all the changes. In git, you have to
>> merge all the changes.
>
> No, I meant it would be nice if the Darcs repo format allowed you to update
> a repo just by copying some files.

Well, git *does* store stuff in files, so technically you could copy the 
files. But by "copy files" I mean "use git to copy the new files."  As in, 
"you don't have to run any diffs or patches or anything".

> Darcs also doesn't explicitly support the "bare" format that Git does,
> despite it being obviously useful.

Which is why you don't have trouble with the rebasing. You only need to 
rebase stuff when you're pushing to a bare repository without human 
intervention. Basically, if you're sending changes to a bare repository 
without human intervention, you have to prove you've already resolved the 
merge conflicts that such might impose on someone else who later updates 
from that bare repository.

>>>> Now if you want to incorporate their changes into
>>>> your work, you generate a diff between their latest version and some
>>>> earlier version, and apply that diff to your latest version, and you're
>>>> merged.
>>>
>>> What a backwards way to look at it.
>>
>> Only if you're used to looking at source control as a series of diffs to
>> start with. But that's (A) exactly what makes git hard to understand and
>> (B) exactly what makes git brilliant. :-)
>
> So doing things the hard way is brilliant?

Building an entire type theory to discuss isomorphic idempotent change sets 
is the easy way?

-- 
Darren New, San Diego CA, USA (PST)
   "Coding without comments is like
    driving without turn signals."

Post a reply to this message

From: Darren New
Subject: Re: Git tutorial
Date: 21 Apr 2011 12:23:11
Message: <4db059ef$1@news.povray.org>

On 4/21/2011 2:35, Invisible wrote:
>>> Given that it's the repository format used by Linux developers, I think
>>> it's safe to say it works adequately for multiple people editing the
>>> file in parallel.
>>
>> This boggles my mind. Apparently I /don't/ understand how Git works at
>> all...
>
> Perhaps I can summarise:
>
> The Darcs workflow. I download the source code, make a small edit to it, ask
> Darcs to record that, and send the changes to the developers. They add it to
> the central repo, which checks whether the bit I just edited has changed
> since I got my copy. If not [which is quite likely], the change is added to
> the central repo. Done.

That's exactly how git works, workflow-wise, if you want.

http://book.git-scm.com/5_git_and_email.html

You *also* have the option of telling the other guy the URL of your 
repository and having him suck it in, or an option of pushing to a (possibly 
bare and unattended) repository your changes.

> The Git workflow. I download the source code, make a small edit to it, and
> ask Git to record that. Git takes a complete record of every file in the
> entire repo.

Uh, no. Not if you're doing it manually. If you're automating it by cloning 
the repository or pulling and pushing changes over git protocols without 
human intervention, then yes. But since your repository tracks what you got 
from other people, you already know about changes you made that they don't have.

> I send that to the developers, and they try to add it to the
> central repo, which makes Git check whether any unrelated changes have
> happened anywhere in the entire repo since I made my change. Since it is
> 100% guaranteed that this will have happened, the merge fails.

Wow. Advice: Don't talk to anyone about how git works, since even if you 
think you've figured it out, you haven't.

> Obviously this model cannot possibly work, so there must be something I'm
> misunderstanding about how Git works.

Ah, thank you.

Yes. You're thinking that you can't add a commit to a repository without 
merging it in. You're still thinking there's one authoritative set of files. 
There aren't. There's bunches of snapshots.

So when you send your change to the maintainers, you're sending them new 
files. There are no merge conflicts, because there is no merge. Every 
changed file has a new sha-1, so it's a new file. git can look at the 
repository and know where in your working directory each file belongs 
(because some of the files in the repository are directories), but in the 
repository itself it's all just a big flat bag of sha-1 files.

So sending your changes to someone else doesn't cause merge conflicts, and 
there's no need to back them out.

When I want to put your changes into *my* version of the files, I have to 
merge it, just like Darcs.

The equivalent description would be that if you changed something in Darcs 
nd I made a change in my working directory, you could no longer send me 
patches until I threw away all my working-directory changes, in case there 
was a conflict.

-- 
Darren New, San Diego CA, USA (PST)
   "Coding without comments is like
    driving without turn signals."

Post a reply to this message

From: Darren New
Subject: Re: Git tutorial
Date: 21 Apr 2011 12:27:48
Message: <4db05b04$1@news.povray.org>

On 4/21/2011 7:46, Warp wrote:
>    I hate SVN. SVN projects are so easy to break accidentally, and if you
> don't know the exact reason, you could be fighting to fix it for a long
> time.

I agree. I've *never* gotten an svn branch to merge back. I have *always* 
built a patch, and then applied the patch to the head.

>    SVN projects are extremely fragile. SVN has the totally braindead idea
> of putting a .svn project directory on each single subdirectory in the
> project. (This is very unlike git, which keeps one single project directory
> under the main directory where the project resides.)

While it lets you check out only parts of the project, it's a PITA to deal 
with. Especially when you want to recursively grep your code for the last 
vestiges of variable abcxyz, and it's all over inside the files under .svn.

git doesn't do that because a git repository is basically a big flat bag of 
files with no possibility of conflicting file names, so there's no 
subdirectories possible. :-)

But yah, all that stuff is a pain. I wound up having to toss and check out 
the entire local copy at least once a month at work, just because crap would 
break and I couldn't fix it locally.

-- 
Darren New, San Diego CA, USA (PST)
   "Coding without comments is like
    driving without turn signals."

Post a reply to this message

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>