POV-Ray : Newsgroups : povray.off-topic : Git tutorial : Re: Git tutorial Server Time
30 Jul 2024 04:13:19 EDT (-0400)
  Re: Git tutorial  
From: Invisible
Date: 20 Apr 2011 12:03:49
Message: <4daf03e5$1@news.povray.org>
>> I had assumed that all DVCSs were the same, but I now see that at
>> least Git
>> and Darcs use fundamentally different models.
>
> GIT stores files, and deduces change sets from those files.
>
> Mercurial stores change sets, and deduces files from those changesets.

Doesn't appear to me that that's what happens, from what little 
Mercurial documentation I've read.

>> Fundamentally, any version control system tracks changes to files.
>
> No, git actually tracks entire files.

Fundamentally, VCS are about tracking changes. Git might *implement* 
that by storing the entire file, but *logically* what you're trying to 
do is keep track of what you changed.

> And, technically, *every* file in the entire repository is stored.

Yes, I gradually game to that realisation. Git is managing the entire 
repo as a strictly linear sequence of unumbered versions. (Until you 
explicitly create branches, anyway.)

>> (Presumably as a diff relative to the previous commit,
>
> Nope. It stores the entire repository. Now, if you don't change a file,
> it hashes to the same value, and hence doesn't need to get stored again.
> But the entire file is put into the repository.

How odd... Still, if you're not worried about the internal 
implementation, logically Git is versioning the whole repo as one unit, 
and that's all you need to know.

> That's why git doesn't have a "rename" command.git looks at the
> same contents disappearing from one part of the directory structure and
> showing up in another and says "Gee, that must have been a rename." If
> there are minor changes between what disappeared on this commit and what
> showed up somewhere else on that commit, git says "there's a 97%
> probability this was a renamed file."

o_O

OK, wow. I thought having to tell Darcs when I rename stuff was 
inconvenient, but this just sounds insane...

> The only reason git uses a pointer to earlier commits is when you merge
> things, you don't want to apply changes you already applied in an
> earlier merge.

And here I was thinking it was so you can revert to earlier versions if 
you want. You know - the entire purpose for a VCS to exist in the first 
place? ;-)

> Yeah, from the little I read about it, Darcs is another one of those
> "interesting" ideas. An actual mathematical system for defining a
> repository, like relational algebra did for databases.

It sounds simple enough. If this change affects line X and that change 
affects line Y, they are independent.

Ah, but wait. What if some change adds or removes lines? If change X 
adds a new line between lines 50 and 51 then change Y no longer affects 
line 150, it now affects line 151. But X and Y are still independent.

There's more to it than meets the eye. Of course, if you just want to 
*use* Darcs, you just edit stuff and it "just works".

>> As far as I can tell, Git would require me to create a branch where I add
>> the comments, and another branch where I add the new code, and then merge
>> them back into the main branch, hoping that I don't get any conflicts. To
>> me, this seems like a lot more work and a lot more conceptual overhead.
>
> Nah. That's only if you want to have both at once working in parallel.

Isn't "working on both at once" kind of the entire point of distributed 
version control?

> That is, if you want one version with the comments but no function, and
> another version with the function but no comments, that's trivial in
> git. If you then want to combine them into a third version that has both
> comments and function, then you merge, which is also trivial unless you
> changed the same lines in both places. (I.e., it's as trivial as any
> other diff-patch based merge.)

And if you merge the comments branch into the main branch, and then 
somebody adds more stuff to the comments branch, then what?

>> It also seems that when you ask Git to perform a commit, you have to
>> tell it which files to record changes for.
>
> Sure. But you can say "add all changes" trivially. Or you can use
> interactive tools to commit just bits and pieces of this and that.

With Darcs, I tell it what files to watch, and then when I've finished 
editing stuff, I say "record this" and it shows me every modified line 
of every file and asks which modifications to keep. Git doesn't support 
recording half a file modification, and doesn't even figure out which 
files changed.

> This is trivial with GIT. I do it all the time. I'll be adding a new
> function, and while testing, realize there's a bug in some other
> function. So when everything works again, I'll do two commits, staging
> just particular hunks (in the diff sense of the word) and do two
> commits, one for the bugfix and one for the new change.

Given that Git can only record the new file or the old one, how is that 
possible?

>> I wonder how well the illusion of one single sequence of file versions
>> works when you have multiple people editing the file in parallel.
>
> There's no single sequence of file versions. Every file is a new version.
>
> Given that it's the repository format used by Linux developers, I think
> it's safe to say it works adequately for multiple people editing the
> file in parallel.

This boggles my mind. Apparently I /don't/ understand how Git works at 
all, because the way it seems to work precludes two people touching the 
same file at the same time...

>> It just records what edits
>> happened, without recording their relative ordering [except where they
>> affect the same lines of code]. Git, on the other hand, appears to be
>> trying
>> to track what every file in the entire repository looked like in every
>> individual commit object.
>
> Yes, but since you have them all, you can recreate the diffs between any
> two versions whenever you want.

That's my point. If multiple people are editing the same files, you do 
*not* have all the changes.

>> You can email individual change-sets around, and this works.
>> Getting somebody else's changes just copies all change-sets from their
>> repository into yours. You can then resolve any conflicts.
>
> git is exactly the same, except it copies files instead of changes.

And the "minor detail" that if 200 people edit the same file, that's 200 
separate branches which have to be manually merged back together again.

> If you want to merge someone's repository into yours, you simply copy
> from them any files or names that they have that you don't, and you're
> done. You're merged.

It would be nice if Darcs worked that way.

> Now if you want to incorporate their changes into
> your work, you generate a diff between their latest version and some
> earlier version, and apply that diff to your latest version, and you're
> merged.

What a backwards way to look at it.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.