#4691 Fix last commit info

v1.0.0
closed
sf-16 (2)
General
2015-08-20
2012-08-10
Cory Johns
No

The following corner cases exist for displaying last commit information correctly. There should be explicit test cases for these cases. It will almost certainly require a change to the LastCommitDoc structure.

  • Two identical files are committed at different times in the same tree level:
    • commit1 = Add /file1
    • commit2 = Add /file2
      • hash(/file1) == hash(/file2)
  • Two identical files are committed at different times in different tree levels:
    • commit1 = Add /dir1/file1
    • commit2 = Add /dir2/file2
      • hash(/dir1/file1) == hash(/dir2/file2)
      • hash(/dir1/) == hash(/dir2/)
    • commit3 = Add /dir2/file3
      • hash(/dir2/file2) != hash(/dir2/file3)
      • hash(/dir1/) != hash(/dir2/)
  • Two identical copies of a directory are committed at different times (e.g., recursive copy of dir):
    • commit1 = Add /dir1/file1
    • commit2 = Add /dir1-copy/file1
      • hash(/dir1/file1) == hash(/dir1-copy/file1)
      • hash(/dir1/) == hash(/dir1-copy/)
    • (i.e., trees, like files, can have matching hashes but different commit info)
  • File is changed then reverted to original content:
    • commit1 = Add /file1
      • hash(/file1) == F0
      • hash(/) == T0
    • commit2 = Change /file1
      • hash(/file1) == F1
      • hash(/) == T1
    • commit3 = Revert /file1
      • hash(/file1) == F0
      • hash(/) == T0
    • (view as of commit3 should show commit3 message, but view as of commit1 should show commit1 message despite identical hashes)
  • All of these cases also have to work across repos, though ideally with minimal duplication of data (forks)

  • Given a commit ID and a path, the solution needs to be able to find the most recent values for the tree's nodes that are not newer than the given commit. This means we need to have some way of determining ordering: is the tree information for commit ID X before or after the info for commit ID Y.

Related

Tickets: #4549
Tickets: #4935
Tickets: #5341
Tickets: #5514

Discussion

1 2 > >> (Page 1 of 2)
  • Cory Johns - 2012-08-10
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -32,3 +32,5 @@
             * hash(`/`) == T0
         * (view as of commit3 should show commit3 message, but view as of commit1 should show commit1 message despite identical hashes)
     * All of these cases also have to work across repos, though ideally with minimal duplication of data (forks)
    +
    +* Given a commit ID and a path, the solution needs to be able to find the most recent values for the tree's nodes that are not newer than the given commit.  This means we need to have some way of determining ordering: is the tree information for commit ID X before or after the info for commit ID Y.
    
     
  • Dave Brondsema

    Dave Brondsema - 2012-08-10
    • milestone: forge-aug-24 --> forge-sep-07
     
  • Cory Johns - 2012-08-21

    It's looking more and more like a tree walking solution is the only way to get this 100% correct. However, there are some optimizations that can potentially be done to make tree walking less expensive.

    Some ideas:

    • Walk the tree in chunks by using the CommitRunDoc structure
    • Process the entire tree node as a unit instead of iteratively computing each child at a time
    • Precompute "interesting" points, such as branch HEADs
    • Caching on-demand computations

    Alternatively, we might want to investigate further and profile the tradeoffs of just getting this info from the file system would be.

     
  • Dave Brondsema

    Dave Brondsema - 2012-08-24
    • milestone: forge-sep-07 --> forge-sep-21
     
  • Dave Brondsema

    Dave Brondsema - 2012-09-13

    If we do it on-demand, a git implementation should probably use rev-list. E.g. git rev-list --max-count=1 commit-or-symbolic -- path/to/file.txt

     
  • Dave Brondsema

    Dave Brondsema - 2012-11-16
    • summary: Fix last commit info for corner cases --> Fix last commit info
    • size: --> 4
    • milestone: forge-backlog --> forge-nov-30
     
  • Cory Johns - 2012-11-26
    • status: open --> in-progress
     
  • Cory Johns - 2012-11-30
    • status: in-progress --> code-review
    • qa: Tim Van Steenburgh
     
  • Cory Johns - 2012-11-30

    allura:cj/4691

    Currently setup to fall back to existing LastCommitDoc data until the new format is pre-populated or, if the old-style data is missing, auto-vivify in the new format.

    Still having some performance issues with the pre-populate of LCD data. It may simply be an artifact of the increasing tree size, but it grows steadily so I wonder if there's some kind of leak that I'm not accounting for.

     
  • Dave Brondsema

    Dave Brondsema - 2012-11-30
    • Milestone: forge-nov-30 --> forge-dec-14
     
  • Leaf node (file) hrefs are incorrectly generated (they contain /tree/).

     
    • status: code-review --> in-progress
     
  • Cory Johns - 2012-12-05
    • status: in-progress --> code-review
     
    • status: code-review --> in-progress
     
  • https://sf-tvansteenburgh-7025.sb.sf.net/p/test4691/git2/ci/b140e04c6094dd861a0b3b37dcf245112107e1d3/tree/file2

    File '/home/tvansteenburgh/tvansteenburgh-7029/forge/Allura/allura/model/repo.py', line 616 in prev_commit
      tree = prev_commit and prev_commit.get_path(self.tree.path().rstrip('/'))
    File '/home/tvansteenburgh/tvansteenburgh-7029/forge/Allura/allura/model/repo.py', line 346 in get_path
      if path[0] == '/': path = path[1:]
    IndexError: string index out of range
    
     
  • Cory Johns - 2012-12-06
    • status: in-progress --> code-review
     
  • I pushed some small bug fixes to origin/cj/4691, so make sure you pull those.

    Latest test was pushing two new (no history in mongo) repos to the sandbox - Allura (git) and vim (hg).

    The refresh succeeded without error, and the repo browser worked fine, but there are no (or very few) last_commit_docs in mongo. Fixing this may force us to get to the root of the performance issues.

     
    • status: code-review --> in-progress
     
  • Dave Brondsema

    Dave Brondsema - 2012-12-14
    • milestone: forge-dec-14 --> forge-jan-11
     
  • Dave Brondsema

    Dave Brondsema - 2012-12-18
    • size: 4 --> 8
     
  • Dave Brondsema

    Dave Brondsema - 2013-01-11
    • Milestone: forge-jan-11 --> forge-jan-25
     
  • Cory Johns - 2013-01-23
    • status: in-progress --> code-review
     
  • Cory Johns - 2013-01-23

    allura:cj/4691

    Could maybe squash some of the commits together to clean up the history. shrug

     
  • Dave Brondsema

    Dave Brondsema - 2013-01-25
    • Milestone: forge-jan-25 --> forge-feb-08
     
1 2 > >> (Page 1 of 2)

Log in to post a comment.