{"id":3647,"date":"2013-12-05T09:50:57","date_gmt":"2013-12-05T11:50:57","guid":{"rendered":"http:\/\/blog.plataformatec.com.br\/?p=3647"},"modified":"2014-09-19T12:05:37","modified_gmt":"2014-09-19T15:05:37","slug":"sharing-large-repositories-with-your-team","status":"publish","type":"post","link":"https:\/\/blog.plataformatec.com.br\/2013\/12\/sharing-large-repositories-with-your-team\/","title":{"rendered":"Sharing large repositories with your team"},"content":{"rendered":"

Hey, there! Here at Plataformatec we like to do project rotations. It means that every three months or so, developers can swap projects. It has lots of benefits like working with different people, getting out of the comfort zone, sharing skills and knowledge, and the best one: a new developer can spot problems that people working for a longer time in the project may not see, since they are used to it.<\/p>\n

But, it comes at a cost. Each project has its own setup process and may slow down the development. We\u2019re using Boxen from GitHub to solve this problem. It works very well and allows us to have a project environment quickly set up.<\/p>\n

But recently we have run into a problem that Boxen couldn’t solve. We had a project which has multiple repositories and some of them are too large. It would take some time just to git clone<\/strong> their > 3GB size repos.<\/p>\n

Our first thought was creating a tar file with gzip or lzma compression. The problem with it would be when extracting, since file ownership and permissions on it could be a problem just like symlinks. So, the solution was to git clone<\/strong> the smallest repos and git bundle<\/strong> the larger ones. Git bundle<\/a> is shipped with git, but only a few people know about.<\/p>\n

The workflow we have is simple. Someone with the repo already cloned and updated to the origin, type the following command:<\/p>\n

\r\n$ git bundle create .bundle master\r\n<\/pre>\n

It will create a file called .bundle with a compressed version of the repository containing the master history. So, this file can be shared via a flash drive, AirDrop or netcat (using the internal network) among developers, and it will take way less time than cloning it.<\/p>\n

Now that you have the file in hands, it requires two steps to work properly. The first one is extracting the bundle into a cloned repository. This can be achieved by:<\/p>\n

\r\n$ git clone .bundle -b master\r\n<\/pre>\n

In case you want to clone into a different path other than in the current working directory, you can pass the path after the -b master option. As you can see, the repo is cloned from a bundle file just like it could be an URL or some bare git repo in your machine. That said, just take a look at your origin remote.<\/p>\n

\r\n$ git remote show origin\r\n\r\n* remote origin\r\n  Fetch URL: \/path\/to\/.bundle\r\n  Push  URL: \/path\/to\/.bundle\r\n\u2026<\/pre>\n

It is pointing to the bundle filename, so every time you fetch or push, it will try to do so in this bundle file. To fix that, we go to the second step which is setting the proper remote URL.<\/p>\n

\r\n$ git remote set-url origin \r\n<\/pre>\n

That\u2019s it. You\u2019re ready to go. If you have multiple repositories to share, you can create a script to automate the cloning and the url setting for the origin. You can share all the repos with this script, for faster and easier setups.<\/p>\n

Example<\/h2>\n

The Ruby language repository size is about 200MB. It is not big enough to require a bundle, but just as an example I guess it would be a nice fit.<\/p>\n

The first step is cloning the repo:<\/p>\n

\r\n$ time git clone https:\/\/github.com\/ruby\/ruby.git\r\nCloning into 'ruby'...\r\nremote: Finding bitmap roots...\r\nremote: Reusing existing pack: 269821, done.\r\nremote: Counting objects: 2813, done.\r\nremote: Compressing objects: 100% (1403\/1403), done.\r\nremote: Total 272634 (delta 1707), reused 2213 (delta 1390)\r\nReceiving objects: 100% (272634\/272634), 136.77 MiB | 1.20 MiB\/s, done.\r\nResolving deltas: 100% (210263\/210263), done.\r\nChecking connectivity... done\r\nChecking out files: 100% (4187\/4187), done.\r\n\r\nreal\t4m51.501s\r\nuser\t1m38.837s\r\nsys\t0m18.808s\r\n<\/pre>\n

As you can see, it takes almost five minutes to clone the full repository – the time may vary depending on your bandwidth. So, now we’re gonna create a bundle file and then clone a new repo from it:<\/p>\n

\r\n$ cd ruby\/\r\n# Just a reminder, the main branch of ruby repo is not master, it's trunk.\r\n$ git bundle create \/tmp\/ruby.bundle trunk\r\n<\/pre>\n

Now that we’ve created a bundle file and placed it in \/tmp<\/strong>, we just need to clone it:<\/p>\n

\r\n$ cd \/tmp\/\r\n$ time git clone \/tmp\/ruby.bundle -b trunk ruby\r\nCloning into 'ruby'...\r\nReceiving objects: 100% (206590\/206590), 84.16 MiB | 22.43 MiB\/s, done.\r\nResolving deltas: 100% (158583\/158583), done.\r\nChecking connectivity... done\r\nChecking out files: 100% (4187\/4187), done.\r\n\r\nreal\t0m46.490s\r\nuser\t1m9.339s\r\nsys\t0m10.021s\r\n<\/pre>\n

Cloning from a bundle file was much faster and has not taken a minute. Now, in order to pull and fetch changes, you need to set the remote URL:<\/p>\n

\r\n$ git remote set-url origin https:\/\/github.com\/ruby\/ruby.git\r\n<\/pre>\n

Enjoyed this post? Was it as useful for you as for us? Tell us your stories on the comments below! See you!<\/p>\n","protected":false},"excerpt":{"rendered":"

Hey, there! Here at Plataformatec we like to do project rotations. It means that every three months or so, developers can swap projects. It has lots of benefits like working with different people, getting out of the comfort zone, sharing skills and knowledge, and the best one: a new developer can spot problems that people … \u00bb<\/a><\/p>\n","protected":false},"author":28,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":""},"categories":[1],"tags":[],"aioseo_notices":[],"jetpack_sharing_enabled":true,"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/posts\/3647"}],"collection":[{"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/users\/28"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/comments?post=3647"}],"version-history":[{"count":6,"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/posts\/3647\/revisions"}],"predecessor-version":[{"id":4214,"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/posts\/3647\/revisions\/4214"}],"wp:attachment":[{"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/media?parent=3647"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/categories?post=3647"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.plataformatec.com.br\/wp-json\/wp\/v2\/tags?post=3647"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}