I've been using source control systems for a long time (CVS for OS course in Uni, then SVN for work for a while, now migrating to Git). But I've never had to keep binary data under source control. I was wondering how exactly source control needs to be handled for binary files like images, music, and Maya files. Things like this I would imagine are 'unmergeable', so conflicts would likely just completely break the file...
How are binary files stored in version control system? How to handle merges and conflicts? Things that are really easy to do with plain-text files but would be impossible with say, a jpeg.
How do you merge plain-text files so easily? I heard that some systems can perform automatic merging. I haven't worked with CVS and SVN (much) to be honest, but at my last job (where we used different version control system) merges would be manual and someone would perform code review on top of that, and the unit test would be updated if it had to be. Using capable tools for manual merging is entirely different story. I mean the notion of automatic merge (if that is at all what you mean) is to me like merging pictures by alpha blending.
(I'm joking, but you get my point.)
PS: After thinking a bit about it, the problem really is not the nature of the media, but the contrast between source and compiled output. You can store the source and the executable both in the repository, but only the source can be merged. Similarly, if you are creating art, you might have a layered vector format description and the resulting raster image. The raster result is not merge-able, but the vector original probably is.
PS II: It occurred to me yesterday that we used to merge Word documents sometimes. There is feature in Word (forgot its name) that allows you to compare and merge formatted documents. My point is, to perform merging you need merging capabilities in the authoring tools, or other software to help you out. Whether there is such software for the format you use is the topic of another conversation.
Well, with plain-text, usually merges are done based on diffs. Git is exteremly good at this, Subversion not so much but it still makes logical sense: Check the new file and the old file and change the new revision based on the diffs. Conflicts are usually pretty easy to resolve (especially with Git) because it's easy to see the differences in the two files and choose the right one. With a binary file that doesn't have human-readable diffs, it's impossible to merge them properly if you need a part from one commit and another from a different commit. Usually, you can only pick one or the other, which works OK I guess, but someone always ends up losing their work.
What also happens is that repositories that have lots of binary data in them usually end up growing in size very fast, because copies of the entire file need to be stored in the history instead of just the diffs. Some source control systems use binary-deltas which only tracks the changes between files so they are smaller (I'm 90% sure SVN does this), but it doesn't help with merge conflicts all that much.
Since direct merging is pretty much impossible, there must be some other convenient way to manage binary revisions efficiently. I could see it being a huge pain on a team of 40 or so devs (like for a game) where two artists may work on the same 3d model at the same time but both changes are required. So you can't just discard one when a conflict occurs.