Data versioning

Definition

Versioning refers to saving new copies of your files when you make changes so that you can go back and retrieve specific versions of your files later.

Naming versions

When creating new versions of your files, record what changes are being made to the files and give the new files a unique name. Follow the general advice on the site for naming files, but also consider the following:

  • Include a version number, e.g "v1," "v2," or "v2.1".
  • Include information about the status of the file, e.g. "draft" or "final," as long as you don't end up with confusing names like "final2" or "final_revised".
  • Include information about what changes were made, e.g. "cropped" or "normalized".

Simple file versioning

One simple way to version files is to manually save new versions when you make significant changes. This works well if:

  • You don't need to keep a lot of different versions.
  • Only one person is working on the files.
  • The files are always accessed from one location.

The directory below shows multiple versions of a web page mock-up called DMSSiteHome.jpg. Note the use of v1, v2, etc. to indicate versions. The notations "FISH" and "SandC" indicate different images that were swapped into some versions, i.e. major changes that were made.

file versions screenshot, image by Amy Hodge

Saving multiple versions makes it possible to decide at a later time that you prefer an earlier version. You can then immediately revert back to that version instead of having to retrace your steps to recreate it.

This method of versioning requires that you remember to save new versions when it is appropriate. This method can become confusing when collaborating on a document with multiple people.

Simple software options

Everyone at Stanford has access to two cloud services that offer version control features: Google Drive and Box.

Google Drive

Google Drive logoDrive's word processing, spreadsheet, and presentation software automatically create versions as you edit.

  • Any time you edit files created on Google Drive, new versions are saved as you go.
  • Version information includes who was editing the file and the date and time the new version was created.
  • You can also see what changes were made from one version to the next (or between the current version and any older version) and revert back to a previous version at any time.

Pros: The real-time editing feature means that Google Drive works well for collaborating on files with multiple people. And because the files are on Google Drive, they are accessible from anywhere.

Cons: You are restricted to the software provided by Google, which may not have all the bells and whistles of your desktop word processing, spreadsheet, or presentation software.

More: Find out more about using Google Drive for Stanford.

Stanford box

box logoStanford Box creates and tracks versions of your files for you.

  • Any type of document can be stored and versioned with Box.
  • The comments feature lets you indicate changes that have been made between versions.
  • Documents can be shared with others, and Box will track who uploaded or updated each file and when.

Pros: Box allows you to automatically sync folders on your desktop to your Box account. Microsoft Excel, Word, and PowerPoint files, as well as Google Docs and Spreadsheets can be edited directly within the Box interface.

Cons: Does not have real-time editing like Google Drive.

Add-on: The Box Edit add-on allows you to launch local editing of any type of file from your Box account. Saving the file automatcially creates a new version back on your Box account. 

More: Find out more about using Stanford Box.

Advanced software options

git logoIf you have more sophisticated version control needs, you might consider a distributed version control system like git. Files are kept in a repository and users clone copies of the repository for editing and commit changes back to the repository when they are done. 

Version control systems like git are frequently used for groups writing software and code, but can be used for any kind of files or projects. Many people share their git repositories on GitHub.