Endy:Data storage

From OpenWetWare
Revision as of 14:29, 19 June 2007 by Ilya (talk | contribs) (4. Potential solutions)
Jump to: navigation, search

1. The problem, briefly stated

We need an easy, secure and efficient way to store all our files:

  • individual user files (backup)
  • shared project files (centralized storage and backup)
  • old user and project files (centralized storage and backup)

2. Current specifications our backup system

Most people use Bionet to store their files.


  • Consists of two fiber channel disk filers (bionet and bionet2 located on the 3rd floor in building 68) that have about 3TB of usable storage shared among several labs. Data on this system is mirrored to NearStore R200 in building NE47 (as of 2007-06-04).
  • Storage space (as of 2007-06-04):
    • total: 175GB
    • available: 58GB
  • Problems with bionet:
    • Not enough space to backup all our files e.g. microscope images
    • As of 2007-06-04, bionet and bionet2 are out of the support contract; R200 is under a support contract paid for by the Biology Department. It means that if bionet filers fail, there should be a copy of the data stored on R200 in building NE47.

3. Ideal specifications of our future backup system

(lab and individual data storage, sharing and backup needs - please list what would you like have available)

  • Capacity: we want to be able to store all files in a single location
  • Easy: automatic backup
  • Secure: the backup system shouldn't be located in building 68, in case of a fire
  • Efficient backing up or retrieving files should be speedy
  • Affordable

Types of data

  • Individual user data
    • active:
      • stored on: Bionet (easy to access, backed up), some on lab computers
      • size: ?
    • inactive:
      • stored on: Bionet (easy to access, backed up), some on lab computers, including shmoo (~10GB?)
      • size: ?
  • Project data
    • active:
      • stored on: Bionet (easy to access, backed up)
      • size: ?
    • inactive:
      • stored on: Bionet (easy to access, backed up)
      • size: ?
  • Microscope data
    • stored on: lab computer
    • curent data: ~170GB
    • new data (within 6 months from June 2007): Samantha ~100GB, Jason ~500GB, Francois ?GB
    • total: ~800GB

4. Potential solutions


  • Primary storage: 200GB on bionet in bldg 68 mirrored to R200 in bldg NE47.
  • Backup storage: ~170GB on R200 in NE47.

BioMicro plus own storage

  • Primary storage: 200GB on bionet in bldg 68 mirrored to R200 in bldg NE47.
  • Backup storage: a NAS box with a RAID 5 array would provide ~1.5TB or usable storage. Cost: ~$1,000.

Own storage

  • Primary storage: build or buy a storage server with a RAID array (e.g., 4 x 500GB drives in RAID 5 configuration would provide about 1.5TB of usable storage space). Cost: ~$1,500. Host it in the BioMicro Center on the 3rd floor of building 68.
  • Backup storage:
    • Either get the second identical server or a NAS box and host it in Tech Square (NE47). Cost: ~$1,500 for server or ~$1,000 for NAS box.
    • Or use MIT TSM (if appropriate service available - limited to 300GB per machine as of June 2007). Cost: unknown monthly fee (currently $7.50/month for 300GB).
  • Biosupport would provide maintenance for free.


The win.mit.edu Domain

MIT TSM Backup Service

  • Monthly service charge: $7.50 per month per computer
  • Storage limit: 300GB
    • a soft limit, some users go over
    • an approximate figure because it includes both "active" and "inactive" files but this is offset by data compression
  • TSM software is required to use the service and is available for Windows, Mac and Linux (free to MIT community per site license)
  • Backups are stored on one of the TSM backup servers in buildings W91 and E40 (no mirroring)
  • Types of backup:
    • Scheduled: everything by default but can be configured to exclude directories
    • Manual: nothing by default, need to specify which directories to backup
  • Inactive files (old versions of current files and deleted files) are kept for 30 days using incremental storage (only changes are stored)
  • Need a separate account for each computer to be backed up
  • Performance will vary, depending on time of the day, network condition and machine itself)
  • 5,000 users, 250,000 files restored per quarter
  • 128-bit encryption available
  • coming soon (summer of 2007 at the earliest):
    • free service (for personal use): 10-20GB
    • enhanced service (for DLC use): 1TB and up, offsite mirroring, will be expensive, etc


Network Attached Storage

A Tale of Two Terabyte NAS Boxes

Buffalo Technology

  • Buffalo TeraStation Home
    • Example disk configuration: 4 x 250GB IDE (750GB in RAID5)
    • Protocols: FTP, SMB
    • USB 2.0 port for external hard drive (backup or additional storage)
    • Review by PC Magazine
      • Bottom line: Flexible and reliable storage for everyone on your network. Print sharing is a plus, as is expandable USB disk storage.
      • Pros: Offers RAID level data protection; easy-to-configure shared and private storage for all workgroup members; print sharing is a plus.
      • Cons: Large footprint. No logging or reporting features.
    • Review by ExtremeTech
    • TeraStation wiki
  • Buffalo TeraStation Pro
    • Released in March 2006
    • S-ATA drives

Infrant ReadyNAS

  • ReadyNAS NV
  • ReadyNAS NV+
  • Infrant ReadyNAS NV+ and 1100: Small steps forward - review
    • comes with a 5+5-user license for EMC's Retrospect for Windows and Macintosh client backup software
    • The NV+ is a slight improvement over the NV, with most of the value coming in the Retrospect backup client bundle
    • Since both the NV and NV+ use the same processor and have the same memory, the performance difference I saw is more due to better drives in the NV+ and newer firmware than anything else
    • with the lowest price at time of review at $831 for a driveless NV+ and $517 for an NV, you might be better served by using the $300 to buy drives
  • Infrant ReadyNAS NV Review
    • X-RAID (Expandable RAID) allows to add capacity without deleting existing data, automatically adjusts RAID level and formatted capacity to match the available drives
  • ReadyNAS NV - user review
  • ReadyNAS NV - AnandTech review
  • ReadyNAS NV - PracticallyNetworked.com review