BioMicroCenter:Servers: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(New page: {{BioMicroCenter}} The BioMicro Center has developed and manages a computational infrastructure to support our genomics experimentation and [[BioMicroCenter:BioInformaticsStaff|bioinforma...)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{BioMicroCenter}}
{{BioMicroCenter}}


The BioMicro Center has developed and manages a computational infrastructure to support our genomics experimentation and [[BioMicroCenter:BioInformaticsStaff|bioinformatics analysis]]. In response to requests from our users, we have expanded these servers to create a new collaborative computational environment at MIT. These public servers are designed to offer CORE labs access to inexpensive analysis and storage systems that piggy-back on the existing infrastructure.  
The BioMicro Center has developed and manages a computational infrastructure to support our genomics experimentation and [[BioMicroCenter:BioInformaticsStaff|bioinformatics analysis]]. In response to requests from our users, we have expanded these servers to create a new collaborative computational environment at MIT. These public servers are designed to offer CORE labs access to inexpensive analysis and storage systems that piggy-back on the existing infrastructure. These services are built on a full cost recovery model where the total cost of the servers and services is expected to be recovered over three years. Charges for the equipment are billed on an annual basis.
== STORAGE ==
 
Large scale data storage is available through the BioMicro Center. Space is available in 1TB increments on an annual basis. The storage is accessible from Window, Macintosh and Linux operating systems and is backed up by MIT's [http://ist.mit.edu/backup/tsm TSM service].<BR><BR>
 
'''We strongly encourage all labs using Illumina sequencing or bioinformatics analysis to have networked data storage. Either Rowley or BMC-PUB storage is *required* to utilize our computer cluster or our Galaxy server.'''
 
=== STORAGE: [[BioMicroCenter:BMCPub|BMC-PUB servers]] ===
[[ Image:BioMicroCenter_BMCpub1.png | thumb | right | 150px | BMC-pub server]]
The BioMicroCenter Public Server (BMC Pub) is a data storage service offered by the Center. The server is designed to provided low cost server based storage for labs and facilities at MIT accessible easily from Linux, Windows and Macintosh operating systems. Space is available in 1TB increments. The servers use a RAID6 architecture to accommodate drive failures and are backed up routinely by MIT's [http://ist.mit.edu/backup/tsm TSM] service. Currently the BioMicro Center supports 128TB storage within the BMC-Pub system. More details on how to use the BMC-pub1 system are [[BioMicroCenter:BMCPub|'''found HERE''']]. Pricing for the servers is [[BioMicroCenter:Pricing|on the pricing page.]] Users who would like data storage on the BMC-pub systems should contact [[BioMicroCenter:People|Stephen Goldman]].
 
=== STORAGE: [http://rous.mit.edu/index.php/BCC_Computing_Resources Koch Institute Isilon Cluster]===
For users with appointments in the [http://ki.mit.edu Koch Institute], storage is also available on the KI's Isilon server, Rowley. This Isilon cluster was purchased in September 2010. The server is named after [http://en.wikipedia.org/wiki/Janet_Rowley Janet Rowley], a pioneer in the field of chromosome translocations in cancer and winner of the 2009 national medal of science and 2009 presidential medal of freedom. The cluster currently consists of seven 36NL nodes and three 108NL nodes with a total capacity of over 500TB and is accessible from any networked computer. Rowley serves as the primary storage device used by the BioMicro Center for [[BioMicroCenter:Sequencing|Illumina Sequencing]] and data delivery. Users who would like data storage on Rowley should contact [http://rous.mit.edu/index.php/Charlie_Whittaker Charlie Whittaker].
 
== COMPUTATION ==
=== [http://rous.mit.edu/index.php/Rous.mit.edu_accounts BMC/BCC Computation Cluster (ROUS)] ===
[[ Image:BioMicroCenter_ROUSimage.jpg | thumb | right | 150px | ROUS ]]
Rous is a Linux cluster, initially purchase in September 2009. Rous, named after the Nobel prize winning cancer researcher [http://www.nobelprize.org/nobel_prizes/medicine/laureates/1966/rous-bio.html Peyton Rous] (and not [http://www.urbandictionary.com/define.php?term=R.O.U.S. Rodents Of Unusual Size]) is equipped with a wide range of bioinformatics software. It uses x86 architecture with 160 processing cores and over 500 GB RAM. Rous is the primary server for handling data analysis from Illumina sequencing and for the GALAXY instance at MIT. Users of Rous '''must''' have an account on either BMC-PUB or on Rowley. <BR><BR>
 
Currently, we are reworking the way the queues on Rous work. In addition to a general queue, similar to what is on Rous now, the new system will have lab specific queue where jobs from that lab will have priority on a specific node (or nodes) of the server. Similar to the data storage model, access to these nodes will be on a charge back basis to recover the cost of the instrumentation. We are currently in the testing phase of this project. <BR><BR>
 
Some additional useful facts about Rous:
* Rous uses Sun Grid Engine (SGE) to manage jobs. <BR>[http://rous.mit.edu/index.php/SGE_Instructions_and_Tips HOW TO SUBMIT JOBS WITH SGE] <BR>
* Rous uses Modules to handle software packages. <BR> [http://rous.mit.edu/index.php/Managment_of_Software_Packages_with_module HOW TO USE MODULES] <BR>
* Currently, users are limited to 24 simultaneous processes on the server.
* Requests for software changes to Rous should be made to [[BioMicroCenter:People|Stuart Levine]] and [http://rous.mit.edu/index.php/Charlie_Whittaker Charlie Whittaker].


== DATA STORAGE ==
Large scale data storage is available through the BioMicro Center. Space is available in 1TB increments on an annual basis. The storage is accessible from Window, Macintosh and Linux operating systems and is backed up by MIT's [http://ist.mit.edu/node/1270 TSM service]. <BR><BR>


'''We strongly encourage all labs using Illumina sequencing or bioinformatics analysis to have networked data storage. Either Rowley or BMC-PUB storage is *required* to utilize our computer cluster or our GALAXY server.'''
'''We strongly encourage all labs using Illumina sequencing or bioinformatics analysis to have networked data storage. Either Rowley or BMC-PUB storage is *required* to utilize our computer cluster or our GALAXY server.'''


=== [http://rous.mit.edu/index.php/Rowley.mit.edu ROWLEY ISILON CLUSTER] ===
=== Server Software Installed on ROUS ===
Data storage on the Rowley Isilon Cluster is available to Koch Institute members only. Currently the first TB is available free of charge to KI members with additional terabytes requiring chargeback. For more information, please contact Charlie Whittaker in the [http://rous.mit.edu/index.php/Main_Page KI Bioinformatics and Computing Core.]
A large amount of software is installed on our cluster server. The modules available as of June 2012 include:<BR><BR>
'' Active by default ''
----------------------------------------------------------------- /home/software/modulefiles -----------------------------------------------------------------
allpathslg/46066(default)          cufflinks/1.3.0                    ki-bmc/pipeline_1.1.2              signalp/4.1(default)
bamtools/1.0.2                      cufflinks/2.0.0                    ki-bmc/pipeline_1.2                snpeff/2.0.5d
bedtools/2.15.0(default)            cufflinks/2.0.1                    ki-bmc/pipeline_1.2.1              snpsnift/1.3.4b
bedtools/2.16.1                    cufflinks/2.0.2                    materialsstudio/6.0                sra/2.1.10
bedtools/2.17.0                    cufflinks/2.1.1                    matlab/2011b(default)              tabix/0.2.5(default)
blast/2.2.27(default)              fasta/35.4.12(default)              muscle/3.8.31(default)              tabix/0.2.6
bowtie/2.0.0b3(default)            freec/5.7(default)                  novoalign/2.08.03(default)          tmhmm/2.0c(default)
bowtie2/2.0.0-beta5                gasv/2012.10.12                    olb/1.9.0                          tophat/1.4.1(default)
bowtie2/2.0.0-beta6(default)        gatk/1.3-21-gcb284ee                olb/1.9.4(default)                  tophat/2.0.0
bowtie2/2.0.2                      gatk/1.4-30-gf2ef8d1                pasa/r2012-06-25(default)          tophat/2.0.3
bowtie2/2.0.3                      gcc/4.8.0(default)                  polysh/0.4(default)                tophat/2.0.4
bowtie2/2.0.5                      gmap/2013-01-23                    python/2.7.2(default)              tophat/2.0.6
bowtie2/2.0.6                      hdf5/1.8.8(default)                r/2.14.0(default)                  tophat/2.0.7
bowtie2/2.1.0                      hmmer/3.0(default)                  r/2.14.2                            tophat/2.0.8
breakdancer/1.1_2011_02_21(default) jre/1.6.0-29(default)              r/2.15.0                            trinityrnaseq/r2012-10-05(default)
bwa/0.4.6                          ki-bmc/20120516                    r/2.15.3                            trinityrnaseq/r2013-02-25
bwa/0.6.1(default)                  ki-bmc/pipeline_0.9                rsem/1.2.3(default)                ucsc-tools/20120530(default)
casava/1.8.2                        ki-bmc/pipeline_1.0(default)        samtools/0.1.18(default)            vcftools/0.1.10
clustalo/1.1.0(default)            ki-bmc/pipeline_1.0.1              samtools/0.1.19                    vcftools/0.1.8a(default)
cufflinks/1.2.1(default)            ki-bmc/pipeline_1.1                shera/2012-03-23(default)          wx/2.8.12(default)
 
'' Active after module add jre ''
------------------------------------------------------ /home/software/jre/jre-1.6.0-29/pkg/modulefiles -------------------------------------------------------
jalview/2.7            picard/1.63            picard/1.72            picard/1.89            varscan/2.3.2(default)
 
'' Active after module add python ''
----------------------------------------------------- /home/software/python/python-2.7.2/pkg/modulefiles -----------------------------------------------------
HTSeq/0.5.4p1              macs/1.4.2                  numpy/1.6.1(default)        pysam/0.6(default)          virtualenv/1.7(default)
biopython/1.61              macs/2.0.10_6_6_2012        pip/1.1                    scipy/0.10.0(default)      wxpython/2.8.12.1
cython/0.15.1(default)      matplotlib/1.1.0(default)  pybedtools/0.6              setuptools/0.6c11(default)
h5py/2.0.1(default)        mysql-python/1.2.3(default) pybedtools/0.6.2            sicer/1.1
 
 
Other packagesare installed but have not yet been converted to module packages. These include:
 
* BLAT
* GenePattern
* IGV
* MAQ
* MEME
* Velvet
 
== BIO-IT ==
=== Jingzhi Zhu, PhD ===
'''Research Computing Specialist'''<BR>
Jingzhi Zhu is a new member of our team who will be focusing on managing our BioIT environment and data processing pipelines. Jingzhi recieved his PhD from Penn State where he studied computational materiel sciences using high performance computing.
 
=== Stephen Goldman ===
[[Image:Stephen.JPG|right|100px]]'''Systems Administrator'''<BR>
Stephen Goldman has been involved in IT at MIT for 21 years. Stephen manages BioMicro's on site data storage servers and provides desktop and network support to the Department of Biology and Biological Engineering Stephen has been a critical member of the team in managing data integrity, data security and data access concerns as the core has transformed over the past few years.  


=== [[BioMicroCenter:BMCPub|BMC-PUB SERVERS]] ===
=== TechSquare Consultants ===
[[ Image:BioMicroCenter_BMCpub1.png | thumb | right | 150px | BMC-pub server]]
Data storage on the BMC-PUB systems is available to all CORE members. The BMC-PUB systems provide a similar level of data security and accessibility as the Rowley cluster using a 100% cost recovery model that allows for replacement of the data storage server every three years. Currently, the BMC-PUB system is composed of seven 16 or 48TB servers using a RAID6 architecture. Technical details about the servers can be found [[BioMicroCenter:BMCPub|'''HERE''']].

Revision as of 10:30, 30 April 2013

HOME -- SEQUENCING -- LIBRARY PREP -- HIGH-THROUGHPUT -- COMPUTING -- OTHER TECHNOLOGY

The BioMicro Center has developed and manages a computational infrastructure to support our genomics experimentation and bioinformatics analysis. In response to requests from our users, we have expanded these servers to create a new collaborative computational environment at MIT. These public servers are designed to offer CORE labs access to inexpensive analysis and storage systems that piggy-back on the existing infrastructure. These services are built on a full cost recovery model where the total cost of the servers and services is expected to be recovered over three years. Charges for the equipment are billed on an annual basis.

STORAGE

Large scale data storage is available through the BioMicro Center. Space is available in 1TB increments on an annual basis. The storage is accessible from Window, Macintosh and Linux operating systems and is backed up by MIT's TSM service.

We strongly encourage all labs using Illumina sequencing or bioinformatics analysis to have networked data storage. Either Rowley or BMC-PUB storage is *required* to utilize our computer cluster or our Galaxy server.

STORAGE: BMC-PUB servers

BMC-pub server

The BioMicroCenter Public Server (BMC Pub) is a data storage service offered by the Center. The server is designed to provided low cost server based storage for labs and facilities at MIT accessible easily from Linux, Windows and Macintosh operating systems. Space is available in 1TB increments. The servers use a RAID6 architecture to accommodate drive failures and are backed up routinely by MIT's TSM service. Currently the BioMicro Center supports 128TB storage within the BMC-Pub system. More details on how to use the BMC-pub1 system are found HERE. Pricing for the servers is on the pricing page. Users who would like data storage on the BMC-pub systems should contact Stephen Goldman.

STORAGE: Koch Institute Isilon Cluster

For users with appointments in the Koch Institute, storage is also available on the KI's Isilon server, Rowley. This Isilon cluster was purchased in September 2010. The server is named after Janet Rowley, a pioneer in the field of chromosome translocations in cancer and winner of the 2009 national medal of science and 2009 presidential medal of freedom. The cluster currently consists of seven 36NL nodes and three 108NL nodes with a total capacity of over 500TB and is accessible from any networked computer. Rowley serves as the primary storage device used by the BioMicro Center for Illumina Sequencing and data delivery. Users who would like data storage on Rowley should contact Charlie Whittaker.

COMPUTATION

BMC/BCC Computation Cluster (ROUS)

ROUS

Rous is a Linux cluster, initially purchase in September 2009. Rous, named after the Nobel prize winning cancer researcher Peyton Rous (and not Rodents Of Unusual Size) is equipped with a wide range of bioinformatics software. It uses x86 architecture with 160 processing cores and over 500 GB RAM. Rous is the primary server for handling data analysis from Illumina sequencing and for the GALAXY instance at MIT. Users of Rous must have an account on either BMC-PUB or on Rowley.

Currently, we are reworking the way the queues on Rous work. In addition to a general queue, similar to what is on Rous now, the new system will have lab specific queue where jobs from that lab will have priority on a specific node (or nodes) of the server. Similar to the data storage model, access to these nodes will be on a charge back basis to recover the cost of the instrumentation. We are currently in the testing phase of this project.

Some additional useful facts about Rous:


We strongly encourage all labs using Illumina sequencing or bioinformatics analysis to have networked data storage. Either Rowley or BMC-PUB storage is *required* to utilize our computer cluster or our GALAXY server.

Server Software Installed on ROUS

A large amount of software is installed on our cluster server. The modules available as of June 2012 include:

Active by default

----------------------------------------------------------------- /home/software/modulefiles -----------------------------------------------------------------
allpathslg/46066(default)           cufflinks/1.3.0                     ki-bmc/pipeline_1.1.2               signalp/4.1(default)
bamtools/1.0.2                      cufflinks/2.0.0                     ki-bmc/pipeline_1.2                 snpeff/2.0.5d
bedtools/2.15.0(default)            cufflinks/2.0.1                     ki-bmc/pipeline_1.2.1               snpsnift/1.3.4b
bedtools/2.16.1                     cufflinks/2.0.2                     materialsstudio/6.0                 sra/2.1.10
bedtools/2.17.0                     cufflinks/2.1.1                     matlab/2011b(default)               tabix/0.2.5(default)
blast/2.2.27(default)               fasta/35.4.12(default)              muscle/3.8.31(default)              tabix/0.2.6
bowtie/2.0.0b3(default)             freec/5.7(default)                  novoalign/2.08.03(default)          tmhmm/2.0c(default)
bowtie2/2.0.0-beta5                 gasv/2012.10.12                     olb/1.9.0                           tophat/1.4.1(default)
bowtie2/2.0.0-beta6(default)        gatk/1.3-21-gcb284ee                olb/1.9.4(default)                  tophat/2.0.0
bowtie2/2.0.2                       gatk/1.4-30-gf2ef8d1                pasa/r2012-06-25(default)           tophat/2.0.3
bowtie2/2.0.3                       gcc/4.8.0(default)                  polysh/0.4(default)                 tophat/2.0.4
bowtie2/2.0.5                       gmap/2013-01-23                     python/2.7.2(default)               tophat/2.0.6
bowtie2/2.0.6                       hdf5/1.8.8(default)                 r/2.14.0(default)                   tophat/2.0.7
bowtie2/2.1.0                       hmmer/3.0(default)                  r/2.14.2                            tophat/2.0.8
breakdancer/1.1_2011_02_21(default) jre/1.6.0-29(default)               r/2.15.0                            trinityrnaseq/r2012-10-05(default)
bwa/0.4.6                           ki-bmc/20120516                     r/2.15.3                            trinityrnaseq/r2013-02-25
bwa/0.6.1(default)                  ki-bmc/pipeline_0.9                 rsem/1.2.3(default)                 ucsc-tools/20120530(default)
casava/1.8.2                        ki-bmc/pipeline_1.0(default)        samtools/0.1.18(default)            vcftools/0.1.10
clustalo/1.1.0(default)             ki-bmc/pipeline_1.0.1               samtools/0.1.19                     vcftools/0.1.8a(default)
cufflinks/1.2.1(default)            ki-bmc/pipeline_1.1                 shera/2012-03-23(default)           wx/2.8.12(default)

Active after module add jre

------------------------------------------------------ /home/software/jre/jre-1.6.0-29/pkg/modulefiles -------------------------------------------------------
jalview/2.7            picard/1.63            picard/1.72            picard/1.89            varscan/2.3.2(default)

Active after module add python

----------------------------------------------------- /home/software/python/python-2.7.2/pkg/modulefiles -----------------------------------------------------
HTSeq/0.5.4p1               macs/1.4.2                  numpy/1.6.1(default)        pysam/0.6(default)          virtualenv/1.7(default)
biopython/1.61              macs/2.0.10_6_6_2012        pip/1.1                     scipy/0.10.0(default)       wxpython/2.8.12.1
cython/0.15.1(default)      matplotlib/1.1.0(default)   pybedtools/0.6              setuptools/0.6c11(default)
h5py/2.0.1(default)         mysql-python/1.2.3(default) pybedtools/0.6.2            sicer/1.1


Other packagesare installed but have not yet been converted to module packages. These include:

  • BLAT
  • GenePattern
  • IGV
  • MAQ
  • MEME
  • Velvet

BIO-IT

Jingzhi Zhu, PhD

Research Computing Specialist
Jingzhi Zhu is a new member of our team who will be focusing on managing our BioIT environment and data processing pipelines. Jingzhi recieved his PhD from Penn State where he studied computational materiel sciences using high performance computing.

Stephen Goldman

Systems Administrator

Stephen Goldman has been involved in IT at MIT for 21 years. Stephen manages BioMicro's on site data storage servers and provides desktop and network support to the Department of Biology and Biological Engineering Stephen has been a critical member of the team in managing data integrity, data security and data access concerns as the core has transformed over the past few years.

TechSquare Consultants