Updated September 2013
GINA is designed as an enterprise quality computing, data storage and GIS environment. We attempt to limit single points of failure by using redundant hardware or highly available active-active load balanced systems where possible.
The majority of our systems run CentOS Linux. Most of our custom code is developed using Ruby. Github is doing most of our version control. Chef is used for configuration management. Sensu is used for monitoring. Postgres and MongoDB are doing the majority of our database work. And we tend to default to QGIS unless we have a compelling reason to use something else.
We are currently moving a lot of data from fiber channel storage shared out over nfs to glusterfs. We are using supermicro servers from Silicon Mechanics as the back-end storage. The servers are 4U 24 disk (plus 2 for the OS) chassis with another 4U 24 disk chassis connected as a JBOD via external SAS. These 48 disks get sliced up into 4 mdadm raid6 sets of 11 disks each with a 4 disk shared hot spare group. By using GlusterFS with LVM bricks on top of the raid we can provision chunks of storage in virtually any size we want and grow it later if needed. Identical LVs on multiple boxes can be replicated using gluster for HA and also improved read performance. The gluster volumes can also be used as normal nfs3 mounts so we can move to newer storage and still to attach it to older servers using the legacy nfs3 protocol.
We utilize three datacenters on campus for our infrastructure in the West Ridge Research, Syun-Ichi Akasofu and Butrovich Buildings. We try to keep a copy of everything spinning on disk locally and also deep archive a copy of everything. Currently ARSC's tape silo serves as our deep archive but we are investigating other deep archives that could provide more geographically separation.
Currently, most of our Virtual Servers run on KVM/libvirt/qemu using either fiber channel volumes present over clustered LVM (there is a mixture of image files and block devices) or image files on glusterfs for operating system volumes with additional separate glusterfs volumes mount inside of VMs for large data storage areas.
Real-time processing such as VIIRS and MODIS is mostly done on dedicated physical hardware, rather than VMs, with either fiber channel or glusterfs volumes used to store the finished products.