SmartOS and Science: Handling Big Data Software

Christopher Hogue is the principal investigator at the Mechanobiology lab the National University of Singapore, where he wears many hats, including sys-admin and coder. Mechanobiology is the study of cellular and molecular systems that either respond to or generate forces. To do this work, Hogue maintains and develops a software package of code called TraDES, which is used by scientists who study protein molecules with Nuclear Magnetic Resonance. TraDES generates protein 3D structures.

As Hogue says, “TraDES' job is simple. Make hundreds of thousands to billions of 3D protein structures chosen from random sampling and expected structure behavior.” In other words, there is a lot of data. Hogue tried running the software on a number of different operating systems, but ran into some difficulties. He was experiencing storage failure fatigue and realized he wanted, “a compute/storage server that is as fast to set up as an iPhone ... [and] computational capability as close to my data as possible.” Ultimately, he found his needs were met when he started running on SmartOS. He explains:

ZFS is the solution to my storage failure fatigue. It is the core filesystem of FreeBSD, Solaris and Illumos...ZFS was engineered to prevent silent data corruption, scale to as big as you can imagine, and much more. For someone who has seen filesytems fail for a myriad of reasons, the engineering behind ZFS is truly sensible.

The bottom line is, for high-performance networking and filesystem access, a SmartOS Zone is always preferred providing you can build your software to run on it. Yet the system is flexible enough to support any operating system as a KVM guest, and an additional layer of security is provided because KVM runs in a Zone. Kernel optimized version of Linux KVM images are provided by Joyent, which are preferred.

You can read more about Hogue’s lab and his experience using SmartOS in his blog post, "Why SmartOS in My Lab?”



Post written by rachelbalik