Project Overview

Networked storage systems allow computer programs to run on one server even though their data is stored on another. These programs access data by sending messages over the network. In datacenters, the time to send a message between servers has declined as 10GbE and optical switches have become common. It is now faster to use networked storage to access data in main memory stored on another server than to access a local disk. Further, well designed networked stores can scale out, i.e., they can increase throughput by simply adding servers.

This research studied networked storage systems for Internet services, i.e., computer programs that receive requests from end users, process data related to the request and return results with fast response time. During the course of this work, the structure of Internet services changed. Older services that performed a 3-5 database accesses per request gave way to modern services that require hundreds or thousands of data lookups per request. Why do modern Internet services access so much data? For starters, modern services that are qualitatively similar to older services have larger datasets now, e.g., consider growth of a search engine index. Second, modern services execute complicated data mining and machine learning operations, e.g., consider a recommendation engine for e-commerce products. Finally, modern services use data that is often noisy and redundant, inflating data accesses required for a dependable result.

The intellectual contributions of this research included: + The paper titled Zoolander: Efficiently Meeting Very Strict, Low-Latency SLOs introduced an analytic model on replication for predictability (a.ka. redundancy). It used a probabilistic approach--- comparable to the famous birthday problem--- to estimate processing time for k networked storage requests. When paired with queuing models, this approach provided accurate predictions on the effects of redundancy on response time. Managing redundancy is now a hot topic in computer systems research. Further, this contribution was accessible. Questions on modeling replication for predictibility have been added to undergraduate and graduate architecture courses.
+ The paper titled Obtaining and Managing Answer Quality for Online Data-Intensive Services showed that operating system software can capture and manage answer quality without modifying hosted Internet services. For example, a cloud infrastructure provider can use our techniques to throttle data accesses per query without degrading results. A key result showed that a good implementation in the operating system could achieve throughput comparable to an implementation where the service is modified.
+ The paper titled Adaptive Power Profiling for Many-Core HPC Architectures showed that power consumption with 1 active core provides predictive information on power consumptions as active cores are added. A short profile of 1-2% of the workload combined with data on 1-core power usage accurately characterizes power usage.

The broader impacts achieved by this research included:
+ Dr. Jaimie Kelley and Dr. Aniket Chakrabarti worked on this project during their PhD studies. Dr. Kelley is now an assistant professor at Denison University. Dr. Chakrabarti is applied scientist at Microsoft AI & Research. 4 Masters and Undergraduate theses were derived from project.
+ Data Management in the Cloud is a new course taught at The Ohio State University. It is required for Data Analytics majors. PI Stewart taught the initial offering and worked with his colleagues to ensure that principles for networked storage systems were included.
+ We created a 1-hour workshop to showcase our research. The workshop culminated with a demo: A competition between workshop attendees and the OpenEphyra question-answer system with adaptive answer quality. We presented this workshop to over 150 people. Most were 7-8th graders. The workshop helped to motivate Buck-I-Code, a weekend coding workshop for Columbus-area girls. Buck-I-Code is now in its 5th year.
+ We developed a map-reduce platform based on the open-source BashReduce system that was used by Nationwide Children's Hospital. The platform subsampled genetic data to improve running time, i.e., it traded answer quality for performance.

Low-Latency Networked Storage

Project Overview

About Us