
Networked storage systems allow computer programs to run on one server
even though their data is stored on another. These programs access data
by sending messages over the network. In datacenters, the time to send a
message between servers has declined as 10GbE and optical switches have
become common. It is now faster to use networked storage to access
data in main memory stored on another server than to access a local
disk.
Further, well designed networked stores can scale out, i.e., they can
increase throughput by simply adding servers.
This research studied networked storage systems for Internet services,
i.e., computer programs that receive requests from end users, process
data related to the request and return results with fast response time.
During the course of this work, the structure of Internet services
changed. Older services that performed a 3-5 database accesses per
request gave way to modern services that require hundreds or thousands
of data lookups per request. Why do modern Internet services access so
much data? For starters, modern services that are qualitatively similar
to older services have larger datasets now, e.g., consider growth of a
search engine index. Second, modern services execute complicated data
mining and machine learning operations, e.g., consider a recommendation
engine for e-commerce products. Finally, modern services use data that
is often noisy and redundant, inflating data accesses required for a
dependable result.
The intellectual contributions of this research included:
+ The paper titled Zoolander: Efficiently Meeting Very Strict,
Low-Latency SLOs introduced an analytic model on replication for
predictability (a.ka. redundancy). It used a probabilistic approach---
comparable to the famous birthday problem--- to estimate processing time
for k networked storage requests. When paired with queuing models,
this approach provided accurate predictions on the effects of redundancy
on response time. Managing redundancy is now a hot topic in computer
systems research. Further, this contribution was accessible. Questions
on modeling replication for predictibility have been added to
undergraduate and graduate architecture courses.
+ The paper titled Obtaining and Managing Answer Quality
for Online Data-Intensive Services showed that operating system software
can capture and manage answer quality without modifying hosted Internet
services. For example, a cloud infrastructure provider can use our
techniques to throttle data accesses per query without degrading
results. A key result showed that a good implementation in the
operating system could achieve throughput comparable to an
implementation where the service is modified.
+ The paper titled Adaptive Power Profiling for Many-Core HPC
Architectures showed that power consumption with 1 active core provides
predictive information on power consumptions as active cores are added.
A short profile of 1-2% of the workload combined with data on 1-core
power usage accurately characterizes power usage.
The broader impacts achieved by this research included:
+ Dr. Jaimie Kelley and Dr. Aniket Chakrabarti worked on this
project during their PhD studies. Dr. Kelley is now an assistant
professor at Denison University. Dr. Chakrabarti is applied scientist
at Microsoft AI & Research. 4 Masters and Undergraduate theses were
derived from project.
+ Data Management in the Cloud is a new course taught at The
Ohio State University. It is required for Data Analytics majors. PI
Stewart taught the initial offering and worked with his colleagues to
ensure that principles for networked storage systems were included.
+ We created a 1-hour workshop to showcase our research. The
workshop culminated with a demo: A competition between workshop
attendees and the OpenEphyra question-answer system with adaptive answer
quality. We presented this workshop to over 150 people. Most were
7-8th graders. The workshop helped to motivate Buck-I-Code, a weekend
coding workshop for Columbus-area girls. Buck-I-Code is now in its 5th
year.
+ We developed a map-reduce platform based on the open-source
BashReduce system that was used by Nationwide Children's Hospital. The
platform subsampled genetic data to improve running time, i.e., it
traded answer quality for performance.