Distributed systems and cloud computing are becoming more prevalent due to an increase
in the demand for large compute resources with access to storage. This has led to the
natural separation of machines for storage related functions and compute related functions.
High speed networking has also made moving data across the network seem like a feasible
solution. However we leave performance on the table when we could be executing code on
storage servers and avoiding the process of sending data unnecessarily over the network.
We look into realistic scenarios, where all the data used is not stored in a single
machine but is stored across multiple systems. We have created a framework for just
this scenario and have built it so that various customizable storage configurations can
be tested against custom benchmarks. This work tests various storage configurations
using the framework and presents the results. We analyze these results and provide our
rationale for the observed behavior.