Identifier

etd-11042009-151525

Degree

Master of Science in Electrical Engineering (MSEE)

Department

Electrical and Computer Engineering

Document Type

Thesis

Abstract

In recent years, scientific applications have become increasingly data intensive. The increase in the size of data generated by scientific applications necessitates collaboration and sharing data among the nation's education and research institutions. To address this, distributed storage systems spanning multiple institutions over wide area networks have been developed. One of the important features of distributed storage systems is providing global unified name space across all participating institutions, which enables easy data sharing without the knowledge of actual physical location of data. This feature depends on the ``location metadata'' of all data sets in the system being available to all participating institutions. This introduces new challenges. In this thesis, we study different metadata server layouts in terms of high availability, scalability and performance. A central metadata server is a single point of failure leading to low availability. Ensuring high availability requires replication of metadata servers. A synchronously replicated metadata servers layout introduces synchronization overhead which degrades the performance of data operations. We propose an asynchronously replicated multi-master metadata servers layout which ensures high availability, scalability and provides better performance. We discuss the implications of asynchronously replicated multi-master metadata servers on metadata consistency and conflict resolution. Further, we design and implement our own asynchronous multi-master replication tool, deploy it in the state-wide distributed data storage system called PetaShare, and compare performance of all three metadata server layouts: central metadata server, synchronously replicated multi-master metadata servers and asynchronously replicated multi-master metadata servers.

Date

2009

Document Availability at the Time of Submission

Release the entire work immediately for access worldwide.

Committee Chair

Kosar, Tevfik

DOI

10.31390/gradschool_theses.4113

Share

COinS