EFFICIENT AND SECURE STORAGE METHOD FOR LARGE SCALE FILE SERVERS UTILIZING CLIENT SIDE DE-DUPLICATION

TABLE OF CONTENTS
CHAPTER ONE
Introduction
1.1. Background
1.2. Problem Statement
1.3. Objective
1.3.1 General Objective
1.3.2 Specific Objective
1.4. Methodology
1.5 Thesis Outline

CHAPTER TWO
Literature Review
Hash and Encryption Standards

CHAPTER THREE
3.1 Secure Hash Standard (SHS)
3.2 Advanced Encryption Standard (AES)
3.2.1 Description of AES Algorithm
3.2.2 High-level description of the AES algorithm
Client Side De-duplication

CHAPTER FOUR
4.1 The Challenge of Client Side de-duplication
4.2 Convergent Encryption
Analysis, Design and Implementation

CHAPTER FIVE
5.1 Overview of proposed scheme
Case I: First upload
Case II: Subsequent uploads
5.1.1 Storage Manager
5.1.2 Proof of Ownership Verifier
5.2. Implementation of the system
5.2.1. The Server Side Application
5.2.2 The Client Component

CHAPTER SIX
Result and Discussion

CHAPTER SEVEN
Conclusion and Recommendation
Bibliography
Appendix


Abstract
According to a recent survey by Iternational Data Corporation [63], 75% of today’s digital data are duplicated copies. To reduce the unnecessarily redundant copies, the storage servers would handle duplication (either at a file level or chunks of data sized 4KB and larger). De-duplication can be managed both at the server-side and the client-side. In order to identify duplicated copies, it is required that files be un-encrypted. However users may be worried about the security of their files and may want their data to be encrypted. However encryption makes cipher text indistinguishable from theoretically random data, i.e., encrypted data are always distributed randomly, so identical plaintext encrypted by randomly generated cryptographic keys will very likely have different cipher texts which cannot be de-duplicated. In this research, a method that resolves the conflict between de-duplication and encryption is presented.


Chapter One
Introduction
1.1. Background
Currently, commercial large scale storage services including Microsoft Skydrive, Amazon and Google drive Storage have attracted millions of users. While data redundancy was once an acceptable operational part of the backup process, the rapid growth of digital content in the data center has pushed organizations to rethink how they approach this issue and to look for ways to optimize storage capacity utilization across the enterprise. Explosive data growth over the recent years has brought much pressure on the infrastructure and storage management.

The Flud backup system [4] and Google [6] etc can save on storage costs by removing duplication. According to a recent survey by IDC [63], 75% of today’s digital data are duplicated copies. To reduce the unnecessarily redundant copies, the storage servers would handle duplication (either at a file level or chunks of data sized 4KB and larger) by keeping only one or few copies for each file and making a link to the file for every user who asks to store the file, regardless of how many copies there are. The copies are replaced by pointers which reference the original block of data in a way that is seamless to the user, who continues to use a file as if all of the blocks of data it contains are his or hers alone.

Duplication can be managed both at the server-side and the client-side; client-side de-duplication is mostly known for effectively
1.      Reduced band width requirement

2.      Reduced storage space requirement

3.      Lower electric consumption (hence a greener environment)

4.      Lower overall cost of storage.............

For more Electrical & Computer Engineering Projects click here
================================================================
Item Type: Project Material  |  Size: 119 pages  |  Chapters: 1-5
Format: MS Word  |  Delivery: Within 30Mins.
================================================================

Share:

Search for your topic here

See full list of Project Topics under your Department Here!

Featured Post

HOW TO WRITE A RESEARCH HYPOTHESIS

A hypothesis is a description of a pattern in nature or an explanation about some real-world phenomenon that can be tested through observ...

Popular Posts