Xiao Qin's Research

Auburn University

QoSec Project

A Middleware Approach to Teaching Computer Security (2009 - )



Project 4: Writing Hadoop Applications


Project Description

In this project, you will be writing a security application that will run in Hadoop. The application will be focusing on Cryptography. We will use the Advanced Encryption Standard (AES),Data Encryption Standard (DES), and Triple Data Encryption Standard (DESede) as our algorithms for encryption and decryption. The goal is to speed up an application that is processing intensive by running it on a Hadoop Cluster.


Resources

1. Hadoop: The Definitive Guide. Author: Tom White. O’Reilly Media, 2009. Chapters 5.

2. Input files are located on BlackBoard (i.e. 1MB.txt, 64MB.txt, 512MB.txt, 1GB.txt, 2GB.txt, 4GB.txt, 8GB.txt)



System Requirements

1. Ubuntu version 8.04 or later.

2. Sun Java 6

3. SSH installed

4. Hadoop Cluster

5. JCE Policy installed in Java Directory



Project Tasks

This project is going to give you a hands on implementation of cryptography using the Java Cryptographic Extension (JCE) library provided by Sun. We will be using Password Based Encryption (PBE) in order to avoid key management. We will also be using Cipher Block Chaining (CBC) for all three algorithms.

We will not be needing a reducer for this application because the keys will serve as a sequence number in the file that is being encrypted. There will be no combiner classes or reducer classes.

1. (XX points) Creating a Driver:
In the resources listed above, refer to chapter 5 to serve as a guideline for creating a Driver class for your cryptographic application. You will create two files that are listed below:



2. (XX points) Creating a Mapper:

Utilize the Java API to create ciphers for each of the three algorithms listed above (AES, DES, DESede). In order to run the application, you will need to install the JCE Policy Files in order to have permissions to run the cryptography. You will create six files listed below:



Hint: Each mapper should have a static block of code at the beginning of the class to generate the keys. Key generation should not be done in the map() function!

For key generation, we will use “PBKDF2WithHmacSHA1” as the instance of the SecretKey- Factory. We will use “(Algorithm)/CBC/PKCS5Padding” as the instance of the cipher. You will also have to creat your own initialization vector (AES = 16 Bytes, DES = 8 Bytes, and DESede = 8 Bytes).

The overall project should be able to take the given input file, and output encrypted data according to the algorithm, and then be able to decrypt the data that was generated from the encrypt mapper via the decrypt mapper. The output should be the same as the original file.



3. (XX points) Performance Analysis:

Run your application with the given input files from Project 3, and gather the timings of the various file sizes. Create two graphs, one for encrypt and one for decrypt, of the response times of the different file sizes. Give detailed descriptions of what you observed and what problems you encountered. You must also include a ReadMe.txt.

Submission

You need to submit a detailed lab report to describe what you have done and what you have observed; you also need to provide explanation to the observations that are interesting or surprising. Also compress all source files and include a README.txt that gives detailed instructions on how to compile and run your application.