Professional Documents
Culture Documents
Storage
By Satish Nikam_Senior Architect_CTO Organisation Posted June 22, 2012 In Technology Frontier
Windows Azure Blobs are part of the Windows Azure Storage service, along with Queues and Tables. Windows Azure
Blob Storage can store large amounts of data such as videos, audio, and images. Data stored in Blob storage can be
exposed publicly or privately and can be accessed from anywhere via HTTP or HTTPS. A single blob can store up to
200GB (or 1TB), depending on type. A storage account can have up to 100TB of blobs. Data stored in Windows Azure
Storage is durable, meaning storage is triple-replicated within the datacenter, providing resiliency to hardware failures.
Also Blobs are, by default, replicated to another sub-region which ensures high degree of disaster recovery.
Blob Storage can be accessed using Windows Azure SDK for Java, which is a wrapper over the REST API and provides
a way to work with containers and blobs.
Here we will demonstrate the use of Windows Azure Blob Storage service from a Java application. Blob Storage can be
accessed from a Java application running locally or within Windows Azure worker and web role instances. We recently
published CloudNinja for Java to github, a reference application illustrating how to build multi-tenant Java based
applications for Windows Azure. CloudNinja for Java uses Windows Azure Blob Storage for storing Tomcat access logs
and tenant logo files.
Here we will discuss the following operations on Blob Storage:
Create and delete a blob container
Create and delete blobs inside a container
Verify the integrity of the blob content
Lease blobs
Create and delete blob snapshots
Set Access Control Levels (ACLs) on blobs and containers
List blobs in a container
Create a directory structure of blobs and containers
Use Shared Access Signatures on containers
PREREQUISITES
The prerequisites for using Windows Azure Blob Storage service from a Java application are:
Windows Azure Libraries for Java
Windows Azure SDK
Windows Azure SDK provides a Storage Emulator that emulates Windows Azure Storage, and is backed by a local SQL
Server instance (SQL Express, by default). While the storage emulator is fine for development, it differs from Windows
Azure Storage. Please see this MSDN article for details about specific differences.
The code below retrieves the emulated storage account. Before running the following code, ensure that Storage Emulator
is up and running.
CloudStorageAccount storageAccount =
CloudStorageAccount.getDevelopmentStorageAccount();
While developing an application the CloudStorageAccount.getDevelopmentStorageAccount method can be used to
access the emulated storage account. This is particularly useful if the developer is not having access to the Windows
Azure Storage account. However, you should not use this method in code that you deploy to Windows Azure, because
the development storage account is not available in Windows Azure.
An alternative approach to accessing the local emulator storage account is to access it just like you would access a real
storage account, with a storage account name and key in your configuration file. The emulator account has a special
account name and key:
Account name: devstoreaccount1
Account key:
Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==
You can place these in the local configuration file, and place your real credentials in the cloud configuration file, allowing
you to easily run code against either account without changing any code.
Development storage account details are documented in this MSDN article.
CloudBlobContainer blobContainer =
blobClient.getContainerReference(container-name);
blobContainer.createIfNotExist();
Create a blob container using the createIfNotExist method that checks whether a container exists with the same name.
The method creates the blob container only if a container with the same name does not exist. Otherwise, no operation is
performed.
It is better to use createIfNotExist method instead of create method, as create throws StorageException if the
specified container name already exists.
blobContainer.delete();
blobContainer.deleteIfExists();
HOW TO CHECK THE EXISTENCE OF A CONTAINER
If a container exists, the following code returns true.
For more information on Block blobs and Page blobs, you can visit Understanding Block Blobs and Page Blobs .
We create a block blob for an image file using following code. If a blob with the same name already exists, the code
overwrites the existing blob.
blob.delete();
blob.deleteIfExists();
HOW TO VERIFY INTEGRITY OF THE BLOB CONTENT
Data transfer over a network is possibly going to face some errors. While uploading or downloading data on cloud, the
data may get corrupted due to network behavior or some other intermittent issues.
To reduce the risk of corrupt data being processed, Windows Azure blob storage supports MD5 hashing. This hashing
ensures end-to-end data integrity.
While uploading a blob, we calculate the Base64 encoded MD5 content for the blob. This encoded content is also
uploaded in the request header. The encoded content is then used to perform the end-to-end integrity of the data being
uploaded. If the content of blob and its MD5 hash dont match, the upload operation will fail. The blob will not be
uploaded, and the operation will throw StorageException.
blob.releaseLease(accessCondition);
BREAK LEASE
Break lease is to end a lease. After breaking a lease, we must ensure that another client cannot acquire a new lease to
the blob until the current lease period has expired. Breaking a lease leaves the blob in an unlocked state for the remaining
duration of the lease period.
// Break Lease
blob.breakLease();
HOW TO CREATE A BLOB SNAPSHOT
Sometimes an application may corrupt the blob content while processing it. This may happen due to different reasons.
For example, a runtime exception occurs before completing the operations. In this situation, we may need to reinstate the
blob to its original content. This can be done by creating a blob snapshot.
Another use of creating blob snapshots is for creating backups of blobs. The name of a snapshot consists of base blob
name followed by the DateTime value as suffix. The DateTime value indicates time at which snapshot was taken. Blob
snapshots are read-only. They can be read, copied, and deleted; but never modified.
Upon creation, snapshots have no associated cost. However, as committed blocks (or pages) are replaced in the base
blob, storage costs begin to accrue as the base blob diverges from the snapshot.
More details about snapshots may be found here. Snapshot billing details are here.
The createSnapshot method creates a read-only snapshot of a blob.
// Create a snapshot
CloudBlob snapshotBlob = blob.createSnapshot();
A unique ID is associated with each snapshot blob. Generally, the :mestamp of the snapshot blob is the ID. It appears as a
string. For example, 2012-03-26T14:16:18.0174890Z.
// Get the snapshot ID
Clients can read the container metadata and the blob content, and can also list the blobs within the container.
SharedAccessPolicy policy =
blobContainer.downloadPermissions().getSharedAccessPolicies().get(policy);
String signature = blobContainer.generateSharedAccessSignature(policy));
After generating the Signature, the format of the URL to access the blobs in the container is:
http://<storage-account-name>.blob.core.windows.net/<container-name>/<blob-name>?
<signature>
HOW TO USE SHARED ACCESS SIGNATURE ON BLOBS
Shared Access Signature (SAS) on a blob will allow operations on it for a specific duration as specified in the code.
Following code is to generate SAS for duration of 30 minutes.
A SAS token might start or expire earlier or later than expected as a result of clock skew. To handle clock skew problem
specify start time a few minutes earlier than required and expiry time a few minutes later than required.
CloudBlob blob =
container.getBlockBlobReference(MainDir/SubDir/sampleTextDoc.txt);
This will help in organizing blobs in a container.
In order to search through a structure of subdirectories you can use the method listBlobs of blob container and specify
the prefix parameter. The prefix parameter value can be path to subdirectory whose contents are to be listed.
SUMMARY
In this article we discussed using the Blob Storage service to perform various operations on blobs. Also we discussed
how acquiring leases on blobs can be used to manage concurrency. We demonstrated generating shared access
signature to provide access rights to blobs for specific time periods.