You are on page 1of 9

WEB DOWNLOADS

Build Smarter ASP.NET File Downloading Into Your Web Applications


Joe Stagner

This article discusses:


• Dynamic downloading from ASP.NET sites This article uses the following technologies:
ASP.NET
• Generating links on the fly
Code download available at:
• Resumable downloads and custom handlers Downloading2006_09.exe (174KB)
• Security concerns involved with custom downloading mechanisms

Contents Sidebars
The Basic Download Link
Forcing Downloads for All File Types
Downloading Huge Files in Small Pieces
A Better Solution
Resuming Downloads that Fail

hances are good that your users need to download files from your organization's Web site. And since providing a download is as
C
easy as providing a link, you certainly don't need to read an article about the process, right? Well, thanks to so many Web advances,
there are many reasons it might not be that easy. Maybe you want the file to be downloaded as a file rather than shown as content
in the browser. Maybe you don't yet know the path to the files (or maybe they're not on disk at all), so those simple HTML links
aren't possible. Maybe you need to worry about your users losing connectivity during large downloads.

In this article I'll present some solutions to these problems so your users will have a faster, error-free downloading experience.

Along the way I'll discuss dynamically generated links, explain how to bypass default file behaviors, and illustrate resumable

ASP.NET-driven downloads using HTTP 1.1 features.

The Basic Download Link

Let's tackle the missing link problem first. If you don't know what the path to a file is going to be, you could simply pull the list of

links from a database later. You could even build the link list dynamically by enumerating the files in a given directory at run time.

Here I'll explore that second approach.

Imagine I built a DataGrid in Visual Basic® 2005 and filled it with links to all the files in the download directory, like you see in

Figure 1. This could be done by using Server.MapPath within the page to retrieve the full path to the download directory

(./downloadfiles/ in this case), retrieving a list of all files in that directory using DirectoryInfo.GetFiles, and then from the resulting

array of FileInfo objects building up a DataTable with columns for each of the relevant properties. That DataTable can be bound to a

DataGrid on the page, through which links can be generated with a HyperLinkColumn definition as follows:

<asp:HyperLinkColumn DataNavigateUrlField="Name"

DataNavigateUrlFormatString="downloadfiles/{0}"

DataTextField="Name"
HeaderText="File Name:"

SortExpression="Name" />

If you were to click on the links, you would see that the browser treats each file type differently, depending on which helper

applications are registered to open each file type. By default, if you clicked on the .asp page, the .html page, the .jpg, the .gif, or the

.txt, it would open in the browser itself and no Save As dialog would appear. The reason for this is that these file extensions are of

known MIME types. So either the browser itself knows how to render the file, or the operating system has a helper application that

the browser will use. Webcasts (.wmv, .avi, and so on), PodCasts (.mp3 or .wma), PowerPoint® files, and all Microsoft® Office

documents are of known MIME types, presenting a problem if you don't want them opened inline by default.

Figure 1 Simple HTML Links in a DataGrid

In addition, if you allow downloading in this manner, you have only a very general access control mechanism at your disposal.

You can control download access on a directory-by-directory basis, but controlling access to individual files or files types would

require detailed access control—a very labor intensive process for Web masters and system administrators. Fortunately, ASP.NET

and the .NET Framework provide a number of solutions. They include:

• Using the Response.WriteFile method

• Streaming the file using the Response.BinaryWrite method

• Using the Response.TransferFile method in ASP.NET 2.0

• Using an ISAPI filter

• Writing to a custom browser control

Forcing Downloads for All File Types

The most easily employed of the solutions I just listed is Response.WriteFile method. The basic syntax is very simple; this

complete ASPX page looks for a file path specified as a query string parameter and serves that file up to the client:
<%@ Page language="VB" AutoEventWireup="false" %>

<html>

<body>

<%

If Request.QueryString("FileName") Then

Response.Clear()

Response.WriteFile(Request.QueryString("FileName"))

Response.End()

End If

%>

</body>

</html>

When your code, which is running in an IIS worker process (aspnet_wp.exe on IIS 5.0 or w3wp.exe on IIS 6.0) calls Response.Write,

the ASP.NET worker process starts to send data to the IIS process (inetinfo.exe or dllhost.exe). As the data is sent from the worker

process to the IIS process, the data is buffered in memory. In many cases this is not a cause for concern. However, it's not a great

solution for very large files.

On the plus side, because the HTTP response that sends the file is created in the ASP.NET code, you have full access to all of

ASP.NET authentication and authorization mechanisms and can therefore make decisions based on authentication status, on the

existence of Identity and Principal objects at run time, or any other mechanism you see fit.

Thus, you can integrate existing security mechanisms like the built-in ASP.NET user and group mechanisms, Microsoft server add-

ins such as Authorization Manager and defined role groups, Active Directory® Application Mode (ADAM) or even Active Directory, to

provide granular control over download permissions.

Initiating the download from inside your application code also lets you supersede the default behavior for known MIME types. To

accomplish this you need to change the link you display. Here is code to construct a hyperlink that will post back to the ASPX page:

<!-- in the DataGrid definition in FileFetch.aspx -- >

<asp:HyperLinkColumn DataNavigateUrlField="Name"
DataNavigateUrlFormatString="FileFetch.aspx?FileName={0}"

DataTextField="Name"

HeaderText="File Name:"

SortExpression="Name" />

Next you need to check the Query String when the page is requested to see if the request is a postback that includes a filename

argument to be sent to the client's browser (see Figure 2). Now, thanks to the Content-Disposition response header, when you click

on one of the links in the grid, you get the save dialog regardless of the MIME type (see Figure 3). Notice, too, that I've restricted

what files can be downloaded based on the result of calling a method named IsSafeFileName. For more information on why I'm doing

this and on what this method accomplishes, see the "Unintended File Access" sidebar.

Figure 3 Forcing a File Download Dialog

An important metric to consider when using this technique is the size of the file download. You must limit the size of the file or

you'll expose your site to denial-of-service attacks. Attempts to download files that are larger than resources permit will generate a

runtime error stating that the page cannot be displayed or will display an error like this:

Server Application Unavailable

The Web application you are

attempting to access on this Web

server is currently unavailable.


Please hit the "Refresh" button in your Web

browser to retry your request.

Administrator Note: An error message detailing

the cause of this specific request failure can be

found in the system event log of the Web server.

Please review this log entry to discover what

caused this error to occur.

The maximum downloadable file size is a factor of the hardware configuration and runtime state of the server. To deal with this

issue, see the Knowledge Base article "FIX: Downloading Large Files Causes a Large Memory Loss and Causes the Aspnet_wp.exe

Process to Recycle" at support.microsoft.com/kb/823409.

This method may be symptomatic when downloading large files such as videos, particularly on Web servers running Windows

2000 and IIS 5.0 (or Windows Server™ 2003 with IIS 6.0 running in compatibility mode). This issue will be exacerbated on Web

servers that are minimally configured with memory since the file must be loaded into server memory before it can be downloaded to

the client.

Empirical evidence generated on my test machine, a server running IIS 5.0 with 2GB of RAM, indicates download failure when file

sizes approach 200MB. In a production environment, the more user downloads running concurrently, the more server memory

constraints will result in user download failures. The solution to this problem requires a few more straightforward lines of code.

Downloading Huge Files in Small Pieces

The file size problem with the previous code sample stems from the single call to Response.WriteFile, which buffers the entire

source file in memory. A better approach for a large file is to read and send it to the client in smaller, manageable chunks, an

example of which is shown in Figure 4. This version of the Page_Load event handler uses a while loop to read the file 10,000 bytes

at a time and then sends those chunks to the browser. Therefore, no significant portion of the file is held in memory at run time. The

chunk size is currently set as a constant, but it could also be modified programmatically, or even moved into a configuration file so it

can be changed to meet server constraints and performance needs. I tested this code with files up to 1.6GB, and the downloads

were fast and resulted in no significant server memory consumption.


IIS itself does not support file downloads greater than 2GB in size. If you require larger downloads, you will need to use FTP, a

third-party control, the Microsoft Background Intelligent Transfer Service (BITS), or a custom solution like streaming the data

through sockets to a browser-hosted custom control.

A Better Solution

The commonality of file download requirements, and the ever-increasing size of the files in general, caused the ASP.NET

development team to add a specific method to ASP.NET for downloading files without buffering the file in memory before sending it

to the browser. That method is Response.TransmitFile, which is available in ASP.NET 2.0.

TransmitFile can be used just like WriteFile, but typically yields better performance characteristics. TransmitFile also comes

compete with additional functionality. Take a look at the code in Figure 5, which uses some additional features of the newly added

TransmitFile to avoid the aforementioned memory usage problems.

I was able to add some security and fault tolerance with just a few additional lines of code. First, I added a bit of security and

logic constraint using the file extension of the requested file to determine the MIME type and specifying the requested MIME type in

an HTTP Header by setting the "ContentType" property of the Response object:

Response.ContentType = "application/x-zip-compressed"

This allowed me to limit downloads to only certain content types, and map different file extensions to a single content type. Notice

also the statement that adds a Content-Disposition header. This statement let me specify the file name to download, separate from

the original file name on the server's hard disk.

In this code I create a new file name by appending a prefix to the original name. While the prefix here is static, I could

dynamically create a prefix so that the downloaded file name will never conflict with a file name already on the user's hard disk.

But, what if halfway though fetching a large file, my download fails? While the code thus far has come a long way from a simple

download link, I still can't gracefully handle a failed download and resume downloading a file that has already been partially moved

from the server to the client. All the solutions I have examined so far would require the user to start the download over again from

the beginning in the event of a failure.

Resuming Downloads that Fail

To address the question of resuming a failed download, let's go back to the approach of manually chunking a file for transmission.

While not as simple as the code that uses the TransmitFile method, there is an advantage to manually writing the code to read and

send the a file in chunks. At any given point in time, the runtime state contains the number of bytes that have already been sent to

the client, and by subtracting that from the total file size, you get the number of bytes remaining to be transmitted in order for the

file to be complete.

If you look back at the code, you'll see that the read/send loop checks as a loop condition the result of

Response.IsClientConnected. This test insures that transmission is suspended if the client is no longer connected. At the first loop
iteration in which this test is false (the Web browser that initiated the file download is no longer connected), the server stops sending

data and the remaining bytes required to complete the file can be recorded. What's more, the partial file received by the client can

be saved in the event the user attempts to complete the failed download.

The rest of the resumable download solution comes via some little-known features in the HTTP 1.1 protocol. Normally, HTTP's

stateless nature is the bane of the Web developer's existence, but in this case the HTTP specification is a big help. Specifically, there

are two HTTP 1.1 header elements relative to the task at hand: Accept-Ranges and Etag.

The Accept-Ranges header element quite simply tells the client, the Web browser in this case, that this process supports

resumable downloads. The Entity Tag, or Etag, element specifies a unique identifier for the session. So the HTTP Headers that the

ASP.NET application might send to the browser to begin a resumable download might look like this:

HTTP/1.1 200 OK

Connection: close

Date: Mon, 22 May 2006 11:09:13 GMT

Accept-Ranges: bytes

Last-Modified: Mon, 22 May 2006 08:09:13 GMT

ETag: "58afcc3dae87d52:3173"

Cache-Control: private

Content-Type: application/x-zip-compressed

Content-Length: 39551221

Because of ETag and Accept-Headers, the browser knows that the Web server will support resumable downloads.

If the download fails, when the file is requested again, Internet Explorer will send the ETag, file name, and the value range

indicating how much of the file has been successfully downloaded before the interruption so that the Web server (IIS) can attempt to

resume the download. That second request might look something like this.

GET http://192.168.0.1/download.zip HTTP/1.0

Range: bytes=933714-

Unless-Modified-Since: Sun, 26 Sep 2004 15:52:45 GMT

If-Range: "58afcc3dae87d52:3173"
Notice that the If-Range element contains the original ETag value that the server can use to identify the file to be resent. You'll also

see that the Unless-Modified-Since element contains the date and time that the original download began. The server will use this to

determine whether the file has been modified since the original download began. If it has, the server will restart the download from

the beginning.

The Range element, which is also in the header tells the server how many bites are required to complete the file, which the server

can use to determine where in the partially downloaded file it should resume.

Different browsers use these headers a bit differently. Other HTTP headers that a client might send to uniquely identify the file

are: If-Match, If-Unmodified-Since, and Unless-Modified-Since. Note that the HTTP 1.1 is not specific about which headers a client

should be required to support. It is therefore possible that some Web browsers will not support any of these HTTP headers and

others may use a different header than those that are expected by Internet Explorer®.

By default, IIS will include a header set like the following:

HTTP/1.1 206 Partial Content

Content-Range: bytes 933714-39551221/39551222

Accept-Ranges: bytes

Last-Modified: Sun, 26 Sep 2004 15:52:45 GMT

ETag: "58afcc3dae87d52:3173"

Cache-Control: private

Content-Type: application/x-zip-compressed

Content-Length: 2021408

This header set includes a different response code than that of the original request. The originating response included a code of 200,

whereas this request uses a response code of 206, Resume Download, which tells the client that the data to follow is not a complete

file, but rather the continuation of a previously initiated download whose file name is identified by the ETag.

While some Web browsers rely on the file name itself, Internet Explorer very specifically requires the ETag header. If the ETag

header is not present in the initial download response or the download resumption, Internet Explorer will not attempt to resume the

download; it will simply begin a new one.

In order for the ASP.NET download application to implement a resumable download feature, you need to be able to intercept the

request (for download resumption) from the browser and use the HTTP headers in the request to formulate an appropriate response

in the ASP.NET code. In order to do this you should catch the request a little earlier in the normal sequence of processing.
Thankfully, the .NET Framework is here to help. This is a great example of a fundamental design premise of .NET—providing a

well-factored object library of functionality for a large portion of the standard plumbing work that developers are called on to perform

daily.

In this case, you can take advantage of the IHttpHandler interface provided by the System.Web namespace in the .NET

Framework in order to build your own custom HTTP handler. By creating your own class that implements the IHttpHandler, you will

be able to intercept Web requests for a specific file type and respond to those requests in your own code rather than simply allowing

IIS to respond with its default behaviors.

The code download for this article contains a working implementation of an HTTP handler that supports resumable downloads.

While there is quite a bit of code to this feature, and its implementation requires some understanding of HTTP mechanics, the .NET

Framework nevertheless makes this a relatively simple implementation. This solution provides the capability to download very large

files, and after the download is initiated, browsing can continue. However, there are certain infrastructure considerations that will be

beyond your control.

For example, many companies and Internet service providers maintain their own caching mechanisms. Broken or misconfigured

Web cache servers can cause large downloads to fail due to file corruption or premature session termination, especially if your file

size is greater that 255MB.

If you require file downloads in excess of 255MB or other custom functions, you may want to consider custom or third-party

download managers. You may, for example, build a custom browser control or browser helper function to manage the downloads,

hand them off to BITS, or even hand off the file request to an FTP client in the custom code. The options are endless and should be

tailored to your specific needs.

From large file downloads in two lines of code to segmented, resumable downloads with custom security, the .NET Framework and

ASP.NET provide a full range of options for building the most suitable download experience for the Web site's end users.

Joe Stagner joined Microsoft in 2001 as a Technical Evangelist and is now a Program Manager for Developer Community in the Tools and Platform
Products group. His 30 years of development experiences have afforded him the opportunity to create commercial software applications across a wide
diversity of technical platforms.

You might also like