Riding the Bandwidth Wagon in Delphi 5
by Lasse Vegsfther Karlsen
After two weeks of non-stop programming, your web application is ready and
tested. Everything is A-OK and the grins on your clients faces remind you in
a strange way of "Jaws". Except for one thing. Someone in the back timidly asks
if something could be done to speed up that search result page which contains
so much text. At that instant you know you should have brought a full copy of
the application with you instead of demonstrating it over your 33.6kb modem
line at work.
But all is not lost. There are ways to reduce the amount of data that
you need to send to the client, and I'm not talking about transmitting less
information but rather about sending compressed data. Sounds interesting? Read
on and learn how.
Frivolous assumptions
Since this article describes functionality and techniques that will add to
the complexity of a web application, it's assumed you already know how to create
a web application, specifically an ISAPI dll, as well as how such an application
works. As such I will skip over some details but rest assured - I will present
you with all the options and code needed specifically for the techniques we're
implementing here.
With that in mind, let's go squeeze some more juice out of your good ol' Internet line.
How is the magic done?
How is this possible you might ask? You've probably at one time or another
downloaded a compressed file only to find that your browser somehow interpreted
the data as either a web page or text and displayed this on your screen in all
its glory. Not a pretty picture to say the least. If you were to compress the
data before sending it to the client, wouldn't it look just as strange then?
Not quite, only make sure the client knows that it should handle it differently.
The secret is hidden both inside the data the client sends to the webserver
and inside the data the webserver responds with. It's called Content Encoding.
In short, you can encode the data your application returns to the client, and
the only precaution you need to take is make sure the client knows how to handle
the data in the encoding-format you choose. This in turn is simple since the
client tells you what formats it can handle when it sends the request to the
server.
So what it all boils down to, is that you need to take the following steps
if you want to encode the data returned to the client:
- Check whether the client can handle the encoding type you want to use
- Encode the data in the chosen format
- Return the newly encoded data and tell the client what format you've encoded it in
What format should I use?
We are interested in compressing the data that we return to the client. There is an encoding type specifically for this purpose, and it's name is "deflate". The compression algorithm used in the deflate encoding type corresponds to the algorithm that the zLib compression library implements. You can read more about this library here:
http://www.cdrom.com/pub/infozip/zlib
or check the rfc describing the algorithm and it's binary format here: http://www.funet.fi/pub/doc/rfc/rfc1950.txt.
Although you might think that you already have the files needed for using the zLib compression library - you don't! At least not exactly. Even though the
Delphi installation CD comes with a copy of the zLib compression library in
the form of precompiled object files and some import files, they hide the details
we need to use. More on that later, but for now lets suffice to say we need
a better interface to the library and for that I have chosen to supply you with
my own zLib import unit and a link to the downloadable
precompiled dll: http://www.winimage.com/zLibDll/.
As for the 'deflate' encoding type only Microsoft
Internet Explorer appears to handle it and again only the later versions
(version 4 and up handles it for sure, anything below that is unsure). This
is not a big problem however since the other browsers like Netscape, don't say
they can handle the compression encoding type. In this case our web application
simply wouldn't return with compressed data. The only difference would be a
little longer to download the data to the client. This is no worse than what
we have today so I think we can live with that.
Ok, I got the files, now what?
Now it's time to get down to the gory details. Let's get off to a good start
by creating a new ISAPI project in Delphi 5 and see where that takes us. You
should add the downloaded import unit to this project as well. The dll you just
downloaded can be put either in the C:WinntSystem32 directory (or your corresponding
directory) or in the same directory as your web application.
After creating the new project let's add some action to it, literally. Add
an action to the web module and create an empty event handler for it. Make the
action the default action as well since this is just a demo application for
trying out our new way of returning data.
We now have an empty action event handler so lets add some code to it that
will make it do what we need. I'll show the complete event handler first and
then I'll go through the details.
procedure TWebModule1.WebModule1WebActionItem1Action(Sender: TObject;
Request: TWebRequest; Response: TWebResponse; var Handled: Boolean);
var
PlaintextStream : TStream;
CompressedStream : TStream;
begin
if ( ClientAcceptsDeflate( Request ) ) then
begin
// 1. First, create temporary stream with the data to return PlaintextStream := TStringStream.Create( 'This text is compressed' );
try
// 2. Second, create temporary stream for our compressed data CompressedStream := TMemoryStream.Create;
try
// 3. Now compress the stream...
zLibCompressStream( PlaintextStream, CompressedStream );
// ... and return it
CompressedStream.Position := 0;
Response.ContentStream := CompressedStream;
except
FreeAndNil( CompressedStream );
raise;
end; // try except - avoid memory leaks
finally
// 4. Finally tidy up temporary object FreeAndNil( PlaintextStream );
end; // try finally - destroy plaintext stream object
Response.ContentType := 'text/plain';
Response.ContentEncoding := 'deflate'; Response.StatusCode := 200;
Handled := True;
end // if client accepts compressed data
else begin
Response.Content := 'Not compressed';
Response.ContentType := 'text/plain';
Response.StatusCode := 200;
Handled := True;
end; // if client does not accept compressed data
end; // procedure TWebModule1.WebModule1ActionItem1Action
The primary if-statement here determines whether or not the client can handle
the compressed data and then sends either compressed or uncompressed data back
to the client accordingly. The uncompressed data is handled as you have always
handled data in a web application so we won't discuss that further. Instead
we're going to concentrate on the if-then part of the if-statement that handles
compressed data. You probably noticed that we're using two new procedures/functions
here, namely ClientAcceptsDeflate and zLibCompressStream. I will go through
those later in this article.
Assuming we got a procedure that takes one stream as input, compresses the
data this stream holds and writes the compressed data to a stream as output,
we can describe the code shown above like this:
- First create a temporary stream containing whatever we want to return to
the client
- Second, compress this data and put the compressed data to a new stream
- This new stream, holding our compressed data, we simply return to the client
- Finally, we tidy up our temporary objects
You can find the matching points of this list in the numbered comments of the
above event handler. It's pretty basic code, and it ought to be too since we've
hidden the gory details in two functions, which we'll discuss next.
One thing to note is that once we assign the ContentStream property of the
Response object to our stream the response-bject takes ownership of the stream.
Once the response data has been sent to the client the stream will be freed
for us so we must make sure we don't accidentally free it ourselves. In the
case of an exception however I make the assumption that the assignment went
haywire and thus free up the stream before propagating the exception higher
up.
Parlez-vous frangais?
To determine whether the client knows how to handle compressed data we have
to take a look at the data it sends us in the first place. A typical web request
looks like this (fake request, so the details might not be 100% correct):
GET /index.html HTTP/1.0
Accept-Types: */*
Accept-Encoding: gzip, deflate
User-Agent: Mozilla 4.0 (Microsoft Internet Explorer 5.0 Compatible; NT)
What we're interested in is the line that goes Accept-Encoding: gzip, deflate.
It tells us what encoding types the client is able to accept, and in this case
it can accept data that is encoded in the gzip format as well as the deflate
format. The latter is the one we need, so let's see how to obtain that knowledge
from within our web application. The function looks like this:
The function we need to write looks like this:
function ClientAcceptsDeflate( const Request: TWebRequest ): Boolean;
var
EncodingTypes : string;
begin
// Get and reformat list of encoding types from the request
EncodingTypes := Request.GetFieldByName( 'HTTP_ACCEPT_ENCODING' );
EncodingTypes := UpperCase( StringReplace( EncodingTypes, ',', '/', [ rfReplaceAll ] ) );
EncodingTypes := '/' + StringReplace( EncodingTypes, ' ', '', [ rfReplaceAll ] ) + '/';
// Return the flag
Result := ( Pos( '/DEFLATE/', EncodingTypes ) > 0 );
end; // function ClientAcceptsDeflate
In short I reformat the values gzip, deflate into /GZIP/DEFLATE/
and then check to see if the string /DEFLATE/ is found within it. If
you're interested in knowing what other fields can be found in the request then
I suggest you take a look at http://msdn.microsoft.com/library/psdk/iisref/isre504l.htm
and use the ALL_HTTP variable to check what variables the client actually sends.
Naturellement, parlons!
After we've determined that the client can indeed handle compressed data all
we've got left to do is actually produce the compressed data and this is where
the magic enters.
As stated earlier, we will use the zLib compression library to do the actually
compressing. The code involves the following steps:
- Set up buffers for feeding data to the engine as well as accept compressed
data from it
- Initialize the compression engine
- Feed plain text data into the input buffer from the input stream
- Compress the input buffer to the output buffer
- Write data from the output buffer to the output stream
- Repeat steps 3-5 until no more data in input stream and buffers have been
emptied
- Close compression engine
Let's dig into the details and see what we have to deal with:
procedure zLibCompressStream( const Source, Destination: TStream );
var
z_s : z_stream;
rc : Integer;
// 1. Buffers for input and output SourceBuffer : array[ 0..BufferSize-1 ] of Byte;
DestinationBuffer : array[ 0..BufferSize-1 ] of Byte;
begin
// 2. Prepare the zLib data record
z_init_zstream( z_s );
z_s.next_in := @SourceBuffer;
z_s.next_out := @DestinationBuffer;
z_s.avail_out := BufferSize;
// 2. Initialize the compression engine
deflateInit2( z_s, Z_BEST_COMPRESSION, Z_DEFLATED, -15, 9, Z_DEFAULT_STRATEGY );
// Now compress the stream
try
repeat
// 3. See if we got to feed more data to the compression engine
if ( z_s.avail_in = 0 ) and ( Source.Position < Source.Size ) then
begin
z_s.next_in := @SourceBuffer;
z_s.avail_in := Source.Read( SourceBuffer, Buffersize );
end; // if input data completely depleted
// 4. Compress the data
if ( z_s.avail_in = 0 ) then
rc := deflate( z_s, Z_FINISH )
else
rc := deflate( z_s, Z_STREAM_END );
// 5. Check if we got compressed data to write to the destination
if ( z_s.avail_out = 0 ) or ( rc = Z_STREAM_END ) then
begin
Destination.WriteBuffer( DestinationBuffer, BufferSize - z_s.avail_out );
z_s.avail_out := BufferSize;
z_s.next_out := @DestinationBuffer;
end; // if got data available for writing
// 6. Repeat until buffers exhausted until ( rc <> Z_OK ) or ( ( rc = Z_STREAM_END ) and ( z_s.avail_out = BufferSize ) and ( z_s.avail_in = 0 ) );
finally
// 7. Clean up the engine data
deflateEnd( z_s );
end; // try finally - clean up after engine
end; // procedure zLibCompressStream
As before, you can match the points from this list with the numbered comments
above. The reason we could not use the zLib code thats included with Delphi
is that it hides the deflateInit2 routine and the necessary parameters inside
the implementation part of the unit as well as not exposing all the necessary
code.
In order to produce compressed data in manner that the browser can handle,
we need to compress the data with no header record. The header record is a small
record of information that is written to the very start of the compressed data
and helps the decompression engine know how much data that follows. We can opt
to not write this header record by passing a negative value for the wBitSize
parameter to the deflateInit2 procedure. Since the deflate standard that the
browsers adhere to, does not expect nor knows how to handle this header, we
have to filter it out. Since we could not call deflateInit2 directly with the
zLib code that came with Delphi we had to resort to a full dll copy of the compression
library.
The compression engine is capable of compressing data from the input buffer
and write it to the output buffer. When the output buffer is full, our code
need to flush this buffer and write the data in it to the destination, in our
case a stream. When it has managed to compress all data from the input buffers,
our code needs to fill up the buffer again with as much data as possible. The
compression engine takes care of the rest.
Testing it
After compiling your web application (see bottom of article for a copy of the
example project implemented in this article) you should ideally test it with
both a browser that handles compressed data and with one that doesn't. You can
use Internet Explorer 4/5 as the former and Netscape 4.06 as the latter. The
browser that handles compression should show the text 'This text is compressed'
and the other one 'Not compressed' for verification.
Average compression ratio on text-based content is approximately 5-6 times
(15-20% of the original size) so the effect should clearly noticeable on large
web pages.
Wrapping it up
Well, that's it. With the code and knowledge contained in this article you
should now be able to deal with compressed data from your web application. Even
though we created an ISAPI dll in this article, the theory and code should remain
the same also for CGI and NSAPI applications.
I've taken the liberty to create a unit with the two functions described above,
as well as a copy of the example produced in this article. You can download
the files from the list below. If there are any suggestions or things you would
like to comment on I can be reached at lasse@cintra.no.
Files for download:
There are a couple of finishing notes to bear in mind:
- The compression engine does not determine whether the data lends itself
easily to compression or not before it starts chewing on it. This means that
it is possible to feed data through it that cannot be compressed and might
even increases in size instead. For text and web pages this is not a problem
however, but I would do some tests before feeding jpegs or gifs into it.
- The compression is done server-side before it's sent so if the client is
trying to download a very large web page then essentially the web application
loads the entire page into memory, compresses it and sends it. If memory consumption
on the server-side is a problem then I would suggest implementing the compression
code in a TStream-derived class that compresses when you read from it. That
way compression is done on-the-fly as the data is sent and can be fed directly
off the disk through the compression library to the client. Classes for doing
this are available at my homepage
in the package called StreamFilter.
About the Author
Lasse Vegsfther Karlsen is a senior software developer working with
Cintra Software Engineering in Porsgrunn, Norway. He is 28 years old
and has been using Borland development tools since Turbo Pascal 3.0.
If you have an article you'd like to contribute to the community, please
send it along with any bitmaps and file attachments to
David Intersimone (davidi@inprise.com).
|