Posts:
This article explains how to compress multiple files with System.IO.Compression.GZipStream using C#.
I searched all over the web for how to do this and everyone is saying that it can't be done - the truth is that you can write code to compress multiple files with GZipStream, it just takes some work.
Before jumping into the code, it will be helpful to go over the key concepts and it is important to understand the capabilities of the System.IO.Compression.GZipStream class before we start adding functionality.
Constructors - the two constructors take a Stream object as the first argument and a CompressionMode value as the second argument.
The CompressionMode values are:
System.IO.Compression.CompressionMode.Compress
System.IO.Compression.CompressionMode.Decompress
GZipStream has a Read method and a Write method that we will use.
Let's assume that we are first going to compress a file and later decompress it. Create a Stream object - use System.IO.FileStream to create a Stream for an existing file on your hard drive. Next create the GZipStream object by passing in your FileStream and CompressionMode.Compress arguments.
Now that you have handed over the Stream to the GZipStream class, you will no longer interact directly with the Stream, instead you read and write to the GZipStream and when you close it, the underlying Stream will be closed as well.
Note: When working with Streams you should always have a try, catch, finally block to ensure that the Stream gets closed no matter what happens.
Your C# code file should have the following using statements:
using System.IO;
using System.IO.Compression;
using System.Text;
Here are sample functions to compress and decompress a file:
private void TestCompress()
{
string srcFile = "C:\\temp\\file-to-compress.txt";
string dstFile = "C:\\temp\\compressed-file.gzip";
FileStream fsIn = null; // will open and read the srcFile
FileStream fsOut = null; // will be used by the GZipStream for output to the dstFile
GZipStream gzip = null;
byte[] buffer;
int count = 0;
try
{
fsOut = new FileStream(dstFile, FileMode.Create, FileAccess.Write, FileShare.None);
gzip = new GZipStream(fsOut, CompressionMode.Compress, true);
fsIn = new FileStream(srcFile, FileMode.Open, FileAccess.Read, FileShare.Read);
buffer = new byte[fsIn.Length];
count = fsIn.Read(buffer, 0, buffer.Length);
fsIn.Close();
fsIn = null;
// compress to the destination file
gzip.Write(buffer, 0, buffer.Length);
}
catch (Exception ex)
{
// handle or display the error
System.Diagnostics.Debug.Assert(false, ex.ToString());
}
finally
{
if (gzip != null)
{
gzip.Close();
gzip = null;
}
if (fsOut != null)
{
fsOut.Close();
fsOut = null;
}
if (fsIn != null)
{
fsIn.Close();
fsIn = null;
}
}
}
private void TestDecompress()
{
string srcFile = "C:\\temp\\compressed-file.gzip";
string dstFile = "C:\\temp\\decompressed-file.txt";
FileStream fsIn = null; // will open and read the srcFile
FileStream fsOut = null; // will be used by the GZipStream for output to the dstFile
GZipStream gzip = null;
const int bufferSize = 4096;
byte[] buffer = new byte[bufferSize];
int count = 0;
try
{
fsIn = new FileStream(srcFile, FileMode.Open, FileAccess.Read, FileShare.Read);
fsOut = new FileStream(dstFile, FileMode.Create, FileAccess.Write, FileShare.None);
gzip = new GZipStream(fsIn, CompressionMode.Decompress, true);
while (true)
{
count = gzip.Read(buffer, 0, bufferSize);
if (count != 0)
{
fsOut.Write(buffer, 0, count);
}
if (count != bufferSize)
{
// have reached the end
break;
}
}
}
catch (Exception ex)
{
// handle or display the error
System.Diagnostics.Debug.Assert(false, ex.ToString());
}
finally
{
if (gzip != null)
{
gzip.Close();
gzip = null;
}
if (fsOut != null)
{
fsOut.Close();
fsOut = null;
}
if (fsIn != null)
{
fsIn.Close();
fsIn = null;
}
}
}
Note that "compressed-file.gzip" is the destination/output file for the TestCompress method and it is the source/input for the TestDecompress method.
Ok, now that the basics are out of the way it is time to get to the heart of this article which is compressing multiple files with GZipStream. The code is not really that challenging once you have a good game plan.
The approach that I have taken is to create my own "zip format" for storing multiple files in a single file that is compressed with GZipStream. I first write each source file to a FileStream (that writes to a temp file on the disk) with some information that identifies the file, its folder, and size. The result is a big file that is somewhat larger than all of the source files combined. The next step is to use GZipStream (as seen above) to compress the big temp file. Finally, the temp file can be deleted.
To decompress the gzip file, first use GZipStream to decompress to a temp file - you will end up with the same temp file that you created when you were compressing/zipping. To extract the individual files in the temp file, open the temp file with a FileStream and start reading the contents of the file based on the information in the temp file that tells your code where a file starts, ends, what its name is, and what folder it should go into.
Here is an example of the "VWD-CMS GZip Format" (the binary contents of the image files have been removed):
------------------------------------------------------
0,vwdcms/forum/images/announcement.gif,6/4/2007 10:13:36 AM,255
[binary contents of the gif file]
1,vwdcms/forum/images/example.gif,6/4/2007 10:14:04 AM,198
[binary contents of the gif file]
2,vwdcms/forum/images/question.gif,6/4/2007 10:14:04 AM,276
[binary contents of the gif file]
------------------------------------------------------
The file header lines contain the file index, the relative file path, the last modified date, and the length (bytes) of the file.
Note that the dashed lines at the top and bottom are not part of the format. Also note that zipping and compressing image files is pointless from a pure compression point of view because they are already compressed and the resulting file ends up being larger than the individual files. But, there are some advantages to packaging many small files into a single large file, so adding image files to a gzip file is still a valuable feature.
Ok, here is what you are really after....the code... add these 3 files to your project and you will be able to start compressing multiple files with GZipStream.
File: VwdCms.GZip.cs
//************************************************************
// Copyright (c) Jeffrey Bazinet, VWD-CMS
// http://www.vwd-cms.com All rights reserved.
//************************************************************
using System;
using System.Data;
using System.Configuration;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using System.IO;
using System.Text;
using System.IO.Compression;
using System.Collections;
namespace VwdCms
{
public class GZip
{
/// <summary>
/// Compress
/// </summary>
/// <param name="lpSourceFolder">The location of the files to include in the zip file, all files including files in subfolders will be included.</param>
/// <param name="lpDestFolder">Folder to write the zip file into</param>
/// <param name="zipFileName">Name of the zip file to write</param>
public static GZipResult Compress(string lpSourceFolder, string lpDestFolder, string zipFileName)
{
return Compress(lpSourceFolder, "*.*", SearchOption.AllDirectories, lpDestFolder, zipFileName, true);
}
/// <summary>
/// Compress
/// </summary>
/// <param name="lpSourceFolder">The location of the files to include in the zip file</param>
/// <param name="searchPattern">Search pattern (ie "*.*" or "*.txt" or "*.gif") to idendify what files in lpSourceFolder to include in the zip file</param>
/// <param name="searchOption">Only files in lpSourceFolder or include files in subfolders also</param>
/// <param name="lpDestFolder">Folder to write the zip file into</param>
/// <param name="zipFileName">Name of the zip file to write</param>
/// <param name="deleteTempFile">Boolean, true deleted the intermediate temp file, false leaves the temp file in lpDestFolder (for debugging)</param>
public static GZipResult Compress(string lpSourceFolder, string searchPattern, SearchOption searchOption, string lpDestFolder, string zipFileName, bool deleteTempFile)
{
DirectoryInfo di = new DirectoryInfo(lpSourceFolder);
FileInfo[] files = di.GetFiles("*.*", searchOption);
return Compress(files, lpSourceFolder, lpDestFolder, zipFileName, deleteTempFile);
}
/// <summary>
/// Compress
/// </summary>
/// <param name="files">Array of FileInfo objects to be included in the zip file</param>
/// <param name="lpBaseFolder">Base folder to use when creating relative paths for the files
/// stored in the zip file. For example, if lpBaseFolder is 'C:\zipTest\Files\', and there is a file
/// 'C:\zipTest\Files\folder1\sample.txt' in the 'files' array, the relative path for sample.txt
/// will be 'folder1/sample.txt'</param>
/// <param name="lpDestFolder">Folder to write the zip file into</param>
/// <param name="zipFileName">Name of the zip file to write</param>
public static GZipResult Compress(FileInfo[] files, string lpBaseFolder, string lpDestFolder, string zipFileName)
{
return Compress(files, lpBaseFolder, lpDestFolder, zipFileName, true);
}
/// <summary>
/// Compress
/// </summary>
/// <param name="files">Array of FileInfo objects to be included in the zip file</param>
/// <param name="lpBaseFolder">Base folder to use when creating relative paths for the files
/// stored in the zip file. For example, if lpBaseFolder is 'C:\zipTest\Files\', and there is a file
/// 'C:\zipTest\Files\folder1\sample.txt' in the 'files' array, the relative path for sample.txt
/// will be 'folder1/sample.txt'</param>
/// <param name="lpDestFolder">Folder to write the zip file into</param>
/// <param name="zipFileName">Name of the zip file to write</param>
/// <param name="deleteTempFile">Boolean, true deleted the intermediate temp file, false leaves the temp file in lpDestFolder (for debugging)</param>
public static GZipResult Compress(FileInfo[] files, string lpBaseFolder, string lpDestFolder, string zipFileName, bool deleteTempFile)
{
GZipResult result = new GZipResult();
if (!lpDestFolder.EndsWith("\\"))
{
lpDestFolder += "\\";
}
string lpTempFile = lpDestFolder + zipFileName + ".tmp";
string lpZipFile = lpDestFolder + zipFileName;
result.TempFile = lpTempFile;
result.ZipFile = lpZipFile;
int fileCount = 0;
if (files != null && files.Length > 0)
{
CreateTempFile(files, lpBaseFolder, lpTempFile, result);
if (result.FileCount > 0)
{
CreateZipFile(lpTempFile, lpZipFile, result);
}
// delete the temp file
try
{
if (deleteTempFile)
{
File.Delete(lpTempFile);
result.TempFileDeleted = true;
}
}
catch (Exception ex4)
{
// handle or display the error
throw ex4;
}
}
return result;
}
private static void CreateZipFile(string lpSourceFile, string lpZipFile, GZipResult result)
{
byte[] buffer;
int count = 0;
FileStream fsOut = null;
FileStream fsIn = null;
GZipStream gzip = null;
// compress the file into the zip file
try
{
fsOut = new FileStream(lpZipFile, FileMode.Create, FileAccess.Write, FileShare.None);
gzip = new GZipStream(fsOut, CompressionMode.Compress, true);
fsIn = new FileStream(lpSourceFile, FileMode.Open, FileAccess.Read, FileShare.Read);
buffer = new byte[fsIn.Length];
count = fsIn.Read(buffer, 0, buffer.Length);
fsIn.Close();
fsIn = null;
// compress to the zip file
gzip.Write(buffer, 0, buffer.Length);
result.ZipFileSize = fsOut.Length;
result.CompressionPercent = GetCompressionPercent(result.TempFileSize, result.ZipFileSize);
}
catch (Exception ex1)
{
// handle or display the error
throw ex1;
}
finally
{
if (gzip != null)
{
gzip.Close();
gzip = null;
}
if (fsOut != null)
{
fsOut.Close();
fsOut = null;
}
if (fsIn != null)
{
fsIn.Close();
fsIn = null;
}
}
}
private static void CreateTempFile(FileInfo[] files, string lpBaseFolder, string lpTempFile, GZipResult result)
{
byte[] buffer;
int count = 0;
byte[] header;
string fileHeader = null;
string fileModDate = null;
string lpFolder = null;
int fileIndex = 0;
string lpSourceFile = null;
string vpSourceFile = null;
GZippedFile gzf = null;
FileStream fsOut = null;
FileStream fsIn = null;
if (files != null && files.Length > 0)
{
try
{
result.Files = new GZippedFile[files.Length];
// open the temp file for writing
fsOut = new FileStream(lpTempFile, FileMode.Create, FileAccess.Write, FileShare.None);
foreach (FileInfo fi in files)
{
lpFolder = fi.DirectoryName + "\\";
try
{
gzf = new GZippedFile();
gzf.Index = fileIndex;
// read the source file, get its virtual path within the source folder
lpSourceFile = fi.FullName;
gzf.LocalPath = lpSourceFile;
vpSourceFile = lpSourceFile.Replace(lpBaseFolder, string.Empty);
vpSourceFile = vpSourceFile.Replace("\\", "/");
gzf.RelativePath = vpSourceFile;
fsIn = new FileStream(lpSourceFile, FileMode.Open, FileAccess.Read, FileShare.Read);
buffer = new byte[fsIn.Length];
count = fsIn.Read(buffer, 0, buffer.Length);
fsIn.Close();
fsIn = null;
fileModDate = fi.LastWriteTimeUtc.ToString();
gzf.ModifiedDate = fi.LastWriteTimeUtc;
gzf.Length = buffer.Length;
fileHeader = fileIndex.ToString() + "," + vpSourceFile + "," + fileModDate + "," + buffer.Length.ToString() + "\n";
header = Encoding.Default.GetBytes(fileHeader);
fsOut.Write(header, 0, header.Length);
fsOut.Write(buffer, 0, buffer.Length);
fsOut.WriteByte(10); // linefeed
gzf.AddedToTempFile = true;
// update the result object
result.Files[fileIndex] = gzf;
// increment the fileIndex
fileIndex++;
}
catch (Exception ex1)
{
// handle or display the error
throw ex1;
}
finally
{
if (fsIn != null)
{
fsIn.Close();
fsIn = null;
}
}
if (fsOut != null)
{
result.TempFileSize = fsOut.Length;
}
}
}
catch (Exception ex2)
{
// handle or display the error
throw ex2;
}
finally
{
if (fsOut != null)
{
fsOut.Close();
fsOut = null;
}
}
}
result.FileCount = fileIndex;
}
public static GZipResult Decompress(string lpSourceFolder, string lpDestFolder, string zipFileName)
{
return Decompress(lpSourceFolder, lpDestFolder, zipFileName, true);
}
public static GZipResult Decompress(string lpSrcFolder, string lpDestFolder, string zipFileName, bool deleteTempFile)
{
GZipResult result = new GZipResult();
if (!lpDestFolder.EndsWith("\\"))
{
lpDestFolder += "\\";
}
string lpTempFile = lpDestFolder + zipFileName + ".tmp";
string lpZipFile = lpSrcFolder + zipFileName;
result.TempFile = lpTempFile;
result.ZipFile = lpZipFile;
string line = null;
string lpFilePath = null;
GZippedFile gzf = null;
FileStream fsTemp = null;
ArrayList gzfs = new ArrayList();
// extract the files from the temp file
try
{
fsTemp = UnzipToTempFile(lpZipFile, lpTempFile, result);
if (fsTemp != null)
{
while (fsTemp.Position != fsTemp.Length)
{
line = null;
while (string.IsNullOrEmpty(line) && fsTemp.Position != fsTemp.Length)
{
line = ReadLine(fsTemp);
}
if (!string.IsNullOrEmpty(line))
{
gzf = GZippedFile.GetGZippedFile(line);
if (gzf != null && gzf.Length > 0)
{
gzfs.Add(gzf);
lpFilePath = lpDestFolder + gzf.RelativePath;
gzf.LocalPath = lpFilePath;
WriteFile(fsTemp, gzf.Length, lpFilePath);
gzf.Restored = true;
}
}
}
}
}
catch (Exception ex3)
{
// handle or display the error
throw ex3;
}
finally
{
if (fsTemp != null)
{
fsTemp.Close();
fsTemp = null;
}
}
// delete the temp file
try
{
if (deleteTempFile)
{
File.Delete(lpTempFile);
result.TempFileDeleted = true;
}
}
catch (Exception ex4)
{
// handle or display the error
throw ex4;
}
result.FileCount = gzfs.Count;
gzfs.CopyTo(result.Files);
return result;
}
private static string ReadLine(FileStream fs)
{
string line = string.Empty;
const int bufferSize = 4096;
byte[] buffer = new byte[bufferSize];
byte b = 0;
byte lf = 10;
int i = 0;
while (b != lf)
{
b = (byte)fs.ReadByte();
buffer[i] = b;
i++;
}
line = System.Text.Encoding.Default.GetString(buffer, 0, i - 1);
return line;
}
private static void WriteFile(FileStream fs, int fileLength, string lpFile)
{
FileStream fsFile = null;
try
{
string lpFolder = GetFolder(lpFile);
if (!string.IsNullOrEmpty(lpFolder) && lpFolder != lpFile && !Directory.Exists(lpFolder))
{
Directory.CreateDirectory(lpFolder);
}
byte[] buffer = new byte[fileLength];
int count = fs.Read(buffer, 0, fileLength);
fsFile = new FileStream(lpFile, FileMode.Create, FileAccess.Write, FileShare.None);
fsFile.Write(buffer, 0, buffer.Length);
fsFile.Write(buffer, 0, count);
}
catch (Exception ex2)
{
// handle or display the error
throw ex2;
}
finally
{
if (fsFile != null)
{
fsFile.Flush();
fsFile.Close();
fsFile = null;
}
}
}
private static string GetFolder(string filename)
{
string vpFolder = filename;
int index = filename.LastIndexOf("/");
if (index != -1)
{
vpFolder = filename.Substring(0, index + 1);
}
return vpFolder;
}
private static FileStream UnzipToTempFile(string lpZipFile, string lpTempFile, GZipResult result)
{
FileStream fsIn = null;
GZipStream gzip = null;
FileStream fsOut = null;
FileStream fsTemp = null;
const int bufferSize = 4096;
byte[] buffer = new byte[bufferSize];
int count = 0;
try
{
fsIn = new FileStream(lpZipFile, FileMode.Open, FileAccess.Read, FileShare.Read);
result.ZipFileSize = fsIn.Length;
fsOut = new FileStream(lpTempFile, FileMode.Create, FileAccess.Write, FileShare.None);
gzip = new GZipStream(fsIn, CompressionMode.Decompress, true);
while (true)
{
count = gzip.Read(buffer, 0, bufferSize);
if (count != 0)
{
fsOut.Write(buffer, 0, count);
}
if (count != bufferSize)
{
break;
}
}
}
catch (Exception ex1)
{
// handle or display the error
throw ex1;
}
finally
{
if (gzip != null)
{
gzip.Close();
gzip = null;
}
if (fsOut != null)
{
fsOut.Close();
fsOut = null;
}
if (fsIn != null)
{
fsIn.Close();
fsIn = null;
}
}
fsTemp = new FileStream(lpTempFile, FileMode.Open, FileAccess.Read, FileShare.None);
if (fsTemp != null)
{
result.TempFileSize = fsTemp.Length;
}
return fsTemp;
}
private static int GetCompressionPercent(long tempLen, long zipLen)
{
double tmp = (double)tempLen;
double zip = (double)zipLen;
double hundred = 100;
double ratio = (tmp - zip) / tmp;
double pcnt = ratio * hundred;
return (int)pcnt;
}
}
}
File: VwdCms.GZippedFile (this doesn't actually contain the file, just the file information)
//************************************************************
// Copyright (c) Jeffrey Bazinet, VWD-CMS
// http://www.vwd-cms.com All rights reserved.
//************************************************************
using System;
using System.Data;
using System.Configuration;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using System.IO;
using System.Text;
using System.IO.Compression;
namespace VwdCms
{
public class GZippedFile
{
public int Index = 0;
public string RelativePath = null;
public DateTime ModifiedDate;
public int Length = 0;
public bool AddedToTempFile = false;
public bool Restored = false;
public string LocalPath = null;
public string Folder = null;
public static GZippedFile GetGZippedFile(string fileInfo)
{
GZippedFile gzf = null;
if (!string.IsNullOrEmpty(fileInfo))
{
// get the file information
string[] info = fileInfo.Split(',');
if (info != null && info.Length == 4)
{
gzf = new GZippedFile();
gzf.Index = Convert.ToInt32(info[0]);
gzf.RelativePath = info[1].Replace("/", "\\");
gzf.ModifiedDate = Convert.ToDateTime(info[2]);
gzf.Length = Convert.ToInt32(info[3]);
}
}
return gzf;
}
}
}
File: VwdCms.GZipResult (this object contains information about the compression and decompression operations)
//************************************************************
// Copyright (c) Jeffrey Bazinet, VWD-CMS
// http://www.vwd-cms.com All rights reserved.
//************************************************************
using System;
using System.Data;
using System.Configuration;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using System.IO;
using System.Text;
using System.IO.Compression;
namespace VwdCms
{
public class GZipResult
{
public VwdCms.GZippedFile[] Files = null;
public int FileCount = 0;
public long TempFileSize = 0;
public long ZipFileSize = 0;
public int CompressionPercent = 0;
public string TempFile = null;
public string ZipFile = null;
public bool TempFileDeleted = false;
}
}
Conclusion
You can zip up multiple files using GZipStream, you just need to pack all of the files that you want to zip into a single file before compressing. Of course, you will want to decompress and extract the files, so you just need to come up with a zip file format that allows you to extract the files later.
I hope this article was helpful.
-Jeff
I will post a sample project with some good usage examples in the VWD-CMS Member Downloads section soon.
-Jeff
Looks great I was just trying to do that.
Do you have the sample code for download?
great job.
Looks great, I have been trying to figure this one out for a few days. Do you have anything for download? It sure would be helpful.
Thanks
Hey guys - sorry for the delay in replying to your posts...the site is supposed to send me an email notification when a new post happens...looks like I need to fix a bug.
Yes, if you download VWD-CMS you get the classes that are included in this article.
-Jeff
where do i download the complete source code for this apps?
is it a windows application ?
The C# code using GZipStream (the classes above) are included with VWD-CMS 2.1 so just download the CMS and you can find these classes in the App_Code/VwdCms folder.
-Jeff
You explained here how to zip up multiple files into a single file using GZipStream. I want to do the reverse process. I have a single GZip file that contains multiple files and I want to decompress all the files separately with their extensions and use one of the containing files. Any help how to do it?
Do you know what software was used to create the Gzip file originally? In order to know how to unzip the gzip file you will need to know the format of the data in the file. You might be able to use some of the code provided here to unzip the file into a temp file (just a big string after decompression) and then manually look at it to figure out the format. Once you know the format, you can write code to reconstruct the original files...
Thanks a lot Jeff.
I used your codes to write a temp file. But problem is how to reconstruct the orignal file from that temp file. I am using zip-file "GeoIP.tar.gz" to write a temp file. I need to reconstruct a "GeoIP.dat" file out of all the files in temp file.
From extension I can see that both "tar" and "gz" are involved. No idea how to deal it. I dont want to use 3rd party software like sharpZipLib.
It would be really cool if VWD-CMS was able to zip and unzip using a common/standard compression format rather than the custom format described in this article...so I will investigate this and post any ideas when I get some free time.
I found this article: http://en.wikipedia.org/wiki/Gzip which gives some good information about the various steps involved in creating a .tar.gz archive.
-Jeff
Hi solbrun_fyr,
After I read the articles on the Tar format, I realized that it would not take too much work to decompress the .tar.gz file and then extract the files from the .tar file. So I went ahead and coded it up this afternoon. Thanks for inspiring me to do this one - I have wanted to use a standard zip format but never got around to it before.
You can download the classes and demo project from the VWD-CMS Member downloads page.
I hope you like it! - Jeff
Hi Jeff
Thanks a lot for your time. You are great :-)
Best regards
solbrun_fyr
thank you! you save my life
sir i incorporate your code to my apps...
and when i tried to compress... it has more than the original file...
how is that possible??
hope you could give me some tips...
im just a newbie...
the compress percentage is showing negative value.