In this exercise you will:
- Design the first version of a ZIP file manager and extractor
- Produce a set of UML class and sequence diagrams to document your design
- Experience design patterns in actual Java code
- Write a covering set of unit tests
- Experience working with Eclipse, JUnit and Ant
Create an executable named
xzip which is able to print and
extract the contents of ZIP files.
The program's usage from the command line should be:
xzip [-l | -s | -x] [-nr] <zip file name>
Note that the first two parameters are optional, but the file name
is not. Parameters must appear in this order.
A zip file can contain files, which may be compressed (but don't have to be).
In addition, a zip file can also contain directories, which may by themselves
contain files and other directories. When a zip file is extracted, the
directories it contains are extracted as well, and files within these
directories are extracted to their correct relative location.
For the examples given below, assume that the zip file
contains the following: A file called
mysong.mp3 which is an MP3
song, a file called
myicon.gif which is a GIF image, and a
directory called '
web' which includes two HTML files called
other.html and one directory called
file which includes two other files with these same names.
The first argument of
xzip dictates what action will be
performed on the zip file:
-l (long print): This action prints the names and details of
the files stored in the given zip file. The long format means that in addition
to the file names, more details are printed for each file. For example:
> java xzip -l my.zip
mysong.mp3 (7901 bytes, MP3
myicon.gif (910 bytes, 32x32 GIF
(directory, total 5 files)
(5657 bytes, 122 lines of text)
(6983 bytes, 208 lines of text)
(directory, total 2 files)
(5622 bytes, 120 lines of text)
(7501 bytes, 234 lines of text)
As you can see, the details printed for each file type are different, and follow
1. For each text file, the name of the file is printed, followed by
a tab, and then the text
(N bytes, M lines of text) where
is the uncompressed size of the file, and
M is the number of lines
(measured by the number of new-line characters) in the file. A text file is
defined as any file whose extension is
2. For each image file, the name of the file is printed, followed
by a tab, and then the text
(N bytes, WxH EXT image) where
is the uncompressed size of the file,
W is the image's width,
H is the image's height, and
EXT is the image file extension
in uppercase. An image file is defined as any file whose extension is
3. For any other file type - any regular file not a text or zip file,
backup.zip in the above example - print the file name, a
tab, and the text
(N bytes, EXT file) where
N is the
size of the uncompressed file and
EXT is the file extension in
For each directory, the name of the directory is printed,
followed by a tab, and then the text
(directory, total N files) where
N is the number of files in that directory. Each directory that the directory contains
is counted as one (for the directory itself) plus the number of files inside
that directory, recursively. In addition, as the example above shows, the
contents of the directory are printed in the following lines, under the same
rules, with an indent of three spaces relative to the indent of the directory.
-s (short print): This action prints the names of the files and
directories in the given zip files, but without the extra details (file size,
number of text lines and so forth) which the long format provides. Printing is
equivalent to what the
-l option outputs, including indentation and
recursion into directories and zip files - only the text in parentheses for each
file/directory is not printed. For example:
> java xzip -s my.zip
-x (extract): This action actually extracts the contents of the
zip file into the file system. That is, for each file/directory in the zip file,
a new uncompressed file/directory should be created in the file system. Files
should be created in the current directory, but files that are inside
directories in the zip file should be created inside these relative directories
in the file system. If a file is being extracted and another file with the same
name already exists, then the existing file should be overwritten.
Upon completion, this action prints one line to the console in this format:
Extracted N files, M files failed.
N is the total
number of files and directories that were successfully created, and
M is the total number of files and directories that failed. In addition,
one line is printed for each failed file or directory, including a detailed
error message. Whenever possible, an error should not result in halting the
entire program, and the program should output the error message and continue
normally. For example:
> java xzip -x my.zip
Error: Failed to extract myicon.gif: Cannot overwrite existing file - file is in
Extracted 7 files, 1 files failed
And the reported 7 successful files will be the following (locations are
relative to the current directory):
The second command-line argument of
xzip means "no recursion",
and if it appears then all actions should be performed without recursion into
directories. This means that only one summary line is printed for every
directory in the printing actions, and that the directory is created but not
populated in the extract action. For example:
> java xzip -l -nr my.zip
mysong.mp3 (7901 bytes, MP3
file)In this example, the 3 extracted files will be
myicon.gif (910 bytes, 32x32 GIF
(directory, total 5 files)
> java xzip -x -rn my.zip
Extracted 3 files, 0 files failed
The default action is
-l, meaning that
xzip my.zip is equivalent to
xzip -l my.zip. You should print an informative usage message if the
program is activated with no or illegal command-line arguments. You should print
a detailed error messages and exit gracefully when a critical error occurs (the
given zip file does not exist, the given zip file is corrupted and so on).
While this exercise can be programmed within a single class, this won't work since this
xzip is only a first version, so it is crucial to maintain an open mind with
respect to possible future requirements. Consider the following:
You must design your program so that it is easy to add code that implements the above requirements.
Assume that you are the one who will actually have to code it - that's how it usually is in "real life".
For each of the above requirements write an explanation in your README file, not more than three
sentences long, which explains how it should be coded. For example:
- It may be required to be able to read input format other than ZIP, such
as TAR, ARJ, CAB and other archive file formats. The input does not even have to
be a file: It can be the set of files of a given directory, a given FTP server
address and so forth.
- It may be required to support other file types that the long print action provides
additional details about. For example, printing the image size for image files
myicon.gif, or printing the song duration for music files
It may be required to produce the output in formats other than
plain text - HTML, PDF, Word or
others. It may also be required to write output in several formats at once, for example:
xzip -pdf myzipfiles.pdf -html myzipcontents.html my.zip.
- It may be required to modify the input zip file instead of just printing or
extracting it. For example, new actions may enable adding files and directories to the zip
file, changing the date and time signature of zipped files, and so on.
- It may be required to support recursion into zip files, and not only into
directories. For example, if a zip file contains another zip file, then its
contents would have to be printed (recursively) like a directory's contents are
printed, and when a file is extracted any zip files it contains would be
extracted as well.
- It may be required to activate all of the program's features not only from
the command-line, but also from a graphical user interface, or perhaps even two
user interfaces (for example, one that is a custom UI for handling zip files,
and another which is fully integrated with the Windows Explorer).
It is important that each solution you present will be at most three sentences long. The intention is
to enforce the use of design patterns vocabulary rather than elaborating specific class and object
Requirement: It may be required to define filters on which files get
printed or extracted, in addition to the
-nr switch. For example, new
command-line arguments can dictate that only text files should be acted on, only
files that match a given pattern (such as
*.cpp), and so on.
Solution: Write an Iterator for each kind of filter, whose next() method will move to the
next element for which the filter is true. Such iterators are implemented as Decorators of other
iterators, which easily enables to dynamically combine different filters and does not require
changing or recompiling existing code.
Code & Unit Test
This exercise intends you to divide your time equally between actual coding and between design,
writing UML diagrams, and answering the above six design questions. With a proper design, this
exercise is quick and simple to code. You are required to write in Java 5.0, and you
may use the standard libraries to
their full extent - the standard streams, strings and data structures. In
particular, working with zip files is done using the
package, and working with image files is done using the
javax.swing.ImageIcon class (see references and code samples below).
You are also required to use the Eclipse environment for this
exercise, and are encouraged to take advantage of its editing and debugging
features to their full extent. Submit the Eclipse project file together with
your exercise, so that it would be possible to open your project and read your
code, UML diagrams and Ant file within Eclipse. The UML diagrams should be done
using the Omondo UML plugin (see the Technical Help
page for details on installing it at home). You are required to provide class
diagrams that include all your classes; there may be more than one diagram, if
this is visually easier. You are also required to provide at least two sequence
diagrams, depicting two interactions in your design which you consider
It is also required that you submit unit tests to test your work,
and use the JUnit framework to do so. Organize your unit tests into classes by
subject, and write a method for each small test. Organize the code such that the
program source code is in one package (for example
xzip), and the
tests are in a separate package (for example
xziptest). Each test
should be self-validating - that is, know by itself whether it has passed or
failed. Writing unit tests should be an integral part of coding: This is
essential when code must be changed in newer versions. We recommend that you try
test-first programming - read the following article
as a starting point - and in any case you will lose time if you only write all unit tests
after you "finish" coding. You will have another chance to estimate the
convenience of unit tests in exercise 3.
One metric for measuring the usefulness of a set of unit tests is called coverage, which means
the percentage of your code that the unit tests actually run. Coverage of 90% or above is
considered good, and you should aim to that goal.
The code you submit must be built with no compiler warnings, and pass all unit tests.
You must submit an Ant file (
build.xml) with your exercise, whose
default target compiles the entire program, and runs all unit tests.
This is an advanced course, so there is no intention to take points for coding style or
naming conventions - the emphasis is on proper design. However, you are as always expected
to write clear code with a consistent style.