University of North Carolina at Asheville

CSCI 202: Introduction to Data Structures


Lab 14: Using the Java Collections Framework

[Introduction] [Instructions] [What To Submit]


Introduction

In this lab you will rewrite the Word Index Generator introduced in Lab12, this time using off-the shelf classes provided by the Java Collections Framework. In particular, you will redefine the WordIndex class that you used in that lab to scan a textfile and build an alphabetized list of all the distinct words it contains. As you may recall, this application also accumulates a list of all the line numbers in the file where each word occurs.

As usual, the basic GUI framework is already written for you. Your first tasks will be to download a set of source files, one of which (WordIndex.java) contains several stubbed methods for you to fill in. After you complete this file, the application should be ready to run.

To remind you how this application works, the image below shows the initial window:

As you can see, the center panel contains two multiline text boxes (of type javax.swing.JTextArea). When you enter the name of a valid textfile (and its directory pathname) in the bottom control panel and press the Start button, the application reads the file and displays each line of text in the upper text area. For convenience, the display adds line numbers to the start of each line.

As it scans the file, the application also quietly inserts each word it encounters (and the line number where it occurs) into a specialized binary tree which will be defined below. After the file has been completely scanned, the application then traverses the binary tree and displays an accumulated word index in the lower text area. For example, scanning the sample file stuff.txt yields the display shown below:


Instructions:

  1. Login and enter the directory csci/202/labs. If this directory does not exist, create it now.
  2. Launch the NetBeans IDE, select the File/New Project... option in the toplevel menu bar, and create a new project called Lab14. Define the project to be a General Java Application and accept all the default settings so that NetBeans will automatically write a class lab14.Main for you. You should find the source file for this class in Lab14/src/lab14.
  3. Download the following files and import them into your Lab14/src/lab14 project folder:

    DisplayFrame.java WordIndex.java Item.java IndexItem.java ErrorDialog.java
  4. Next create a new subfolder in your Lab14 project folder, named Lab14/samples. Then download the following two sample textfiles into this directory for later testing:

    stuff.txt penrose.txt
  5. Open the source file for lab14.Main and insert the following lines of code into its main() method:
    	DisplayFrame frame = new DisplayFrame();
    	frame.setVisible(true);
    
  6. Once you have assembled all these files, you should be able to generate a complete set of Javadoc pages for the classes they define. But for a quick overview, the only task of the main() method in lab14.Main is to create a DisplayFrame, which is the toplevel application window for this project. Both the DisplayFrame and ErrorDialog classes are completely implemented and ready for use. However, you might want to take the time to look over the code in DisplayFrame.java, since this class contains the event handler buildIndex() that is actually responsible for scanning textfiles.

    In Lab 13, WordIndex was defined as a customized subclass of labtrees.BinaryTree. In this lab, this class must still accumulate the list of words and their associated line numbers as they are found in the textfile. But rather than subclassing your own BinaryTree class, this time the application will perform its task by making use of an internal java.util.TreeSet.

    As you should verify for yourself, the new source file for WordIndex.java, already has an instance variable that references a TreeSet of IndexItem elements. In addition, the WordIndex constructor already creates an empty TreeSet object for you. However, this file is still not a complete implementation of this class, since its key methods are stubbed. You will need to complete this class definition as specified below to make everything work properly.

    As before, IndexItem is defined as a subclass of the Item class that you have used earlier with both ordered lists and binary search trees, For this lab the Item superclass is provided as one of the completed source files.

    The specialized feature in an IndexItem is the addition of a Queue of Integer objects as new member data. which will be used to accumulate the list of line numbers marking the occurrences of each distinct word found in the file. It also overrides the Item version of the toString() method, so that a traversal of the complete WordIndex will display each word and its line-number list as shown in the second figure above. This class has also been completely implemented, and is ready for use.

  7. Now open the incomplete file WordIndex.java, and locate the stubbed methods

    	public void insert (IndexItem item)
    	public void removeAll ()
    	public int getItemCount ()
    	public Iterator listAscending ()
    

    The insert() method is similar in some respects to the BinaryTree insert() method that you implemented in Lab 10. If the IndexItem contains a new word, a new TreeSet node is created to hold that IndexItem. Note that in addition to the word itself, the new IndexItem stores the line number where the word was first found in its line-number queue.

    By now you may recall that it is slightly trickier to handle repeated occurrences of a word. In essence, the method must extract the single line number in the queue of the input IndexItem parameter, and add that line to the existing IndexItem for that word. In other words, it effectively discards the input IndexItem parameter after extracting its line number. Of course the method must also increment the occurrence count for that word, using its inherited incOccurrences() method as in your earlier labs.

  8. Once you have completed your implementation of the WordIndex class, try to compile your Lab14 project. Of course if you encounter any compiler errors, you will have to fix them all before you can proceed...
  9. Run the project. You should now verify that the application behaves as described in the Introduction. If you don't want to bother writing your own sample textfiles, just use the two freebies stuff.txt and penrose.txt that you should have downloaded earlier. The second of these files provides a more substantial test case, since it contains several complete paragraphs copied from the introduction to The Emperor's New Mind by Roger Penrose (Oxford University Press, 1989).


What To Submit

When your project is complete and working correctly, demonstrate it to your lab instructor. Then, before you exit NetBeans, clean your Lab14 project. Finally, before you logout, open a terminal window and use the cd command to enter your csci/202/labs directory. Then create a JAR file of your Lab14 project folder, using the command

jar cf Lab14.jar Lab14
Please leave Lab14.jar in your csci/202/labs directory for the remainder of the semester.