Our problem for the day
Our task is very similar to Homework 4 except that we are looking for CSCI 107
Breaking up the task
Modularization is always a good idea.
Go ahead and create a Java application, but this time
write a main
routine that reads one line at a time,
using Scanner
,
and passes that line on to processLine
which will do
the checking.
For now, have processLine
use a
Scanner
to break the line into
tokens which are printed, one per line.
This is similar to what we did in
Lab 3 to break up the lines of
the ZIP table.
Adding state
In order to solve problems like this, you need to write a loop
that remembers.
Modify your program so that it prints a token only if
the preceding token was "CSCI"
.
You can do this by introducing a boolean
variable called
lastWordCSCI
that remembers if the previous token
was "CSCI"
.
Using state
Now modify your program so that it prints the entire line
only if there are two successive tokens
"CSCI"
and "107"
".
In your first attempt, the line may be printed several times
if the two tokens occur more than once.
Modify processLine
so that this does not happen.
Add an additional variable called matched
that is set to true
the first time
"CSCI"
and "107"
" occur in order.
Finite state machines and enumerations
At this point you should have two variables,
lastWordCSCI
and matched
, that
track the state of the loop.
Let’s replace these by a single variable
that uses
Java
enumerations.
To do this, you must first define the enumeration with a line similar to the following, which must appear outside any of your module declarations. By the way, the values of the enumeration are considered constants so they are written in all capital letters.
enum ProcessState { INITIAL, LASTWORDCSCI, MATCHED } ;
Now you must declare and initialize a state variable. The syntax for this is odd and wordy.
ProcessState loopState = ProcessState.INITIAL ;
Now you got to think a little. Your loop will move through three states
ProcessState.INITIAL
,
ProcessState.LASTWORDCSCI
and ProcessState.MATCHED
.
You can use the usual ==
operator to test the value of
loopState
and the usual =
operator to set the value of loopState
.
Try it out.
Enumerations for tokens
What we are doing is called parsing. Usually parsing is accomplished by representing the possible values of the tokens with an enumerated type. Let’s try this out.
First, define an appropriate enumerated type.
enum TokenType { CSCI, INTROCOURSE, OTHER } ;
Now write a method called word2token
that is passes
a token and returns the appropriate TokenType
.
Use the following code to get started.
private static TokenType word2token (String token) { if (token.equals("CSCI")) { return TokenType.CSCI ; } }
Now you need to modify processLine
to use these
tokens. This will involve adding a call to word2token
with the loop of processLine
.
Something like the following:
TokenType token = word2token(lineStream.next()) ;
Your program must allow allow test against tokens rather than
String
s. For example, an expression like
word.equals("CSCI")
may need to be changed to
token==TokenType.CSCI
.
Was this worth the trouble?
Modify your program so that "CSCI"
can be in
lower or upper case and the course number can be
107, 181, or 182.
All you have to do is change word2token
.
How would the pros do this?
They would probably use a regular expression.
public static void main(String[] args) { Scanner inStream = new Scanner(System.in) ; Pattern regex = Pattern.compile("(^|\\s)(CSCI|csci)\\s+(107|181|182)(\\s|$)") ; while (inStream.hasNext()) { String line = inStream.nextLine() ; if (regex.matcher(line).find()) { System.out.println(line) ; } } }