For this assignment, we want to process WORDS in input files and, potentially, change some of those WORDS
before printing them out.
For our assignment, a WORD is a sequence of characters, separated by whitespace (that is, spaces, tabs,
newlines; what cctype’s isspace() returns true for).
It is possible that a WORD may begin and/or end with one or more punctuation characters (what cctype’s
ispunct() returns true for). Such a WORD therefore has leading and/or trailing punctuation.
A SUBSTITUTION is a line containing two WORDS.
Your program will process a file containing SUBSTITUTIONS and a file containing WORDS. Each
SUBSTITUTION in the first file is a replacement rule for WORDS in the second file. Your program should
process the input files and apply the SUBSTITUTIONS to each input WORD before printing it out. If a WORD
in the input matches the first WORD in a SUBSTITUTION, it is replaced with the second word in the
SUBSTITUTION, and then it is printed. If no SUBSTITUTION applies, the WORD is printed unchanged.
For example, imagine that we have these SUBSTITUTIONS in the first file:
foo many
hi hello
fish bicycle
And suppose we are given this second file:
So, hi everyone! This reminds me of foo things.
I need a new fish for my birthday.
The resulting output would be:
So, hello everyone! This reminds me of many things.
I need a new bicycle for my birthday.
NUMBER OF CHANGES: 3
There are a few rules about the file of SUBSTITUTIONS:
1. Blank lines are ignored
2. Lines that do not have two WORDS are ignored
3. Any leading and trailing punctuation in WORDS in a SUBSTITUTION are discarded
4. All of the letters in the WORDS in the SUBSTITUTIONS file should be converted to lower case
5. If there are multiple SUBSTITUTIONS with the same first WORD, the last SUBSTITUTION is used, and
all prior SUBSTITUTIONS for that WORD are discarded.
Because we discard punctuation in the SUBSTITUTIONS, and because we convert to lower case, the following
three lines in the SUBSTITUTIONS file all have the same first WORD. According to the rules, the last
SUBSTITUTION is the only one that is retained.
Hi!!! hello
“hi” hello
Hi Hello
When deciding if an input WORD matches a word in a SUBSTITUTION, the following rules should apply:
1. Any leading and trailing punctuation is ignored when deciding if a WORD matches
2. Any difference in case are ignored when deciding if a word matches
3. When performing a replacement, leading and trailing punctuation from the WORD is preserved
4. If the first letter after any leading punctuation in the input WORD is a capital letter, then the first letter in
the resulting replacement should also be capitalized
5. At most one replacement should be performed for each WORD
For example, imagine that we have these substitutions:
foo
hi hello
fish bicycle
And suppose we process this input:
“Hi, everyone!” said the boy. “Yeah, hi! I want a brand new fish!”
The resulting output would be:
“Hello, everyone!” said the boy. “Yeah, hello! I want a brand new bicycle!”
NUMBER OF CHANGES: 3
The program should keep a count of the number of substitutions applied. If any replacements were made, the
last line of output, after processing the entire input file, should be the line:
NUMBER OF CHANGES: N
Where N is the number of replacements that were made.
The program takes two command line arguments: the first is the name of a file containing SUBSTITUTIONS,
and the second is the name of a file containing WORDS. The program should process the SUBSTITUTIONS
file according to the rules described above. Then, it should read the file containing WORDS, apply any
replacements indicated from the SUBSTITUTIONS, and print out the result.
Note that any and all whitespace between WORDS in input is printed unchanged to output.
If exactly two filenames are not provided, print “TWO FILENAMES REQUIRED”, and stop. If a file cannot be
opened or read for any reason, print the error message “BAD FILE FILENAME”, and stop. It is possible that
either or both of the files your program reads may be empty. It is also possible that the second file may contain
no words. Neither of these situations are errors.
The program will be submitted and graded in separate parts:
Part 1
Recognizing error cases associated with the wrong number of file names and files that cannot be opened.
Handling cases with an empty SUBSTITUTIONS file.
Handling simple substitutions (no punctuation, no case changes).
test case:
This file shows all the test case names, and what will be run for each test case
You can run an individual test case by name by saying "runcase NAME", where NAME
is the name of the test case. If you type "runcase" it will give you the names
of all the test cases.
When a test case runs, it saves the output of your program and compares it to the
contents of the correct run of the program. The file containing the correct answer
is NAME.correct, where NAME is the name of the test case.
If the test case fails, the Unix "diff" program shows you the differences between
your output and the correct output. Lines that begin with < or > are highlighting
the differences.
If you want to run on your own machine, you can download cases.tar.gz and unzip
everything. The zip includes a shell script called StudentTest. You can use
StudentTest on your machine, the same way you would use runcase, as long as
a. everything is in the same directory
b. your executable is named "prog1"
c. you have bash and unix utilities like diff
> For example, imagine that we have these SUBSTITUTIONS in the first file:
> foo many
Write the program which allows you to print out Replacing instances of foo with many
> And suppose we are given this second file:
> So, hi everyone! This reminds me of foo things. Write the program which allows you to print out
Found word "So,"
Found word "hi"
Found word "everyone!"
....
I've given you a strategy to help you solve this problem, which actually helps you solve many problems.
> Could u write the code for me, please?
Doing so would be a waste of my time.
Doing so would be depriving you of a chance to learn how analyse a problem, and to write code.
Doing so would just put off for another week, the realisation that you're already behind.
So, either buckle down to some actual work (more than posting your assignment on forums, and begging).
Or just drop the course, because your path at the moment can only end in failure.
I remember when I was in this position a year ago. I came to this site hoping someone would do my code for me the night before it was due. Real eye opener when I realised how much I didn't know and how much time I wasted during the semester. I paid someone to do my assignment and I still failed. Now I'm repeating the course and I can't stop programming. Once you get a basic handle on it and start fiddling with more complex things ahead of what you're learning it becomes a frustrating kind of fun. Right now I am trying to stay ahead of the current year 1's since it would be a shame to be outdone by them. As salem c said you should buckle down and do some work, he gave you a strategy not so much an algorithm but use the strategy, write pseudocode, make an algorithm and then you might be able to make an attempt at what you're supposed to do. The assignment doesn't sound too difficult, read in your files, store them in a string. Find the words, compare and replace. You'd get more help if you had some form of working code than just an assignment description.