Problem Set 7: Word frequencies

Submit this assignment to ps7 on Handin.

Note: Whenever you write a function in this class, follow the design recipe. You will be graded accordingly.

Problem 1. Develop a data and structure definition for storing a Frequency, which combines a String and a Number and represents that many uses of that string. Call the structure frequency with fields word and count.

Problem 2. Develop a data definition ListOfString that uses cons and empty to hold arbitrarily many Strings. Similarly, develop a data definition for ListOfFrequency that uses cons and empty to hold arbitrarily many Frequencys.

Note: Throughout the rest of this assignment, a ListOfFrequency should never contain multiple Frequencys with the same word. You can make this assumption when your function receives a ListOfFrequency in its input, and you must maintain this guarantee when your code produces a ListOfFrequency in its output. For example, the following is a valid ListOfFrequency:
(cons (make-frequency "hello" 5)
      (cons (make-frequency "hi" 4)
            (cons (make-frequency "bye" 2)
                  empty)))
But the following is not a valid ListOfFrequency, so you don’t need to worry about receiving it as input, and you must not produce it as output:
(cons (make-frequency "hello" 5)
      (cons (make-frequency "hi" 4)
            (cons (make-frequency "hello" 2)
                  empty)))

Problem 3. Design a function count-word that consumes a ListOfFrequency and a String and adds 1 to the frequency for that string, producing a new ListOfFrequency. If there is no Frequency for that string, the resulting ListOfFrequency should have a Frequency with that string and the number 1.

Problem 4. Design a function count-all-words that takes a ListOfString and produces a ListOfFrequency with the frequencies counted from the entire list of strings.

Problem 5. Download the text of Hamlet into the same folder as your file where you are completing this assignment.

Then, create a list of words from the downloaded file. Use the 2htdp/batch-io library. You should use the built-in read-words function.

Then compute the frequencies of all of the words in the file. Because this operation might take a while, don’t define it as a constant. Instead, put the operation (not its result) in a short comment, like this:
; (define hamlet-frequencies ...)

Problem 6. Design a function frequents that consumes a ListOfFrequency and produces a ListOfFrequency that contains only the Frequencys from the original list where the number is more than 100. Use this to compute all the words used more than 100 times in Hamlet. Include this list in your submission, as another comment.