Problem 1: (60 points) You have a large (let's say a million) collection of ten character strings. You are to search this collection for circular rotations of a key. So, if you were given the key ALGORITHMS, you would search for: ALGORITHMS LGORTIHMSA GORITHMSAL ORITHMSALG RITHMSALGO ITHMSALGOR .... Now, suppose you have three different architectures on which to run this problem: (1), a moderate sized (say 10-d) hypercube, (2), a vector supercomputer (like a Cray), (3), 1000 IBM PC's with 40Megabyte hard disks, each with it's own modem and telephone line. Part A: Describe how you could efficiently solve this problem use the above three architectures. Don't write code, but outline the algorithm you'd use to solve the problem. Go into moderate detail about how you partition the database. Describe any synchronization required by the processors. Estimate the performance of your algoritmic solution. Part B: Suppose your system is going to receive several key requests concurrently. If necessary, modify your solution to process concurrent rotational key requests. Again, estimate the performance of your algorithm. Part C: Finally, suppose that rather than a database of one million strings of length ten, you had a database of one thousand strings of length ten thousand. Would you modify your solution? How? Problem 2: (20 points) Show the data dependencies in the following code and describe how it might be efficiently vectorized. DO 20 I=2,N B(I) = C(I-1) C(I) = D(I) + C(I-1) 20 D(I) = A(I) + B(I) Problem 3: (20 points) Describe the parallelism that is obtained in executing the following Id Nouveau program on a dataflow machine. rec_silly A i = if i==0 then 0 else A[i DIV 2] + 1 % the Pascal DIV operator -- 16 DIV 2 == 8, 17 DIV 2 == 8 make_array (l, u) generate = { A = array (l, u); { for i from l to u do A[i] = generate i} in A} ; % program (8) on page 473 of the Arvind and Ekanadham article V = make_array (0, 15) (rec_silly V)