Computer Networking — Python Practice II

A little about modules

Look at the Python tutorial on Modules (up to packages). Create the file (as described in the totorial) and go through section 6.3 (except you can skip 6.1.3).

Start up Python, perhaps in idle or charm and learn about bit more about the socket module by trying out the following commands.

import socket
s = socket.socket()

Opening URLs

Open the documentation for urllib2 which everyone seems to agree is pretty bad, but better than urllib.

Restart python and import the urllib2 module. Use the dir operator on urllib2. Find a function to open the URL (hint – urlopen) at page and use the dir operator on object representing the opened URL. It should look a lot like an object for reading a file.

With a single operator read the web page at the URL. There’s a little bit of redirection here that may disappoint you. That’s another reason why urllib2 is unpopular.

Parsing HTML

You have been advised to use HTMLParser in Homework 3.

I m assuming you have already tried out the example in Section 19.1.1 . Your assignment for this lab is the modify this example so that it records and prints the level of each start tag. This means the output should be something like the following:

Encountered a start tag: html [level 0]
Encountered a start tag: head [level 1]
Encountered a start tag: title [level 2]
Encountered some data  : Test
Encountered an end tag : title
Encountered an end tag : head
Encountered a start tag: body [level 1]
Encountered a start tag: h1 [level 2]
Encountered some data  : Parse me!
Encountered an end tag : h1
Encountered an end tag : body
Encountered an end tag : html

Java and C++ have similar ways of handling initialization of derived classes. Python is weird. I suggest you take a look at some of the following.