We discuss page monitors and then go into some detail about advanced RSS feed-manipulation Web sites.
Class held on 10/22/2008. Student notes are available on this page. Possible questions are available on this page.
Class structure
- Go through “At beginning of class” information
- Go through diagram explaining page monitors & RSS filters
- Work on exercises
At beginning of class
- I haven't graded anything new.
- Check the class notes page to see who is taking notes today and who is coming up with questions.
- Go over any other announcements you might have missed since last class.
- Check the start page for any blogs you might be interested in that you might have missed.
- I have arranged the Yahoo visit for early December. Don't miss that class!
- I want to review what this class is about.
- Let's look over the upcoming industry updates.
- For this class:
- Google update: Dylan Burkhardt (
dylanb)
- Google update: Dylan Burkhardt (
- For next class:
- Search industry update: Max Rossiter
- For this class:
My notes

Page monitoring software
Overview
Page Monitors were the next big thing five years ago. It is a program or web based program that you download. Each day (or whatever time period you want to set) it downloads the webpage, and if it's different it will send you an email. Some tell you what has changed while others just tell you that it has changed.
At first, you might not be that impressed with page monitors. But after realizing that it can be used for a lot more than news, it can be quite a useful tool. WatchThatPage.com is the best free site.
WatchThatPage has a limit of 250 characters for the URL. Also, shortened URLs (from tinyurl.com) do not work. To get around these problems, use TrackEngine, where neither of these problems exist.
- Capabilities
- Automatically determine if a Web page, or part of a Web page, has changed
- Results might be delivered via email, RSS feed, or a summary Web page
- Page Monitoring Software Examples
- Track a company's press release page (Goldman Sachs)
- Find out when a new version of software is released (BBEdit)
- Find out when a new product is released (Canon cameras)
- Track a product category (Flat panel LCD TVs at Amazon)
- Monitor product information (comments at CNet about a Tivo)
- Track auctions
- Track new jobs (at Google)
- Monitor earnings releases (at JPMorganChase)
- Track who is linking to you (e.g., David Pogue)
- Follow investment information about a company (e.g., Goldman Sachs at The Motley Fool)
Feed creation software
Overview
- Capabilities
- Demonstrations
- Demonstration with Feed43 and the JPMorganChase Annual report (the feed)
- Demonstration with Feed43 and a Google Web search results page
- Demonstration with Feed43 and the Goldman Sachs press release page
Make a feed
From a page
- Dapper
- Description: Dapper is pretty slick. You can look through user created Dapps or you can (easily) create your own. Don’t forget to use the “get a nice short url” option and create your own that is easier to look at/use. This allows you to get an RSS feed for more things (instead of just news and blogs) such as searches.
- The Glory, Bliss and How-to of Screen Scraping for RSS
- Demo
- Video tutorial
- Useful Dapps
- FeedYes
- Feed43
- Feed43 is a little bit more complicated. You have to find the actual html within the source code of the page.
- Define Extraction Rules – By finding the specific places (within the code) of the information that you’re looking to have monitored by the RSS feed. There are directions for what specific code to use in the program.
- Then click extract
- Then you can give it a title, description, url, etc
- Then put in where the title, date, etc are etc
- If these sites are updated once a month, its too much of a hassle to make one of these (use a page monitor). But if it is updated daily and you want to monitor it, then it might be a good idea to make one!
- Free, or $29/year for 20 hourly updates
- My feeds
- Feed43 is a little bit more complicated. You have to find the actual html within the source code of the page.
From other feeds
- FeedRinse: From their site, “Feed Rinse is an easy to use tool that lets you automatically filter out syndicated content that you aren't interested in. It's like a spam filter for your RSS subscriptions.”
- Yahoo Pipes
- FeedZero: This uses adaptive filtering software to learn what feed articles you like, and which you don't like, based on your input.
Purpose of today's tools
We’ve already done some Email alerts. But the remaining question is: why would we do all these different things that are available?
- Focused RSS feed — If you’re lucky, there is a keyword-based, or specific-topic defined, RSS feed available for a site you can subscribe to.
- General RSS feed: If there's simply a general RSS feed (such as "Yahoo breaking news"), then you should run that feed through a keyword tool:
- FeedRinse
- Yahoo Pipes (if other processing is needed)
- The following are useful if there's no RSS feed available on a page:
- FeedYes: I would try this first since it's the easiest to use when setting up a feed.
- Feed43: This is more powerful but more difficult to use.
- Dapper: This is another powerful tool. One of the benefits of this tool is that you can use pre-defined feeds ("dapps").
- Page Monitor — Use Page Monitor to get any updates on a specific page sent to your Email account.
- Email alerts — This was a precursor to RSS feeds. Some sites will give you updates to their site via email, not RSS.
Hints about possible test questions
You're definitely going to be held responsible for the following topics:
- What WatchThatPage (as an example of a page monitor) can do
- What Dapper can do
- What Feed43 can do and how its search patterns work
- What Yahoo Pipes can do and how feeds can be manipulated (for example, Fetch Feeds, Union, Filter, Sort)
- Under what circumstances would you use each one of these tools (as opposed to another)
I'll add to this later but this should give you an idea of the type of questions that I might ask.
Possible blog topics
You do not have to write a blog. These are suggested blog topics if you were to write one. There are lots of possibilities in this class. Describe different ways that you found these tools useful. Describe how you used Yahoo Pipes, possibly differently than how we have described them here.
Resources
Page monitors
Web-based
- WatchThatPage
- Free (for any number of pages), or $20/year for priority service
- Can highlight changes in pages
- Changes sent in an email
- Keyword matching
- This site doesn't appear to be updated any more (3+ years)
- TrackEngine
- Free for 5 bookmarks, or $20/year for 10 pages, or $53/year for 50 pages
- Highlights new content in HTML email
- Monitors changes daily
- Does do keyword matching
- This site hasn't been worked on for 6+ years
- Other possible sites: InfoMinder, ChangeDetect, Trackle
Mac software
Windows software
- WebSite-Watcher
- Free for 30 days; $47 purchase