|
Automatically recording programs with the KiSS DP-558 |
|
|
|
Written by Erik Brakkee
|
Last year, I bought a DP-558 KiSS hard disk recorder. One cool feature the player has is the ability to schedule recordings
through an Electronic Programme Guide (EPG) on the internet. This is
very simple, using your own userid you log on to the site browse for
interesting programs and record them. The hard disk recorder, which is
connected to the internet regularly polls a KiSS server to find out about which
recordings have been scheduled. This is possible since the hardware id
of the KiSS player is linked to your user id. Now this is cool but
wouldn't it be even more cool to automatically record certain shows
when they appear and to send notifications by mail about possibly
interesting programs?
This is how I came up with the idea to crawl the KiSS EPG site for
programme information and autmatically record interesting programs.
Every day at a very early time (e.g. 5 AM) it crawls the site for
interesting shows. To implement the crawling I developed a very simple
web crawling framework. The framework defines the concepts of Page and
Action. A Page is basically an HTML page transformed into an XML
content model of the page. An action is basically a hyperlink with a
user friendly (content based) name. To implement crawling a program
must be written that performs the necessary navigation using pages and
actions, extracting the content as it progresses.
The framework creates Page objects by first tidying up the HTML page into XHTML (using jtidy)
and afterwards providing the interpretation of the HTML (=content +
presentation) as content only in XML. The framework determines which
XSLT to use based on configuration based on page type or URL. If the
action in the transformed page contains a page type, then that is used
to determine the XSLT. Otherwise, a URL match determines which XSLT to
use.
The KiSS crawler is just an application of this simple web crawling
framework. It consists of XSLT transformations, a program for
navigating the KiSS site, configuration of the basic crawler, and a
configuration at application level defining which programs should be
recorded and for which programs a notification should be sent. The
crawler works like a charm. It retrieves detailed program information
including time, title, description, and categorization and sends a
detailed notification afterwards. The source code will be made open
source under the wamblee.org flag. In fact it can be accessed already
using the subversion URL https://wamblee.org/svn/public/utils.
|
|
Last Updated ( Monday, 20 March 2006 )
|