From Web Page to Podcast
By on July 17, 2006Autodiscover MP3 links as you browse and auto generate a Podcast RSS file you can import into a media player.

Did you ever come across a web page with many MP3 links and wondered how to play those through iTunes, or with your MP3 player or iPod? If it's just one MP3 link, you can save the MP3 file to the desktop and then drag the file into your iTunes jukebox view. But what if there are 10 or 30 MP3 links? What a drag, literally, it would be to save each and import each into iTunes. Even if you did that, your files would then be scattered all over your player's library, rather than grouped into a simple, findable, titled structure, like the one your player's Podcast area provides.
A much cooler approach for playing a web page's audio files through your media player would be to have your browser detect the sound files on any web page and automatically generate a single Podcast file, which contains links to all those MP3s your browser found. Then you'd only need to import the one file into your Podcast player and you're good to go.
This hack does just that by integrating various technologies including the Firefox browser, Greasemonkey, tabs, Javascript, MP3, XML, XPath, RSS, DOM, regular expressions, CSS, a local web server and your favorite Podcast media player (e.g. iTunes).
Already know how to use Greasemonkey? Want to hack the hack?
- Turn on Greasemonkey (you must be using the Firefox browser of course)
- Load & activate the Greasemonkey script webpagetopodcast.user.js
- Visit any web page with sound file links.
- If MP3s are found, the script will display a small yellow box in the upper right corner, which indicates how many MP3 files were found and a link.
- Click on the link to open a new browser tab which contains your Podcast RSS XML file.
- Save that file to your local web server (e.g. the ~/Sites directory on Mac OS X)
- Import the RSS file into your Podcast reader, then drag items to your MP3 player, and you're ready to go.
I tested this script on various web pages which had MP3s scattered in various DOM arrangements and various sound file link labeling approaches. Much of the effort went into extracting what seemed like a good label for the sound file, ignoring hyperlink text such as "listen" or "play", which wouldn't make any sense in a Podcast reader. Perhaps you can find a better approach than my hack uses.
It tested well on the following examples:
- http://podbop.org/artists
- http://www.houstonjones.com/cds_and_samples_hojo_mojo.htm
- http://www.bloggercon.org/2006/06/23
- http://jobster.blogs.com/
- http://fuzzyblog.com/podcasts/
- http://www.nytimes.com/ref/sports/olympics/podcasts-olympics.html
- http://www.ajet-bone.com/angus/sound_movie/sound.html
Note: This script is intended for use with publically hosted and web accessible MP3's
and the resulting podcast file is intended for personal use only. To use this
script, you must visit the original publisher's web page. The functionality
contained here merely creates a link to the original content, arranged in an
XML format, and allows playback through your computer. There is no
physical difference created or implied as compared to merely playing the hosted
MP3 file with your computer's media player. The script only helps you to
categorize the audio in the convenient podcast format for local personal use.
The audio content itself is copyright by the producer of the web page through
which the original audio files are hosted.
Click on "READ FULL ENTRY" below for more details and explanation on Greasemonkey and Podcasts.
New to Greasemonkey and Podcasts? How does it work?
Greasemonkey was created by Aaron Boodman to make the web a better place. Greasemonkey is a Firefox extension that lets you augment any web page with your own Javascript. Besides handling the Javascript and DOM manipulation features, Greasemonkey provides powerful utilities that take advantage of the Firefox browser's features such as tabs and XPath. Greasemonkey also provides the ability to make Ajax type HTTP requests, but to any server.
A Podcast is an XML text file, written in the RSS format, which contains titles, descriptions and links to MP3 or sound files. And because a Podcast file follows a standard format, there are a variety of media players that support Podcasts. I like using the iTunes player, and although they've done some odd things to extend the RSS format to suit their player, I find iTunes to be a well thought out interface, and it's the only one through which I can update my iPod. Note that the words or technical concepts of "iPod" and "Podcast" have nothing to do with each other. A Podcast can be played on any media player which supports the Podcast RSS XML format.
More detailed Instructions:
- Install the Greasemonkey extension for Firefox. The extension is available at: https://addons.mozilla.org/firefox/748
- Be sure the little monkey icon on the bottom right corner of your browser is enabled.
- Load the Greasemonkey script "Web Page to Podcast" script and enable it by clicking here: webpagetopodcast.user.js. (I recommend to right click on this link and open it in a new tab). Click on the Install button that appears in the upper right.
- Surf to a web page that has some MP3 links. First, visually check for a Podcast link on the web page, as the author may have already created one for you. Look for a small orange RSS or XML button, or a link which says "Podcast". If you find such a link, click on it and skip ahead to step 10.
- If the script finds any MP3 links, it will display a little yellow box near the upper right corner, indicating how many MP3 links were found.
- Click on the link in the little yellow box, and a new tab will be created, containing a full Podcast RSS/XML file for your web page's MP3s.
- Click on the new tab and save the new tab by using the browser's File > Save As, to a local web server directory you can reach with a hyperlink starting with http://localhost.... On Mac, simply save it to your ~/Sites directory (the default Apache server location). Use an .xml extension. (if you're on a PC, check with a PC expert on the best approach to saving to/running a local web server.)
- The script makes an attempt to extract the best descriptive title for each MP3 link. Occasionally, it will not do a good job. In this case, simply edit the file's <title> XML elements as needed.
- Ensure that your web server is running. On Mac OS X, goto System Preferences > Sharing > Services and click on Personal Web Sharing and click "Start" (if it's already running, the button says "Stop").
- If you use iTunes, start it up and find the menu along the top and goto Advanced, Subscribe to podcast, and paste or type in the URL of your saved Podcast XML file. On Mac, that might be something like http://localhost/~al/samplepodcast.xml. And that's it.. iTunes will load a new item under the Podcasts section, and automatically load the first MP3. You can load the other MP3s in that Podcast by clicking on the tiny "GET" icon.
- If you use another Podcast player, it should have some utility for you to paste in a Podcast URL.
- If the MP3 titles are not adequate or distinct, go back to the saved Podcast file with your text editor and alter the <title> element contents. And import it to your media player again.
- Once the MP3 files are in iTunes, attach your iPod and drag and drop to your device as you like.
- You don't need to keep Greasemonkey active in your browser. But next time you see some MP3s on a web page, click on the little monkey, and refresh the web page. This Web Page to Podcast script will autodiscover all the MP3s.
The possibilities with Greasemonkey are endless. I hope this script can be a helpful source for your future Greasemonkey hacks. Check in with us in the future for more scripts.
Reference:
Greasemonkey download for Firefox
http://greasemonkey.mozdev.org
Mozilla Javascript Reference
http://devedge-temp.mozilla.org/central/javascript/index_en.html
Massive Greasemonkey script library
http://userscripts.org
Greasemonkey community's discussion forum
http://www.mozdev.org/pipermail/greasemonkey
Free Greasemonkey scripting eBook
http://diveintogreasemonkey.org
Greasemonkey Hacks book
http://search.barnesandnoble.com/booksearch/isbnInquiry.asp?z=y&isbn=0596101651&itm=1
Aaron Boodman's home page (creator of Greasemonkey)
http://youngpup.net
Greaseblog blog
http://greaseblog.blogspot.com
Couple of presentations by Aaron Boodman, creator of Greasemonkey
http://greaseblog.blogspot.com/2005/12/slides-from-nov-8-emerging-technology_02.html
http://greaseblog.blogspot.com/2005/08/aarons-oscon-2005-slides.html
Greasemonkey API (let's you do things a normal web page cannot)
http://diveintogreasemonkey.org/api/index.html
Probably the most famous Greasemonkey script (now a Firefox extension)
http://bookburro.org
Podcasting
http://en.wikipedia.org/wiki/Podcast
RSS
http://en.wikipedia.org/wiki/RSS_%28file_format%29
http://blogs.law.harvard.edu/tech/rss
Juice - Podcast Receiver (formerly called iPodder)
http://juicereceiver.sourceforge.net/
RSSRadio
http://www.dorada.co.uk
How to find Podcasts
http://lifehacker.com/software/top/technophilia-find-great-podcasts-183411.php
Happy hacking..
Al Nevarez
Product Manager & weekend hacker
TrackBack
TrackBack URL for this entry:
http://blog.medallia.com/cgi-bin/mt/mt-tb.cgi/6
Comments
This Information Week article on Greasemonkey
http://informationweek.smallbizpipeline.com/howto/160402289
mentions a cool script:
"Want to listen to MP3s in your browser rather than launching RealAudio or Windows Media Player when you click on a URL ending in .mp3?" Very nice. It puts small play/stop button next to each link while keeping your web page on the screen.
Greasemonkey script:
http://musicplayer.sourceforge.net/greasemonkey/inline.player.user.js
I found reference in this month's Wired magazine to a Greasemonkey script that helps you to download videos from YouTube and Google and make them iPod friendly.
http://www.joshkinberg.com/blog/archives/2005/11/greased_google.php
http://www.userscripts.org/scripts/show/3982
The king of greasemonkey has spoken! This will undoubtly be a useful tool for downloading techtalks onto my ipod to listen to while travelling to and from work!. Cheers