Reading the Economist - Hpricot, Ruby-RSS, Festival
Well, having the Economist read at any rate.
First, set up Festival (configuring it to use ALSA and an ‘English’ voice):
apt-get install festival apt-get install festvox-rablpc16k cat > ~/.festivalrc <<END (Parameter.set 'Audio_Command "aplay -D plug:dmix -q -c 1 -t raw -f s16 -r \$SR \$FILE") (Parameter.set 'Audio_Method 'Audio_Command) (voice_rab_diphone) END
Then liberally sprinkle some ruby:
#!/usr/bin/ruby
require 'rss/1.0'
require 'rss/2.0'
require 'open-uri'
require 'yaml'
require 'hpricot'
include YAML
TEMPFILE = "/tmp/economistreader"
puts "Fetching feed"
source = "http://www.economist.com/rss/full_print_edition_rss.xml"
content = ""
open(source) do |s| content = s.read end
rss = RSS::Parser.parse(content, false)
puts "Title: #{rss.channel.title}"
puts "Found #{rss.items.size} items"
for item in rss.items
puts "#{item.title}"
puts "Read? [Y/n]”
if readline.strip.downcase =~ /^n/
next
end
doc = Hpricot(open(item.link))
paras = doc.search(”//div[@class='col-left']/p[@class='']“)
File.open(”#{TEMPFILE}.body”, “w”) do |f|
paras.each do |p|
f.write(p.inner_text + “\n”)
puts p.inner_text
end
end
system(”festival”, “–tts”, “#{TEMPFILE}.body”)
end
I give it about 3 articles before the voice drives me completely insane. There’s a character-set issue that puts ‘?’s in odd places and causes Festival to get confused. Even without confusing characters, free text-to-speech software still isn’t ‘all that‘.
You could also, it’s worth pointing out, visit PimpMyNews. You’ll find the Economist’s feed under ‘Business/World Business News’. Unfortunately, they are lazy and their software only reads out the text from the RSS ‘Description’ field rather than parsing the whole article. That said, if what you want is to hear the first 200 words of every article in the Economist, that’s your badger.