Hpricot Limitations
Well, after fighting with Cygwin and Hpricot for a few days, I've given in and dropped Cygwin from my stack for the Ruby/Windows testing for now.
Within five minutes of doing that, I was able to get something done with Hpricot, and bump into its limitations within a minute or so after that.
For example:
irb(main):001:0> require 'rubygems'
=> false
irb(main):002:0> require 'hpricot'
=> true
irb(main):003:0> require 'open-uri'
=> true
irb(main):004:0> doc = Hpricot(open("http://code.whytheluckystiff.net/hpricot/wi
ki/HpricotBasics"))
=> #<Hpricot::Doc {doctype "<!DOCTYPE html\n" " PUBLIC \"-//W3C//DTD XHTML 1.
...
"> "http://trac.edgewall.com/" </a>} "\n " </p>} "\n" </div>} "\n\n\n\n "} </bod
y>} "\n" </html>} "\n\n">
irb(main):005:0> (doc/"//img")
=> #<Hpricot::Elements[{emptyelem <img src="/hpricot/chrome/site/images/hpricot-
small.png" alt="hpricot">}, {emptyelem <img src="/hpricot/chrome/common/trac_log
o_mini.png" height="30" alt="Trac Powered" width="107">}]>
irb(main):006:0> (doc/"//img[@alt='hpricot'")
=> #<Hpricot::Elements[{emptyelem <img src="/hpricot/chrome/site/images/hpricot-
small.png" alt="hpricot">}, {emptyelem <img src="/hpricot/chrome/common/trac_log
o_mini.png" height="30" alt="Trac Powered" width="107">}]>
irb(main):007:0> (doc/"//img[@alt='hpricot']")
=> #<Hpricot::Elements[{emptyelem <img src="/hpricot/chrome/site/images/hpricot-
small.png" alt="hpricot">}]>
irb(main):008:0> (doc/"//a/img[@alt='hpricot']")
=> #<Hpricot::Elements[{emptyelem <img src="/hpricot/chrome/site/images/hpricot-
small.png" alt="hpricot">}]>
irb(main):009:0> (doc/"//a[img/@alt='hpricot']")
=> #<Hpricot::Elements[]>
There's nothing wrong with the second query; it's a valid XPath expression, it's just not supported by jQuery, which means it isn't supported by Hpricot. Too bad; I guess i'll have to work around the syntax limitations.
No comments:
Post a Comment