Discussion:
[XOM-interest] Bug in Builder.canonicalizeURL
Adam Constabaris
2010-06-18 18:36:48 UTC
Permalink
Hello,

It would appear there is a rather nasty bug in XOM 1.2.5's Builder
class, in that it will duplicate the query string on URIs that contain
'em when canonicalizeURL is called, viz. the result of passing

http://www.example.com/?id=1

to canonicalizeURL (which is called by Builder.build(String)) yields
up the "canonical"

http://www.example.com/?id=1?id=1

The culprit appears to be the use of URL.getFile() to get the "path"
portion of the systemId -- on JDK 1.6, this value *includes* the query
string, which is later appended.

Diff against Builder.java from 1.2.5 distribution is attached; if
it's stripped by the list software, the fix that worked on my test
cases was 1110 (String path = u.getPath() ) and 1121 add check for
null query before appending "/" to path.

cheers,

AC
Elliotte Rusty Harold
2010-06-18 23:42:40 UTC
Permalink
On Fri, Jun 18, 2010 at 2:36 PM, Adam Constabaris
Post by Adam Constabaris
Hello,
It would appear there is a rather nasty bug in XOM 1.2.5's Builder
class, in that it will duplicate the query string on URIs that contain
'em when canonicalizeURL is called, viz. the result of passing
http://www.example.com/?id=1
to canonicalizeURL (which is called by Builder.build(String)) yields
up the "canonical"
http://www.example.com/?id=1?id=1
Sounds like a bug. I'll try and look at this on Sunday. What would be
most helpful would be a unit test demonstrating the problem. And yes,
the list strips attachments. :-)
--
Elliotte Rusty Harold
elharo at ibiblio.org
Elliotte Rusty Harold
2010-06-19 00:25:17 UTC
Permalink
I've been able to reproduce this bug, and a fix is checked into head.

I want to also upgrade jaxen to 1.1.3, and then I'll release 1.2.6.
--
Elliotte Rusty Harold
elharo at ibiblio.org
Loading...