#1192 Planet Fedora feed headers do not provide the Content-Encoding
Closed: Fixed None Opened 15 years ago by lmacken.

Our RSS1.0, RSS2.0 and Atom feeds do not specify the Content-Encoding in the HTTP response headers.


Do you have a test I can use to verify I've fixed it.

{{{
import httplib
from pprint import pprint

def printheaders(host, path, port=80):
print host + path
h = httplib.HTTPConnection(host, port)
h.request('GET', path)
r = h.getresponse()
pprint(r.getheaders())

printheaders('planet.fedoraproject.org', '/rss20.xml')
printheaders('feeds.washingtonpost.com', '/wp-dyn/rss/politics/index_xml')

================================================

planet.fedoraproject.org/rss20.xml
[('content-length', '139120'),
('accept-ranges', 'bytes'),
('server', 'Apache/2.2.3'),
('last-modified', 'Wed, 18 Feb 2009 21:26:19 GMT'),
('connection', 'close'),
('etag', '"8e01b4-21f70-46338120cf4c0"'),
('date', 'Wed, 18 Feb 2009 21:29:55 GMT'),
('content-type', 'text/xml')]
feeds.washingtonpost.com/wp-dyn/rss/politics/index_xml
[('x-content-type-options', 'nosniff'),
('transfer-encoding', 'chunked'),
('expires', 'Wed, 18 Feb 2009 21:29:56 GMT'),
('server', 'GFE/1.3'),
('last-modified', 'Wed, 18 Feb 2009 07:21:47 GMT'),
('etag', 'idw5V9pVKPxoUfy4ck4opdZSAqs'),
('cache-control', 'private, max-age=0'),
('date', 'Wed, 18 Feb 2009 21:29:56 GMT'),
('content-type', 'text/xml; charset=iso-8859-1')]
}}}

Specifying <?xml encoding="UTF-8"?> in our feeds would probably be sufficient, although I don't think that adding the charset to the content-type would hurt.

okay I added
AddCharset UTF-8 .xml

to the planet apache config

and now I see
('content-type', 'text/xml; charset=utf-8')

Is that sufficient to get your parser to work?

Excellent, this did the trick. Thanks a lot, Seth!

Login to comment on this ticket.

Metadata