unicode - feedparser fails during script run, but can't reproduce in interactive python console -

When I run Eclith or when I run my script in iPython then it has failed:

< Pre> 'ASCI' codec can not decode the byte 0xe2 in position 32: not in serial number (128)

I do not know why, but when I only perform feeds I am URL) using the same URL statement, there is no error thrown. This is sorting me a big time.

The code is as simple:

  Try: d = feedparser.parse (url) Except the exception, e: logging. Terror ('error while retrieving feed.') Logging. Terror (E) logging. Terror (formatExceptionInfo (none)) logging.error (formatExceptionInfo1 ())

Here is a stack trace:

  d = feedparser.parse (url) "Python26 \" in the file "C: \ Python 26 \ lib \ site-packages \ feedparser.py", line 2623, parse feedparser.feed (data) file "C: \ feed sgmllib.SGMLParser.feed (self, data) Lib \ site-packages \ feedparser.py ", line 1441, feed" C: \ Python26 \ lib \ sgmllib.py ", line 104, the self.goahead (0) file in the feed" C: \ Python26 \ lib \ sgmllib. Py ", line 143, goahead k = self.parse_endtag (i) file" C: \ Python26 \ lib \ sgmllib.py ", line 320, parse_endtag .finish_endtag (tag) in the file" C: \ Python26 \ lib \ sgmllib .py ", line 360, finish_endtag Self.unknown_endtag (tag) file "C: \ Pyt Hon26 \ lib \ site-packages \ feedparser.py", line 476, unknown_endtag method () file "C: \ Python 26 \ lib \ site-packages \ feedparser.py ", Line 1318, _end_content value = self.popContent ('content') in the file" C: \ Python26 \ lib \ site-packages \ feedparser.py ", line 700, popcontent value = self.pop (tag) file" C: \ Python26 \ lib \ site-packages \ feedparser pop output = _resolveRelativeURIs (output, self.baseuri, self.encoding) file "C: \ Python 26 \ lib \ site-packages \ feedparser.py", line 1594, _resolveRelativeURIs (HtmlSource) file in p.feed "C: \ Python 26 \ lib \ site-packages \ feedparser.py", line 1441, feed sgmllib.SGMLPars In the er.feed (self, data) file "C: \ Python26 \ lib \ sgmllib.py", line 104, the self.goahead (0) file in the feed "C: \ Python26 \ lib \ sgmllib.py", line 138 , Goahead k = self.parse_starttag in the file "C: \ Python26 \ lib \ sgmllib.py", line 296, parse_starttag self.finish_starttag (tag, attrs) in the file "C: \ Python26 \ lib \ sgmllib.py ", Line 338, tag self.unknown_starttag (tag, attrs) in the finish_start" c: \ Python26 \ lib \ site-packages \ feedparser.py ", line 1588, unknown_starttag attrs = [(key, (in self.relative_uris ( Tag, key)) and the key for the self.resolveURI (value) or value attrs file "C: \ Python26 \ lib \ site-packages \ feedparser.py", line 1584, in the solution RI return _urljoin (self.baseuri, uri) file "C: \ Python26 \ lib \ site-packages \ feedparser.py", line file 286, _ urlparse.urljoin (base, yuri): "Python26 \ lib \ Urlparse.py ", line 215, in urljoin params, query, piece)) file" C: \ Python 26 \ lib \ urlparse.py ", line 184, urlunparse return urlunsplit ((plan, netloc, url, query, piece )) File "C: \ Python 26 \ lib \ urlparse .p", line192, urlunsplit url = scheme + ':' + url file "d: \ Python26 \ lib \ encodings \ cp1252.py", line 15 , Encoded return codecs.charmap_decode (input, errors, decoding_table)

partially resolved Or:

It is reproduced on passage URL Feedrprrs. Pars () is Unicode when it is an ascii URL, and for the record, you need a feed in which some high characters are Unicode characters. I'm not sure why this is.

New Tmime

Search This Blog

unicode - feedparser fails during script run, but can't reproduce in interactive python console -

Comments

Post a Comment