Hey, can anyone help with the following?
I am trying to scrape the site on which Here is a piece of code that I used to get the data using McKenz and Sundersup. I'm stuck here because it will not let me use the search () function for a list
br_results = mechanize.urlopen (br_results) html = br_results.read () soup = beautiful Html) local_links = soup.findAll ("a", {"class": "down-arrow csa"}) upc_code = soup.findAll ("ul", {"class": "bc-meta3"}) For Upc_code: upc_text = upc.contents.contents print upc_text
I imagine < Code> upc_code is the list you are showing us, and local_links
does anyone have your question correct? Given that you do not tell it in your code further ...?
So I'm not sure that upc_text
will be in the body of your loop, which is upc
a ul
tag List of (possibly)
- upc.contents
a li
tag, and I do not think how upc.contents.contents
can work - what are you seeing as a result of that code? I was expecting an exception!
Anyway, the way I write the loop will be something like this: in the listitems for the listitems = upc.findAll 'li' for the upc_code in anitem: print anitem.contents [ 1]
Because you want the second child of each list item (first a strong
tag, second, the shipping string you want.
If this is not the second child of each list item that you want, please make it clear; For example, you can identify the strong Get her next brother, if she suits you better - just make the nested loop body
in print anitem.find ('strong'). NextSibling
Comments
Post a Comment