Questions tagged [pyquery]

pyquery is a jquery-like library for python that allows you to make jquery queries on xml documents.

PyQuery uses lxml for fast XML and HTML manipulation.

It allows you to make jQuery-style CSS-selector queries on XML/HTML documents. The API is intended to match jQuery's API whenever possible, though it has been made more Pythonic where appropriate

It can be used for many purposes. The main idea is to use it for templating with pure http templates that you modify using pyquery. I can also be used for web scrapping or for theming applications with Deliverance.

Read more

92 questions
105
votes
6 answers

How to "log in" to a website using Python's Requests module?

I am trying to post a request to log in to a website using the Requests module in Python but its not really working. I'm new to this...so I can't figure out if I should make my Username and Password cookies or some type of HTTP authorization thing I…
Marcus Johnson
  • 1,923
  • 5
  • 18
  • 26
21
votes
2 answers

Getting attributes in PyQuery?

I'm using PyQuery and want to print a list of links, but can't figure out how to get the href attribute from each link in the PyQuery syntax. This is my code: e = pq(url=results_url) links = e('li.moredetails a') print len(links) for link…
Richard
  • 52,447
  • 93
  • 287
  • 475
16
votes
3 answers

Using lxml to parse namepaced HTML?

This is driving me totally nuts, I've been struggling with it for many hours. Any help would be much appreciated. I'm using PyQuery 1.2.9 (which is built on top of lxml) to scrape this URL. I just want to get a list of all the links in the…
Richard
  • 52,447
  • 93
  • 287
  • 475
14
votes
6 answers

Iterating over objects in pyquery

I'm scraping a page with Python's pyquery, and I'm kinda confused by the types it returns, and in particular how to iterate over a list of results. If my HTML looks a bit like this:
blah blah

Something…

AP257
  • 72,861
  • 84
  • 184
  • 258
13
votes
3 answers

pip error: unrecognized command line option ‘-fstack-protector-strong’

When I sudo pip install pyquery, sudo pip install lxml, and sudo pip install cython, I get very similar output with the same error that says: x86_64-linux-gnu-gcc: error: unrecognized command line option ‘-fstack-protector-strong’ Here's the…
cooltoast
  • 247
  • 2
  • 3
  • 8
12
votes
2 answers

Installing PyQuery Via Pip

I'm attempting to install PyQuery via pip but I'm getting an error I do not understand. The command I used was: sudo pip install pyquery I get the output below: Requirement already satisfied (use --upgrade to upgrade): pyquery in…
Torra
  • 1,022
  • 3
  • 12
  • 23
11
votes
3 answers

Why can this unbound variable work in Python (pyquery)?

The code is from the guide of pyquery from pyquery import PyQuery d = PyQuery('

Hi

Bye

') d('p').filter(lambda i: PyQuery(this).text() == 'Hi') My question is this in the 3rd line is an unbound variable and is never defined…
Hanfei Sun
  • 39,245
  • 33
  • 107
  • 208
10
votes
1 answer

Convert unicode with utf-8 string as content to str

I'm using pyquery to parse a page: dom = PyQuery('http://zh.wikipedia.org/w/index.php', {'title': 'CSS', 'printable': 'yes', 'variant': 'zh-cn'}) content = dom('#mw-content-text > p').eq(0).text() but what I get in content is a unicode string with…
wong2
  • 28,972
  • 43
  • 121
  • 165
8
votes
3 answers

How to get an individual css style in pyquery

You can set a css style using several methods: p = PyQuery('

') p.css('font-size','16px') p.css(['font-size'] = '16px' p.css = {'font-size':'16px'} Great, but how to get an individual css style? p.css('font-size') # jquery-like method…
shafty
  • 599
  • 6
  • 17
8
votes
4 answers

how do i make this python code less ugly

First of all python is an awesome language. This is my first project using python and I've made a ridiculous amount of progress already. There's no way that this code below is the best way to do this. What's the most idiomatic way write a class…
Tyler
  • 4,441
  • 9
  • 37
  • 55
5
votes
2 answers

PyQuery How do I append and rename an element into each of its subelements

How can I append or insert a class attribute into its sub-elements, but only for direct children and then to repeat it for the next class and sub-elements. In the docs it is referenced here pyquery manipulating >>> d = pq('
sayth
  • 6,154
  • 10
  • 47
  • 93
5
votes
2 answers

Entire JSON into One SQLite Field with Python

I have what is likely an easy question. I'm trying to pull a JSON from an online source, and store it in a SQLite table. In addition to storing the data in a rich table, corresponding to the many fields in the JSON, I would like to also just dump…
user1893148
  • 1,542
  • 3
  • 20
  • 32
5
votes
1 answer

PyQuery: Get only text of element, not text of child elements

I have the following HTML:

$325.00$295.00

I'd like to get the $295 out. However, if I simply use PyQuery as follows: price = pq('h1').text() I get both prices. Extracting only direct child…
Richard
  • 52,447
  • 93
  • 287
  • 475
4
votes
4 answers

What’s the most forgiving HTML parser in Python?

I have some random HTML and I used BeautifulSoup to parse it, but in most of the cases (>70%) it chokes. I tried using Beautiful soup 3.0.8 and 3.2.0 (there were some problems with 3.1.0 upwards), but the results are almost same. I can recall…
Vaibhav Mishra
  • 9,447
  • 11
  • 41
  • 56
4
votes
1 answer

parse html body fragment in lxml

I'm trying to parse a fragment of html:

title

I use lxml.html.fromstring. And it is driving me insane because it keeps stripping the tag of my fragments: >…
fserb
  • 3,443
  • 2
  • 23
  • 22
1
2 3 4 5 6 7