Highest Voted 'scrapinghub' Questions

1

vote

0 answers

Crawlera/Zyte proxy authentication using C# and Selenium

I've tried a number of ways of using Zyte (formally Crawerla) proxies with Selenium. They provide 1- API key (username) 2- Proxy url/port. No password is needed. What I have tried... ChromeOptions options = new ChromeOptions(); var proxy =…

c# selenium selenium-chromedriver scrapinghub

asked Feb 19 '21 at 01:55

MattHodson

508
4
16

1

vote

1 answer

Not able to scrape image URLs using beautiful soup and python

So basically I am using the below code to scrape the image urls of the credit cards from the respective links in the explore_more_url variable. from urllib.request import urlopen from bs4 import BeautifulSoup import json, requests, re from selenium…

python web-scraping beautifulsoup python-requests scrapinghub

asked Feb 17 '21 at 13:14

user15215612

1

vote

1 answer

How can I scrape the image using Beautiful Soup and python

I am trying to scrape the image link from the below link but I am not able to Link : https://www.online.citibank.co.in/credit-card/rewards/citi-rewards-credit-card?eOfferCode=INCCCCTWAFCTRELM I have used the below code x = '…

python web-scraping beautifulsoup python-requests scrapinghub

asked Feb 11 '21 at 10:30

Ali Baba

65
8

1

vote

2 answers

Trying to scrape image urls but not able to get it using beautiful soup and python

I am scraping this link :…

python web-scraping beautifulsoup python-requests scrapinghub

asked Feb 11 '21 at 04:44

Ali Baba

65
8

1

vote

0 answers

Is it possible to use a monitor on a script if it fails?

I use scrapinghub to run my spiders. I have a FinishReasonMonitor that slacks me if a spider fails. Is it possible to apply this to a script? My spiders rarely fail, but my scripts occasionally do. In scrapinghub it shows script outcomes as…

python scrapy scrapinghub spidermon

asked Jan 27 '21 at 11:50

weston6142

115
8

1

vote

0 answers

I am using scrapy to scrape data from Yelp. I cannot see any error but data is not getting scraped from the StartURLs mentioned in the spider

Code for the items.py and other files are mentioned below. The logs are also mentioned at the end.I am not getting any error but according to the logs the scrapy has not scraped any pages. ``` import scrapy class YelpItem(scrapy.Item): #…

web-scraping scrapy scrapy-pipeline scrapinghub

asked Oct 10 '20 at 06:21

sneha s

11
1

1

vote

0 answers

How to use crawlera proxies in selenium

I have a selenium project. I am going to use Crawlera proxy in selenium. I have already an API Key of Crawlera. headless_proxy = "127.0.0.1:3128" proxy = Proxy({ 'proxyType': ProxyType.MANUAL, 'httpProxy':…

selenium scrapinghub crawlera

asked Jul 15 '20 at 20:59

pystack-piter

11
1

1

vote

0 answers

504 Timeout Exception when using scrapy-splash with crawlera

I tried scrapy-splash with http://www.google.com and followed all the prerequisite steps given in the following Github Repo https://github.com/scrapy-plugins/scrapy-splash and i was able to render the Google page. However when i tired the same…

python scrapy scrapy-splash scrapinghub crawlera

asked May 26 '20 at 09:36

Shashikiran

67
5

1

vote

1 answer

Scrapinghub Deploy Failed

I am trying to deploy a project to scrapinghub and here's the error I am getting slackclient 1.3.2 has requirement websocket-client<0.55.0,>=0.35, but you have websocket-client 0.57.0. Warning: Pip checks failed, please fix the conflicts. WARNING:…

scrapy scrapinghub

asked May 15 '20 at 15:23

johncsmith427

43
6

1

vote

0 answers

scrapinghub requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://storage.scrapinghub.com

I am trying to run scrapy_price_monitor in local environment, but when I give the command "scrapy crawl spidername", it returns "unauthorized" when trying to send the item to storage.scrapinghub. I have already succesfully "shub login" (added my…

python scrapy scrapinghub

asked Feb 12 '20 at 13:52

pedrovgp

545
5
20

1

vote

0 answers

Scrapy: settings, multiple concurrent spiders, and middlewares

I'm used to running spiders one at a time, because we mostly work with scrapy crawl and on scrapinghub, but I know that one can run multiple spiders concurrently, and I have seen that middlewares often have a spider parameter in their…

scrapy scrapinghub

asked Dec 20 '19 at 11:09

kenshin

197
10

1

vote

0 answers

Why Splash headless browser can not able to fetch the page of Linkedin

I have tried to get the page source of Linkedin. But I cannot able to fetch even for one URL. I got the response like "Failed loading page" Few samples, https://www.linkedin.com/company/amazon https://www.linkedin.com/company/apple Splash version:…

headless-browser scrapy-splash scrapinghub splash-js-render

asked Nov 27 '19 at 07:17

Mideen abdul gaffoor

71
6

1

vote

1 answer

How to scrape multiple websites with different data in urls

I'm scraping some data from a webpage where at the end of the url has the id of the product, it appears to rewrite the data at every single row, like its not appending the data from the next line, I don't know exactly what's going on, if my first…

python python-2.7 web-scraping beautifulsoup scrapinghub

asked Nov 06 '19 at 07:17

Ivan Barba

11
1

1

vote

1 answer

mysql.connector.errors.InterfaceError: 2003: Can't connect to MySQL server on '127.0.0.1:3306' on Scrapinghub

I try to run my spider on scrapinghub, and run it getting an error Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks result = g.send(result) File…

python mysql scrapy scrapinghub

asked Jul 28 '19 at 15:15

Biddaris

33
1
4

1

vote

1 answer

"'str' object has no attribute 'get'" when using Google Cloud Storage with ScrapingHub

I'm trying to get Google Cloud Storage working with a Scrapy Cloud + Crawlera project so that I can save text files I'm trying to download. I'm encountering an error when I run my script that seems to have to do with my Google permissions not…

python google-cloud-platform scrapy google-cloud-storage scrapinghub

asked Jun 25 '19 at 02:54

Nathan Wailes

6,053
5
35
68

Questions tagged [scrapinghub]