Questions tagged [httrack]

HTTrack (Website copier)

HTTrack is a free and open source Web crawler and offline browser, developed by Xavier Roche and licensed under the GNU General Public License Version 3.

HTTrack allows users to download World Wide Web sites from the Internet to a local computer.[4][5] By default, HTTrack arranges the downloaded site by the original site's relative link-structure. The downloaded (or "mirrored") website can be browsed by opening a page of the site in a browser.

HTTrack can also update an existing mirrored site and resume interrupted downloads. HTTrack is configurable by options and by filters (include/exclude), and has an integrated help system. There is a basic command line version and two GUI versions (WinHTTrack and WebHTTrack); the former can be part of scripts and cron jobs.

HTTrack uses a Web crawler to download a website. Some parts of the website may not be downloaded by default due to the robots exclusion protocol unless disabled during the program. HTTrack can follow links that are generated with basic JavaScript and inside Applets or Flash, but not complex links (generated using functions or expressions) or server-side image maps.

Reference :

http://www.httrack.com/

http://en.wikipedia.org/wiki/HTTrack

61 questions
0
votes
1 answer

How do I export a website made in Dotcms?

I'm trying to save one of my websites (with all its files) to my pc so i can upload it to another server. I've tried using Httrack and wget but with both I only got a small part of the images and most of it was scattered in a new folder called…
ZuZu
  • 1
0
votes
0 answers

How do I mirror a website with its embedded content with Httrack?

Pretty much the title. I’m trying to mirror store.taylorswift.com, but most of the images are embedded from Shopify. So while I succeed in mirroring the website by itself, the result is not as usable as intended, and not at all what I would consider…
0
votes
0 answers

How to make slider working on a cloned HTML-page?

I have cloned html page containing a slider by means of HTTrack tool. But it looks static and does not move. Actually this is a snapshot of the current state of the slider without the possibility to change its state despite all links to images are…
vitaliy4us
  • 253
  • 1
  • 12
0
votes
0 answers

Is it possible to mirror the content of a Canvas LMS website with httrackHTTrack?

I am trying to mirror course materials from Canvas (Learning Management System) so that I can study offline. (Blame it on COVID-19) I tried to follow this tutorial to download but it did not work. The URL it captured is:…
cccfran
  • 97
  • 5
0
votes
0 answers

How to configure HTTrack in remote Linux server to download a site?

I'm trying to download a website using HTTrack using Linux command line. httrack https://www.example.com/test/ -O /home/user/websites/test This is running perfectly in local and does the job. But when I try this same command in a remote linux…
Zed Shaw
  • 65
  • 12
0
votes
1 answer

Detect URL and Redirecting URL using Javascript

I have a question on how to detect a domain url using Javascript and redirect url. My point is to redirect url if the url is not my domain. (Eg. my domain is website.com. if the domain is not website.com, it will redirect to website.com.) I think…
Gen Happy
  • 9
  • 1
0
votes
0 answers

How can I download the web pages in html format where there is login required?

I have an educational site whose structure is as follow. There 20 modules and in each module there 30-50 submodules (unequal nos) and in each sub modules there will be 20-30 videos.And under the videos there are comments.Its an educational site and…
Fasty
  • 672
  • 7
  • 26
0
votes
1 answer

Httrack convert wordpress to HTML

I am trying to covert a wordpress website into a simple html/css website but the problem is that whenever I use httrack, it downloads the whole wordpress files making it hard for me to extract the simple html/css files Is there away to solve that…
0
votes
0 answers

How to stop httrack from copying websites?

Httrack can download all the files stored on the server of any website. How can I stop httrack to do so? can I achieve this by using robot.txt file ?
user11755923
0
votes
1 answer

How to download/mirror my website built using Iweb on mac when i have no access to hosting?

I created a site using Iweb in mac few years ago. But now I don't have the domain.sites file to edit the site. Also, I don't have access to the hosting account since it has been long I have been active on it. I used httrack website copier and dozens…
Brainy Prb
  • 393
  • 5
  • 20
0
votes
1 answer

Website Host Gone | Recover the old website

my client lost his webiste hosting cause he didn't pay the last 3 months, so the host deleted the webiste on their servers. Now we have a version on https://web.archive.org , How we can recover it , and upload it agian on an other host…
GreenZone
  • 1
  • 3
0
votes
1 answer

HTTrack on angular.io

I was trying to use HTTrack with default settings to download the angular io docs - without any success. What options and preferences should I use for preforming this task?
octo-developer
  • 181
  • 1
  • 1
  • 9
0
votes
0 answers

Download Webpage with HTTrack executed JavaScript

I want to save a webpage with httrack including the executed JavaScript-Output. I'm using: httrack -r1 URL -O PATH Currently I'm only getting the .js-source: " Is there any option I can add to…
night4awk
  • 21
  • 1
  • 6
0
votes
1 answer

wrong srcset attributes from httrack

I have spidered a website with httracks and a lot of files on different levels are generated. But the website uses picture / source tags with srcset attributes which httrack does not handle, all those pictures does not work well offline. httrack…
Bernd Wilke πφ
  • 8,555
  • 1
  • 13
  • 32
0
votes
0 answers

How to not create empty folders?

I try to download only images from a certain website, with saving original folder nesting structure, where images are located on the website's server. In filter settings i setup file types to download…
Evgeniy
  • 1,838
  • 18
  • 41