Full Site
To download a full website, do
Add --spider at the end to do a dry run first.
Add --random-wait to vary the wait time (for sites that block automated downloading).
Add --limit-rate=20k to limit download speed to 20 kilobytes per second (an example politeness limit; use your discretion).
Courtesy: this, this and this.
Small part of site
To download a small part (e.g. a given page and its first-level links) to the current directory, do
A URL list
To download a list of files specified by URLs, do
wget -c -w 15 -i urls.txt
This will download all URLs listed in the file urls.txt.
To download a full website, do
wget -c -w 15 --mirror -p -k -P /path/to/dir http://my.website.com -a my.log
This will download the entire website, waiting 15 seconds between retrievals. It will copy everything needed to show the page correctly (-p), convert links for local viewing (-k), store the files in /path/to/dir (-P), and write (append) messages to my.log (instead of the console). If stopped in between, run this command again, and it will start where it left off (-c).Add --spider at the end to do a dry run first.
Add --random-wait to vary the wait time (for sites that block automated downloading).
Add --limit-rate=20k to limit download speed to 20 kilobytes per second (an example politeness limit; use your discretion).
Courtesy: this, this and this.
Small part of site
To download a small part (e.g. a given page and its first-level links) to the current directory, do
wget -c -w 3 -a my.log -r -l 1 -p -k http://somesite.com/interesting/link.html
This will download the page link.html and all links in it (-r), but will not recurse further to links of links (-l 1).A URL list
To download a list of files specified by URLs, do
wget -c -w 15 -i urls.txt
This will download all URLs listed in the file urls.txt.
No comments:
Post a Comment