A Beginners Guide to Using wget in Windows
Many Windows users are so accustomed to the graphical interface and the web browser as the universal tool of choice that they forget there are a host of other tools out there. Wget is a GNU command-line utility popular mainly in the Linux and Unix communities, primarily used to download files from the internet. However, there is a version of wget for Windows, and using it you can download anything you like, from entire websites to movies, music, podcasts and large files from anywhere online.
Not many Microsoft users know about this neat tool, which is why I wrote this beginner’s guide to using wget in Windows. We tend to use our browser for everything, which is fine but it isn’t always the most efficient way to achieve something. Wget is just one of the many tools that have been around for eons but very few people know about.
Getting wget for Windows
Table of Contents
- Getting wget for Windows
- Download a single file
- Download a single file but save it as something else
- Download to a specific folder
- Resume an interrupted download
- Download a newer version of a file
- Download multiple web pages
- Download an entire website
- Download a specific file type from a website
- Download all website images
- Check a website for broken links
- Download files without overloading the web server
Getting wget is very easy. Follow this guide to installing and configuring wget.
- Download wget from here and install it. Make sure it is the setup program and not just the source otherwise it won’t work.
- Once installed, you should now be able to access the wget command from a command line window. Open a CMD window as an administrator and type ‘wget -h’ to test. If it works, you’re golden, if you get ‘unrecognized command’ you downloaded the wrong package. Try again.
- Set a download directory to save all your files. Type ‘md \directory name’ to create a download directory. I called mine ‘downloadz’ to be recognizable.
Once installed, you’re ready to set to work. Below I have listed a selection of popular wget commands that can achieve a wide range of things.
Download a single file
Download a single file but save it as something else
wget ‐‐output-document=newname.html website.com
Download to a specific folder
wget ‐‐directory-prefix=folder/subfolder website.com/file.zip
Resume an interrupted download
wget ‐‐continue website.com /file.zip
Download a newer version of a file
wget ‐‐continue ‐‐timestamping website.com/file.zip
Download multiple web pages
For this you need to create a list in Notepad or other text editor. Add a new full URL (with http://) onto a separate line. Then point wget to the file. In this example I named the file Filelist.txt and saved it in the wget folder.
wget ‐‐input Filelist.txt
Download an entire website
wget ‐‐execute robots=off ‐‐recursive ‐‐no-parent ‐‐continue ‐‐no-clobber http://website.com
You might find, as I often do that web hosts block wget commands. You can try to spoof these blocks by impersonating Googlebot. Try typing this:
wget –user-agent=”Googlebot/2.1 (+http://www.googlebot.com/bot.html)” -r http://website.com
Download a specific file type from a website
wget ‐‐level=1 ‐‐recursive ‐‐no-parent ‐‐accept FILETYPE http://website.com / FILETYPE/
For example, change FILETYPE for MP3, MP4, .zip or whatever you like.
Download all website images
wget ‐‐directory-prefix=files/pictures ‐‐no-directories ‐‐recursive ‐‐no-clobber ‐‐accept jpg,gif,png,jpeg http://website.com/images/
wget ‐‐output-file=logfile.txt ‐‐recursive ‐‐spider http://website.com
Download files without overloading the web server
wget ‐‐limit-rate=20k ‐‐wait=60 ‐‐random-wait ‐‐mirror http://website.com
There are hundreds, if not thousands of wget commands and I’ve only shown you a few of them here. Now that you’re familiar with the tool and how it works, it’s up to you what you use it for!
Do you have any cool commands that can achieve wonders? Share them with us below!