Which Website Downloader?

pvsurfer · Feb 23, 2006

I'm looking to get a website downloader (and offline browser).

I'm aware of the likes of Netgrabber, Offline Explorer, Teleport, etc. but don't know much about them other than their own boilerplate.

I've also come across a freeware product, HTTrack, but if the 'payware' products are better, I wouldn't mind buying one.

Any advice or feedback would be appreciated.

dog · Feb 23, 2006

I'd suggest you'd forget those tools all together... they do more harm than any good. The resource spikes they cause sites is tremendous as they pull page after page, multiple pages at a time, aggresively downloading information - the bandwidth and server load is very taxing ... they'll only serve to cripple servers, raise costs, thus increasing ads etc, in an attempt for sites to gain more revenue to cover increased costs, to maybe ultimately causing sites/domains to go subscription based.

Why don't you try an RSS reader instead and then view the pages you fancy from the descriptions.

Steve

pvsurfer · Feb 23, 2006

Steve~

I didn't realize that they put that kind of load on a website!

Do all of them do that to about the same extent (don't they have settings to prevent that)?

~pv

NGRhodes · Feb 24, 2006

I use Winhttrack. Never explored commercial versions, but this is the only one I found that would work flawlessy and have good configuration options.

http://www.httrack.com/

Steve, I agree, but sometimes we need to grab offline copies of sites.

dog · Feb 24, 2006

Hi pv,

Unfortuantely ... they all have the same effect. Some of the tools mentioned can be throttled down to some extent, but even then they cause unnecessary load, they work much much faster than a real person (as a person reads as they go - a person also differentiates between what does and doesn't interest them, these tools do not they just instantly download everything / following every link - they're also designed to pull using multiple connections simutaniously (to reduce the time needed to accomplish their goal). Truthfully I don't see the merit of these tools for offline browsing. Yes they could be useful if someone wanted/needed to mirror a site that was experiencing difficulties, but not much beyond that.

You also have to remember search engine spiders/crawlers use similar technology, and there are hundreds of thousands of those indexing every piece of content available - even though they are scripted to index things at a relatively reasonable rate - they too can cause heavy loads especially when mutliple spiders (10-100) from the same search engine are all indexing one site ... not to mention the spiders/crawlers of other search engines also indexing at the same time. Search engine spiders really are a catch 22 for site operators, yes they are wanted for their indexing/search results ... but the kind of bunch ups mentioned are very taxing. If you added several individuals doing the same thing you can really begin to imagine the damage it could inflict, as well as the cause and effect I mention in the previous post.

I hope anyone using this type of technology for personal use, really reconsiders doing so. It'll only serve to harm us all in the end.

HTH,

Steve

NGRhodes · Feb 24, 2006

dog, Yes and a lot of these crawlers break the http 1.1 standard by using more than 2 connections.

Thats the best thing you can do is make sure that you only use 1 or 2 connections to a website to prevent excessive strain on the servers.

Good point about dynamic, searchable sites, I always stop crawlers from using any search facilities (make sure they have access to an index/sitemap instead).

Log in or Sign up

Which Website Downloader?

pvsurfer Registered Member

dog Guest

pvsurfer Registered Member

NGRhodes Registered Member

dog Guest

NGRhodes Registered Member

Log in or Sign up

Which Website Downloader?

pvsurfer Registered Member

dog Guest

pvsurfer Registered Member

NGRhodes Registered Member

dog Guest

NGRhodes Registered Member

Useful Searches