View Full Version : Removing HOSTS File Duplicate Entries
subhrobhandari
November 6th, 2009, 09:24 AM
How can duplicate entries in HOSTS file be removed? ;) Its really impossible to manually edit a 30 MB text file.
funkydude
November 6th, 2009, 09:59 AM
HostsMan has this functionality http://www.abelhadigital.com/
subhrobhandari
November 7th, 2009, 08:17 AM
I have already tested but it just crashes every time I try to find duplicate entries. Though comments are removed successfully using this. :(
Keyboard_Commando
November 7th, 2009, 09:53 AM
30mb HOSTS file, Yikes!
Just out of interest ... do you have a performance slow downs with a HOSTS file that large? and how did you get to have one that size. You must be subscribed to a lot of blocking definitions? :o
subhrobhandari
November 8th, 2009, 12:30 AM
Yes, I am subcribed to 8 lists or more, and among those only one list that filters the adult sites is around 20 MB. :P There are no noticeable slow down. Though it has decreased to 28.4 MB after the comments are removed. The only problem I face it takes around 3-4 min to merge an additional list but thats expected.
vroom23
November 28th, 2009, 04:20 PM
I use Boxer Text editor to change all cases to lower remove trialling spaces and then remove duplicate lines under edit>delete>duplicate lines, I have around 850k entries.
subhrobhandari
November 30th, 2009, 02:41 AM
Right now I have removed the 20 MB adult sites' subscription, so HOSTS is around 6 MB right now and Hostsman can remove dupes.
siljaline
November 30th, 2009, 03:24 PM
Sounds like you've combined many to arrive at such a huge file ! (http://www.mvps.org/winhelp2002/hostsfaq.htm#Merging)
Start with a fresh one and go from there? (http://www.mvps.org/winhelp2002/hosts.htm)
You could edit out the duplicates manually. (http://www.mvps.org/winhelp2002/hostsfaq.htm#Editor)
inka
December 2nd, 2009, 03:21 AM
-{ Quote: "How can duplicate entries in HOSTS file be removed? ;) Its really impossible to manually edit a 30 MB text file." }-
TextPad32 handles files of unlimited size.
Tools -} Sort (and checkmark 'remove duplicates')
www.textpad.com
How many total lines (hostnames) does your HOST file currently contain?
Do yourself a favor.
Come to grips with the reality that you can't win by attempting to (collect and) block by hostname.
-=-
After you block smuttylefthandedpilgrims. com
and
www .smuttylefthandedpilgrims. com
next week you'll be right back, chasing your tail and adding
ns1.smuttylefthandedpilgrims. com
and
ns2.smuttylefthandedpilgrims. com
and
download.smuttylefthandedpilgrims. com
ad nauseum
and nowadays yer chasing skeeters, playing Bop-the-Gopher, across 100+ country code TLDs
smuttylefthandedpilgrims. com.my
and
www .smuttylefthandedpilgrims. com.my
and
smuttylefthandedpilgrims. com.kr
and
www .smuttylefthandedpilgrims. com.kr
Instead, you can block based on the (one) pattern:
smuttylefthandedpilgrims
and be done with it using the freeware DNSKong from
http://pyrenean.com
FWIW, I use an older version of DNSKong.exe which has a GUI & its filesize is 260kb.
The author ditched the GUI for the current version; dnskong.exe is now about 28kb
(tiny. no frills. terse documentation.)
My current 'named.txt' (blocklist) contains 176,000+ patterns & occupies 2.8Mb
I've tested DNSKong using a blocklist containing 500K+ lines. No problem, no slowdown.
siljaline
December 7th, 2009, 01:53 AM
-{ Quote: "snipped" }-
In English, please ? :wacko:
inka
December 7th, 2009, 08:36 PM
First part of my post explains "how to" remove duplicate entries
by loading the giant HOSTS file into TextPad32 and using its Sort command.
-{ Quote: "Its really impossible to manually edit a 30 MB text file" }-
The balance of the post presents an argument for utilizing a blocklist based based on patterns (vs full hostnames) and recommends an app (DNSKong) which will enable you to do so.
vBulletin® Copyright ©2000-2012, Jelsoft Enterprises Ltd.
Copyright ©2002 - 2012, Wilders Security Forums