Simple DIY filtering proxy using Apache

Discussion in 'all things UNIX' started by Gullible Jones, Feb 23, 2015.

  1. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,466
    First off, big fat warning: be careful configuring Apache for any purpose. It is easy to goof things up and expose open ports to the whole Internet. If you don't understand what I'm doing here, then please don't try using this.

    Okay, that aside... I am currently running the following configuration as a local filtering proxy. Why? Because I got tired of slow, iffy browser extensions that don't quite work. I wanted something set-and-forget. Thus:

    Code:
    Listen 127.0.0.1:8144
    ProxyRequests On
    ProxyVia On
    
    # Deny all requests by default.
    <Proxy *>
     Order deny,allow
     Deny from all
    </Proxy>
    
    # Allow requests to some high-level domains. Hopefully this is a sane heuristic...
    # The optional part at the beginning is to accommodate unencrypted sites.
    # Fake sites (e.g. phishing) containing a top-level domain further along their URL
    # string SHOULD be blocked, but please don't put too much faith in this regex.
    <ProxyMatch "^(http://)?[a-z0-9-.]+\.(com|org|net|edu|gov|io)">
     Order deny,allow
     Deny from all
     Allow from 127.0.0.1
    </ProxyMatch>
    
    # Block a bunch of major online advertisers.
    # This is not even close to comprehensive, but knocks out most of the more
    # obnoxious and intrusive ads I've come across. I don't mind if the occasional
    # ad slips through, as long as it's not malicious or gratuitously annoying.
    <ProxyMatch "(adzerk|pagead|adserv|tribalfusion|exoclick|doubleclick)">
     Order deny,allow
     Deny from all
    </ProxyMatch>
    
    Note that I'm also passing HTTPS through the Apache proxy, since it's limited strictly to the local machine. As far as I can tell, this does not introduce any Superfish-like hazards, but don't be too sure; my methods are not very thorough.

    Other ideas:
    - Clever use of regexes could block unenecrypted versions of sites, or force redirects to the encrypted versions using mod_rewrite.
    - Somewhat less clever, but probably more effective: one could just blanket deny unencrypted HTTP, maybe with a few exceptions.
    - More clever: time-wasting sites could be redirected to more useful ones.
    - Content blocking of all sorts via URL matching regex. Might be more reliable for some things than for others though....

    And possible issues:
    - False sense of security, as always.
    - Possibly lag when initiating connections.
    - If a networking vulnerability is discovered in Apache, there might be trouble!
     
  2. fblais

    fblais Registered Member

    Joined:
    Jul 31, 2008
    Posts:
    1,341
    Location:
    Québec, Canada
    Cool.
    Can I can setup Apache on my linux partition to act as a proxy for linux itself?
    Or is it indended to be run on a separate PC to act as a proxy for another one?
    Or both scenario possible?

    Thanks again!
     
  3. Gullible Jones

    Gullible Jones Registered Member

    Joined:
    May 16, 2013
    Posts:
    1,466
    It was intended as the former. The latter could also be used.

    I would strongly advise against doing either, unless you have used Apache for a while and are familiar with it; no offense. For most users I would recommend setting up IPFire, or some other such router/gateway distro, and using the transparent proxy that ships with it.
     
  4. fblais

    fblais Registered Member

    Joined:
    Jul 31, 2008
    Posts:
    1,341
    Location:
    Québec, Canada
    Thanks.
    Not for me then. :)
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.