Can Proxomitron convert encoded characters?

Discussion in 'privacy general' started by Devinco, Sep 6, 2004.

Thread Status:
Not open for further replies.
  1. Devinco

    Devinco Registered Member

    Joined:
    Jul 2, 2004
    Posts:
    2,524
    Hi Everyone,

    I have read that Proxomitron can help protect our privacy by filtering out nosey Javascript commands (amongst other things).

    One method that the bad guys use to evade web filters like Proxomitron is to encode characters in various ways like this:
    <& #115;cript type="java& #115;cript">"

    or this:
    document.write('<'+'j'+'a'+'v'+'a'+'s'+'c'+'r'+'i'+'p'+'t'+'>')

    or URL encoding like this:
    goodwebsite. com/site/dir/helpdesk.asp@1234567890N#v1 [insert nasty javascript url encoded here]

    Does Proxomitron prefilter the webpage to convert these type of encoded characters before the main filters get the page?

    Or, is there a filter that does this encoded character conversion?
     
    Last edited: Sep 6, 2004
  2. GlobalForce

    GlobalForce Regular Poster

    Joined:
    Jun 30, 2004
    Posts:
    3,581
    Location:
    Garden State, USA
    Hi Devinco,

    Seeing as this post has provoked such fear, I enter not with an explanation, but rather shared interest.

    I've pondered your question (though not tech enough.....yet), and since you seem so inqusitive (good thing),
    I was hoping to spur the gears of progress.

    Proxomitron does intrigue me, but at the moment I'm dealing with a less involved agenda :D.

    So in getting on with it, I've located a few sites rolling back to my original intention...

    http://www.ccs.neu.edu/home/gene/cs-info.html

    http://www.w3.org/WAI/ER/existingtools.html

    http://www.geocities.com/u82011729/prox/filter.html

    http://www.securityconfidence.com/seccon.php?page=Security-Scanner-Database

    They don't all deal with proxo, but you may find them of interest.
    Please keep me posted (no hurry) as I'm curious to what you find.


    GF ;)
     
  3. Paranoid2000

    Paranoid2000 Registered Member

    Joined:
    May 2, 2004
    Posts:
    2,839
    Location:
    North West, United Kingdom
    Some interesting links there GlobalForce - thanks for posting them!

    I was hoping for a Proxomitron-guru to answer this thread (I'd only consider myself a semi-n00b1e with Prox... :D) but I'll try to answer - you can (and will need to) create filters to unencode obfuscated Javascript/HTML.

    There appear to be two methods of obfuscation - URI encoding (where characters are replaced with a % followed by their hexadecimal character code - so a space would appear as %20) and HTML entity encoding which can either use decimal (space = &#32) or hexadecimal (space = &#x20) values. A list of characters and codes is given in ISO 8859-1 Characters as Named and Numeric HTML Entities, however the full list is defined by ISO10646.

    Either method can be de-obfuscated - though given the size of ISO10646 (over 34,000 characters as mentioned in this technical overview) it may be easier just to filter out such characters (unless you normally visit webpages using foreign characters).

    As for obfuscation via concatenation (using +'s), any Javascript has to have <script> and </script> tags so you only need to worry about these being hidden (document.write is itself Javascript so has to be within such tags) as in "<SCRI--><!-- PT LANGUAGE="JavaScript">". A search term like <*s*c*r*i*p*t*> should catch these out.

    Asking this question at the Official Proxomitron Forum may yield some better answers.
     
  4. Devinco

    Devinco Registered Member

    Joined:
    Jul 2, 2004
    Posts:
    2,524
    GlobalForce,

    Thank you for the links. They look very interesting. I will check them out.

    :)
     
  5. Devinco

    Devinco Registered Member

    Joined:
    Jul 2, 2004
    Posts:
    2,524
    Paranoid2000,

    I sent a PM to Kye-U a couple of days ago, but I'm sure Kye-U is very busy. Or perhaps the PM JS alert was filtered out by Proxo.
    Thanks for the idea about the Proxo Forum, that is my next destination for this topic. I will also ask at Kye-U's forum.

    I was hoping that this wasn't a brand new idea. That someone already had a filter for it. Or perhaps that Proxo natively decodes them.
    I like the idea of filtering out the foreign characters (that would be crackers next choice of obfuscation).
    The web browser is able to decode these characters, I wonder if there is a table or component that is already done that could be used to to do the decoding?

    A lot to learn.

    Thank you! :)
     
  6. GlobalForce

    GlobalForce Regular Poster

    Joined:
    Jun 30, 2004
    Posts:
    3,581
    Location:
    Garden State, USA
    Thanks for the replies gentlemen,

    Glad to here you both approve.....A bit late getting back to this, but I've been busy trying to put together some useful info for this thread.

    I hope you don't mind hearing this, but I feel other members and a few select guests consider you the "Proxo Guru" here,
    your genuine and concise explanations have exposed some of the mysteries (and benefits) of setting proxo up.

    This is just a short response from me. I'll be back on this one shortly.....

    GF
     
    Last edited: Sep 16, 2004
  7. Kye-U

    Kye-U Security Expert

    Joined:
    Jun 11, 2004
    Posts:
    481
    Devinco, I apologize for this very late post.

    I do not receive many PMs on WS, so I don't pay attention to the box at the top right corner :X

    Anyways, back to your... nearly 2 month old question.

    With Proxomitron, nothing is impossible (well, web/header based).

    With a very "broad" filter, like the example Paranoid2000 pointed out, you could make the matching portion look like:

    Code:
    <*(s*c*r*i*p*t|\&#115|%4E%23%76%31%7F)*>
    I wouldn't write such a filter with the almost limitless hex, octa, decimal values and different combinations to filter out all "<javascript>" tags.

    I've learned that Javascript is almost impossible to filter, due to the possibilities and variables which are all different.

    Instead I go with specific functions of Javascript, such as "location.replace" or "window.open" and I filter those out.

    Proxomitron does not automatically convert hexadecimal to ASCII.

    But there is a function in Proxomitron where you may find useful:

    $UESC(text)

    Source: http://www.sankey.ws/proxlang.html

    I would not specifically find ways to filter out different types of encoded tags, especially Javascript.

    If you find a common method of which spammers/advertisers are using to bypass web filters, then feel free to write a filter based on that method :D

    I apologize for reviving this topic.
     
  8. GlobalForce

    GlobalForce Regular Poster

    Joined:
    Jun 30, 2004
    Posts:
    3,581
    Location:
    Garden State, USA
    I'm not! I'm thankful for the input Kye-U. I'm just getting underway with Proxo,
    so any new developements surrounding this great little web filter are always of interest.

    I'm sure Devinco will be pleasantly surprised.
    I'll also be expecting P2k to chime in.....


    GF
     
  9. Paranoid2000

    Paranoid2000 Registered Member

    Joined:
    May 2, 2004
    Posts:
    2,839
    Location:
    North West, United Kingdom
    Not much more I can add (I'm still struggling with custom filters to strip all the superfluous tables from Tom's Hardware...). Disabling Javascript via the browser would appear the surest solution - but not all browsers offer the capability of doing this on a site-by-site basis.

    The $UESC command does look like a useful tool here but the SRC command listed further down looks potentially better still - "displays the real source of the web page (not just unaffected by The Proxomitron, but unaffected by JavaScript stunts too!) ". With all those commands though, you could spend half a lifetime just experimenting with Proxo - and probably pick up a Master's Degree in HTML in the process...
     
  10. Kye-U

    Kye-U Security Expert

    Joined:
    Jun 11, 2004
    Posts:
    481
    The "Src" command works like...

    http://src..www.wilderssecurity.com/showthread.php?t=47031

    For this to work on every page, you'd need to write a Header filter to catch the URL, kill the connection, then redirect to "http://src..\1"

    Example:

    Code:
    [HTTP headers]
    In = FALSE
    Out = TRUE
    Key = "URL: src. (Out)"
    Match = "http://(^src..)\1"
    Replace = "\k$JUMP(http://src..\1)"
    This works fine :D

    Still need to find out if it really works, so I'll go test it.

    EDIT: Does not seem to convert everything to a readable type.
     
    Last edited: Oct 29, 2004
  11. Devinco

    Devinco Registered Member

    Joined:
    Jul 2, 2004
    Posts:
    2,524
    Kye-U,

    Thank you for replying! I am very glad you did revive this thread. :)

    A web browser can "understand" all of the various obfuscation methods and render (or in the case of javascript, execute) the code properly as a web page.
    How is this done in the browser? (Take Firefox or IE for example)
    Are there common .dll (modules) that convert all of these obfuscations prior to rendering?
    Could these "common modules" be utilized by Proxomitron to pre-filter all this obfuscation first, then Proxo would only have to deal with the regular filtering?
    (Perhaps these common conversion modules are generic in nature and could be used by a 3rd party program like Proxo.)

    If the browser can do this conversion, why can't Proxomitron?
    Why can't there be a general hex, octa, and decimal converter filter in Proxo?

    The $UESC would appear to take care of the URL obfuscation.
    What is your opinion of that command?

    Also, you mentioned that the src command does not seem to convert everything to a readable type.
    How well does this method work?
    What isn't converted to readable type?
     
Loading...
Thread Status:
Not open for further replies.