HTTP Switchboard for Chrome/Chromium:

Discussion in 'other software & services' started by apathy, Nov 25, 2013.

  1. gorhill

    gorhill Guest

    I came up with this scoped recipe for www.stumbleupon.com:

    http%3A%2F%2Fwww.stumbleupon.com%0A%09wh
    itelist%0A%09%09cookie%20stumbleupon.com
    %0A%09%09script%20netdna-cdn.com%0A%09%0
    9script%20stumbleupon.com%0A%09%09*%20*%
    0A%09blacklist%0A%09%09cookie%20*%0A%09%
    09script%20*%0A​
    The thing is StumbleUpon is a tool to browse the web through the site's own [noparse]<iframe>[/noparse], thus out-of-box settings didn't work (iframes are blacklisted out of the box). Then I found that netdna-cdn.com is required. Etc.

    Now the above ruleset blacklists cookies/scripts in general, except for what is required for stumbleupon itself to work. If you want cookies/scripts enabled also for pages which load inside the StumbleUpon frame, then you will have to deblacklist cookies/scripts and save or decode/import:

    http%3A%2F%2Fwww.stumbleupon.com%0A%09wh
    itelist%0A%09%09*%20*%0A​
    At least this way, the permissive rules are restricted to that one scope.

    EDIT: Oops, I realize some of my own blacklist choices (facebook.com, linkedin.com, etc.) made it into the above recipes, I fixed this.
     
    Last edited by a moderator: Jan 12, 2014
  2. wat0114

    wat0114 Registered Member

    Joined:
    Aug 5, 2012
    Posts:
    4,100
    Location:
    Canada
    Exactly. In some cases I simply decline viewing a website if I have to allow content to render minimum proper viewing.
     
  3. gorhill

    gorhill Guest

    I don't control content providers, but users do: not accepting to go along with the bloat/trackers/etc. will force content providers to reduce the bloat. In the end, if bloat prevails, it's because users collectively go along with this. Now, I think a good first step is to help users become aware of the bloat (which is not that visible without the right tools), let's make it visible so that users have now a way to be informed and act on that information if they want.
     
  4. J_L

    J_L Registered Member

    Joined:
    Nov 6, 2009
    Posts:
    8,738
    Thanks for going the extra mile, it works great. Unfortunately, I'm too used to the convenience of not whitelisting every site, so the code will have to help someone else at the moment. Great to see such an attentive developer like you.
     
  5. wat0114

    wat0114 Registered Member

    Joined:
    Aug 5, 2012
    Posts:
    4,100
    Location:
    Canada
    gorhill needs his username in lights in this forum. Well, okay, at least he should get the "Developer" tag and appropriate font color that goes along with it :)

    Getting http://www.denverpost.com/ was like pulling teeth o_O

    Code:
    http%3A%2F%2F*.denverpost.com%0A%09white
    list%0A%09%09script%20sb.scorecardresear
    ch.com%0A%09%09object%20sb.scorecardrese
    arch.com%0A%09%09image%20sb.scorecardres
    earch.com%0A%09%09script%20b.scorecardre
    search.com%0A%09%09object%20b.scorecardr
    esearch.com%0A%09%09cookie%202o7.net%0A%
    09%09image%20mngi.112.2o7.net%0A%09%09*%
    20gravatar.com%0A%09%09*%20smugmug.com%0
    A%09%09*%20ajax.googleapis.com%0A%09%09*
    %20ad.auditude.com%0A%09%09*%20yahooapis
    .com%0A%09%09*%20brightcove.vo.llnwd.net
    %0A%09%09*%20overture.com%0A%09%09*%20np
    c-denvernews.overture.com%0A%09%09*%20mi
    lehighmamas.com%0A%09%09*%20nhregister.c
    om%0A%09%09*%20slideshowpro.com%0A%09%09
    *%20mnginteractive.com%0A%09%09*%20tout.
    com%0A%09%09*%20newsinc.com%0A%09%09*%20
    ytimg.com%0A%09%09*%20twimg.com%0A%09%09
    *%20crowdynews.com%0A%09%09*%20digitalfi
    rstmedia.com%0A%09%09*%20breakingburner.
    com%0A%09%09*%20medianewsgroup.com%0A%09
    %09*%20auditude.com%0A%09%09*%20legolas-
    media.com%0A%09%09*%20brightcove.com%0A%
    09%09*%20denverpost.com%0A%09blacklist%0
    A%09%09*%20*%0A
     
    Last edited: Jan 13, 2014
  6. TheWindBringeth

    TheWindBringeth Registered Member

    Joined:
    Feb 29, 2012
    Posts:
    2,171
    Not to try to push you in the regex direction gorhill, but something didn't come up during those posts about regex support. Instead of switching everything over to regular expressions, would it be possible to simply bolt-on a regex based rules layer that exists in parallel with what you have now? I haven't fleshed this it out, but what just popped into my head...

    A regex based rule contains:

    RE to match against the page URL
    RE to match against the (sub) request URL
    An array of permission values for those types you need, where 0 = block, 1 = whitelist, and 2 = ignore.

    The regex based rules are checked, in order, *before* your existing checks and have priority. However, they can also ignore things so that the operation falls through to your existing checks and rules. They can match not only hostnames but (also) other portions of the URLs, which adds that level of granularity to your extension. Edit: if necessary and concerned about repetitively performing the matches, perhaps you could use a cache/map where the key is pageURL concatenated with requestURL.

    Advanced users could handle the concept of having two types of rules and also understand how this would affect what your existing user interface displays. Perhaps something as simple as one footnote line of text at the bottom of your existing UI: "Advanced rules are enabled, N Matches" would be sufficient to keep advanced users informed. They could identify what needs to be blocked through the network panel, and check the console output to see which regex rules weren't ignored. Since this is meant for advanced users you could, I presume, entirely hide this functionality from basic users when it isn't configured. Firefox extensions would use hidden preferences that advanced users would manipulate via about:config. It would be nice to have an editor for these rules, and even the button to bring that up could also be hidden via the sooper secret mechanism to enable advanced rules.

    This sounds low impact all around but only you would know how it, or something like it, would mesh with what you have. Just a thought that came to me, hope you don't mind.
     
    Last edited: Jan 13, 2014
  7. moontan

    moontan Registered Member

    Joined:
    Sep 11, 2010
    Posts:
    3,931
    Location:
    Québec
    soon the content and the sludge will be so intertwined it will not be possible to separate them.

    i see a day soon where unfortunately it will be almost impossible to use something like NoScript or HTTPSB.
    ----
    i agree, Ray should get all the praise he deserves. :thumb:
     
    Last edited: Jan 13, 2014
  8. tlu

    tlu Guest

    But doesn't have a concatenated list some advantages? With several integrated big hosts files, zillion duplicate entries are inevitable. In order to avoid that I'm using a slightly modified version of this script which downloads several hosts files, removes duplicates and concatenates those lists. Wouldn't this lead to a smaller footprint of HTTPSB and, possibly, to performance gains?

    Another thing that came to my mind: On your comparison page you link to a discussion about Disconnect where someone asked to include Request Policy in your comparison. Yes, Request Policy doesn't exist for Chrome, but there is KISS Privacy. You might want to test that, too.
     
    Last edited by a moderator: Jan 13, 2014
  9. Dave0291

    Dave0291 Registered Member

    Joined:
    Nov 17, 2013
    Posts:
    553
    Location:
    U.S
    Aren't most HOST files easily searched through? If so, and of course depending on how Gorhill implements the files in the database, wouldn't something like Hostsman be pretty easy to search out and delete duplicates? That's one of the features of that particular program after all.
     
  10. tlu

    tlu Guest

    Well, with the script which I mentioned Hostsman is superfluous, IMO. And gorhill is running Linux anyhow ;)
     
  11. kupo

    kupo Registered Member

    Joined:
    Jan 25, 2011
    Posts:
    1,121
    Hello, is using CsFire together with HTTPSB redundant?
     
  12. gorhill

    gorhill Guest

    No, internally a map (Object) is used, this takes care of collisions. Maps are blazingly fast in Chromium-based browser. The negative side is memory footprint. So far I decided performance is far more important than size given realtime requests are handled. I mitigated a lot the memory footprint early in HTTPSB development, but certainly more probably can be done, at the cost of performance however. But for now, I don't see memory footprint as an issue, especially considering that many users appear fine with AdBlock which consumes almost three times as much memory (well on Chromium at least).

    But when the pile of bugs lower to a point where I will be able to make (fun) investigative work, I will be able to try stuff, there are many ideas I want to try in there code-wise. In publicsuffixlist.js I did use a mixed approach (map/indexOf/binary-searh) to reach great performance/small memory footprint sweet point and it works very well. I want to try the same for preset blocked hosts (which is by far the largest memory consumer in HTTPSB).

    EDIT: Also, merging all into one file would prevent the user from picking a specific list. Users might have become accustomed from using that one list or this one other list, and these users will appreciate that the extension just provide these as is. And also what about the license etc. of the lists. I rather keep it simple and just mirror these lists separately locally.

    I didn't know about KISS Privacy, I will give it a look.
     
    Last edited by a moderator: Jan 13, 2014
  13. gorhill

    gorhill Guest

    I didn't know about CsFire, so I looked into it.

    With a quick glance to the log, it does appears that whatever CsFire does, HTTPSB does it, hopefully some users more knowledgeable of CsFire can correct/confirm. HTTPSB doesn't have any special treatment for redirects, but then if working in block-all/allow-exceptionnaly mode, whatever is the result of the redirection is blocked by default.

    EDIT: One thing I like in the extension and which I have meant to enter as an improvement issue is a log with time-stamped entries of the meaningful things HTTPSB does (not to confuse with the existing HTTP request log). Example: "Cookie {name} was deleted", or "Cookie '{name}' could not be deleted because of scope '{scope_name}', "Scope '{scope_name}' auto-created", "Domain '{domain_name}' auto-whitelisted", etc.
     
    Last edited by a moderator: Jan 13, 2014
  14. tlu

    tlu Guest

    Beyond what it says on its homepage there is a detailed scientific paper.
     
  15. gorhill

    gorhill Guest

    I was just looking at the code and I thought this was very well written code, clearly a skilled and experienced programmer, and build with portability in mind from the ground up. I really need to read this paper to educate myself.
     
  16. gorhill

    gorhill Guest

    Ok, I checked and CsFire and HTTPSB are not compatible: One of them might not be able doing what it is expected to do. This is a Chrome API by-design limitation:

    "Only one extension is allowed to redirect a request or modify a header at a time"
    HTTPSB strips cookie information from the outgoing headers for blocked cookies. It also add a CSP directive to incoming headers to prevent execution of blocked inline javascript.
     
    Last edited by a moderator: Jan 13, 2014
  17. tlu

    tlu Guest

    This might be the reason why I had always got an error message by Lastpass that it couldn't connect to the login server. The CsFire author has added a rule to solve that problem but it didn't help. That's why I've disabled CsFire in the meantime.
     
  18. wat0114

    wat0114 Registered Member

    Joined:
    Aug 5, 2012
    Posts:
    4,100
    Location:
    Canada
    It's a disturbing thought but it could happen :( The denver post site (I'm a Peyton Manning fan :D ) is one of several I've come across where googleapis.com has to be allowed to render some of the more important content. Even scorecardresearch in the site has to be allowed as well.
     
  19. tlu

    tlu Guest

    In all cases where Google services are involved I'm building rules with a domain-level or site-level scope. That's fortunately very easy with HTTPSB.
     
  20. wat0114

    wat0114 Registered Member

    Joined:
    Aug 5, 2012
    Posts:
    4,100
    Location:
    Canada
    Absolutely same here. Raymond's HTTPSB has been a godsend for Chrome/Chromium users :)
     
  21. tlu

    tlu Guest

    Absolutely. And he may even be godlike since considering his activity in the past weeks/months I wonder if he ever sleeps :D :D :D
     
  22. wat0114

    wat0114 Registered Member

    Joined:
    Aug 5, 2012
    Posts:
    4,100
    Location:
    Canada
    Hi Raymond,

    HTTPSB has the nice option to delete non-blocked session cookies after xx minutes. Is it possible to add an option to keep selected cookies from being deleted? For example I don't want Lastpass cookies from being deleted, otherwise I need to log back in again after I close and reopen the browser.
     
  23. gorhill

    gorhill Guest

    If this occurs when you close/open the browser, then it is not HTTPSB. Do you have the option "Keep local data only until I quit my browser" in the privacy settings section of your browser? Otherwise I will have to investigate why this happens.
     
  24. wat0114

    wat0114 Registered Member

    Joined:
    Aug 5, 2012
    Posts:
    4,100
    Location:
    Canada
    I use the extension Click & Play and I just today discovered the option "Cookies to Keep" :oops: , so I've placed lastpass.com in there which has resolved that problem. However, I was under the assumption, I'm guessing wrongly, that the option in httpsb "Delete non-blocked session cookies xx minutes after last time they have been used" would delete the lastpass session cookie during my browsing sessions. Will this not happen?
     
  25. gorhill

    gorhill Guest

    Not if LastPass keeps sending you the session cookie before the time elapses: "after last time they have been used".

    Typically if you are busy using a web site which requires login, the session cookie should be kept around as long as you are active on the web site. I wouldn't be happy if my session was terminated while in the middle of form submission or whatever.

    If you close the page though, the session cookie won't be updated anymore, hence it will be removed after x minutes -- that is, unless you reopen the web site before the interval expires.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice