AI/Data Mining to de-anonomise using style, Bitcoin and time cues.

Discussion in 'privacy general' started by deBoetie, Sep 6, 2017.

  1. deBoetie

    deBoetie Registered Member

    Joined:
    Aug 7, 2013
    Posts:
    1,832
    Location:
    UK
    A team led by Rebecca Portnoff has used AI/data mining techniques to identify sex traffickers on Backpage.

    https://www.newscientist.com/articl...rail-to-find-and-help-sextrafficking-victims/

    https://dl.acm.org/citation.cfm?doid=3097983.3098082

    The tool developed & used by the researchers in combination:
    • identified the style in which ads are written, using AI
    • searched the bitcoin blockchain to identify the wallet associated
    • associated the timestamp of the ad posting
    It correctly identified about 90% of ads that had the same author, with a false positive rate of only 1%.

    The relatively new aspect of this is the use of AI to characterise "style", and doing so in conjunction with more traditional cues. I've written in the past about how individual use of pronouns is diagnostic, I imagine (there are no further details), that the way this AI works is more general.

    Sadly, this class of technique is increasingly likely to be used in less laudable de-anonomisation attacks. This might give people pause for thought when taking the necessary care over opsec and segregation - as Mirimir has posted.
     
  2. mirimir

    mirimir Registered Member

    Joined:
    Oct 1, 2011
    Posts:
    9,252
    Yes, this is scary stuff. My meatspace persona does very little online in English. Mirimir, in case you haven't noticed, is a kludge of various variants of English that I've been exposed to. Plus some Mexican Spanish. And other personas have used other languages than I know well enough. But there's a limit to that, of course.

    I wonder how well those AI approaches work across languages.
     
  3. Palancar

    Palancar Registered Member

    Joined:
    Oct 26, 2011
    Posts:
    2,402

    I am guessing writing style is pretty idiosyncratic and differing languages only slow the process of determination down a bit.
     
  4. mirimir

    mirimir Registered Member

    Joined:
    Oct 1, 2011
    Posts:
    9,252
    Well, there is no "y'all" in my native language ;)
     
  5. RockLobster

    RockLobster Registered Member

    Joined:
    Nov 8, 2007
    Posts:
    1,812
    mirimir, its, "there ain't no y'all in my language" :)
     
  6. mirimir

    mirimir Registered Member

    Joined:
    Oct 1, 2011
    Posts:
    9,252
    Well, there is no "ain't" either ;) But yes, point taken :)

    I especially love "all y'all". I mean, one can "every one of you", but this is so much shorter. And "some y'all", that's just awesome.
     
  7. deBoetie

    deBoetie Registered Member

    Joined:
    Aug 7, 2013
    Posts:
    1,832
    Location:
    UK
    Gorgeous! I guess there's all the language/regional/dialect variants, as well as characteristic use of slang/idiom/phrases, being quite diagnostic - let alone in-group slang. For example, there's a group of people who will know the phrase about humans - "mostly harmless" - from Douglas Adams.

    I do like the idea of running text through multiple translations, though I guess that would have to avoid cloud services.
     
  8. mirimir

    mirimir Registered Member

    Joined:
    Oct 1, 2011
    Posts:
    9,252
    If you look in junk shops -- or in my case, junk boxes -- you can find pre-Internet translation software. With local databases. You may need to run an XP (or even 95) VM, but hey ;)
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.