Even Anonymous Coders Leave Fingerprints

Discussion in 'privacy general' started by ronjor, Aug 10, 2018.

  1. ronjor

    ronjor Global Moderator

    Joined:
    Jul 21, 2003
    Posts:
    164,083
    Location:
    Texas
    Louise Matsakis 08.10.18
     
  2. mirimir

    mirimir Registered Member

    Joined:
    Oct 1, 2011
    Posts:
    9,252
    I'm not surprised that it also applies to code. However, so far it seems to be language dependent:
    Also interesting is that spoofing is possible. And that implies that code stylometry can be readily defeated.
    Also, regarding prose in different languages, they say:
    OK, so I'm somewhat active online as my meatspace identity. But in my native tongue, and not in English. But I certainly don't just translate stuff. It's almost like I'm a different person when I'm thinking in American English. I draw on usage and slang that I've picked up from friends and coworkers, from many places. From "all y'all" (Southern US) to "gobsmacked" (British) to "take the decision" (Mexican). And that's totally absent from prose in my native tongue, so I doubt that stylometry would link it to Mirimir.
     
  3. deBoetie

    deBoetie Registered Member

    Joined:
    Aug 7, 2013
    Posts:
    1,832
    Location:
    UK
    I suspect this would be programming language and IDE dependent. For the coding I do, the stylistic aspects are in fact constrained by the IDE, architectural, component, library and style guidelines imposed by the projects - so relatively little to go on there, particularly when others are doing the same thing. I'd view what I do as more being a librarian and hooker-upper rather than a coder.

    Of course, stylometry on natural free text is far more likely to be more individual, for example, in the use of pronouns.

    As usual, the article is somewhat coy in quantifying false-positives, and obviously its reasonable success rate is only when predicated on a known pool where previous example of work is already known.
     
  4. mirimir

    mirimir Registered Member

    Joined:
    Oct 1, 2011
    Posts:
    9,252
    They do note that those who code based on searching and combining stuff are harder to identify. As opposed to those who write from scratch.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.