'Scrapers' Dig Deep for Data on the Web

Discussion in 'privacy general' started by Dermot7, Oct 16, 2010.

Thread Status:
Not open for further replies.
  1. Dermot7

    Dermot7 Registered Member

    Joined:
    Dec 20, 2009
    Posts:
    3,196
    Location:
    Surrey, England.
    "At 1 a.m. on May 7, the website PatientsLikeMe.com noticed suspicious activity on its "Mood" discussion board. There, people exchange highly personal stories about their emotional disorders, ranging from bipolar disease to a desire to cut themselves.

    It was a break-in. A new member of the site, using sophisticated software, was "scraping," or copying, every single message off PatientsLikeMe's private online forums."

    http://online.wsj.com/article/SB10001424052748703358504575544381288117888.html
     
  2. hierophant

    hierophant Registered Member

    Joined:
    Dec 18, 2009
    Posts:
    854
    This isn't exactly news. I've used "scraper" software in legitimate research for years -- e.g., getting product databases from corporate websites and archives on the wayback machine. The software just automates what any user could do manually.
     
  3. CloneRanger

    CloneRanger Registered Member

    Joined:
    Jan 4, 2006
    Posts:
    4,833
    @ hierophant

    It might not be "news" as such, but it's BAD news :eek: There can only be one reason why they would do that, and it's NOT good :thumbd:
     
  4. hierophant

    hierophant Registered Member

    Joined:
    Dec 18, 2009
    Posts:
    854
    OK, I get your point.

    And yet, people posted this information to a public forum. Although it may have been "members only", PatientsLikeMe.com obviously didn't do a very good job of screening their members.

    If that's so, it would have been prudent to 1) ensure that posters are anonymous -- or, at least, that they know they should be -- and 2) to carefully screen new members. If they didn't, they were irresponsible.

    It was a mole, and not a "break-in". And the forums weren't "private" -- obviously.

    Also, I gotta wonder what PatientsLikeMe has been doing with the data that they've collected. Perhaps they've been selling aggregated/anonymized data? They're certainly using it on their website, I see. And perhaps I'm just being paranoid.
     
  5. caspian

    caspian Registered Member

    Joined:
    Jun 17, 2007
    Posts:
    2,301
    Location:
    Oz
    Wow that is pretty amazing. You are full of surprises.

    I have gone to way back machine to look at old websites. But there is usually nothing much left to see, unfortunately.
     
  6. hierophant

    hierophant Registered Member

    Joined:
    Dec 18, 2009
    Posts:
    854
    I do work from time to time ;)

    Sometimes you luck out, and find entire websites. Older websites, with relatively-static link structures, actually tend to survive best.
     
Loading...
Thread Status:
Not open for further replies.