'Scrapers' Dig Deep for Data on the Web

Discussion in 'privacy general' started by Dermot7, Oct 16, 2010.

Thread Status:
Not open for further replies.
  1. Dermot7

    Dermot7 Registered Member

    Joined:
    Dec 20, 2009
    Posts:
    3,430
    Location:
    Surrey, England.
    "At 1 a.m. on May 7, the website PatientsLikeMe.com noticed suspicious activity on its "Mood" discussion board. There, people exchange highly personal stories about their emotional disorders, ranging from bipolar disease to a desire to cut themselves.

    It was a break-in. A new member of the site, using sophisticated software, was "scraping," or copying, every single message off PatientsLikeMe's private online forums."

    http://online.wsj.com/article/SB10001424052748703358504575544381288117888.html
     
  2. hierophant

    hierophant Registered Member

    Joined:
    Dec 18, 2009
    Posts:
    854
    This isn't exactly news. I've used "scraper" software in legitimate research for years -- e.g., getting product databases from corporate websites and archives on the wayback machine. The software just automates what any user could do manually.
     
  3. CloneRanger

    CloneRanger Registered Member

    Joined:
    Jan 4, 2006
    Posts:
    4,978
    @ hierophant

    It might not be "news" as such, but it's BAD news :eek: There can only be one reason why they would do that, and it's NOT good :thumbd:
     
  4. hierophant

    hierophant Registered Member

    Joined:
    Dec 18, 2009
    Posts:
    854
    OK, I get your point.

    And yet, people posted this information to a public forum. Although it may have been "members only", PatientsLikeMe.com obviously didn't do a very good job of screening their members.

    If that's so, it would have been prudent to 1) ensure that posters are anonymous -- or, at least, that they know they should be -- and 2) to carefully screen new members. If they didn't, they were irresponsible.

    It was a mole, and not a "break-in". And the forums weren't "private" -- obviously.

    Also, I gotta wonder what PatientsLikeMe has been doing with the data that they've collected. Perhaps they've been selling aggregated/anonymized data? They're certainly using it on their website, I see. And perhaps I'm just being paranoid.
     
  5. caspian

    caspian Registered Member

    Joined:
    Jun 17, 2007
    Posts:
    2,363
    Location:
    Oz
    Wow that is pretty amazing. You are full of surprises.

    I have gone to way back machine to look at old websites. But there is usually nothing much left to see, unfortunately.
     
  6. hierophant

    hierophant Registered Member

    Joined:
    Dec 18, 2009
    Posts:
    854
    I do work from time to time ;)

    Sometimes you luck out, and find entire websites. Older websites, with relatively-static link structures, actually tend to survive best.
     
Thread Status:
Not open for further replies.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.