Dont trust the av comparatives results

Discussion in 'other anti-virus software' started by dr pan k, Mar 9, 2010.

Thread Status:
Not open for further replies.
  1. Kees1958

    Kees1958 Registered Member

    Joined:
    Jul 8, 2006
    Posts:
    5,857
    Guys,

    I know I started a thread which laughed at the stupidity of SOME of the you tube testers (trying to shoot themselves in the foot and complain they manage to do so).

    But let's give Dr Pan_k (intruiging nickname: Am I suppoed to guess the _ ?) some cred's. An enthousiast did have the same observation as an expert. Best answer for me was the fact that the AV;s are used with high heuristics. Some heuristics go crazy when they find out one (usually suspicious characteristic of an executable), so that explains it for me.

    To the OP: better next time you state that you noticed something and ask for experiences of others or explanation. The responses are usually more open/friendly when you do not attache a conclusion on such small test samples.

    Regards Kees
     
  2. dr pan k

    dr pan k Registered Member

    Joined:
    Nov 22, 2007
    Posts:
    204
    yday night i checked what is probably the best considered av comparative test results of 2009 and gold winner in fp detection had as low as 20 fp out of a huge selection of samples. Its simply hard to believe those results when u bump into some fp for that same product in an every day use of the vt. How can they have only 20 fp's and i get several just by ordinary checking the stuff i downloaded ?? i got some innocuous files tagged like trojan and worm....

    Maybe it is to blame the autoit scripts generally or maybe some of the engines are low tuned or high tuned and in real enviroment they perform a lot better than the vt web site. Yet again after this i simply dont trust the findings of the comparatives as i used to.

    as for the nick, it actually represents who i am through a pronunciation game and makes fun of the social role in it

    Tnx to all for the replies
     
  3. dawgg

    dawgg Registered Member

    Joined:
    Jun 18, 2006
    Posts:
    818
    Believe them or not, IMO, they're true for those specific set of samples and settings used to perform the test at that specific time.
    Again, not necessarily same settings and engine as that on VT.

    You just wont see the same results if you're going through a newer sample of malware, lets say, found in the last 24hrs. You need to look at dynamic tests get a more realistic view of this - although the AVs also use other modules of protection in these tests rather than only scanner, its still the most realistic IMO.

    The details are there with the statistics, you need to read and analyse what the statistics show, not only the %.
     
  4. dr pan k

    dr pan k Registered Member

    Joined:
    Nov 22, 2007
    Posts:
    204
    The point is that i didnt went over the net to gather some fresh out of the box samples. i simply checked the files as i was downloading them for personal use. this means that these were older than a few days and i didnt tried to manipulate them in any way. they were not zipped or nothing. If 98% of detection with les than 20 or 30 fp's were true or even close to reality this wouldnt hae happened.

    what u suggest is that i should not take seriously in consideration the static tests and pay attention only to dynamic ones..
    in the dynamic test several products had more than 90% of detection with none to 1 fp.... though the data are not as detailed as other tests this still is not even close to my personal experience.

    and one last thing. if im not wrong in this specific test the team of experts used some 100 samples. statistically speaking my 20 or so is not so far away, though they concern only one category of apps.

    ps: i believe that the specific av testing team is doing a pretty decent job since they usually publish complete stats. this was not the case of the dynamic test. I would be very interested to read what someone whos directly involved in this kind of testing has to say..
     
  5. kwismer

    kwismer Registered Member

    Joined:
    Jan 4, 2008
    Posts:
    240
    the test uses a more representative sample of all files everywhere, rather than just autoit scripts. i mentioned this before but i guess the significance was overlooked. your sample selection is incredibly biased when you only use autoit scripts. autoit scripts may be (i can't conclusively say they 'are') more prone to triggering false alarms than other file types. when you only use 1 type of file you're virtually guaranteed to get a different false positive rate (either higher or lower) unless you happen by chance to choose a file type that magically has exactly the same false positive rate as the average.

    while it's true that all comparative tests need to be taken with a grain of salt, your experiences and more importantly your interpretation of those experiences that you've presented here are far more suspect.
     
  6. kwismer

    kwismer Registered Member

    Joined:
    Jan 4, 2008
    Posts:
    240
    yes it would. i'm sorry to say but you are repeatedly demonstrating a complete lack of understanding of statistics and sampling bias.

    as such i'm going to explain it in terms that everyone can understand.

    lets say you have a field with 100 sheep in it. 50 of the sheep are black and 50 are white. the black sheep are all on the west side of the field and the white sheep are all on the east side. if i go to the west side and start counting out 20 sheep i'm going to find only black ones. should i then question the assertion that only 50% of the sheep are black when my experience shows 100% are black? no because my sample is too limited - not necessarily by size, but by other factors that affect how closely (or not) my sample matches the entire population.

    by only using autoit scripts you have done precisely the same thing. false positives are not uniformly distributed across all file types - some get more than others and by only using a single file type your sample is incredibly biased instead of being representative.
     
  7. dr pan k

    dr pan k Registered Member

    Joined:
    Nov 22, 2007
    Posts:
    204
    your example has nothing to do with the hole question. the post speaks clearly of autoit scripts, and doesnt take into consideration other forms of possible malware. i am not running some av comparative and u simply dont want to get it. the av's are tested for autoscripts and therefore the final results contain a percentage of both detected and non autoit samples. i am not comparing my personal experience to a complete test. i simply realized that after a random control, cause this is what it is, the % of detection and fp's is far from being close to the % published generally speaking.

    as for my stats knowledge i prefer not to make any comments.. you or anybody else can pm and i will be delighted to give u some of the files i used, and try them out yourselves.
     
  8. kwismer

    kwismer Registered Member

    Joined:
    Jan 4, 2008
    Posts:
    240
    the example was to explain why your false positives differ from those in the test. as such it has nothing to do with any form of possible malware and only pertains to file types. the false positive testing in professional tests use more than just autoit scripts. they use many other file types. their results are supposed to generalize to the population of all clean files. on average you can expect X% of false positives across the entire population, but for any particular sub-population of file types the actual false positive rate may be higher or lower than X%.

    you are comparing your results (your fp results) to those in an av-comparative and you simply don't want to get that that implies your results are comparable to an av-comparative.

    and my example demonstrates why your false positive rate is so different than the ones in the professional tests. your 'random control' was biased - there are those who would question whether it was even random at all because it only contained autoit scripts. a random number generator that always produces numbers that start with 1 is a rather suspect random number generator.
     
  9. johnyjohn

    johnyjohn Registered Member

    Joined:
    Jan 2, 2010
    Posts:
    126
  10. kwismer

    kwismer Registered Member

    Joined:
    Jan 4, 2008
    Posts:
    240
  11. dr pan k

    dr pan k Registered Member

    Joined:
    Nov 22, 2007
    Posts:
    204
    this is an interesting conclusion from the above mentioned article:

    If we get rid of static on-demand-tests with their mass of unvalidated samples, the copying of classifications will at least be significantly reduced, test results will correspond more closely to reality (even if that means saying good bye to 99.x% detection rates) and in the end everyone will benefit: the press, the users and of course us as well.
     
  12. NoIos

    NoIos Registered Member

    Joined:
    Mar 11, 2009
    Posts:
    607
    This thread has become a joke!
     
  13. biscuits

    biscuits Registered Member

    Joined:
    Feb 16, 2010
    Posts:
    113
    Dear dr pan k

    Dude, the service that VT provides is not the same as the service the actual AVs give. VT uses the engine and the virus signatures but you must also consider that all AVs have particular settings w/c VT doesn't have.

    Also please understand how AV comparatives came up with their percentages and what they mean. They are testing a large amount of samples of different types.

    Let us put it this way. AV comparatives' samples compose of x, y, z files while the files you uploaded the past days are x1, x2, x3 files. In AV-comparatives' test, an AV product detected x file as a trojan while its actually not (an FP). Y and Z were detected as malwares (w/c were real malwares). That gives the AV product a rating of 66.66%. Now, if all the samples were x files (w/c i think is your situation) the AV will most probably get a rating of 0%.

    If I am not mistaken, you are pretty much uploading apps (apps that are very similar) that you downloaded from websites that provide illegal copies of those apps so you are using VT to know if the files are clean.
     
  14. dawgg

    dawgg Registered Member

    Joined:
    Jun 18, 2006
    Posts:
    818
    Seems to me we're going round in circles, we have tried to explain it, but have not got far.
     
Thread Status:
Not open for further replies.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.