Dont trust the av comparatives results

dr pan k · Mar 9, 2010

In the last days i was searching for a certain not so "leggit" app, so i tried several uploads at virus total and what i found out is that almost every time there is a huge discrepancy between the results of the various av comparatives and the real thing.
To make it more clear the possible cases are two: Either the various av's have way more false positives or the av's have way lower detection rates.
I tried about 20 samples and the there was not a single file that didnt have at least 4 positivities and in the same time there was not a single file that came up at all 42 engines as a threat or even close to that.
This means only one thing: the detection rates and the false positive rates declared by vendors and the results of the various av comparatives simply cant be true.

ps1: i know that 20 files dont stand as a statistically significant test, yet again its a number of cases indicative of a trend
ps2: i am not refering to a specific av or av comparative

guest · Mar 9, 2010

Did you make the test using VirusTOtal?
There are many keygens and cracks that are FP's but these doesnt count for their

dr pan k · Mar 9, 2010

guest said:

Did you make the test using VirusTOtal?
There are many keygens and cracks that are FP's but these doesnt count for their
Click to expand...

the category of files used at http://www.virustotal.com/ were written in AutoIt and they were no cracks or keygens.

NoIos · Mar 9, 2010

Well, although I appreciate your post, I have to say that brings nothing new. You bring here a personal opinion and I respect that. Maybe others share this opinion too. Personally I find this type of posts or threads useless and forgive me...honestly...I find all this type of discussions stupid.

AV tests or comparatives are made with various methods and for various reasons. There are tests that deserve some attention and others that don't.

Bringing here a result of Virustotal for 20 samples...is not a serious indication for anything...cannot reveal a trend...cannot really be taken under any consideration. You cannot accuse others for not doing their jobs in a serious manner when you provide a proof that completely lacks of seriousness.

Forgive me but it is time to face some things like intelligent beings and not like 13 year old kiddies.

bellgamin · Mar 9, 2010

dr pan k said:

ps2: i am not refering to a specific av or av comparative
Click to expand...

The OP used undefined test methodology, a pathetically small & undefined malware database, & then used a glittering generality such as that which I have quoted.

This is pure ap-cray -- kiddie stuff, as is well-said by NoIos.

AV tests by professionals are useful but NO ONE says that those tests are perfect -- not even the pros themselves.

AV tests by self-aggrandizing amateurs, such as the one cited by OP, are NOT useful.

Scoobs72 · Mar 9, 2010

dr pan k said:

ps1: i know that 20 files dont stand as a statistically significant test, yet again its a number of cases indicative of a trend
Click to expand...

Your 'test' is so far off being statistically significant that it in no way whatsoever does it indicate any trend.

Triple Helix · Mar 9, 2010

FWIW Also it has been said that VT doesn't use the full version of each product and that on most of them Advance Heuristics are not turned on!

TH

dr pan k · Mar 9, 2010

Triple Helix said:

FWIW Also it has been said that VT doesn't use the full version of each product and that on most of them Advance Heuristics are not turned on!

TH
Click to expand...

this is the only comment i find of any use in explaining the situation described as far as it goes for the low detection rate of some products but what about the false positive issue ? many products did gave fp.. maybe to assume that AutoIt is considered as a language used by many malware ? some av expert could enlighten me

Scoobs72 · Mar 9, 2010

dr pan k said:

this is the only comment i find of any use in explaining the situation described as far as it goes for the low detection rate of some products
Click to expand...

There is no 'situation' to explain. The tests are irrelevant and of no significance at all, as explained in the above posts. Sorry, but that's just the way it is.

dr pan k · Mar 9, 2010

maybe i wasnt clear. this is not a test. i dont believe in "private" testing but i start to doubt on the % of the official ones

kwismer · Mar 9, 2010

dr pan k said:

In the last days i was searching for a certain not so "leggit" app, so i tried several uploads at virus total and what i found out is that almost every time there is a huge discrepancy between the results of the various av comparatives and the real thing.
Click to expand...

you may wish to read this:
http://anti-virus-rants.blogspot.com/2008/12/why-perform-virustotal-based-av-tests.html

i don't know why you did what you did, but from your posts in this thread, very little if anything of what you did was done right.

you've done a test which you think calls into question both the true and false positive rates but haven't made it clear how many of each you tested (did you even know how many truly were malware and how many weren't? did you validate your samples at all?). you used virustotal in a way that even the maintainers of virustotal say you should not do. you threw statistical significance out the window not only by using an incredibly small sample but also by using the most biased sample selection technique i've ever heard of (they were all autoit scripts?).

i'm afraid you still have a lot to learn.

steve1955 · Mar 9, 2010

kwismer said:

i'm afraid you still have a lot to learn.
Click to expand...

Doesn't everybody?

Fly · Mar 9, 2010

dr pan k said:

In the last days i was searching for a certain not so "leggit" app, so i tried several uploads at virus total and what i found out is that almost every time there is a huge discrepancy between the results of the various av comparatives and the real thing.
To make it more clear the possible cases are two: Either the various av's have way more false positives or the av's have way lower detection rates.
I tried about 20 samples and the there was not a single file that didnt have at least 4 positivities and in the same time there was not a single file that came up at all 42 engines as a threat or even close to that.
This means only one thing: the detection rates and the false positive rates declared by vendors and the results of the various av comparatives simply cant be true.

ps1: i know that 20 files dont stand as a statistically significant test, yet again its a number of cases indicative of a trend
ps2: i am not refering to a specific av or av comparative
Click to expand...

Virustotal doesn't use the regular/full AVs, sometimes the online scans are not even up to date with regard to the signatures, so for that reason alone the test is defective.

dr pan k · Mar 9, 2010

@ kwismer: i do have a lot to learn but this is NOT a test. all the files were autoit scripts. i wanted to check the stuff i was downloading from the net before running it on my pc and i came out with the above mentioned results. there was no intention of running any sort of test at all. therefor i didnt post any stats or other info. as i was impressed by them i decided to post em here. instead of telling me that as a test is not valid, which is quite obvious, why someone who knows a bit more doesnt explain why there is such a discrepancy.

as for validating the files, theres no reason in doing it. this is exactly the point here. if half the av's show a file as a trojan and the other half dont pick it up, and this happens with several files in a row theres got to be some problem. either these are fp or some av's have low detection rates. guess is that both detection rates and fp are not the ones that are pubblished.

Scoobs72 · Mar 9, 2010

You have had several good answers here as to why the whole basis for your questioning is meaningless. You really need to read and learn from them.

kwismer · Mar 9, 2010

dr pan k said:

@ kwismer: i do have a lot to learn but this is NOT a test.
Click to expand...

you're interpreting the results as if it were. if you want to give up thinking of it as a test then what you're going to have to do is stop thinking the results mean anything about the detection capabilities of scanners. if you can't see how those two things are connected then there are deeper problems.

dr pan k said:

all the files were autoit scripts. i wanted to check the stuff i was downloading from the net before running it on my pc and i came out with the above mentioned results. there was no intention of running any sort of test at all. therefor i didnt post any stats or other info. as i was impressed by them i decided to post em here. instead of telling me that as a test is not valid, which is quite obvious, why someone who knows a bit more doesnt explain why there is such a discrepancy.
Click to expand...

the reason for the discrepancy is the same reason it would be invalid as a test if it were a test. a number of reasons have already been detailed in this thread.

dr pan k said:

as for validating the files, theres no reason in doing it. this is exactly the point here.
Click to expand...

and this is exactly why you have to throw out any notion that your results have any meaning with regards to the capabilities of products. the ONLY question virustotal can answer is "is this file malware?", and even then the answers aren't necessarily accurate.

dr pan k said:

if half the av's show a file as a trojan and the other half dont pick it up, and this happens with several files in a row theres got to be some problem.
Click to expand...

the only problem is in the way you're interpreting the results outside of virustotal's intended scope.

NoIos · Mar 9, 2010

dr pan k said:

why there is such a discrepancy.
Click to expand...

The discrepancy is visible to you because you use wrong tools and a lot of simplification for your conclusions.

All the fellow forum members here have provided some good hints, explanations and guidelines for further readings and thinking. The forum itself is a great source.

best regards,
NoIos

dawgg · Mar 9, 2010

dr pan k said:

I tried about 20 samples and the there was not a single file that didnt have at least 4 positivities and in the same time there was not a single file that came up at all 42 engines as a threat or even close to that.
This means only one thing: the detection rates and the false positive rates declared by vendors and the results of the various av comparatives simply cant be true.
Click to expand...

1. If you used 20 AutoIt files, then you may constantly be triggering a "loose" generic/packer detection. Some AV products may have these detections in a separate category, eg "potentially malicious/unwanted or riskware" with alerts indicating it is not necessarily malicious, so in your example, they will probably constantly have this detection. Not forgetting to mention, AutoIt files are most often used by those who created the file, so they would really know whether it is malicious or not and add it to exclusions if needs be. If I download something and find an AutoIt file/script in it, I'd be suspicious.

More generally:

2. Some AVs on VT are - let's say - generous with tagging files as suspicious or something similar. You should really ignore these detections. Sometimes the VT engines are configured to detect more than their AVs do with default settings or settings they are tested with in false-positive tests.

3. For some AVs, the scanning engine (and effectiveness) used on the AV and VT are different. Sometimes AV products detect more than VT shows.

4. If you're talking about AVs claiming to detect (lets say 98%) according to a test,they are not lying, and neither are the testers lying. The static on-demand tests are full of old malware, allowing time for AVs to find the files and add them to detections. Chances are, if you scan live malware found now, you are using a newer variant, so far less AVs will detect it.
No fallacy involved, mainly age of the malware samples varies which may be the discrepancy between what you generally see on VT and what comparatives show.

dr pan k · Mar 9, 2010

dawgg said:

4. If you're talking about AVs claiming to detect (lets say 98%) according to a test,they are not lying, and neither are the testers lying. The static on-demand tests are full of old malware, allowing time for AVs to find the files and add them to detections. Chances are, if you scan live malware found now, you are using a newer variant, so far less AVs will detect it.
No fallacy involved, mainly age of the malware samples varies which may be the discrepancy between what you generally see on VT and what comparatives show.
Click to expand...

tnx for the reply dawgg.

as a matter of fact i used the virustotal because i was suspicious with the files i downloaded, though this kind of app i was looking for is usually written in autoit.

your 4th paragraph gets right to the point. I, like many others here, read the results of the various av comparatives and personally i find very hard to believe that av's reach 98+% of detection with very low fp's and then miss some samples i happened to check just because i wasnt sure of the "surprises" i would find in them...

by reading the replies of the other members it seems as if i am the only one who doesnt believe those fantastic % published all over the net

kwismer · Mar 10, 2010

dr pan k said:

tnx for the reply dawgg.

as a matter of fact i used the virustotal because i was suspicious with the files i downloaded, though this kind of app i was looking for is usually written in autoit.

your 4th paragraph gets right to the point. I, like many others here, read the results of the various av comparatives and personally i find very hard to believe that av's reach 98+% of detection with very low fp's and then miss some samples i happened to check just because i wasnt sure of the "surprises" i would find in them...

by reading the replies of the other members it seems as if i am the only one who doesnt believe those fantastic % published all over the net
Click to expand...

i'm guessing you didn't read the link i gave you - it specifically mentions retrospective testing, which makes av products look quite bad.

Stefan Kurtzhals · Mar 10, 2010

Well, I experience the same all day. Some products which have very low false positive ratings at AV Comparatives have a high rate when I am working on my "own" false positive files. I always find myself asking "how can these products have such a low fp rate at AV-C, when they are that trigger happy on all this stuff".

Beside that, I hate Autoit. I really really really *HATE* it!

I vote for a general ban of it!

NoIos · Mar 10, 2010

Everyone who takes Virustotal as a comparison tool ( and I don't know why some of you don't get it ) will always arrive to wrong conclusions. Also I wonder how you talk about the configuration of VirusTotal's AV engines when nothing that specific has ever been published.

Stefan Kurtzhals said:

Well, I experience the same all day. Some products which have very low false positive ratings at AV Comparatives have a high rate when I am working on my "own" false positive files. I always find myself asking "how can these products have such a low fp rate at AV-C, when they are that trigger happy on all this stuff".

Beside that, I hate Autoit. I really really really *HATE* it!

I vote for a general ban of it!
Click to expand...

Thanks for sharing your experience. But since AV comparatives publish their results and in some way they expose themselves, before blaming them for not doing their job correctly, it would be good to have some official publication from your part too. I understand that it's your experience that you describe. I just don't find informative and productive the way you put it here. Also "voting" for a general ban of autoit...how can I say it...it sounds immature.

cruelsister · Mar 10, 2010

I think the problem here is that many people take the results of reputable AV tests as the Word of God. I can't count how many times I've read posts touting A over B because of a 1% better score on a test- as if margin of error didn't apply.

OP- Use the tests as they should be used- separating the obviously good from the obviously bad and then choose the product with the cutest interface.

NoIos · Mar 10, 2010

dr pan k said:

tnx for the reply dawgg.
by reading the replies of the other members it seems as if i am the only one who doesnt believe those fantastic % published all over the net
Click to expand...

Nobody here said that you have to believe the tests ( or that I/we believe the tests 100%). I don't know where you have read this thing.

NoIos · Mar 10, 2010

cruelsister said:

I think the problem here is that many people take the results of reputable AV tests as the Word of God. I can't count how many times I've read posts touting A over B because of a 1% better score on a test- as if margin of error didn't apply.
Click to expand...

No, the problem here is not what you describe. Nobody in this thread said that you have to accept the AV comparatives as the word of god. The problem here is that somebody uses wrong methods and simplifications to present his doubt about the AV comparatives. We basically all say the same thing and we may even agree, but sometimes it's also important how you put it.

cruelsister said:

OP- Use the tests as they should be used- separating the obviously good from the obviously bad and then choose the product with the cutest interface.
Click to expand...

Log in or Sign up

Dont trust the av comparatives results

dr pan k Registered Member

guest Guest

dr pan k Registered Member

NoIos Registered Member

bellgamin Registered Member

Scoobs72 Registered Member

Triple Helix Specialist

dr pan k Registered Member

Scoobs72 Registered Member

dr pan k Registered Member

kwismer Registered Member

steve1955 Registered Member

Fly Registered Member

dr pan k Registered Member

Scoobs72 Registered Member

kwismer Registered Member

NoIos Registered Member

dawgg Registered Member

dr pan k Registered Member

kwismer Registered Member

Stefan Kurtzhals AV Expert

NoIos Registered Member

cruelsister Registered Member

NoIos Registered Member

NoIos Registered Member

Log in or Sign up

Dont trust the av comparatives results

dr pan k Registered Member

guest Guest

dr pan k Registered Member

NoIos Registered Member

bellgamin Registered Member

Scoobs72 Registered Member

Triple Helix Specialist

dr pan k Registered Member

Scoobs72 Registered Member

dr pan k Registered Member

kwismer Registered Member

steve1955 Registered Member

Fly Registered Member

dr pan k Registered Member

Scoobs72 Registered Member

kwismer Registered Member

NoIos Registered Member

dawgg Registered Member

dr pan k Registered Member

kwismer Registered Member

Stefan Kurtzhals AV Expert

NoIos Registered Member

cruelsister Registered Member

NoIos Registered Member

NoIos Registered Member

Useful Searches