Automatically Generated Signature -- Good or Bad?

ntl · Dec 21, 2003

Some people do it (including ewido & Mischel Internet Security). Some people want to do it ... and others really hate automatically generated signatures!

I would be interested in your opinion on auto-generated sigs.

IMHO, the automatic generation of signatures has some important advantages:

1.
A newcomer who wants to develop an AT scanner does not need to spend months or years in order to manually pick signatures for each and every trojan since 1995 ...

2.
Manual signature picking is terribly boring and will possibly frustrate a trojan/virus analyst.

3.
If your signature database gets cracked (and all your signatures are disclosed) you can quickly create new signatures.

4.
You do not need to employ and pay hundreds of virus analysts.

What are the disadvantages of automatically generated signatures?

1.
Is it difficult to automatically find a "strong", code-based signature like a call @ a defined offset?

2.
Do you need a scan engine which supports single-point scanning (like Kaspersky) in order to avoid false alerts?

I would be grateful if we could discuss this issue without flaming each other.

TIA,

Nautilus

Btw.: I am cross-posting again ;-) See http://www.rokop-security.de/board/index.php?showtopic=1370

ntl · Dec 21, 2003

Forgot to mention: Auto-generated sigs may facilitate the concept of "rotating signatures".

rerun2 · Dec 21, 2003

quoting: ntl link=board=25;threadid=18229;start=0#msg112404 date=1072030649]
3.
If your signature database gets cracked (and all your signatures are disclosed) you can quickly create new signatures.
Click to expand...

Wouldn't this be more of a disadvantage?
The disclosure of your entire signature database for even a short period sounds like a risk to me.

Isn't it also much harder to crack manually chosen signatures?

ntl · Dec 21, 2003

@rerun

I agree that the disclosure of the signature database is some of the worst things which can happen to an AV/AT software producer. However, if you manually create signatures it will take you much longer until you have a new signature database.

I do not believe that automatic signatures facilitate signature database cracking.

Cheers, ntl.

Jason_DiamondCS · Dec 22, 2003

Hi ntl,

Advantages
1) Yeah it is much faster to go through a database of files to find auto signatures for them. You still need a database though.
2) Some find it interesting.
3) Yes
4) True

Disadvantages
1) Yes, it is extremely difficult to write a program which detects "good" signatures simply because the definition of a good signature totally depends on the file in question
2) I am not too sure what you mean by single point scanning?

Basically I think automatic signatures may be ok as a backup system to proven signatures, as long as the automatic signatures aren't very simple like checksumming large parts of a file. In my opinion no existing automatic signature scheme I have seen has done it to an extent that is satisfactory. For example a well picked manual signature in a lot of cases cannot just be hex-edited or it would break the program, so it requires a recompile (hence you need source) with a lot of code change to stop the scanner from detecting it. Whereas with some automatic schemes you can just change one byte in a piece of code which never gets executed and you are no longer detected.

With signatures it isn't the number of signatures you have for each file, it is the quality that matters most. With a lot more R&D into automatic signatures maybe one day we will rely on them more and more.

-Jason-

Wayne - DiamondCS · Dec 22, 2003

Just to add to that - automatic signatures are relatively 'fixed' (ie. you can't change them because they're created by a constant algorithm). If there's ever a problem (such as a false alarm) the signature can't be changed because the automatic signature algorithm only knows to look for one thing, and trojan authors can then code routines into their EditServer programs so as to make the server undetectable by the automatic signature algorithm. Manual signatures give the human analyst (who is always going to be smarter than any automatic signature extraction algorithm) the complete freedom to choose any signature(s) with virtually no restrictions, and if there are any problems such as false alarms there aren't any problems as a new signature can be found.

My main point as Jason said is that on their own automatically-extracted signatures are not sufficient, but if combined with manual signatures then they can certainly be useful as a second line of defence and they do work well in that regard.

However, if you manually create signatures it will take you much longer until you have a new signature database.
Click to expand...

Once a signature has been found we can manually add a signature within a minute so it's not any slower than automatic signatures in that regard. I can understand why you may think that it would take longer because we have to look for a signature (as opposed to it being obtained automatically), but that actually doesn't take any longer because we always disassemble/analyse trojans anyway so we can find good, strong signatures then while we analyse, so it doesn't take any extra time.

However, if you were to create a new anti-trojan program now then you would have no choice but to use automatic signatures as you simply wouldn't have time to disassemble, analyse and manually extract signatures from some 10000+ trojans (it's taken us over half a decade).

Andreas Haak · Dec 22, 2003

>Just to add to that - automatic signatures are relatively 'fixed' (ie. you can't change them
>because they're created by a constant algorithm).

But hand picked signatures have the same weakness if they were done by one and the same analyst. Cause after a few hundret signature extractions the process starts to get its own automation .

Mostly analysts have a fixed roadmap (some times its even the same roadmap company wide). There are special areas within the file they look first for a good signature for example. If you know how a few signatures are build you can guess with quite a high hit rate how a signature for another nastie looks like.

For example Kaspersky likes to put API calls to its signatures. BitDefender prefers signatures that include string parts (for example the backdoor name) and so on.

Surely such approaches are not that easy to recognize than a normal "fingerprint" - but its still recognizeable .

Wayne - DiamondCS · Dec 22, 2003

But hand picked signatures have the same weakness if they were done by one and the same analyst. Cause after a few hundret signature extractions the process starts to get its own automation
Click to expand...

Yes each analyst will naturally prefer different signatures ie different code sequences, so some 'patterns' may be obvious - but only to that analyst, and as each trojan is different the signature chosen is usually quite unique to that trojan. Even if it's from the same family of trojans it's still impossible for a trojan to automatically predict which signature the analyst might've chosen - it needs prior knowledge, whereas with automated signatures there's the possibility for automated attacks against the algorithm itself, which is the main problem.

Andreas Haak · Dec 22, 2003

quoting: Wayne - DiamondCS link=board=25;threadid=18229;start=0#msg112615 date=1072103016]Yes each analyst will naturally prefer different signatures ie different code sequences, so some 'patterns' may be obvious - but only to that analyst, and as each trojan is different the signature chosen is usually quite unique to that trojan. Even if it's from the same family of trojans it's still impossible for a trojan to automatically predict which signature the analyst might've chosen - it needs prior knowledge, whereas with automated signatures there's the possibility for automated attacks against the algorithm itself, which is the main problem.
Click to expand...

Agree . Just wanted to point out that even hand picked signatures have this weakness . But as you said ... its much more difficult to exploit it .

Gavin - DiamondCS · Dec 23, 2003

A human changes day to day too

Nautilus · Dec 23, 2003

Thanks everybody for your comments! I find this topic quite interesting and try to better understand the process of signature creation.

@ Jason

1.
" I am not too sure what you mean by single point scanning?"

I believe this technique is used by Kaspersky. The engine does not search for a signature within the entire file but merely searches at a single location (i.e., at a specific file offset). I believe this technique is used to avoid false positives. In addition, KAV requires a "double match" in order to detect malware: there are always two signatures @ different offsets which must be matched.

2.
"For example a well picked manual signature in a lot of cases cannot just be hex-edited or it would break the program, so it requires a recompile (hence you need source) with a lot of code change to stop the scanner from detecting it."

This sounds pretty interesting to me (and it also explains why signature picking may not be as boring as I thought). However, I am wondering how a well picked signature should look like? For example, I took a closer look at the Beast 1.92c server:

Some scanners try to detect it by simply scanning for a text string called "Beasty". Obviously, such signature cannot be considered safe. The same applies to very large, non-fuzzy signatures like fingerprints. Checksums are cr.p, too.

Also signatures like "s h e l l . c o m ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? T T Y U P D . C O M ?? ?? B e a s t y " can be easily hexedited without changing any executable code.

In addition, the following pair of signatures cannot be considered safe:

"Signature 1

646F6F72206F70 = door op

Signature 2

6A006A02E823E3 =

004171A8 push 0 ; dx
004171AA push 2 ; dwFlags
004171AC call mouse_event"

This is because a double-match is required and the first signature can be simply hexedited (e.g., you can write "dour op" instead of "door op").

By contrast, if a signature is based on executable code like a "call [xyz address]" it cannot be simpy "hexedited" w/o breaking the server. However, the server can be "patched", e.g., the call can be redirected to another location from where a jmp to the original location of the call takes place. Such signatures are safer than text strings and they are harder to find, too. However, they still do not require major changes of the trojan's code.

Therefore, I am wondering how a safe signature should look like? (I feel that I need to get an answer to this question first before I can understand whether it is possible to generate safe signatures automatically.)

Cheers, ntl

4A6F4A6F · Dec 23, 2003

hi

here are my comments about this also interesting topic, btw this board has sometimes more interesting and akind of funny topics as other "security boards" *cough*

Jason / DiamondCS: For example a well picked manual signature in a lot of cases cannot just be hex-edited or it would break the program, so it requires a recompile (hence you need source) with a lot of code change to stop the scanner from detecting it.
Click to expand...

But to use easy to find keywords and readable strings as sigs is not really a good idea. Example a trojan has been written by a person who has the nickname xyz and now to use this: like 'written by yxz' or xyz@someone.com' etc. as a sig for this trojan was choosen by a human but not really clever. A skript kiddie only have to know the trojan coder, the trojan name, typical server files names or whatever thats all..so to use easy to find strings is not really a good idea for a sig, just my opinion.

Jason_DiamondCS · Dec 25, 2003

You need to know assembly language / programming to understand which code cannot be changed and which can, obviously there is code which can be switched around fairly easy and others which can't. With code which cannot easily be changed they can insert jmp's to other code which jmp's back to the old (after writing any other instructions which don't fit nicely with the new jmp) location if they have enough "blank" space to write this code into the file, usually which can be found at the end of sections which are aligned on some form of boundary. So to get past a single scanner with only ONE signature it may be fairly easy to do this 'jmp' method. But most hackers aim to stop it being detected by a few scanners at least, and a lot of scanners have more than one signature, and hence this option quickly fades.

There are also a lot of cases where taking "text" based signatures are good, it finds a lot of variants, etc. They shouldn't be relied on as a single signature, rather as PART of a group of signatures.

-Jason-

Log in or Sign up

Automatically Generated Signature -- Good or Bad?

ntl Guest

ntl Guest

rerun2 Registered Member

ntl Guest

Jason_DiamondCS Former DCS Moderator

Wayne - DiamondCS Security Expert

Andreas Haak Guest

Wayne - DiamondCS Security Expert

Andreas Haak Guest

Gavin - DiamondCS Former DCS Moderator

Nautilus Registered Member

4A6F4A6F Registered Member

Jason_DiamondCS Former DCS Moderator

Log in or Sign up

Automatically Generated Signature -- Good or Bad?

ntl Guest

ntl Guest

rerun2 Registered Member

ntl Guest

Jason_DiamondCS Former DCS Moderator

Wayne - DiamondCS Security Expert

Andreas Haak Guest

Wayne - DiamondCS Security Expert

Andreas Haak Guest

Gavin - DiamondCS Former DCS Moderator

Nautilus Registered Member

4A6F4A6F Registered Member

Jason_DiamondCS Former DCS Moderator

Useful Searches