Possible Kaspersky AV bug on partition with 1 million files

Discussion in 'other anti-virus software' started by DonovanHawkins, Feb 3, 2013.

Thread Status:
Not open for further replies.
  1. DonovanHawkins

    DonovanHawkins Registered Member

    Joined:
    Feb 3, 2013
    Posts:
    3
    Location:
    USA
    I submitted the following bug report with Kaspersky and wanted to see if anyone else can confirm it on their system. I am using Kaspersky Anti-Virus but I would assume that KIS will be the same.


    I have spent the past week tracking down a problem that causes file copying to freeze with no way to cancel or kill the process. It would appear that KAV (Kaspersky Anti-Virus) has some problems with partitions which contain more than 1 million files. I have reproduced the problem on two different hard drive configurations, two different hard drive controllers, and a variety of different files and partition sizes.

    If there are more than 1048576 file records on a partition and you try to copy a bunch of files to that partition, the copy will freeze and be impossible to cancel or kill. Normal disk read/write will stop, but there will be write activity to $Logfile on the affected volume. I have done a lot of testing and made a few observations:

    • Partition size, used/free space, and file types/sizes already on the partition do not matter. I have reproduced the problem on partitions as small as 20 GB and as large as 2 TB. I have reproduced it with over a terabyte of data on the partition and with just zero-byte files that took up no space.
    • Source of the copy does not matter, but the simple method I have found for reproducing the issue is most reliable if the source is on the same partition. I originally encountered the problem while copying between two different drives, however.
    • The problem does not occur on a freshly formatted drive that has no System Volume Information folder. I run chkdsk once to trigger the creation of this folder in my method below.
    • The problem only occurs if you are adding more files than the partition has ever had in the past. For example, if you had 1.5 million files and deleted 200,0000 of them, the problem would not occur until you tried to exceed 1.5 million again. It seems to only happen when new NTFS file records are being created, not when old file records in the MFT are being reused. I have not tested whether compacting the MFT would reset this (some third-party defrag software can do this).
    • Copying a single file doesn't necessarily cause the problem, and there is no hard limit. There is just a high probability that the copy will freeze which is virtually certain when copying a large number of files.
    • Copying simple .txt files will cause the problem only if File Antivirus is enabled. Copying .exe and .dll files will cause the problem even with KAV completely disabled and the shell extension unregistered. The only way to completely fix the problem is to disable KLIF.sys via the registry.

    I have developed a simple method to reproduce this problem. Many other scenarios will reveal the problem, but this is a straightforward way to reproduce it quickly. To prepare a fresh partition for testing:

    1. Create and format a new 50 GB partition using NTFS and call it K: (any letter will work)
    2. Open an administrator command prompt and run:
      chkdsk K:
    3. Create a folder K:\fill
    4. Create an empty text file K:\fill.txt
    5. Copy C:\Windows\notepad.exe to K:\fill.exe
    6. Open a regular command prompt in K:\ and run:
      for /L %F in (1,1,10) do start /min cmd /c for /L %G in (1,1,100000) do copy fill.txt fill\%F_%G.txt
    Once this is done you will have a drive which is roughly 48,500 files away from the problem.


    To see the problem with .txt files, prepare a fresh partition for testing. Open a regular command prompt in K:\ and run:
    for /L %F in (1,1,100000) do copy fill.txt fill\%F.txt

    If File antivirus is enabled, the previous command will stop around the 48,500 mark. If File antivirus is disabled, the command will finish copying all 100,000 files.


    To see the problem with .exe files, prepare a fresh partition for testing. Open a regular command prompt in K:\ and run:
    for /L %F in (1,1,100000) do copy fill.exe fill\%F.exe

    This will again stop even if KAV is completely disabled. The only way to fix this is to disable the filesystem filter driver (KLIF.sys) by setting HKLM\SYSTEM\CurrentControlSet\services\KLIF\Start to 4 and rebooting. To return it to normal, set the value back to 1. Note that you must disable KAV's Self-Defense in order to do this.


    Some final notes for reproducing this:

    • Sometimes a reboot is needed after changing KAV settings.
    • I originally tested by drag-and-drop copying the System32 folder from an old Vista install (has lots of .dlls). I also did the same with the System32 of the currently-running OS and with the KAV folder under Program Files. These tests all froze even when dragging from one physical drive to another. I suspect that using the shell results in some reads that are crucial to the timing or pointer/handle math that causes this problem. When reproducing this problem via the command prompt, having the source file on the same partition provides the necessary reads.
    • I also managed to reproduce the problem without any copying by selecting (single-click in Explorer) the zero-byte file fragment that got stuck during one of the above System32 copies. This non-copy version of the problem does not occur if File antivirus is disabled.
    • The machine I have been testing with is a recent reinstall that has no other security software and no software that would install any kernel-mode filter drivers. I have only installed the OS, updated all drivers, configured various settings, and started to install some basic software like Adobe Reader and Firefox.


    EDIT:
    I did some additional testing with a second computer running Vista Ultimate x64 and was able to reproduce the problem using Kaspersky Internet Suite. I then installed a fresh copy of Windows 7 Professional x64 on the same computer followed immediately by Kaspersky Antivirus and again I was able to reproduce the problem.

    Repeatedly copying notepad.exe (as specified in my previous method) did not work to trigger the problem on the second computer, but repeatedly copying msvcr100.dll from KAV's x64 folder did trigger it. Copying the Windows System32 folder via explorer also triggered it.

    I tried setting the integrated Intel controller to both AHCI and legacy IDE modes and the problem occurred in both cases. The first computer I encountered this on had the problem occur on both the integrated Intel controller (single SATA drive) and on a PCI-E LSI RAID card (6-drive RAID-10 array).
     
    Last edited: Feb 5, 2013
  2. Bodhitree

    Bodhitree Registered Member

    Joined:
    Dec 5, 2012
    Posts:
    567
    I'm getting the impression, day after day, Kaspersky isn't the AV to run...
     
  3. The Red Moon

    The Red Moon Registered Member

    Joined:
    May 17, 2012
    Posts:
    3,872
    From personal experience i would say your impression is wrong.:cautious:
     
  4. whitestar_999

    whitestar_999 Registered Member

    Joined:
    Apr 1, 2010
    Posts:
    101
    "partition with 1 million files".this is the first time i am hearing about such a thing & i am pretty sure you are the only one/0.00001% of total users of kaspersky having such a partition.i wouldn't consider it a bug but rather an exceptional situation which probably never crossed kaspersky developers.btw what creates 1 million files in a partition i am really interested in knowing(of course assuming from a normal user/downloader point of view & not if you are running some kind of huge database/log files system for something like a server or multiple virtual servers).
     
  5. DonovanHawkins

    DonovanHawkins Registered Member

    Joined:
    Feb 3, 2013
    Posts:
    3
    Location:
    USA
    It's still a bug when their kernel-mode driver chokes on a perfectly valid partition, but I agree that they shouldn't be blamed too much for having missed it during testing. It won't really affect my opinion of their product (assuming they are able to fix it).

    Keep in mind that their filter driver probably doesn't care how many files are on the disk, so it isn't just a matter of not bothering to support so many. It's more likely that they are truncating some file-related identifier and that it only causes a problem when the value is large enough to get trashed.


    A couple decades of accumulated junk that I've never finished going through. I tend to copy everything off my old drives into folders on my new computer when I upgrade, and those folders have just grown and grown.

    I was trying to get it all moved into one place so I could do some cleaning when I encountered this problem.
     
  6. _vx_

    _vx_ Registered Member

    Joined:
    Mar 13, 2013
    Posts:
    1
    Location:
    Ukraine
    The same issue.
     
  7. hawki

    hawki Registered Member

    Joined:
    Dec 17, 2008
    Posts:
    1,956
    Location:
    DC Metro Area
    Not so sure about that. Maybe it's the way Kaspersky counts files BUT I use 137 gigs on a 1000 gig partition and at the end of a full kaspersky scan it reports that it scanned 8.5 MILLION files

    EDIT: I am thinking its maybe cuz I have several AVI "films" on my drive so I guess Kaspersky counts each "frame" as a seperate file.
     
    Last edited: Mar 13, 2013
  8. whitestar_999

    whitestar_999 Registered Member

    Joined:
    Apr 1, 2010
    Posts:
    101
    i don't think AVI files(or for that matter any multimedia file type) are scanned by each frame.the scenario you described is for complex 100mb's installers/gb size game iso's which pack thousands of compressed dll & exe files.still for millions of files to exist you must have 100's of such installers/iso or like the op mentioned accumulating years of data from various hdd in a single partition.it is not possible for an average user to have million+ files in a single partition.
     
  9. DonovanHawkins

    DonovanHawkins Registered Member

    Joined:
    Feb 3, 2013
    Posts:
    3
    Location:
    USA
    Kaspersky confirmed on 2013/03/12 that they had found a bug in KLIF.sys and were testing a fix. That fix was released with 13.0.1.4190(g) and I have not been able to reproduce the problem with this version.

    In the end, it took more than a month to get past level 1 tech support, less than a week for them to fix the bug once they sent my report in, and more than another month for the fix to be released.

    For the curious, the last observation I had made was that the bug occurs when the MFT grows beyond the 4GB mark on the partition. Here is what I sent to Kaspersky regarding that:


    I think I found a possible significance to the 1 million files. The problem occurs when the Master File Table (MFT) grows beyond the 4GB point on the partition. I have verified this by running fsutil fsinfo ntfsinfo k: at an admin command prompt (see below for details).

    This could help identify where the problem is. For example, the KAV filter driver probably has to keep track of where the MFT is located on disk and handle driver requests for it differently than for regular files. If the filter driver uses 32 bits to store or compare the MFT upper limit, the special handling would not be invoked when a new entry is added to the MFT. This could result in a deadlock or infinite loop.


    Details:

    Running the fsutil command above gives two important values: "Mft Valid Data Length" and "Mft Start Lcn". The former is the total number of bytes in the MFT while the latter is the logical cluster number where the MFT starts.

    For a typical NTFS partition the MFT starts at cluster 0xC0000 which is byte offset 3,221,225,472 (4k per cluster). With 1,048,576 entries the MFT will take up 1,073,741,824 bytes (1k per MFT entry). This will put the end of the MFT at byte offset 4,294,967,296, which is the point at which a 32-bit unsigned integer will wrap.

    I have verified that the partitions which reproduce the problem have "Mft Valid Data Length" slightly greater than 1,073,741,824 and "Mft Start Lcn" equal to 0xC0000.
     
  10. zapjb

    zapjb Registered Member

    Joined:
    Nov 15, 2005
    Posts:
    3,518
    Location:
    USA - Back in a real State in time for a real Pres
    Thanks for the wrap up.
     
  11. er34

    er34 Guest

    @DonovanHawkins

    Hey!

    Thank you for your patience and persistency re. this issue. You deserve a present :thumb:
     
  12. Terra Branford

    Terra Branford Registered Member

    Joined:
    Aug 8, 2013
    Posts:
    1
    I'm sorry to bump a topic that is a few months old, but I want to express my sincere gratitude to DonovanHawkins, for posting this thread.

    I was hit by a rather nasty Trojan that propagated all over my host drive, and Kaspersky was only able to enter into a "Found it! Disinfected it! Ah, crud, it's back!!!!" cycle. I eventually gave up, and wiped the host drive. The problem, is that I also have two additional drives (2TB each) that were connected at the time of infection... so there was a very real worry the Trojan may have left a "present" waiting for me, in one of those drive's shadow directories.

    I've spent the last four days trying to do ~deep~ scans with Kaspersky (cranking the settings to "For dangerous environments", and even enforcing Rootkit scans across all files, regardless of size)... and each time, it would get to 44% complete (a little over a million files), and promptly Kaspersky would crash out, and then restart itself.

    I was terrified that the virus was still present, and it was causing Windows to spazz out, or something. But now that I know about the limitations on file numbers it can safely scan, I know I need to either do individual "Custom Scans" for these drives, ~remove~ a drive temporarily, or delete some files.

    You have no idea how much this thread has helped me to not have a full blown panic attack over this. I've been making intensive backups of all my data onto these drives, for the last fifteen years of my life, and while it's a "structured backup" system (all nice and tidy, folders organized by date of backup), it's hard to know ~exactly~ what is safe to delete at this point, and what shouldn't be touched... so I tend to just let the backups sit there, and not touch them "just in case".

    So, thank you, again. I really thought my system had just been re-hijacked by that stupid Trojan, and that I was completely up the creek without a paddle.
     
Loading...
Thread Status:
Not open for further replies.