Event ID:6004 - A driver packet received from the I/O subsystem was invalid.

Discussion in 'ESET NOD32 Antivirus' started by dwood, Jan 30, 2008.

Thread Status:
Not open for further replies.
  1. capatt

    capatt Registered Member

    Joined:
    Jan 23, 2007
    Posts:
    84
    I'm seeing exactly this same problem on a home computer with EAV (NOT ESS) 3.0.642 installed. It doesn't seem to affect actual performance, except to fill the logs.
     
  2. GhostMan

    GhostMan Eset Staff Account

    Joined:
    Jun 8, 2007
    Posts:
    99
    Location:
    Bratislava
    Guys

    use this driver

    (only under EAVBE, IT'S NOT WORKING FOR ESSBE)
    and send us some feedback, please. Bug 6004 should be resolved...

    Regards.

     
    Last edited by a moderator: Mar 11, 2008
  3. capatt

    capatt Registered Member

    Joined:
    Jan 23, 2007
    Posts:
    84
    I'd love to try this, but what exactly do I do? Where do I put it?
     
  4. Marcos

    Marcos Eset Staff Account

    Joined:
    Nov 22, 2002
    Posts:
    14,456
    Just rename the original file, copy the new epfwtdir.sys instead and restart the computer.
     
  5. dwood

    dwood Registered Member

    Joined:
    Jan 11, 2005
    Posts:
    92

    Testing on a couple of PC's who get this problem, will post results shortly. :thumb:
     
  6. mps_surcouf

    mps_surcouf Registered Member

    Joined:
    Mar 5, 2008
    Posts:
    33
    GhostMan/Marcos

    Winxp sp2 NOD32 3.0.642

    Renamed old file.
    Copied the downloaded file to my workstation.
    Restarted

    Still get

    Event Type: Warning
    Event Source: MRxSmb
    Event Category: None
    Event ID: 3019

    and
    Event Type: Error
    Event Source: EventLog
    Event Category: None
    Event ID: 6004

    When connecting to network shares
    So this is not fixed for me.

    If I check the event logs on all 35 work stations (at least the 10 or so I did a spot check on) This error only occurs after NOD 3.0.642 install.
    There were no errors of this type before EAV 3.0.642.

    Thanks Mike
     
  7. STI

    STI Registered Member

    Joined:
    Feb 25, 2008
    Posts:
    10
    hi,
    i have changed the driver in the windows\system32\drivers dir and in the program dir and had no event 3019 since then.

    will this driver also solve the stuck server problemo_Oo_O

    nico :)
     
  8. mps_surcouf

    mps_surcouf Registered Member

    Joined:
    Mar 5, 2008
    Posts:
    33
    Hi

    I only changed it in the programs directory (lack of info?)

    Marco could you explain why it is in two places? Can you clarify.

    I will change it in the sytem32\driver directory aswell and report back

    Thanks

    Mike
     
  9. mps_surcouf

    mps_surcouf Registered Member

    Joined:
    Mar 5, 2008
    Posts:
    33
    Hi

    Now I have changed epfwtdir.sys in the system32\drivers directory (aswell as C:\Program Files\ESET\ESET NOD32 Antivirus\Drivers\epfwtdir directory)

    No more 3019 or 6004. Thankyou!
    Please let us know when you may have a full product update for distribution.

    I never had a performance problem but a read a few posts regarding server performance being affected by this. Is it possible to get any technical info.
    I want to be sure before puttting V3.0 on our servers. If a workstation hangs its not a problem but of course a server is a whole diffenret story.

    Thanks again for the update

    Mike
     
  10. dwood

    dwood Registered Member

    Joined:
    Jan 11, 2005
    Posts:
    92
    Guys,

    Can confirm that on both test machines with driver replacement they are clear of the 6004 errors. :D

    Any idea when this will be released?

    Thanks

    Dan
     
  11. noons

    noons Registered Member

    Joined:
    Apr 27, 2007
    Posts:
    115
    So glad I just found this post. I have been banging my head about these log entries for a while now and didnt even think of the issue being nod32. Thought at first it was some sort of hardware issue, but then found the same entries on another computer using nod32.
     
  12. CrookedBloke

    CrookedBloke Registered Member

    Joined:
    Oct 15, 2007
    Posts:
    110
    DO NOT INSTALL THIS "FIX" ON YOUR PRODUCTION SERVERS WITHOUT READING THIS FIRST!

    (sorry for shouting)

    Okay. I downloaded the driver file from a link provided by GhostMan above. I installed EAV 3.0.642 on a Windows Server 2003 SP2 system (after completely uninstalling NOD32 2.7 and rebooting), then I renamed the epfwtdir.sys file in both locations on the system and placed the new driver in those locations. I restarted the server and tested it.

    The Event ID 6004 error message did not reappear in the System log. Tentative jubilation.

    However, the Event ID 3019 warning messages still appear any time shares on the server are browsed in an Explorer window on the same server. That, in and of itself, is not a serious problem. But, unfortunately, the REAL problem is not fixed.

    When I tried to log off of the test server the terminal services session hung. I closed the session on the remote client, logged on to another server on the domain, and tried to use the Terminal Services Manager to reset the session. The session could not be reset. The server was now unresponsive to attempts to connect via RDP or from its physical console. Yes, that's right. It couldn't even be controlled from its own keyboard. It had to be forced to shut down by holding its power button depressed for >5 seconds, as per its operator’s manual.

    This is almost EXACTLY what we have been seeing on our servers all along with version 3 of EAV. The ONLY difference is that, this time, I didn’t see the Event ID 6004 error in the System log of the compromised server.

    The server's shares could still be seen from remote systems, but it was not possible to log on to the server locally or remotely. If it continued to run, I would expect it to slowly sink into total unresponsiveness even to requests for access to network shares, as our other servers did previously when they were running EAV 3.x.

    NOT FIXED!

    Sorry. I hope Eset can get this corrected soon. As it stands, the software is utterly useless to us.
     
  13. mps_surcouf

    mps_surcouf Registered Member

    Joined:
    Mar 5, 2008
    Posts:
    33
    Hi CrookedBloke

    RE: browsing shares on the same server

    That will always produce 3019 errors even on a machine I have here that never had any AV on it. I dont fully understand why but if you google you will see this is "normal" behaviour and nothing to do with ESET. I think this might be e red herring in your case.

    I don't have v3 on my servers and don't plan to after your comments.
    However you do seem to be in a unique situation. Could anyone else confirm this behaviour. This is not to say it is not very important to resolve just that we could do with a few other instances of this to study.
    I will be setting up a test server soon and may be able to help.

    It seems strange that my client machines do not become unreponsive due to this driver. I always thought the network stacks were very similar. I was wondering if you could list the apps running on your server. It seems to me the most likely explanation is EAV 3.0 is not dealing with a particular application correctly. Prehaps some sort of indexing or database app.

    Cheers

    Mike
     
  14. Marcos

    Marcos Eset Staff Account

    Joined:
    Nov 22, 2002
    Posts:
    14,456
    If someone happens to replicate the problem on servers, please enable complete memory dumps and create one when the server stops responding (instructions are available at Microsoft's website). I'm aware that no one will be willing to try it on production servers, but in case someone manages to replicate it, we'd highly appreciate a memory dump. Please PM me if you have one for perusal.
     
  15. CrookedBloke

    CrookedBloke Registered Member

    Joined:
    Oct 15, 2007
    Posts:
    110
    Hi, yes I'm aware that Event ID 3019 warnings are created when browsing a server's own network shares with Explorer. But what I started seeing with EAV 3 (as opposed to NOD32 3.7 and SAV CE) were LOTS of these warning messages caused by other actions -- like using Robocopy to mirror shares on different servers.

    The test machine on which I performed this experiment is not runningg ANY application whatsoever. This has absolutely nothing to do with applications. I've been working on the 3019/6004 problem since December. My situation is not at all unique. There are quite a few posts here about servers becoming unresponsive when running EAV 3.x.

    I have seen hearsay (not direct information from Eset) from some on this forum and in other places that Eset was blaming the Event ID 6004 issue on Microsoft -- claiming that the error is an inappropriate response by the OS to something the driver is doing. I hope that Eset's development team is not going on that assumption. What I'm seeing is almost certainly a malfunction of the driver, NOT of the OS. I have seen this behavior on WS2000 SP4, WS2003 SP2, and WS2003 R2 SP2. I have seen it reported on other operating system configurations. This is a problem with the Eset software, NOT with the operating system.
     
  16. CrookedBloke

    CrookedBloke Registered Member

    Joined:
    Oct 15, 2007
    Posts:
    110
    I will try to do this tomorrow, if time permits. I will be in touch with you if I have questions. I can't test this system to destruction because it must be available as a spare in case of failure of another system, but I'll do my best.

    Edit: On second thought, I would appreciate a much more explicit set of instructions from you. The MSKB article is one which which I am (or at least think I am) familiar. I don't think it will work for us under these circumstances. I cannot force a memory dump from a keyboard when the console has been rendered inoperable by the condition I'm trying to replicate. -- Or am I missing something?
     
  17. mps_surcouf

    mps_surcouf Registered Member

    Joined:
    Mar 5, 2008
    Posts:
    33
    Hi Crooked bloke

    First off I want to see this fixed as much as you.

    If the situation is so simple ie file serving only I don't see why it can't be reproduced quite quickly.

    If you send me your EAV config file and how I can repro I will happily give it a go.

    I have a backup box here with windows 2003 sp2 I can install EAV 3.0 on
    If I know how to make it die ie a copy script or something I should be able to confirm quite quickly.

    Cheers

    Mike
     
  18. CrookedBloke

    CrookedBloke Registered Member

    Joined:
    Oct 15, 2007
    Posts:
    110
    Hi, Mike.

    It isn't complicated at all. From my communications with others on this matter you don't need my config file or any information specific to my locale or system. All you have to do is install the antivirus software on the server, configure it to your liking, log on to the server via RDP (version doesn't matter), and browse the server's own network shares with Windows Explorer. Then you just quit the Explorer instance and (try to) log off the server.

    The most interesting thing about this (to me, at least) is that it gets progressively worse over time on a given server. Early on, in December, I ran EAV 3.x for a couple of weeks before I actually encountered the problem with loss of responsiveness to RDP and console sessions. The system I tested today had been running 2.70.39 for a few weeks with no problem whatsoever. I removed that version of NOD32 completely, rebooted, installed 3.0.642, replaced the driver file (in both locations), rebooted, and then browsed the network shares with Explorer.

    When I tried to log off of the remote session the logoff screen just froze. I let it stay that way for 30 minutes before I finally just terminated it. I was unable to log on again remotely. I went to the site where the server was and tried to log on locally. No dice. I had to force the system to shut down to restore responsiveness to console or RDP. During this time the shares on the server could still be seen from other systems in the domain. I have seen systems in this state gradually stop responding to requests for share access over a period of hours to days.

    When I restarted this system and logged on I saw the Userenv warning I had expected to see, but no other warnings or errors -- other than the Event ID 3019 warnings related to the network share browsing. Before the "fixed" driver was installed I would have seen many more 3019 warnings and just about one or two 6004 errors in the System log before a failure.

    Marcos is asking for a user to produce a memory dump from the keyboard when the condition is duplicated, but I don't think that can work because once the condition is duplicated, the keyboard has no effect upon the system. I may give it a go tomorrow anyway, if I get a chance. But this is not really high on my list of priorities right now. I've been working on it for too long, with far too little to show for it.

    If you take a look around I'm sure you'll find quite a bit of information about this issue, though many related posts have not been very informative. I've tried to provide as much information as possible, but the problem has been very easy for me to reproduce (too easy, in fact) on a variety of systems. And I've seen many other users make the same assertion. I had to totally remove it from my production domain -- twice. I would never have tried a new and unproven AV on that domain but we had run into catastrophic failures with SAV CE and had to replace the AV software pronto. On the other hand, NOD32 runs on all of these systems with nary a glitch. I can even schedule in-depth on demand scans during operations with no discernible effect upon production.
     
  19. guest

    guest Guest

    Those of you having server lockups have you looked at this?

    http://support.microsoft.com/kb/948496

    The following issues may occur when Windows Server 2003 SNP is turned on:• When you try to connect to the server by using a VPN connection, you receive the following error message:
    Error 800: Unable to establish connection.
    • You cannot create a Remote Desktop Protocol (RDP) connection to the server.
    • You cannot connect to shares on the server from a computer on the local area network.
    • You cannot join a client computer to the domain.
    • You cannot connect to the Exchange server from a computer that is running Microsoft Outlook.
    • Inactive Outlook connections to the Exchange server may not be cleaned up.
    • You experience slow network performance.
    • You may experience slow network performance when you communicate with a Windows Vista-based computer.
    • You cannot create an outgoing FTP connection from the server.
    • The Dynamic Host Configuration Protocol (DHCP) server service crashes.
    • You experience slow performance when you log on to the domain.
    • Network Address Translation (NAT) clients that are located behind Windows Small Business Server 2003 or Internet Security and Acceleration (ISA) Server experience intermittent connection failures.
    • You experience intermittent RPC communications failures.
    • The server stops responding. • The server runs low on nonpaged pool memory
     
  20. Marcos

    Marcos Eset Staff Account

    Joined:
    Nov 22, 2002
    Posts:
    14,456
    If you manage to replicate the problem, it would also help us to know:
    1, whether disabling the real-time protection makes a difference (e.g. disable automatic real-time protection startup and restart the server)
    2, whether browsing to \\computer_name produces error 8003 in the system event log
     
  21. CrookedBloke

    CrookedBloke Registered Member

    Joined:
    Oct 15, 2007
    Posts:
    110
    guest

    Thanks for your suggestion, but, no, the MSKB article to which you refer doesn't describe this situation. For one thing, most of our servers are WS2000 SP4, not 2003. There are some similarities to the issues described in the article, but also considerable differences. Besides, the problem doesn't occur when the servers are using ANY other antivirus software besides EAV 3.x. (Have tried NOD32 2.7, SAV CE various versions, McAfee various versions, and Trend various versions.) The problem is definitely an incompatibility between our server operating systems (in several quite different configurations) and EAV 3.x.

    Are you running a standard version of Windows Server (2000 SP4, 2003 SP2, or 2003 R2 SP2) with EAV 3.x that does not exhibit this behavior? So far, the only people I've actually run into (admittedly on a forum where problems are reported) who claim to be successfully running EAV 3.x on one of these server versions without having serious issues with network shares are folks who are running Advanced or Enterprise versions of the OS. Not sure whether or not that is really what makes a difference or whether it could be a hardware issue.

    The servers I'm using are Dells with mirror raid on the OS drives and RAID 5, RAID 10 or RAID 50 on external arrays, all using various PERC controllers. They are dual Pentium III or Dual Xeon systems with memory ranging from 512 MB to 2 GB. Some are DCs, some are just member servers, some have SQL Server 2000 running on them, and some have no app software running on them at all. All of them respond in the same way to EAV 3.x. None of them have this problem with any other antivirus software, though two of them were suffering spontaneous reboots under very heavy network traffic loads (gigabit) when they were running SAV CE. (That was what prompted us to change antivirus software from Symantec to Eset.)

    The responsiveness issue does NOT occur if EAV's real-time protection is turned off. But that makes the software useless to us. So I've reverted the domain to version 2.7 -- twice. Been working on this since December. And there are no other ongoing network / domain configuration problems here.

    Thanks again for your suggestion.
     
  22. CrookedBloke

    CrookedBloke Registered Member

    Joined:
    Oct 15, 2007
    Posts:
    110
    Hi, Marcos.

    I may have mentioned this before in our previous communications, but thought I should mention it:

    1. Disabling real-time protection makes the problem stop happening. No matter how hard I try, with real-time protection disabled, I cannot duplicate the issue.

    2. Browsing to \\computer_name has not produced error 8003 in the System logs at any time AFAICT.

    Unfortunately, I was called and asked to put the system I was using for testing back into production, so I'm not going to be able to participate any further in testing for now.
     
  23. guest

    guest Guest

    I have 16 servers running a mix of Server 2003 Enterprise/Standard and 32 and 64 bit. I have Nod32 3.0 .621 or .642 installed

    So far only one of them has locked up like previous users have described. It took a hard reboot (during production hours) to get it back online. It was a file server which is accesses by about 25 users.

    I have network scanning disabled on all installs of Nod32. I can't say my lockup was caused by Nod32 yet.
     
  24. CrookedBloke

    CrookedBloke Registered Member

    Joined:
    Oct 15, 2007
    Posts:
    110
    guest,

    Thanks for that information. To give me a better idea of what's going on, would you mind telling me what you mean when you say that you "have network scanning disabled"? Are you excluding network shares from coverage by real-time protection, or do you mean something else?

    I don't know your network, but I'd be pretty suspicious about that server lockup you mentioned. My domain here is a known quantity. The ONLY functional issues it has experienced in six years of operation have involved antivirus software -- Symantec AV Corporate Edition and Eset EAV 3.x.

    SAV CE was able to cause spontaneous reboots on two of our most heavily laden servers on occasion. I debugged dumps that proved this to Symantec's satisfaction. I was unable to work with their upper level support on the issue because the ONLY way we (or anyone else I know of) could duplicate this issue was on a system that was really being hammered on a big network with a ton of clients. We had to switch to another client very quickly.

    NOD32 2.7 worked very well for us, but was somewhat lacking in admin capabilities. That's why I went to version 3.x when it came out. That was a disaster for the aforementioned reasons, and we have had to revert the domain to 2.7.

    Interestingly, the problems were initially kind of slow to come on. But now I can guarantee the reproducibility of the problem with 3.x within minutes of installing and testing the software on a server. I have to wonder if one or more MS updates have exacerbated the issue, but the errors were there all along, as I can see in my logs. It is just that the responsiveness issue has got worse.

    Turning off real-time protection altogether does prevent the problem from happening, but killing RT protection, or even just excluding network shares from protection, is not an option for us. Our customers send data to us that goes onto those shares. A slip-up by one customer could have disastrous results for all. And the lead time from getting a significant portion of the data to putting it into use on the production network is nil. There is no way to get the data first, scan it, and then place it on the network as a separate operation.

    I'm between a rock and a hard place.

    BTW, for grins I have actually eliminated all recommended exclusions for NOD32 2.7 on our most heavily hammered server and run an in-depth scan on it while in full production -- with no ill effects. That's kind of impressive.
     
  25. Marcos

    Marcos Eset Staff Account

    Joined:
    Nov 22, 2002
    Posts:
    14,456
    There is a special AMON driver available for testing which has one of the new scanning features in v3 disabled. This feature was not supported in v2. If someone experiencing a server lockup is willing to test it, please PM.
     
Thread Status:
Not open for further replies.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.