AV Definition update caused error

Discussion in 'ESET NOD32 Antivirus' started by ThomasAdams, Sep 2, 2010.

Thread Status:
Not open for further replies.
  1. wenetu

    wenetu Registered Member

    Joined:
    Sep 2, 2010
    Posts:
    1
    Same problem here in Italy, and it's past midnight!

    It wasn't easy to understad that the problem was caused by the NOD32 definiton update (I didn't see the ekrn.exe memory error at first), expecially with 100 clients screaming like mad that their PC was unusable!

    I wonder if ESET will do more test on its updates.

    Thanks to everybody on the forum, ad thank to Alan that offer to bring me a Pizza.

    Goodnight
    Maurizio
     
  2. 8bit

    8bit Registered Member

    Joined:
    Jun 18, 2008
    Posts:
    9
    So far so good! The tool is working like a champ for me. No rebooting at all!

    Thank you!
     
  3. 0verlord

    0verlord Registered Member

    Joined:
    Dec 18, 2008
    Posts:
    17
    All my servers are back up, luckly only had to reboot two. Couting the minutes till I can go home and have a very stiff drink.
     
  4. tommycat1313

    tommycat1313 Registered Member

    Joined:
    Sep 2, 2010
    Posts:
    8
    I have spent the entire day remedying this problem. I almost got fired for it. I promise ESET that THEY will get fired if anything like this ever happens again.

    I spend too much time fighting spyware and viruses to worry about fighting the software that is supposed to prevent them.

    Thank you ESET for turning me into a fully-fledged alcoholic. :mad:

    ---TC
     
  5. murdamcloud

    murdamcloud Registered Member

    Joined:
    Jul 30, 2007
    Posts:
    19
    This thread and the extra work I've had to do over the past 24 hours has made me think about two things(ok, two more things). <soapbox>
    1. The nature of responsibility in a situation like this. Blaming is easy but taking responsibility is harder for me to do. I like shifting responsibility. Ultimately though, servers and clients on the network I look after are pretty much my responsibility. I chose what goes on them and how and when. But I'm squeezed by time and money and resources so I have to resort to trusting that people I don't know who write software updates in countries I have never visited, will get it right day after day after day. And let's be realistic here, it could be any company that we trust will get it wrong. (Dell shipping malware on servers? Mcaffee's probs earlier this year? Dodgy antenna on an iPhone? The list goes on). This brings me to;

    2. There are so many unintended consequences in life that I have begun to think that we do not understand the nature of risk and don't really have enough robustness built into our systems.
    For instance, your server has RAID and you backup every night according to some rotation that will ensure you have several chances of getting your data back or keeping it available WHEN something goes wrong. No Ifs. When. A long time ago, someone spilled a coffee on their server, lost their data and thought, 'Jeez, I wish I could figure out a way to make sure that when that happens again, I don't get fired,'.
    This ESET situation has pointed out that there is a weakness in our system-ie some risk that most of us weren't really aware of before. If we were, then we'd have some kind of system in place to deal with the fallout or to prevent it in the first place. Someone mentioned testing them on our test networks before pushing to production. But he also said that it would be extremely time consuming to do so. And he was right. Who can do that? Well, can we have a reasonable expectation that ESET should be doing it better than they did this time? As Marcos said, though, the fact that it was a problem in 5417 that did not manifest until 5418 was applied gives it the look of one of those fat tail probability type happenings. Small chance piled on top of small chance has big impact. Just because it has a small chance of occurring doesn't mean we should devote less resources to it. We have to work out how much. Hearing people point to their experiences "In all my years I've never had something like this happen" doesn't really convince me that it won't happen in the future or that they weren't just 'lucky'. Or that I haven't just been lucky...so far. Some variables are hidden from us-it takes events like this to reveal them. Then we have 'the prescience of hindsight'. i.e being good at predicting what just happened.<end soapbox>
     
  6. gisuck

    gisuck Registered Member

    Joined:
    Nov 4, 2008
    Posts:
    56
    I thought I'd chime in a little.

    I work in IT, and my company does run ESET as its antivirus product. We encountered a "minor inconvenience" with the 5418 update, and I cannot understand why there is such a large outcry of "OMG... who's paying my overtime. I can't get work done"

    The fix was simple. Reboot. Press Update. Reboot again. Problem solved. The business was back up and running within 2 hours of the first report of problems and me investigating what happened. 30 minutes of finding out about the bad update and how to fix it. I simply sent out an email with instructions and screenshots, and everyone did what they told.

    For me, I cannot understand how a server can be so fragile that you "can't reboot" the server what-so-ever. If your server is offline, rebooting it isn't going to make it more offline than it already is. So reboot the server and get it online as soon as possible, instead of waiting for "I need a tool so I can keep my uptime on my server to brag to all of my nerdy friends."

    Also, people might want to re-evaluate running security programs on their servers. If you think about how the huge majority of viruses needs an actual application to be launched on the computer in question before the infection to take place, there really isn't a need to have it on the server. Even if the malware was stored on the server, you are pretty much 100% safe until you execute that code. If all of your clients are running ESET, the chances of an infection being spread throughout your network is almost nil.

    Keep the security on the clients. They are going to pick up the infection if you enable network scan, so whats the worry? Instead of running A/V on your exchange server, push that work to your gateway antispam server. If there is an infection, your clients running ESET is going to pick up and clean the same infection that the ESET on the exchange server is getting.

    I know several in Canada that don't have any security software on their servers, and have been running flawlessly for years. There hasn't been an outbreak of viruses. If any type of malware infection is picked up, it's usually contained to one workstation never spreads beyond that station. Call it luck if you want, but I just don't see how a problem could spread otherwise.

    Besides, if you are going to save a lot of money and time instead of having to troubleshoot performance issues because the security suite needs to check to see if your server is safe all the time. You know that the hardware you have purchased is 100% dedicated to the processes you have built it for, and doesn't have to share it with a security suite to make sure you are safe.
     
  7. tommycat1313

    tommycat1313 Registered Member

    Joined:
    Sep 2, 2010
    Posts:
    8
    I can see your point (murdamcloud) and I totally understand how incredibly nuanced and granulated this profession can be, and I do not envy the ESET engineers their burden.

    But here's the thing. It's a burden THEY CHOSE. I talked my superiors into purchasing ESET and they did so, on my word, because they trusted me. I'm then in essence vouching for ESET. That trust, today, was devastated because of this. I do not see a way back to simply explaining this off as "one of those crazy things that happens in I.T."...my company spent a lot of scratch with ESET and they want to know why ESET cost them a full day of productivity....guess who they turn to for answers....

    In everything I do at my company, I choose quality because I know in the long-run quality is superior, easier, less stressful. ESET was chosen (over Norton and McAfee) for their quality....much egg on face today for that.
     
  8. murdamcloud

    murdamcloud Registered Member

    Joined:
    Jul 30, 2007
    Posts:
    19
    I'm in the same boat. We are using the product across the whole of our organisation-in part because of my recommendation. Sheepish grin? You bet. That ESET is responsible for screwing up the update is without a doubt. My screw up was in allowing the update to go ahead-eg trusting. How can I rectify that part of the equation? That's within my control(hopefully). The only way I can potentially change ESET's future actions is by directly giving some hellish feedback.

    I feel for everyone out there. The extra hours, the stress, the anger and frustration directed at everyone who has to deal with this 'soupe de merde'.
    Good luck. I'm glad you guys care like you do.
     
  9. TermX

    TermX Registered Member

    Joined:
    Sep 2, 2010
    Posts:
    1
    gisuck,

    The inconvenience was anything but minor for us. We are a fortune 50 company have NOD32 running on approximately 3000 servers for our customer base. Having this many servers drop left and right requiring hard resets multiple times while your phones are ringing off the hook is not a fun experience... not to mention the supporting workstations of our support technicians had the same issues complicating things. It's also a major inconvenience for our customers who are medical professionals providing care to sick patients. I don't think someone who is sick or dying agrees that it's a minor inconvenience for their doctor to be having problems pulling up their electronic medical records. Just putting this in perspective for everyone to see how this affects other people's lives and is not just an IT headache. This has generated quite a bit of attention in our organization and despite my past love in NOD32, I don't think it will have a chance of surviving in our organization due to the second major event caused by NOD32 in a month for our customers. The first was the beginning of August but nowhere near as severe or widespread. This hit everything from our servers to our workstations to my home PC. We're seriously looking for other solutions now. NOD32 4.x has been a laundry list of problems since it's launch and this is just the nail in the coffin unfortunately.
     
  10. tommycat1313

    tommycat1313 Registered Member

    Joined:
    Sep 2, 2010
    Posts:
    8
    And, gisuck, I guess we do not all share your clairvoyance with the understanding of hundreds of users crying 'my computer is frozen'. I, for example, did not instantly deduce that a simple ESET update was the CAUSE of this mayhem, regardless of whether the REMEDY was simple (just a reboot...who knew??). SO we spent many valuable hours trying to discern the reason for the problem.

    Share with me your secret sir.

    --TC
     
  11. bradtech

    bradtech Registered Member

    Joined:
    Nov 16, 2009
    Posts:
    84
    In the wake of the Mcafee incident I think it's sad no lessons have been learned in quality control. You would think the testing would be more widespread among different machine types in some kind of virtual machine mock environment.. The last organization I worked at I recommended, and brought in ESET.. I heard today they had problems, and are restoring the file system.. This is embarrassing for me, and saddens me for championing bringing this product that I have recommended and brought in everywhere I went professionally. I know mistakes happen, but today my confidence in the quality control measures of the product have been hit, and I don't like my friends at my last job having their day turned to chaos by a product I recommended.
     
  12. bradtech

    bradtech Registered Member

    Joined:
    Nov 16, 2009
    Posts:
    84
    I'd also like to know from a programming standpoint what occurred? How does a definition update cause the entire erkn process crash? Was it a memory leak? I could see a definition causing false positives, and possibly deleting important windows files.. I've used ESET for a long time and never witnessed anything like this..
     
  13. nanana1

    nanana1 Frequent Poster

    Joined:
    Jun 22, 2007
    Posts:
    947
    This is the first time ESET screwed up on its daily virus definition updates and what a disaster it's been for so many users:thumbd:
     
  14. murdamcloud

    murdamcloud Registered Member

    Joined:
    Jul 30, 2007
    Posts:
    19
    I hope that ESET have a good review process in place for exactly this reason. I'd be interested to to see if there is some transparency into the mistakes that took place, even if that sounds counter-intuitive to the powers that be.

    Perhaps some good will could be generated by showing the users/consumers just what went wrong and how it will be mitigated in the future.
     
  15. mastj25

    mastj25 Registered Member

    Joined:
    Apr 20, 2009
    Posts:
    22
    This isn't the first bad update from Eset, this is the 3rd that I can remember in the last 2 years. This was however the worst. About 15 hours in and only 124 more client and servers left. We have never rebooted so many production servers. Our company definately lost money today because of this.
     
  16. nanana1

    nanana1 Frequent Poster

    Joined:
    Jun 22, 2007
    Posts:
    947
    Will ESET compensate for the lost productivity and money ? :doubt: McAfee did some goodwill measures after their last boo-boo.;)
     
  17. kennyt2000

    kennyt2000 Registered Member

    Joined:
    Sep 2, 2010
    Posts:
    3
    well ist 5.30am and still going strong, been in work for 20 hours now and still a fair few to get up and running before the 9am bedlam.

    Nice tool in the KB, shame you still have to log onto each server and workstation to apply it.

    I think some kind of good will gesture is the least they can do.
     
  18. mitch179

    mitch179 Registered Member

    Joined:
    Sep 3, 2010
    Posts:
    6
    I can't even RDP into the affected servers at my client sites. Some I can some I can't I get an 'Cannot locate RPC server' error when i log in... They are still running but the ones I can't log into are a problem because:

    A) one is a 100km drive away and just so happens it's one of the few that isn't a VM and doesn't have a remote access controller :(

    B) I can't log in to run that tool!

    Anyone got a work around for the RPC error cause if I can log in it's happy days.
     
  19. bradtech

    bradtech Registered Member

    Joined:
    Nov 16, 2009
    Posts:
    84
  20. the6thday

    the6thday Registered Member

    Joined:
    Sep 2, 2010
    Posts:
    2

    just use "shutdown -i" from another workstation(while logged in as Domain administrator) and reboot the server....
     
  21. jimwillsher

    jimwillsher Registered Member

    Joined:
    Mar 4, 2009
    Posts:
    667
    I had the same on a TS, and winlogon.exe crashes upon login, but a THRID reboot sorted that.


    Jim
     
  22. Furbykiller

    Furbykiller Registered Member

    Joined:
    Sep 2, 2010
    Posts:
    3
    okays...for those of you saying that why are we moaning and bitching an these things happen etc......

    The issue was not that the fact that this happened....it was the fact that the lock out happened on system critical servers over a wide geographical area and the only real solution for most of the day wqas to PHYSICALLY RESET the servers numerous times....you have any idea of what that did to finance servers and clusters systems and web farms?

    The knock on effect from the end user being down as well was....well, I bet most IT departments switched off there phones.....I have no issue rebooting any of my servers in a CONTROLLED SYSTEMATICAL and with solutions in place timeframe.....having to hard reboot, and lets face it we had no real option until midnight last night, is a nightmare.

    Bitching about the overtime, cost to business etc.....we have every right to....yesterdays issues probably cost in the tens of thousands....and thats just for individaul businesses.

    If you going to put your product out there with the like of Forefront, etc then you have to live upto it. Lets face it, we all chose ESET for its small footprint and cost size....and to be honest its been a good product....but in this enviroment good sometimes is not enough.....

    Rant over....been in since 6.30am after a 12.00am finish....

    Gonna go sleep under my desk for a bit :)
     
  23. mitch179

    mitch179 Registered Member

    Joined:
    Sep 3, 2010
    Posts:
    6
    That's the plan but it won't change a thing if it doesn't come back up (It's never had an issue in the past but this update has clearly screwed with it).

    That little hotfix works magic on machines I can access though :)
     
  24. burgesshodg

    burgesshodg Registered Member

    Joined:
    Sep 3, 2010
    Posts:
    1
    New poster here, just registered so I can get my point across!

    I see a lot of posts saying blah blah get on with it, you must be new in IT and so on.

    I've been in IT for 20 years now and have never had such a costly update as this. One of my clients runs a payroll remotely through our systems for over 3000 employees. Because of the mess this update caused the BACS cut off was missed meaning my client now has to CHAPS each payment at a cost to them of over £50K!

    IT downtime at my site costs production £22K a day, we lost half a day yesterday so do the maths.

    This was a nightmare, simple as!
     
  25. luka1002

    luka1002 Registered Member

    Joined:
    Feb 24, 2010
    Posts:
    21
    5419 does not resolve anything. Tried everything...
    Seems to work better if u disable E-mail checker... Mails are coming ,station seems to have less crash, and more stability.
    Waiting for today update..
    This is total ****. Have over 15 servers and more than 250 stations ,can not restart every hour!!!
     
Thread Status:
Not open for further replies.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.