PDA

View Full Version : Help! Corruption when xfering large files on gigabit network


shoek
March 2nd, 2005, 02:30 PM
I'm hoping someone can give me something to go on here, because I've been pulling my hair out for about a week now on this one...

I just built a new workstation, and in the course of getting the machine up and running I was using TrueImage to restore a disk image stored on the server on my gigabit network. The restore failed with an "Image Corrupted" message... long story short, I find that when I transfer very large files (like > 2GB on up to 35GB) between my server and workstation, I get invisible corruption of the file. By invisible I mean that I would never have known the file was corrupted until I went to use it had I not had TrueImage verify fail on me. I subsequently used MD5sum on the file to verify its corruption (see Test Case below).

My setup:

Server
Asus A7M266-D, dual Athlon 2400's, 512MB RAM
HighPoint RocketRAID 1820A with 6 Maxtor DM9+ 160GB's in RAID5 (in 64bit/66Mhz slot)
Broadcom BCM5701-based gigabit NIC (in 64bit/66Mhz slot)

Workstation
Asus NCCH-DL, dual Xeon 3.0Ghz, 1GB DDR400 RAM
Maxtor DM9+ 80GB IDE drive
Broadcom BCM5703-based gigabit NIC (in 64bit/66Mhz slot)
Intel Pro/1000 CT onboard gigabit NIC (using the CSA 266mhz connection to the 875p Northbridge)

Network
Netgear GCS105 Gigabit switch
30-40ft CAT5e cable between workstation and switch
2ft CAT5e patch cable between server and switch

Test Case
Using Windows Explorer, find the test file on the server's share
Copy the file to the workstation's drive. I get 20-40Mbs throughput when doing this (pretty good for gigabit IMO)
MD5sum the file on the workstation
Remote desktop into the server, MD5sum the file on the server

THE FILES ARE DIFFERENT!

What I've Tried (that didn't help)
Ran new CAT5e cable
Tried both the Intel and Broadcom gigabit NICs in the workstation
Turned off Large Send Offload and Checksum Offload on workstation's NIC (server NIC does not support these features)
Tried a virgin install of XP on the workstation
Updated Broadcom BCM5703 BIOS to latest
Updated NIC drivers to latest
Wrote Broadcom support - they suggested turning off Large Send Offload and/or Checksum Offload, but this did not help
Popped a old Netgear 10/100 NIC into the workstation and while slow, it transferred the file just fine
Transferred from the server to a P3-1Ghz machine with Intel Pro/1000 MT Desktop NIC and while slower (ie 15Mbs), the file transferred just fine

What I intend to try
Waiting for a new Gigabit switch that supports jumbo frames
Try the BCM5703 in a 32bit/33Mhz PCI slot in the workstation

One thing that I should mention... after a late night trying to figure this out, something got me on the idea of playing with the RWIN and MTU and other TCP/IP parameters on the workstation. I used speedguide.net's TCPOptimizer to set the RWIN large (~500000bytes) and MTU to 1500 and some other settings. I did a test of a 6GB file it worked, so I was encouraged and went to sleep. Next morning I tried the big 35GB file and it corrupted it.

Anyone have any ideas that I can try? Thanks in advance everyone...
-shoek

mikeblas
March 2nd, 2005, 06:03 PM
-{ Quote: "I get 20-40Mbs throughput when doing this (pretty good for gigabit IMO)
" }-

You should be able to do better than that, especially with the hardware you have. Turning on jumbo packets will be very beneficial.

I found the HGS16ST from from Hawking Technologies (http://www.newegg.com/app/viewProductDesc.asp?description=33-164-009&depa=0) to be a nice, semi-managed and very inexpensive gigabit switch that supports jumbo packets. With this switch I've gotten transfers over 85 megabytes/second.

I'm afraid I can't offer much advice about your corruption problem, though.

.B ekiM

Menorcaman
March 2nd, 2005, 06:05 PM
Hello shoek,

Although this <previous thread> (http://www.wilderssecurity.com/showthread.php?t=65129) deals with corruption of large file when xfering over USB, it might be worth running MemTest86+ overnight as suggested by a few users. Also, as mentioned in the same thread, select a fixed file size of 700MB when creating your images. The smaller individule .tib files should prevent corruption until you fathom out (hopefully!!) the root cause of your problem.

Regards

gszatkowski
April 26th, 2005, 01:58 PM
I am having the same problem with server 8.0. File copies (robocopy and xcopy and drag/drop) in the 600mb - 4gb range are corrupted (verified via md5sum). This happens on intel 100mb pci cards on 2k and 2k3 server. Removing the acronis server program, causes the problem to stop. Copies from one server hard disk to another do not have the problem.

A better solution would be nice.

gjs

regnim
April 26th, 2005, 03:12 PM
#1 Bad Memory
#2 Overclocking
Bet it is one of those...

MiniMax
April 26th, 2005, 03:55 PM
Just guessing ... IRPStackSize problems? You will find plenty of posts here talking about IRPStackSize. Here is one:

http://www.wilderssecurity.com/showthread.php?t=66633

Outside links:
http://www.windowsnetworking.com/kbase/WindowsTips/Windows2000/UserTips/Miscellaneous/Windows2000Event2506IRPStackSizeError.html
http://service1.symantec.com/SUPPORT/tsgeninfo.nsf/docid/2000092713243506
http://www.dslreports.com/faq/4817


Worth trying

Acronis Support
April 27th, 2005, 09:12 AM
Hello shoek,

Thank you for choosing Acronis Workstation Disk Backup Software (http://www.acronis.com/enterprise/products/ATICW/).

Could you please send the letter to support@acronis.com along with the link to this tread? It will allow us to investigate the problem.

Thank you.
--
Irina Shirokova