Help needed analyzing WireShark capture file

rookiez · Jun 10, 2008

Hi,

I would appreciate it if someone could help me analyze a capture file that I saved with WireShark.

Lately, our throughput has dropped from our backup server. We have a separate network team that manages Cisco switches and routers. They claim that utilization is low on the switches, but they see packets being dropped or lost. They claim it's the server.

I have done everything possible on this backup server which is running a HP DL380 hardware. I have upgraded firmware, drivers, NICs, etc. I have installed wireshark and used the Expert Info feature. The Expert Info feature shows that I have been getting a lot of bad tcp checksums, TCP duplicate ACKs, TCP previous segment lost, TCP out of order, and TCP fast retransmission.

I have no experience with WireShark and unfortunately our network team can't interpret the capture file. Can somone please assist and provide advice as to what may be the cause of the problem and the reason in dropped throughput.

Thanks in advance for any advice that can be provided.

Mrkvonic · Jun 10, 2008

Hello,

Could you please tell us what the layout of the network is?

LAN?
What OS?
Does it have a DHCP / DNS server?
Did you introduce anything new to the network lately?
Do you have firewalls controlling the network traffic?
Do you use any traffic shaping software?
What is the size of the MTU?

Could you please check the routes from and to the backup server?
Could you please ping / traceroute the backup server?
What is the Metric of your routes to the server?
Does the problem occur equally for every host trying to access the server?

Do you have access to any Linux terminal?

Could you please look into the packet details and see what the sources of those packets are? Are the errors limited to a particular network segment or hosts or ports?

Mrk

rookiez · Jun 10, 2008

Hi,

Thanks for taking the time to look into this. Before I continue on, I want to say that I do not have all of the required info below. But I will do my best to answer them to the best of my knowledge.

All of our systems are on a LAN with 2GB connections. The backup server holding the backup application is sitting on a 167.x.x.x network and is running Windows 2003. A separate NAS device where all the backup data is stored is on a 10.x.x.x network. The NAS device runs a BSD unix operating system. It has minimal commands and is basically just a huge storage device. The backup server will talk to other servers through the backup agent for the backup to start. Servers with the backup agents sit on various network segments as created by our network administrator. From what I'm told, the backup server is on a different switch from the NAS device and also on a different VLAN. To simplify things, I'm only concentrating on the traffic between the NAS device and the backup server. I have placed a new test server (running Windows 2003) sitting on the same switch and VLAN as the NAS device. I believe I was able to replicate the slow throughput by doing a regular file transfer. A 512MB file would take approximately 15 minutes for the transfer to complete.

Our environment does have DHCP and DNS servers. But the servers are configured to use static IP addresses. The servers do point to the proper DNS servers.

Our network team introduced a new IPS device into the environment. But I was told that they have since put everything in bypass mode. We were not successful in asking them to turn the IPS device off and fully taking that out of the equation.

I also do not believe we have any traffic shaping software in our environment.

The size of the MTU are set to 1500 on the servers. I cannot confirm that is the same on the switches. But I was told by our network team that it is set for 1500.

I am able to ping successfully between the backup server and the NAS device.

The trace route takes 3 hops in both directions.
The metric is 10.

The problem seems to vary. We did several tests and the file transfer seems to be "fast" when the servers are on the same switch (irrelevant of VLANs). If I had server1 on the same switch as the backup server, we don't seem to have slow throughput. If I had server2 on the same switch as the NAS device, we also don't seem to have the slow throughput problem.

We did another file transfer test betweeen server1 and server2 and did not have any slow throughput. We originally thought the NAS device was the cause of the problem. But it doesn't explain why we don't have slow throughput when we do a file transfer with a server on the same switch. I have called our vendor to verify that the NAS device is working and configured properly.

The errors are not limited to a particular segment or hosts. I have installed WireShark on several servers in different segments and I have been getting lot of bad tcp checksums, TCP duplicate ACKs, TCP previous segment lost, TCP out of order, and TCP fast retransmission.

I hope I have answered all your questions. If you need more info, please let me know.

Thanks again for any help that you can provide!

Log in or Sign up

Help needed analyzing WireShark capture file

rookiez Registered Member

Attached Files:

1.JPG

Mrkvonic Linux Systems Expert

rookiez Registered Member

Log in or Sign up

Help needed analyzing WireShark capture file

rookiez Registered Member

Attached Files:

1.JPG

Mrkvonic Linux Systems Expert

rookiez Registered Member

Useful Searches