开发者

Detecting TCP dropout over an unreliable network

开发者 https://www.devze.com 2022-12-30 08:55 出处:网络
I am doing some experimentation over an unreliable radio network (home brewed) using very rudimentary java socket programming to transfer messages back and forth between the end nodes.

I am doing some experimentation over an unreliable radio network (home brewed) using very rudimentary java socket programming to transfer messages back and forth between the end nodes.

The setup is as follows:

Node A --- Relay Node --- Node B

One problem I am constantly running into is that somehow the connection drops out and neither Node A or B knows that the link is dead, and yet continues to transmit data. The TCP connection does not time out either. I have added in a heartbeat message that causes a timeout after a while, but I still would like to know what is the underlying cause of why TCP does not time out.

Here are the options I am en开发者_开发问答abling when setting up a socket:

channel.socket().setKeepAlive(false);
channel.socket().setTrafficClass(0x08); // for max throughput

This behavior is strange since it is totally different than when I have a wired network. On a wired network, I can simulate a disconnected connection by pulling out the ethernet cord, however, once I plug the cord back in, the connection becomes restablished and messages begin to be passed through once more.

On the radio network, the connection is never reestablished and once it silently dies, the messages never resume.

Is there some other unknown java implentation or setting for a socket that I can use, also, why am I seeing this behavior in the first place?

And yes, before anyone says anything, I know TCP is not the preffered choice over an unreliable network, but in this case I wanted to ensure no packet loss.


The TCP protocol was designed to be quiet. The RFC requires keepalive heartbeat no more frequent than 2 hours. Unless you have control over the system on both ends to change the default 2 hour heartbeat (sometimes, it requires kernel rebuild), you have to add heartbeat in your own app.

If you send heartbeat, it still needs to wait till Retransmit Timeout, which varies depending on the RTT. On a high-latency network, the timeout can be very high but it should be within minutes.

You get notification on local network because the system can detect link-down status and drop all connections on that network.

BTW, you want set Keepalive to TRUE, instead of false. With Keepalive, you at least get the slow heartbeat.


In the OSI 7-layer model, the first two layers are physical and data link. Your physical hardware running the data link protocol on wired ethernet can detect when the cable is pulled. Your wireless hardware, and corresponding protocol, probably not so much. The TCP stack can't do anything to timeout if the layer1/2 stuff isn't signaling that it is disconnected.


Define 'never'?

I expect you will be notified by a send failing eventually. You're probably just expecting to be notified sooner than you will be. The TCP stack will be retransmitting segments that it doesn't get ACKs for and the timeout before retransmission for each attempt is doubled each time it retransmits. Depending on how the stack is working out when to retransmit it's probably going to be longer than you're expecting before the stack will decide that the connection is broken and only then will it let you know.

See here: http://www.ietf.org/rfc/rfc2988.txt, here: http://msdn.microsoft.com/en-us/library/ms819737.aspx, etc.

You're used to having a wired network where the drivers can notify higher level layers that the connection has been physically broken. If you were to configure a wired network to route via a router which you then deliberately set up to not route correctly then you'd probably see similar behaviour....

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号