FT-450 Hamlib Timeout Glitch
I control my FT-450D using Hamlib. This worked fine if only a single program was connected to the rig. To be able to connect multiple programs at the same time, I wanted to run the
rigctld server as a service and connect the programs using the model #2 (“NET rigctl”). This setup works fine, but every now and then the connection to the FT-450D stopped working. Here is how I found out what’s going on and how I fixed it.
The problem only occurred when WSJT-X or Fldigi where running in parallel with CQRLog. To find out more about the root cause I started
rigctld with tracing and timestampts enabled:
rigctld -m 127 -Z -vvvvv
This showed that every now and then there was no answer from the FT-450 after switching off the PTT:
2019-05-13:18:57:16.688467: client lock engaged 2019-05-13:18:57:16.688555: rig_strvfo called 2019-05-13:18:57:16.688578: rigctl(d): T 'currVFO' '0' '' '' 2019-05-13:18:57:16.688614: rig_set_ptt called 2019-05-13:18:57:16.688637: newcat_valid_command called 2019-05-13:18:57:16.688656: newcat_valid_command TX 2019-05-13:18:57:16.688680: newcat_set_ptt: cmd_str = TX0; 2019-05-13:18:57:16.688701: serial_flush called 2019-05-13:18:57:16.688737: cmd_str = TX0; 2019-05-13:18:57:16.688758: write_block called 2019-05-13:18:57:16.693892: write_block(): TX 4 bytes 2019-05-13:18:57:16.693992: 0000 54 58 30 3b TX0; 2019-05-13:18:57:16.694020: cmd_str = ID; 2019-05-13:18:57:16.694045: write_block called 2019-05-13:18:57:16.699202: write_block(): TX 3 bytes 2019-05-13:18:57:16.699302: 0000 49 44 3b ID; 2019-05-13:18:57:16.699327: read_string called 2019-05-13:18:57:18.701045: read_string(): Timed out 2.1690 seconds after 0 chars 2019-05-13:18:57:18.701105: serial_flush called 2019-05-13:18:57:18.701136: cmd_str = TX0; 2019-05-13:18:57:18.701147: write_block called 2019-05-13:18:57:18.706296: write_block(): TX 4 bytes 2019-05-13:18:57:18.706385: 0000 54 58 30 3b TX0; 2019-05-13:18:57:18.706398: cmd_str = ID; 2019-05-13:18:57:18.706410: write_block called 2019-05-13:18:57:18.711580: write_block(): TX 3 bytes 2019-05-13:18:57:18.711641: 0000 49 44 3b ID; 2019-05-13:18:57:18.711656: read_string called 2019-05-13:18:57:18.711709: read_string(): RX 7 characters 2019-05-13:18:57:18.711724: 0000 49 44 30 32 34 34 3b ID0244; 2019-05-13:18:57:18.711734: newcat_set_cmd: read count = 7, ret_data = ID0244; 2019-05-13:18:57:18.711747: client lock disengaged
The default timeout in Hamlib is 2 seconds when reading from the serial line. If there is no answer within this 2 seconds, Hamlib sends the command again. As you can see in the trace, the second attempt is successful. Unfortunately, WSJT-X is a bit impatient and stops the communication with an error message after 2 seconds. You have to restart the communication by manually clicking into the error dialog.
First Try: Let’s use the big gun
Being a software engineer, I’m a bit prone to over-engineering. So I decided instead of trying to fix the problem with Hamlib itself, I will build a proxy that gives me full control over the communication between my programs and Hamlib. Sounded like a lot fun with network programming and all that. It was a lot of fun indeed, except that it did not fix my problem at all.
After scratching the network-programming-itch I stepped back and did what a good engineer should do: look at all the facts including the Hamlib documentation and use my brain. It is possible to tweak the communication parameters in
rigctld, like for example the timeout for the serial line. And this was the solution: just set the timeout on the serial line below the timeout of WSJT-X:
rigctld -m 127 -C timeout=500
Since Hamlib retries to send the command and the second attempt is always successful, WSJT-X will not notice the glitch and stay happy.
That was easy.