Monday, 17 October 2022

OpenAI Whisper joins the Radio War


Recently, I have been trying to use OpenAI Whisper program to transcribe 
and translate signals received over radio. The above screenshot is from a 
Russian Language transmission  on 7.07 MHz LSB in the 40m Amateur band. 
I used the medium model and an NVIDIA RTX2080 Super graphics card.

Although the results are nowhere near perfect, it does seem to understand 
the gist of what is being said. Click on the picture above to see more detail.

The program uses a 30 sec sliding window to operate, so has some limitations, 
for example it does not like quick fire overs as it looses context. 
The Large model has even better performance but, requires more VRAM than 
is on the card I am using. I may at some stage try using an RTX3090 which has 
24G of VRAM.

The program does not like co-channel interference or selective fading (on AM signals).

The program in using an Anaconda environment and PyTorch   (an AI framework).
At the moment my python program is very simple and, I am sure, can be improved.
On receive, I tried a number of WebSDRs. I have used in to listen to QO-100
narrowband where it translated many different languages. I also used it to translate
traffic from Russian AM stations around 3.1 MHz

