Sunday, 1 May 2022

Using LimeSDR, GNURadio and NVIDIA's Riva voice tools

 

 

This is just a short blog entry. I have been experimenting using AI speech decoding 
and GNU Radio. I apologise for the quality of the video.
 
I am using NVIDIA's Riva speech tools running on an RTX3090 being fed with 
audio samples  generated by GNURadio. 
 
The example above is listening to BBC Radio 4, but I have also tried it with NBFM 
on the 2M band. 
 
The duplication in the output window is down to the Python code not wrapping at 
the end of the line correctly and is not an issue with Riva. 
 
To use Riva, you have to register with NVIDIA before you can download the suite. 
 
I tried using it on a RTX2080 Super card, but it gave loads of out of memory errors, 
so the above example is  running on a RTX3090 with 24G of memory. 
 
In the above setup, I am running GNURadio on one computer talking to another 
computer running the Riva server via Wi-Fi. I am only doing this because my 
big machine is nowhere near my antennas.  

I have not tried SSB on HF yet, that will be the next challenge. Riva also supports 
text to speech.

Voice recognition has come a long way since I last played with it 20 or more 
years ago then I was using IBM's Dragon Naturally Speaking which I used to 
control an HF radio running my  PC-ALE software.

https://developer.nvidia.com/riva

4 comments:

  1. Very good. I had a look at RIVA after I saw your Twitter post. It seems quite good but takes some programming to use.

    ReplyDelete
  2. Just a small cut and paste of the provided example files. You need to create a shared directory to store the new python files in though as the client code runs in a Docker container. I can think of many other applications it can be used for, maybe a chatbot for DMR/DStar/M17?

    ReplyDelete
  3. Thank you for sharing this! Have you considered other voice to text applications that do not require as much processing power? For instance Google and Microsoft both have real time dictate features available. I'm just wondering why so much processing power (RTX3090/24GB) is needed for transcribing in your application.

    ReplyDelete
    Replies
    1. It is simply the size of the model being used, not the processing power. It will also run on the Jetson XavierNX, Xavier and Orin embedded platforms. The model it uses is QuartzNet which is based on Jasper. I believe that both Google and Microsoft's dictate use cloud computing, I am using edge computing (no internet connection required). For enterprise applications you can run Riva in the cloud if you want.

      Delete