This is just a short blog entry. I have been experimenting using AI speech decoding
and GNU Radio. I apologise for the quality of the video.
I am using NVIDIA's Riva speech tools running on an RTX3090 being fed with
audio samples generated by GNURadio.
The example above is listening to BBC Radio 4, but I have also tried it with NBFM
on the 2M band.
The duplication in the output window is down to the Python code not wrapping at
the end of the line correctly and is not an issue with Riva.
To use Riva, you have to register with NVIDIA before you can download the suite.
I tried using it on a RTX2080 Super card, but it gave loads of out of memory errors,
so the above example is running on a RTX3090 with 24G of memory.
In the above setup, I am running GNURadio on one computer talking to another
computer running the Riva server via Wi-Fi. I am only doing this because my
big machine is nowhere near my antennas.
I have not tried SSB on HF yet, that will be the next challenge. Riva also supports
text to speech.
Voice recognition has come a long way since I last played with it 20 or more
years ago then I was using IBM's Dragon Naturally Speaking which I used to
control an HF radio running my PC-ALE software.
Very good. I had a look at RIVA after I saw your Twitter post. It seems quite good but takes some programming to use.
ReplyDeleteJust a small cut and paste of the provided example files. You need to create a shared directory to store the new python files in though as the client code runs in a Docker container. I can think of many other applications it can be used for, maybe a chatbot for DMR/DStar/M17?
ReplyDeleteThank you for sharing this! Have you considered other voice to text applications that do not require as much processing power? For instance Google and Microsoft both have real time dictate features available. I'm just wondering why so much processing power (RTX3090/24GB) is needed for transcribing in your application.
ReplyDeleteIt is simply the size of the model being used, not the processing power. It will also run on the Jetson XavierNX, Xavier and Orin embedded platforms. The model it uses is QuartzNet which is based on Jasper. I believe that both Google and Microsoft's dictate use cloud computing, I am using edge computing (no internet connection required). For enterprise applications you can run Riva in the cloud if you want.
DeleteGreat read thannks
ReplyDelete