Sunday, 20 July 2014

Parallel Processing CUDA OpenGL UHD USRP2

For a bit of light relief the last few days I have been immersed in
CUDA and OpenGL programming. My initial goal is to use the
USRP2 I have to digitise a large piece of spectrum and display
it inside the window of a Qt5 application. I will be using CUDA to
do the parallel processing and OpenGL to display the results.

So far I have managed to create an OpenGL widget that displays a
window in a Qt application, grab samples using UHD, process
them using the CUDA library and write some simple kernel code.
I need to learn a bit more about using OpenGL before I go any further
as I want to display the results as a waterfall and getting CUDA to
talk to OpenGL via the GLWidget does not look too easy to do.
Getting CUDA to share buffers with OpenGL is not difficult but adding
the extra complexity of the GLWidget means I start to stray off the
beaten path.

The biggest problem has been installing the CUDA toolkit and more
especially the correct NVIDIA driver. The one that Ubuntu wants me
to install is not the right one. It has to be the latest one on the NVIDIA
website for CUDA 6.0 to work. I am using a GTX680 card for the GPUs.
I have also been looking at OpenCL but as I am using an NVIDIA card
I thought it better to use CUDA for the moment.

I bought the GTX680 a few years ago and for the same price I could get
something much more powerful now.

I notice it is not going to be long before Ubuntu 12.04 LTS is no longer
supported at which point I will have to consider upgrading. I have heard
upgrades never go smoothly so I am not looking forward to it.

I will post some more about Odroid in the next epistle.  

Friday, 11 July 2014

FFTs and optimisation

Changing the FFT from FFTW to av_fft made little or no difference
on the CPU load when running DVB-T. However optimising the
RS encoder has taken a few percent off the CPU load.

I am using the Valgrind tools to profile the program, the highest CPU load
varies between the iFFT the RS encoder and the interleaver. Currently
the interleaver is the biggest CPU drain and I can't find any obvious way
to optimise it.

I have been able to get 2 MHz wide DVB-T to work on the Odroid U3+
but I had to reduce the oversampling which has caused a small amount
of aliasing to appear in the output spectrum. Work continues.

Tuesday, 8 July 2014

Odroid U3+

Odroid U3+

Not a very good picture but what you are seeing is an Odroid U3+ in it's
case sitting on a tangle of wires which is my desktop. The Odroid is made
by Hardkernel, it is a Quadcore ARM device as used in Galaxy Smartphones.
It is a lot faster that other ARM devices I have been experimenting with and
is well supported with an inhouse magazine and hardware accelerators for the

Currently with DATV-Express there are 2 main development strands. Firstly
I am adding support on the PC platform for analogue video capture and encoding
in software using libavcodec.

On the ARM platform I am trying to get the code fast enough to handle low
bandwidth DVB-T. I have moved from using fftw to do the iFFT to
av_fft which can be found in libavcodec. The new iFFT uses single precision
maths and on the ARM platform has special modules that use the NEON SIMD
instructions available on later ARM devices. The combination of these two
features means the code should run a lot faster.

Thanks to a suggestion by Ron W6RZ (who has ported my DVB-S2
implementation to GNURadio) I have managed to optimize part of the S2 code
to knock around 5% off the CPU load. I am sure further optimizations can be found.

Finally I have been reading a book on OpenCL called "Heterogeneous Computing
with OpenCL" One of the chapters gives an example of using it to do real-time
graphics processing and it has sent my mind racing as to the possibilities.

OpenCL for those that don't know is a framework based on the C language that
allows you to harness both CPUS and GPUS (graphics cards) in a parallel
processing environment. So far I have installed the Intel OpenCL SDK and
compiled and run a few pieces of example code. I have a NVIDIA card on my
Linux machine and the NVIDIA drivers now include the bits needed to work
with OpenCL.

What I am thinking of doing is taking the video capture code I have already
written for DATV-Express, run that on the CPUs to capture up to 4 video
channels. Then pass the video frames to the GPUs for manipulation and then
back to the CPUS for MPEG compression using libavcodec. The resultant
transport stream would then be sent to DATV-Express. I can then get rid of the
old Analogue video mixer / effects unit I have. The video mixer cost me
about the same amount as a dedicated PC would.  

Till next time!

Wednesday, 11 June 2014

DATV-Express Yahoo Group

There is now a DATV-Express Yahoo group open to all.
I hope it will be a place that people can share their
experiences of DATV-Express both good and bad.

We would also like to hear from people that are experimenting
with 3rd party software such as FFMPEG, OpenCaster or
GnuRadio, also those using Linux to receive Amateur DTV.

In the next release of the DATV-Express software 2.03
we will have implemented a UDP socket interface.
This will allow people to send a UDP Transport Stream
either via the loopback address or via Ethernet to / from

I hope you will find this facility useful and will do great things
with it.

Saturday, 17 May 2014

Solar power PI based Broadband-Hamnet node

The title pretty much says it all. The Realtek RT5370 WiFi dongle is at the
end of a 10m powered USB2 cable. Attached to it is a high gain omni
(10dBi, a figure I don't believe). There is about 1v drop up the cable so
I am feeding 8V up to the WiFi dongle via the modified USB2 cable where a 5V
regulator then provides the correct voltage to power both the dongle and the 4 port
USB2 FE1.1s hub chip which is in the end of the 'active' cable. 

This is a bit of fun as I am not expecting to be able to MESH with anyone
other than with my Linksys WRT54GL but .....

Friday, 25 April 2014

DVB-ASI revisited

I have had to admit defeat on my attempts to decode a DVB-ASI
signal using an FPGA. It was simply asking too much of the FPGA.

So I have decided to follow the suggestion in one of the comments
I received on this blog.

I have bought a DVB-ASI PCI card for $28 on eBay, it uses CY7B933
chips which have a synchronous output which appears compatible with
one of the modes of the Cypress USB chip we use on Express. That means
it should be a fairly easy job to remove the PCI interface chip and substitute
the FX2 fifo one instead. I will post more info when I get the 4 channel card.
In anticipation I bought a small quantity of FX2 boards from China
(the 56 pin ones).

I was thinking of making a small PCB for the project but as this will be a one
off project a nest of prototyping wire and my hot glue gun will be enough.
After all that is what boxes are for.

Monday, 21 April 2014

MK802IV and Picuntu 4.5

Just a quick picture of DATV-Express running on a Rikomagic MK802IV quadcore
ARM device. The picture is of the device running Picuntu 4.5 and sending DVB-S at
12 MSymbols/sec with an FEC of 7/8 (which equates to 16 Mbits/s video).
I left it running for over an hour to check for stability. The DVB-S encoding was
 being done in the FPGA so all the MK802IV was doing was reading the 
program stream from the capture device, generating a valid transport stream and 
sending the stream to the DATV-Express board.