Thursday, 8 September 2016

S2 Developments

I have managed to get a little further with the S2 (maybe S2X) decoder.
I now have a decoder for the bi-orthogonal code that S2 uses to
encode the Mode and Code field (MODCOD). Some Googling showed
that the way to do this is to multiply the received codeword by a
Hadamard matrix. You use the index of the maximum entry in the result
and it's sign to give you the received MODCOD value.
The Original S2 protocol used a (32,6) code and this works fine with the
Hadamard approach. S2X has extended this to (32,7) and it is slightly more
tricky to decode. What I have done is to decode it as normal save the maximum
value then subtract the extra line in the S2X generator matrix from the received
codeword then re-run the matrix multiply again. That way I can tell if the extra
bit in the S2X code is a 1 or a 0 by selecting which of the two matrix multiplies
results in the largest maximum value. If the bit is a 1 it means the waveform
conforms to S2X and if it is a zero then it is S2 so the receiver can handle
both standards. There is another bit that is differentially encoded in the
codeword as well which has to be detected and remove before the
bi-orthogonal decode can occur but that is fairly trivial to do.

I have tested it using a modified version of my S2 transmitter C++ class
so I know it is working (and correcting errors).

I noticed on Twitter yesterday that there is now a proposal to launch a
lunar orbiting Cubesat in 2018 Heimdallr Lunar Cubesat and that they are
proposing that a S2X 1/5 code be used as part of the experiment.
The proposal is for a 25 kbps data rate. If this makes it off the launchpad
then it will be a very interesting thing to listen for and I should have the
software working long before then.

I hope to make my S2 decoder scalable so that it can be used on everything
from a TK1 single board computer up to a high end graphics card. Having a
portable TK1 based solution would be ideal but if that fails then there will be
one of my many GPU based PCs to fall back on. The more GPU power
available the closer to the theoretical performance I can achieve so there
is room for a trade off.

I love challenging (for me) projects!


Saturday, 27 August 2016

Small steps in DVB-S2 decoder

Firstly I finally found a suitable enclosure for the Jetson TK1 I bought a while
ago as you can see in the photo above. The red box is a Hammond enclosure.
I mounted the Jetson on a piece of 1.6 mm Aluminium sheet and below that I
added an SSD.

The Jetson has a 192 Core GPU as well as USB3 and Gigabit Ethernet so may
well make a nice basis for a portable computer to use with one of the LimeSDRs
I have on order. Last time I played with it I had difficulty in getting the expected
USB3 performance out of it. By default it sets the USB3 port to USB2 only (daft).

I have done some work on the DVB-S2 receiver software since I last posted.
I managed to write a BCH hard decision decoder. So far I have only tested it with
one of the S2 Normal frame formats but it should work on all the various formats.
The Short frames use slightly different generator polynomials so some extra work
will have to be done to support that. I based it on the RS decoder I wrote a while

I have also moved the syndrome calculation code to run on the CUDA device, in my
case that is a GTX980 TI.

The reason for doing that is the syndrome calculation has to run on every received
frame to check to see if there are any errors. If no errors are detected further
processing is not required. The Host communicates with the Device (GPU)
via PCI and that is very time consuming so it is best to make as few memory
transfers between the GPU and host as possible. The syndromes are quite small
compared to the data frame so the overhead is not very great.
The BCH decoder is implement as a predictor so is not easy to implement using
parallelism, well at least not for me. So it makes more sense to do it on the host.

One thing I learned today is that if the GPU code takes more than a couple of
seconds to run then Windows times out and resets the graphics card. This can be
avoided by either changing a registry entry or by breaking the code up into smaller
tasks. There is a lot for me to learn. The different types of memory on the GPU
and how you use them is also something that needs careful attention to when
writing code.

I am not sure how much time I am going to be able to spend on this little project in
the next few weeks as I have been asked to create a video to go with my CAT16
talk at the end of September and that is going to be time consuming no doubt.

That is it for now.......

Sunday, 14 August 2016

DATV-Express on Windows update

I have been keeping busy even though I have not posted for a couple of months.
After the AMSAT-UK conference it became obvious that DVB-S2 would take
a staring role in the new Es'hailsat 2 Satellite (which apparently is named after the
Arabic name for the star we know as Canopus). The satellite is now scheduled for
a Q3 2017 Launch.

To this end I have ported my Linux DVB-S2 implementation over to Windows.
The actual implementation is a C++ class and runs almost entirely on the host.
This means it uses more CPU than the DVB-S implementation does. I am still
ironing out bugs, by that I mean stuff I have working fine, when given to others
to test is not quite as successful.

While we have receiver support for normal symbol rates it may be a while before
we have RBTV (Reduced Bandwidth TV) support in place. Still we have a year
to get this all up and running.

I have also started a new challenge and that is the development of a GPU based
DVB-S2 receiver. I have done some similar work in the past but not at these
high symbol rates. DVB-S2 was specifically designed to take advantage of
parallel processing. I have been looking for an excuse to use the 2816 cores on
my graphics GTX 980 TI card. Originally I had planned on just implementing
the LDPC decoder on the GPU using the "Belief Propagation" algorithm but
bit by bit it is looking like most of the modem will in fact be on the GPU as
it is a better fit than on the CPU.

After having struggled with Graphics Drivers when installing the NVIDIA
Cuda 7.5 toolkit on my Linux machine it came as  a relief when I found how
simple the Windows Toolkit was to install. My only  mistake was not installing
Visual Studio 2013 before the toolkit. The current Toolkit does not support
the 2015 edition I was using for development.

The GPU modem is more an academic exercise than something that would be
practical for most people.

I am currently looking at ways to estimate the noise variance on the
communications channel as this is required to calculate the Log Likelyhood
Ratios LLRs which are required for the BP algorithm.
This is really pushing my maths skills. Wish me luck!

Saturday, 11 June 2016

EVM measurements of DATV-Express

Above is the EVM measurement taken of DATV-Express using an Agilent E4406A.
It is of an unmodified board with no IQ offset applied. Some of the error is due to
a symbol rate mismatch between Express and the test gear. Express derives it's
symbol rate from a cheap xtal oscillator (the one used for the USB2 clock).
When generating the same waveform using an Agilent E4437B using a common
reference the EVM is only a couple of percent better.

The tests were done using the W-CDMA option that allows QPSK EVM
measurements to be taken provided the symbol rate is around
4 MSymbols per second.

The maths

EVM (%) = √(Perror/Pref) x 100

Where  Perror is the power of the error vector and Pref is the power of the ideal reference.

Tuesday, 7 June 2016

Couple of mods that I have done to DATV-Express

External reference added

C91 change
The first was to modify one of my boards to accept an external frequency reference
in this case 10 MHz. The reference must be greater than 0.3V pk-pk and less
than 4v pk-pk.

The second change I made was to change C91 in the PLL filter from 4n7 to 47n.
As you can see from the phase noise plot this reduces the noise at 100 KHz by
around 17 dB. I very much doubt this makes much difference to the DVB-S
performance but as I intend using the Express board for SSB as well, the
2 changes I have made should make a difference.

I am currently waiting for my SRP (Special Research Permit) for 71 MHz to
arrive, more on that when it turns up.

I am also waiting on some eBay 'bargains' to arrive as well. These include some
tunable 50 - 80 Mhz BPFs from Poland that I will put on the VNA as soon as
they turn up.

Finally I have added an E4437B 4 GHz signal generator to my rack of test gear.
It is the unit at the top of the pile. I offered the dealer around half what he was
asking and I was surprised when he accepted the offer. The guy on the
Signal Path Blog recently torn down and repaired one of these units, so I had
an idea of the risk before I bought.

I was getting a bit frustrated with my old Marconi 1 GHz signal generator as
most of my activities are above 1 GHz now.The generator has lots of neat
features including the ability to do multi tone testing of amplifiers and of
playing back test waveforms downloaded over GPIB.

Other projects in the pipeline include 70 Mhz PA re-vamp and a universal
receive converter for DATV (it uses eBay surplus so isn't reproduceable at
low cost).

That is all for now.

Wednesday, 4 May 2016

SDR part 2

Work on the SDR continues but it is proving to be quite complicated to
get to work. I am currently working on the ADC interface to the software.

Originally I tried to implement it with plain logic but that did not prove fast
enough. I am now doing the job properly using ISERDESE2 and IDELAYE2

The ISERDESE2 block does the serial to parallel conversion but
it is necessary to use IDELAYE2 blocks to delay the sample clocks so they
are centred at the eye of the symbol. The reason for this is the time uncertainty due
to a delays caused by buffering and routing of the high speed data clock.

The software has to go through a training sequence where it moves the sample
point to find the bit transition points. This is controlled by a state machine.
An ISERDESE2 block is used in the clock sync subsystem were it
correlates the clock against itself to find the optimum sample point. More
ISERDESEs are used for the actual de-serialisation of the data lanes.
There are 2 data lanes per channel and four channels.
The channels operate at 500 MHz (4 x 125 MHz). Data is read on both
the rising and the falling edges of the data clock.
Because of the delay added by the IDELAYE2 to the data clock it is also
necessary to use bit slipping to regain frame alignment of the bitstream.

The good news is that according to the Xilinx data sheet what I am trying to do
is well within the abilities of  my Zynq, it is just going to take a bit longer than
I had hoped.

On the bright side this exercise is forcing me to finally learn how to write code
for Zynq based systems as I have had to use so many aspects of the device.

I have FreeRTOS working as the embedded OS, DMA from PL to PS working.
SPI comms to the ADC working and of course TCP/IP/Ethernet using
lwip for comms working.

I have included an image of the prototype system talking to SDR# using the
NetSdr protocol used by RFSpace just to prove I have been busy.

Monday, 18 April 2016

Back to my Diversity SDR project

4 Channel Diversity Hardware

Vivado Block Diagram

I finally took some time out from DATV-Express to work on my SDR.
The hardware consists of a Digilent Zedboard (Zynq 7020) an Analog Devices
interposer board (which required a slight modification) and an Analog Devices
AD9253 ADC evaluation board. The Zedboard was bought used from a
seller on eBay so I paid about half the retail price but as I didn't get the
Xilinx licence I am using the free webpack version.

The OS I plan to run is FreeRTOS I think I have the FPGA code just about
finished and am now starting on the C code for the ARM part of the Zynq.
This will primarily just be taking data DMAed from the PL (FPGA) fabric
and sending it out over Ethernet. 

The board is currently running a demo program that uses the lwIP
stack which I plan to use as a framework for my own application.
I am calling it Diversity4SDR (because it has 4 channels).

The AD9253 is a 4 channel 125 MS/s 14 bit ADC. The FPGA code which uses
mainly free Xilinx AXI4 IP blocks decimates the 125 MS/s into 4x 1MS/s
channels so they will fit over 1Gbit Ethernet.

As SDRs are pretty simple to do I am using this as a training exercise to
get up to speed with Xilinx Vivado . Eventually I plan to use Vivado
to implement a DPD system.

What has spurred me on to get back to this is the excellent work done by
Pavel Demin examining what he has done has help tremendously with the
FPGA code.

Now you know why I seldom ever get on the radio.