Back to Home Page

Article

Designing A DSP Broadband Modem for Mesh Radio Networks

Article pushlished in EETimes (USA), 10 September 2002

David Spreadbury, Principal Technology Consultant at Plextek Ltd

The single board Radiant Modem, combining DSP and FPGA technologies to best advantageRecently the engineering team at Plextek was involved in the initial design and construction of a broadband wireless access system for Radiant Networks plc (Essex, England), which is a mesh of point-to-point microwave links in the 28GHz band. With paths of variable length and quality, each modem is designed to support four bi-directional Time-Division Multiple-Access (TDMA) channels at 100Mb/s if possible, with fallback rates of 50Mb/s and 25Mb/s if conditions demand, using adaptive equalisation and modulation. In addition there are mesh-specific operations like antenna steering, exploration, node discovery and link formation and maintenance. Statistics are also gathered to support mesh management.

The system design phase explored a range of installation, operating and management scenarios, resulting in high-level operational and technical requirements. The modem described here was conceived as a system component with convenient electrical and mechanical boundaries. It had to be fully functional but flexible, and developed within time and cost constraints. The uniqueness of mesh radio required additional monitoring and control features, which suggested that the development would not be without technical risk. Both raw signal processing capacity and significant flexibility and intelligence were required, in a limited space. The tasks range from demodulating a received IF signal near 50MHz to controlling the drive to motors which steer the microwave horns! Any practical solution would have to contain hardware and software.

Plextek have considerable experience of implementing Digital Signal Processing (DSP) in both areas. For this application a high performance fixed-point Texas Instruments' TMS320C6201 DSP processor was selected to provided to run C code at 1600MIPs, and a pair of Xilinx XCV400 Field-Programmable Gate Arrays (FPGAs) were selected to complement this with about a million "system gates". The logic design would be entered in VHDL. This is a powerful route which, although perhaps lacking in circuit visualisation, allows comprehensive synthesis and placement with timing constraints and efficient simulation.

Constructed on a single circuit board of approximate A5 size, the board also includes 200Msps Analogue-to-Digital and Digital-to-Analogue Converters (ADCs and DACs) for interfacing directly to the radio at Intermediate Frequency (IF), and a pair of 100Msps DACs for debugging/diagnostics (e.g. displaying a constellation on an oscilloscope).

Lower-level functions to be performed include:

  • Receive In-phase and Quadrature (I and Q) sampling
  • Root Raised Cosine (RRC) Channel filtering
  • Correlation
  • Frame, slot and symbol timing extraction
  • Half-symbol spaced ("T/2") equalisation and demodulation
  • Carrier phase and frequency offset tracking
  • Scrambling, and Hamming & Reed-Solomon Forward Error Codes (FECs)
  • Automatic Gain Control (AGC) and power control
  • Transmit RRC IQ and IF synthesis
  • Quadrature Phase-shift Keying (QPSK) and Quadrature Amplitude Modulation (QAM 16 and QAM64) operation depending on link conditions

Detailed design required each function to be allocated to hardware, software, or a mix of the two, at an early stage, with well-defined interfaces. Both contribute necessary resources but neither is adequate on its own.

Simplicity reigns

Perhaps contrary to modern trends, simplicity is the key. Although device performance is ever increasing (which has come to the rescue of some projects!), for a given technology a simpler design will always be cheaper, faster, and consume less power. Here are a few areas where simplicity can be won or lost:

  • Each resource should be used for functions that make the best use of its strengths, e.g. can be readily implemented using native primitive operations. Hardware is perfect for deterministic, repetitive and logical processes at continuous high speed. Software provides much more decision-making capabilities and flexibility.
  • Careful frequency planning is important: integer-related IF, symbol, logic and sampling frequencies can greatly reduce the complexity of key operations like IQ sampling and mixing to and from the IF frequency. Here, the IF frequency was twice the symbol frequency, and the sampling frequency was eight times the symbol frequency. This allowed direct IQ subsampling and very simple mixing to and from IF by assigning samples to I, Q, -I and -Q in sequence. Non-integer relation can be accommodated through the use of numerical oscillators, CORDIC processors, and the like. These are extremely useful, but consume silicon area, power, and time, and should only be used when careful system design confirms their necessity.
  • The use of a common clock for the logic and DSP processor makes communication between the two faster, more deterministic, and reliable.
  • For processing-intensive functions (e.g. correlation and equalisation), pipelining and/or parallelism are key hardware advantages. The silicon efficiency of fixed-architecture hardware is highest when running the logic at its maximum speed.
  • Where coefficients have constant or trivial values and where there is an advantage in not using a multiplier or adder, the rearrangement of some algorithms at bit level can reduce complexity. This applies equally well to data, e.g. when constructing complex modulation from symbol values. Rather than building a complete Finite Impulse Response (FIR) filter, a look-up table (LUT) can be used. The table length is given by the product of bits per symbol, samples per symbol, and the extent (in symbols) of the filter transfer function. This approach is fast, readily adaptable (by changing the table entries or using multiple banks), and handles a wide range of linear and non- symbol-based modulation schemes. This method is used in the Radiant modem, with a 64-entry table for each modulation type, incorporating the necessary scaling to give the required transmit power in each case. Note that this approach is also particularly efficient when used with FPGAs that employ LUTs in their internal construction.
  • High level tools should be used with due consideration of the target architecture - portability usually comes at a price. One can describe processes in C or VHDL that read elegantly but give rise to cumbersome implementations. Tools can sanitise design entry, but they should not be expected to replace design effort.

Traditionally, a hardware front end runs continuously at the ADC sampling rate to process the incoming signal, implementing a complete signal path which feeds partially processed results (e.g. symbols or bits) on to the DSP. This can be inflexible, however, and certainly would not be adequate here because of the many modes of operation required, many responsive to "live" conditions. The final solution employed in the modem was therefor to use the hardware in two complementary ways:

  • Functions which are intimately related to time-critical signals (e.g. front and back end sampling and filtering, frame timing) run continuously in hardware to provide truly deterministic operation. These functions tend to transfer data to and from real-time memory, mapping time to signal memory location and thereby relieving other functions of this burden (for example, the hardware correlator, which is used to find precise symbol positioning, informs the processor where in memory to initiate a demodulation process, not when to do it).
  • The remaining functions interact with the software through control, status and data registers, mapped into the processor address space. The trick is to design in flexibility through the use of coefficients, parameters and flags, with an adequate repertoire that does not compromise performance. Such functions operate as peripherals, hardware accelerators or "toolboxes" for the software, allowing their intelligent and flexible use, so that the software is free to construct and adapt the overall processing algorithm as necessary. It benefits both from the speed and parallelism of the hardware (both between different hardware functions and with the software) and a simple interface mechanism, so that enough processing capacity remains to handle those tasks which are performed in software. Even if the processor takes a large fraction of a microsecond to get round to a task, as long each task is completed in adequate time all is well. Run time concerns have been migrated from symbol to slot levels, which is far more appropriate.

The end application may be unusual, but this basic approach is applicable to any high-performance DSP-based Wireless product. With care, "the best of both worlds" can be achieved, as long as sufficient effort is given to task partitioning and interfacing - both between the hardware and software, and the logic designers and programmers!