# Analogue CMOS Direct Sequence Spread Spectrum Transceiver with Carrier Recovery Employing Complex Spreading Sequences

Abstract—DSSS (direct sequence spread spectrum) is a form of digital communication where each bit is represented by a unique spreading sequence. DSSS is a promising technique for implementing high speed wireless links. For any DSSS communication system to be successful, an effective, low cost transceiver is required. This paper presents transceiver systems which have the potential to fulfil this role. An analogue DSSS transceiver is presented to demonstrate an ideal receiver and a DDC-CRL (decision directed Costas carrier recovery loop) is presented which performs carrier and phase estimation in the receiver. These systems operate anywhere over a 20 MHz bandwidth within the 2.4 GHz to 2.4835 GHz ISM (industrial, scientific and medical) band and are implemented for the 0.35 µm CMOS process from Austria Microsystems (AMS).

*Index Terms*— DSSS, analogue DDC-CRL, complex spreading sequences, cross coupled oscillator, power amplifier, Costas loop, low noise amplifier

## I. INTRODUCTION

DSSS refers to the application of spread spectrum theory to a communication system, where each bit in a binary data stream is multiplied by a unique spreading sequence prior to modulation. Using the family of CSS (complex spreading sequences), DSSS offers several advantages over conventional binary spreading sequences.

Such a system is presented in [1], where QPSK (quadrature phase shift keying) is the modulation scheme used. In-phase and quadrature components are required in the transmitted signal to facilitate carrier recovery and phase estimation at the receiver and to represent the real and imaginary components of the CSS.

A DDC-CRL performs carrier recovery and phase estimation by comparing the mismatch between the in-phase and quadrature components of the received signal in the presence of a frequency or phase error between the local carrier reference and the carrier of the received signal.

Previous DSSS transceivers were digital in nature and required complicated digital architectures to realize the mathematical operations required.

The analogue implementation presented greatly simplifies the transceiver structure by using analogue systems which closely approximate the ideal mathematical operations required. This allows for operations in the transceiver to be performed without digital memory elements and negates the requirement for a high speed digital controller.

#### II. DESIGN SPECIFICATIONS

The transceiver specifications are dictated by channel characteristics and limitations imposed by the CSS used. The DDC-CRL performance is primarily quantified by its operating bandwidth, tracking range and BER (bit error rate).

# A. Operating Bandwidth

The transceiver and DDC-CRL are designed to comply with the IEEE 802.11 wireless standard, which defines 20 MHz channels within the 2.4 GHz ISM band. The operating bandwidth of these systems is limited to 20 MHz at 2.4 GHz to comply with this standard.

## B. Complex spreading sequences

The CSS used are from the family of GCL (general chirp like) sequences [2]. These sequences have excellent auto and cross correlation properties and perfectly flat spectra. The sequence length directly influences the signal bandwidth and is chosen accordingly to maintain signal bandwidth within restrictions.

#### C. Tracking Range

The tracking range of the DDC-CRL is the range over which the loop is able to track and lock onto a carrier. If left unconstrained, there is a possibility that the loop will lock onto a signal in an adjacent channel. The pull in and hold in range of the DDC-CRL is restricted to 10 MHz around the carrier, or over the full operating bandwidth of the DDC-CRL.

The tracking range of the DDC-CRL is constrained by the internal bandwidth of the DDC-CRL and device limitations which restrict the tuning range of the VCO.

#### D. Bit Error Rate

The digital implementation by [3] had a BER of  $10^{-3}$  without any form of source coding. This figure was used as the minimum acceptable BER for the transceiver and DDC-CRL.

#### E. Additional Specifications

The transceiver can operate in both balanced and unbalanced configurations. The balanced configuration provides for higher noise immunity, whereas the data rate can be increased up to four times if the unbalanced configuration is used. A transmitted power of 10 dBm and receiver sensitivity of -71 dBm was chosen. A typical power dissipation of 120 mW for the ideal transmitter and receiver and 100 mW for the DDC-CRL were also chosen.

## III. DESIGN METHODOLOGY

Prior to the development of the CMOS implementation of the transceiver and DDC-CRL, mathematical models were developed in the MATLAB/SIMULINK simulation environment. The models were used to test conceptual designs and identify problems inherent to these designs. The models also served as an important analysis and rapid prototyping tool during the development of the CMOS transceiver and DDC-CRL.

The models were used to test specifications which were practically extremely difficult to perform in SPICE and served as a platform for the development of SPICE models required to construct a test bench for the transceiver and the DDC-CRL.

Each subsystem of the mathematical models was used as the basis for the development of each CMOS subsystem and for performance comparisons.

#### IV. CONCEPTUAL DESIGN

The ideal transceiver realizes equations (1) and (2).  

$$u(t) = a(t)C_r \cos \omega_c t + a(t)C_i \cos \omega_c t + a(t)C_i \sin \omega_c t$$
(1)
$$a(t)C_r \sin \omega_c t + a(t)C_i \sin \omega_c t$$

$$a_{1}(t) = \begin{cases} 1, T_{b} < t < 2T_{b} \text{ iff } \left\{ \int_{0}^{T_{b}} \left[ u(t)C_{r} \cos \omega_{c} t \otimes \mathfrak{I}^{-1} \prod \left( \frac{f}{BW} \right) \right] \right\} \Big|_{T_{b}} > 0, \\ -1, T_{b} < t < 2T_{b}, \text{ otherwise} \end{cases}$$
(2)

where u(t) is the transmitted signal, a(t) is the digital data signal in NRZ (non return to zero) form,  $C_r$  and  $C_i$  the real and imaginary parts of the CSS,  $\omega_c$  the carrier frequency,  $T_b$  the bit period and *BW* the signal bandwidth. Figure 1 shows the conceptual design of the ideal transceiver.



Fig. 1. Block diagram of the full transceiver system (including the channel) showing the transmission and recovery processes.

The data signal that has to be transmitted is first processed digitally to distribute the data along the four branches of the system (depending on whether the user has chosen the balanced or unbalanced configuration). Each of the four signals is spread with either the real or imaginary part of the CSS. Spreading is followed by modulation, in which the four spread signals are modulated onto quadrature or in-phase carriers, creating orthogonal signals. Finally, the signals are summed to form the transmitted signal. The transmitted signal power is significantly amplified prior to transmission by means of a power amplifier.

The DDC-CRL reduces to an ideal receiver if the carrier recovery element is removed, thus ideal receiver operation is not discussed individually.

The LNA (low noise amplifier) in the receiver compensates for any attenuation that the transmitted signal has undergone.

The conceptual design of the DDC-CRL is based on the DDC-CRL topology presented in [3], which is derived from a Costas loop for carrier recovery in a QPSK based system. Figure 2 shows the concept design used.



Fig. 2. Functional diagram of the DDC-CRL. The core of the DDC-CRL is the phase detector marked as FU4 and the local carrier reference marked as FU5.

The DDC-CRL consists of five main subsystems, with a large degree of symmetry existing within the main system. The DDC-CRL consists of two branches, an in-phase and quadrature branch. These branches are identical, with only the inputs differing. The remainder of the DDC-CRL consists of a phase detector and variable carrier reference.

Both in-phase and quadrature branches have two main stages. A despreading and demodulation stage and an integrate and dump stage. The received signal is first despread by multiplying it with the relevant spreading sequence. The sequence used for the in-phase component is orthogonal to the sequence used for the quadrature component and thus no form of signal separation is required prior to despreading. For the spreading operation to be successful, spreading code lock is required. Spreading code lock is performed by a CDLL (coherent delay lock loop), such a loop is demonstrated in [1]. Demodulation is achieved by multiplying the despread signal with a local carrier reference, which produces a term at baseband and a term at double the carrier frequency. The double frequency term is heavily attenuated by the integrate and dump stage, which also processes the baseband term.

The integrate operation converts the sequence envelope to a ramp which is more easily used in the bit detection and phase detection stages. The integrator is reset at the end of each bit period so that each bit is represented by a distinct ramp. The bandwidth of the integrator determines the tracking range of the DDC-CRL, by suppressing any difference terms not within the integrator bandwidth prior to phase detection.

The output of the integrate and dump stage is processed by a comparator which performs bit detection based on the polarity of the integrator output. The comparator output is used both as the final data output and as an input to the phase detector.

An error signal proportionate to the frequency error is generated by the phase detector. The phase detector consists of two multipliers and a difference amplifier. The integrator output of each branch is multiplied by the comparator output of the opposite branch (the in-phase branch integrator output is multiplied by the quadrature branch comparator output). The difference between the products of each multiplication operation generates the raw error signal. This error signal consists of an AC and a DC term. The magnitude of the DC term is proportionate to the magnitude of the frequency error and the polarity of the DC term indicates whether the frequency error is positive or negative. Any AC term is undesirable.

The phase detector output is used to control a variable oscillator (after low pass filtering). The control signal drives the variable oscillator to the desired carrier frequency, the output of the oscillator is used to demodulate the received signal. To accommodate the quadrature branch of the DDC-CRL, the oscillator output is shifted by 90 degrees.

#### V. MULTIPLICATION AND CARRIER REFERENCES

Two types of multiplier can be identified in the transceiver and DDC-CRL, fully analogue and pseudo digital.

The modulation, despreading and demodulation multipliers are implemented with Gilbert mixers [4]. A digital multiplier was not suited as one of the multiplicands has analogue characteristics. Gilbert mixers required excess biasing of the inputs before multiplication could be performed. For these reasons simpler multipliers were implemented using the principles of Figure 3. One of the multiplicands is either a digital high or low and thus will always have a known magnitude but unknown polarity. The other multiplicand can assume any (analogue) value. Multiplication is achieved by only using the polarity (or digital state) of the digital input to determine the product.

In the presence of a digital high, the multiplier behaves as a follower, and the analogue input is simply multiplied by '1'. When a digital low is present, the multiplier behaves as a unity gain inverter or multiplication by '-1'.

This allows for simple and effective multiplier to be implemented in the transmitter and phase detector.



Fig. 3. Basic principle used to construct the multipliers in the transmitter and phase detector. By employing two of the switching schemes shown, a simple multiplier can be implemented which generates the same output as a more complex analogue multiplier.

To prevent the generation of undesired phase errors caused by a non-ideal carrier reference, it is desired for the in-phase and quadrature references to be exactly 90 degrees out of phase, regardless of the frequency. Techniques for generating phase shift are presented in [4] and require a large amount of signal processing before an accurate shift is achieved. To alleviate the requirement for a phase shifter a reference is required with quadrature outputs regardless of frequency and hardware mismatches. Such a reference is discussed in section VI.

#### VI. CMOS IMPLEMENTATION

# A. Serial to Parallel Converter

Serial to parallel conversion is required to distribute the input data to the four branches of the transmitter. The clock signal used to synchronize the conversion is derived from the input data using digital clock recovery techniques [5].

## B. Analogue Mixing

Modulation, despreading and demodulation involve the multiplication of one analogue signal with another. The transceiver and DDC-CRL use Gilbert mixers for these operations. The design of Gilbert mixers is presented in [4] and [6]. These mixers are easily cascaded but all inputs and outputs are in differential form and require biasing prior to multiplication. The differential output of the demodulation mixer is converted to a singled ended form using an active load differential amplifier.

# C. Power Amplifier and Low Noise Amplifier

The power amplifier consists of two voltage gain stages and a class E current amplifier for current gain. The amplifier design is based on that of [7] and [8]. A differential LNA is used in the receiver.

## D. Pseudo Digital Multipliers

Figure 4 shows the multiplier used in the transmitter. The multiplier switches between its two analogue inputs based on the digital state of the digital input. This provides the same result at the output as explicit multiplication of a digital signal with a spreading sequence.



Fig. 4. Pseudo digital mixer in the transmitter.

Phase detector operation is described in section IV. A single phase detector multiplier is shown in Figure 5.



Fig. 5. CMOS implementation of a single multiplier used in the phase detector. The multiplier consists of a unity gain input buffer, a configurable amplifier and switching circuitry to control the amplifier configuration.

Multiplication is performed by a configurable amplifier, where the amplifier configuration is determined by two pairs of NMOS switches. When a digital high is presented to the digital input to the multiplier the switches configure the amplifier as a unity gain non-inverting amplifier. When a digital low is present the switches configure the amplifier as a unity gain inverter. This realizes the multiplication by "1" and "-1" with a relatively simple circuit. The feedback loop used within the multiplier ensures linear operation.

# E. Integrate and Dump

Integration is performed using an active non-inverting or Deboo integrator. The gain element within the integrator is implemented with an operational amplifier developed specifically for the DDC-CRL. Dumping refers to rapidly discharging the capacitive element within the integrator, which resets the integrator output to 0 V. A clock signal synchronized with the bit stream is used to drive CMOS switches which discharge the capacitive element at the end of each bit period.

The integrator is the primary subsystem limiting the tracking range of the DDC-CRL since it behaves as a first order low pass filter. Any difference terms generated that lie outside the bandwidth of the integrator are suppressed and have minimal influence on the phase detector. The integrator was designed to have a 3 dB bandwidth of 6 MHz. This figure was chosen to provide linear integration of the CSS while still

allowing difference terms below 10 MHz to pass without significant attenuation.

## F. Comparator

The comparator used is a simple two stage comparator with no hysterisis as presented in [9]. The comparator consists of a differential input stage for accurately controlling the trip point, and a current inverter drive stage to preserve the slew rate in the presence of a capacitive load. The output of the comparator is converted to CMOS digital logic levels by PMOS/NMOS cascode, which pulls the output up to VDD or down to ground.

#### G. Low Pass Filter

This filter is intended to be user definable for several reasons. The settling time of the filter directly influences the tracking performance of the loop by determining the time required by the loop to respond to a step change in the frequency.

The filter ideally must only pass DC. This requirement is relaxed to allow frequencies up to approximately 20 KHz to be passed. Frequencies in this range vary relatively slowly compared to the bit rate, and thus introduce an allowable phase jitter.

# H. Carrier References

As stated in section V, a phase shifter is undesirable as it introduces unnecessary hardware complexity. All carrier references are based on a cross coupled oscillator. The oscillator has four outputs with all outputs in quadrature with each other. The VCO implemented is shown in Figure 6.



Fig. 6. CMOS implementation of a cross coupled VCO. All the outputs are in quadrature with each other, if O1 is taken as a 0 degree reference, then O2, O3 and O4 are 180, 270 and 90 degrees out of phase with O1.

The in-phase and quadrature components are generated by employing direct and cross coupling techniques between two identical LC oscillators. This results in oscillations that are in quadrature [10]. By varying the load capacitance of the oscillator a VCO is implemented which shares all the properties of a cross coupled oscillator.

# I. Operational Amplifier

Central to the operation of the integrator and phase detector is an operational amplifier designed to introduce design flexibility. The op-amp is a two stage un-buffered amplifier with differential input stage and common source gain stage and is based on the design in [9].

#### VII. SIMULATION RESULTS

Simulations were performed both in MATLAB and SPICE to verify the operation of the system and determine if specifications have been met or not. The MATLAB results were used as the benchmark for measuring the performance of the CMOS implementation.

# A. Model Comparison

Figure 7 indicates a close correlation between the ideal despreading and demodulation stage in MATLAB and the same operation performed by the despreading and demodulation stage in SPICE.



Fig. 7. Comparison of the despreading and demodulation stage between the MATLAB and SPICE models. Signal lock is an assumed condition. The random bit streams used for the simulations are not identical.

#### B. Transmitter Performance

Figure 7 shows the final output of the transmitter. Shown are four individual bit streams on four orthogonal bases with a maximum signal power of 10 dBm.



Fig. 8. Transmitted signal characteristics.

# C. Phase Detector Performance

The phase detector's response to a sustained error is shown in Figure 9 for one of the two conditions under which the loop may operate. The figure shows the phase detector response when the error signal is determined over several bit periods.

When the settling time of the low pass filter is longer than

the bit period, then the phase detector output is as shown in

Figure 9. The figure indicates the presence of phase inversion, where the phase detector generates an error signal with correct magnitude but incorrect polarity. Phase inversion is caused by phase ambiguity between the in-phase and quadrature branches and is inherited from the modulation scheme used. Phase inversions are avoided if differential QPSK is used as the modulation scheme. Phase inversion is less likely to occur in loops where the error signal is determined within one bit period.



Fig.9. Phase detector response for a sustained frequency error when the error signal is determined over several bit periods. Dashed lines indicate the detectors response in the absence of phase inversion.

The unsymmetrical nature of the response in Figure 9 is caused by the distortion introduced by the non-ideal integration of the short CSS used.

# D. System Specifications

## 1) Operating Bandwidth

Operating bandwidth was restricted to a 20 MHz range by choosing the bit rate and CSS length such that the spread signal occupies this bandwidth. The bit rate used to test the system was 1 Mbps with a CSS length of 13. This allows for a high bit rate at the expense of loop performance. The relatively short sequence used distorts the integrator output and degrades the phase detectors performance.

2) Bit Error Rate

Testing of the BER requires a brute force approach, simulating the system over an extended period of time under normal operating conditions.

Bit error rate is only tested using the MATLAB model because a test of the CMOS implementation requires an impractically long simulation time. Constant verification of the CMOS subsystems against the MATLAB model ensured correlation between MATLAB and SPICE simulation results. This allows the MATLAB model to be used to test the system BER with confidence. The CMOS implementation specifications are summarized in table 1.

| TABLEI                    |
|---------------------------|
| SUMMARY OF SPECIFICATIONS |

| BOMMART OF BELEFICATIONS                        |                           |  |
|-------------------------------------------------|---------------------------|--|
| Parameter                                       | Value                     |  |
| CSS length                                      | Arbitrary                 |  |
| CSS Sampling frequency                          | Arbitrary                 |  |
| Data rate (after serial-to-parallel conversion) | 1 Mbits/s                 |  |
| Oscillator frequency                            | 2.4 GHz                   |  |
| Transmission power                              | 10 dBm                    |  |
| Power of the pseudo-received signal             | -71 dBm                   |  |
| Power consumption (transmitter)                 | 125 mW                    |  |
| Power consumption (receiver)                    | 97 mW                     |  |
| Power consumption (DDC-CRL)                     | 125.4 mW                  |  |
| BER                                             | At least 10 <sup>-3</sup> |  |
| Operating Bandwidth (DDC-CRL)                   | 20 MHz                    |  |
| Tracking Range (DDC-CRL)                        | $\pm 10 \text{ MHz}$      |  |

# IX. CONCLUSION

The translation of the mathematical model to a CMOS implementation was successfully accomplished, but performance problems arose caused by non-ideal circuit characteristics and signal characteristics inherent to the modulation scheme used.

The use of a spreading sequence which is relatively short compared to the bit period has resulted in undesirable signal distortion caused by the non-ideal integrating element. This can be resolved by using a longer sequence at the expense of bit rate or increased bandwidth. Opting for a longer sequence allows for the overall bit rate to be preserved. Longer sequences also have more sequences in a family. This makes it possible to assign multiple bit streams to a user. The transceiver and the DDC-CRL can accommodate this by the addition of a new branch for each bit stream, with only one phase detector and reference required to perform carrier recovery.

Phase inversion is a phenomenon of the QPSK modulation scheme and is most prominent when the error signal is determined over several bit periods. It was found that the likelihood of phase inversion can be decreased by employing Manchester encoding of the bit stream.

It has been shown that an *analogue* transceiver and DDC-CRL can be implemented in CMOS which has several advantages over digital counterparts. The most prominent advantage is the ability of to accept arbitrary spreading sequences and bit rates. The circuit complexity is also significantly reduced when compared to digital implementations as presented in [1] and [3].

#### References

 F.E Marx and L.P Linde, "A combined coherent carrier recovery and decision-directed delay-lock scheme for DS/SSMA communication systems employing complex spreading sequences," *The Transactions* of the SAIEE – Special Issue: CDMA Technology – Changing the face of wireless access, Vol. 89 No. 3, 1998, pp. 131-139.

- [2] F.E. Marx, "DSSS communication link employing complex spreading sequences". Master dissertation, School of Engineering, University of Pretoria, Pretoria, South Africa, 2005.
- [3] F.E Marx and L.P Linde, "Theoretical analysis and practical implementation of a balanced DSSS transmitter and receiver employing complex spreading sequences," *Proceedings of the 4<sup>th</sup> Africon* Vol. 1, Stellenbosch, 1996, pp. 402-407.
- [4] B. Razavi, RF Microelectronics. New Jersey: Prentice Hall, 1998.
- [5] H. Twyman, (2005, November, 6), Digital clock recovery [online]. Available: <u>http://www.twyman.org.uk/clock\_recovery</u>.
- [6] R. M. M. Greyling, "Design of an integrated direct sequence spread spectrum (DSSS) transceiver for cellular systems," Final Project Report, Dept. Elect. Eng., University of Pretoria, Pretoria, South Africa, 2003.
- [7] C. Coleman, An introduction to radio frequency engineering. Cambridge, New York: Cambridge University Press, 2004, pp. 178-179.
- [8] N.O. Sokal, and A.D. Sokal, "Class E A new class of high-efficiency tuned single-ended switching power amplifiers," *IEEE jour. solid-state circuits*, Vol. SC-10, No 3, pp. 168-176, June 1975.
- [9] E.A Allen and D.R Holberg, *CMOS Analogue circuit design*. New York: Cambridge University Press, 1998, pp. 267 281.
- [10] A. Rofougaran, J. Rael, M. Rofougaran and A. Abidi, "A 900 MHz LC Oscillator with Quadrature Outputs," Dept. Elect. Eng., University of California, Los Angeles, CA, 1996.