FPGA Implementation of BCG Signal Filtering Scheme by Using Weight Update Process

Manjula B. M*1, Chirag Sharma2
1Department of Electronics and Communication Engineering, NMIT, Bangalore
2Department of Electronics and Communication Engineering, NREA, UOM, Mysore
*Corresponding author, email: bmmanjula2001@gmail.com

Abstract
A well-prepared abstract enables the reader to identify the basic content of a document quickly and accurately, to determine its relevance to their interests, and thus to decide whether to read the document in its entirety. The Abstract should be informative and completely self-explanatory, provide a clear statement of the problem, the proposed approach or solution, and point out major findings and conclusions. The Abstract should be 100 to 200 words in length. The abstract should be written in the past tense. Standard nomenclature should be used and abbreviations should be avoided. No literature should be cited. The keyword list provides the opportunity to add keywords, used by the indexing and abstracting services, in addition to those already present in the title. Judicious use of keywords may increase the ease with which interested parties can locate our article.

Keywords: ballistocardiography, weight update process, adaptive filter, noise signal, artifacts

Copyright © 2016 Institute of Advanced Engineering and Science. All rights reserved.

1. Introduction
During the recent advancements in the medical field, EEG and BCG signal are used for the continuous health monitoring system. Information extracted from the recording of EEG and BCG provides the efficient study of the brain activities. This information of brain activities is used by neuroscientists; the nature of this is nonreproducible which makes it difficult to study with separate EEG. EEG recording data is affected due to the patient body interaction, electrodes and magnetic field. Electromagnetic field is generated during the imaging due to the switching between magnetic fields. Due to this, EEG signals are obscured with a regular artifact with an amplitude of 100 time greater compared to EEG amplitude [1].

Another source of artifact is tilting-movement of user’s head during the MR scanning. This type of artifacts can be classified into two categories: first category of artifacts considers head movement, amplitudes etc. Another group of artifact is called ballistocardiogram (BCG), which is caused by the micro movements of head which result of cardiac pulsation and affects the alpha frequency (8-13 Hz) of EEG with amplitude of around 150 μV. BCG is a technique to assess the information about myocardial and vascular health. These signals results from the on body due to the reactions caused by cardiac ejection of blood through the vasculature. BCG signals gives the information about the contraction and relaxation of heart, hence, myocardial function can be analyzed by using BCG signal.

Various approaches have been proposed in recent years for BCG measurement. These approaches include tables, beds, chairs, weighing scales and electromagnets [3]. These systems using weighing scales have various advantages in terms of ease of use, reliability etc. But these devices having some disadvantages for measurement of BCG due to the motion artifacts and vibrations of floor. The motion artifacts are induced due the subject’s movement during signal recording, similarly during recording the floor vibration also induces vibration in to the original signal. During recent years several approaches have been proposed to remove the artifacts. A. Hoffmann et al [2] proposed a filtering scheme to remove the artifacts, according to this approach; the signal is filtered out whose frequencies related to the powerspectrum template.

In an early attempt to remove the BCG artifact, a method based on average subtraction has been proposed in [1]. The QRS complexes of subject’s ECG are first detected. Then, a limited number of the EEG signal slices corresponding to the QRS timing are averaged to create
a template for the BCG artifact to be reduced from each channel. This method, which is called average artifact subtraction (AAS), is very popular [2]. However, the assumption that all the waveforms are similar during the scans is not always valid [8]. In order to deal with the heart beat timing variations a weighted averaging approach is proposed in a subsequent study [12]. In [7], the problem of variability of the artifact is addressed using a clustering algorithm. For all the methods, which are based on averaging technique, a reference ECG channel is essential. However, in some cases, this channel is not present or the heart beats are not accurately detectable. A new type of multipath EEG cap is proposed in [13] that oversamples the electrode space to provide an overcomplete representation of the data. Using the assumption that neural activity is Kirchhoffian and the BCG artifacts are not, the artifacts are removed by solving an overcomplete representation of the single trial EEG data. Adaptive filtering has also been used for BCG removal [9], [11], [14]. The reference signal comes from a movement detector, i.e., a piezolectric sensor, attached to the body of the subject inside the scanner [9] and instead of a simple averaging; median filtering is used to create the BCG template [11]. The authors in [15] enhanced their work by exploiting both average subtraction and adaptive filtering. Different independent component analysis (ICA)-based methods have also been used for BCG removal [2], [16-18]. These methods assume that the brain neural activity including evoked potentials, oscillatory waves, artifacts caused by muscles, and noise are all mixed linearly and are independent or at least can be categorized in groups of independent components. As aforementioned, three phenomena with different characteristics generate the BCG artifact. It implies that BCG consists of more than one independent component added linearly to the EEG data [1], [10]. Hence, the artifact can still be separated using ICA methods. The advantage of these methods is that they do not require ECG channel. More importantly, they do not assume that the BCG artifacts are reproducible. Infomax [17] is used in [19] to extract the BCG sources. In [16], fastICA [20], [21] is utilized to remove imaging, BCG, and occular artifacts. In a comparative work, the performance of Infomax, fastICA, second-order blind identification (SOBI) [22], and complexity pursuit [23] are evaluated and compared to the AAS in [2]. A sequential blind extraction method [24] is used in [18] to extract the BCG artifacts and a simple peak detector is utilized to track the time varying period. Based on the assumption that each occurrence of the BCG artifact in any EEG channel is independent of the previous observations, principal component analysis is employed in the optimal basis set (OBS) method [8]. In the next step, for each EEG channel few of the principal components are chosen as the basis set, which is then fitted (scaled in time and amplitude) and subtracted from each BCG instance. To remove any possible BCG residuals, it is proposed in [4] to apply Infomax to the OBS output.

An important issue of concern in BCG artifact removal is selecting the correct number of BCG components. In ICA-based methods, an incorrect assumption about the number of BCGs may influence the independence assumption. It is assumed in [2] that the BCG artifacts are caused only by head movements inside the scanner. In this case, it is mathematically and experimentally shown that the number of independent BCG components is three. Their experiments also show that assuming three BCG components provides reliable results. In another attempt, the number of components is not set fixed and three to six independent components are chosen for different subjects by thresholding the correlation of the estimated independent components (ICs) with the ECG channel [16]. The authors in [8] opted a conservative approach and fixed the number of components to three. In [13], only the strongest component (in terms of power) from the ICA decomposition of the EEG data is labeled as BCG.

In this paper, we propose an ICA-based blind source extraction method for extracting the sources with periodic statistics. Similar to other ICA methods, it is assumed that the original sources and the mixing medium are generally unknown; however, a priori knowledge about the periodicities helps to improve the extraction performance [25]. This method, called Cyclostationary source extraction (CSE), is used to remove the BCG artifacts from the EEG data recorded inside the MR scanner. The period of the second-order statistics is obtained directly from the EEG data (availability of the ECG channel, necessary to some of the other removal methods is not essential here). In order to find the appropriate number of BCG components, we analyze the outputs of different methods using the defined performance indices. Moreover, we show that the proposed method preserves the remaining data better than the other methods.

In this work we propose a new approach for BCG signal filtering by using modified LMS scheme. The proposed scheme is implemented using MATLAB tool and simulated using Xilinx.
ISE simulator to evaluate the performance of the proposed filtering architecture. Comparative study is presented to show the efficiency of the proposed model in terms of delay, slices-LUTs and power.

Remaining paper is organized as follows: Section II discuss about the related work for BCG filtering. Proposed model is described in section III, results and discussion is mentioned in section IV and finally the work is concluded in section V.

2. Related Work

In this section we discuss about the most recent works, proposed for the artifact removal from BCG signal using various filtering techniques. As we have discussed in previous section BCG signal is important signal for extracting the information about human brain and activities but artifacts are very strong in this signal. These artifacts are caused by electromagnetic induction from heart-beat related movement and flow-related movements in a subject [12]. Several methods have been proposed for removing BCG artifacts from EEG. These include averaging [2], independent component analysis (ICA) [3], adaptive filtering [4], and PCA “optimal basis sets” (OBSs) [5]. These methods can provide cleaner looking EEG with reduced power at the BCG frequencies; however, in our hands, they reduce single-trial EEG classification performance when compared to not removing BCG at all, i.e., these methods attenuate or distort that part of the underlying neural signal that is informative for single-trial classification. This finding may be specific to situations where the number of trials is extremely limited. However, for such cases, single-trial classification methods are appealing because they are supervised and multivariate (e.g., can integrate spatially), and are hence applicable to recordings with very low SNR [6-7]. Template- or regression-based BCG removal approaches assume that the BCG artifact is reproducible and that a reference ECG recording is available. ICA approaches, on the other hand, are appealing because they do not rely on an ECG regressor signal and do not assume that the artifact is exactly reproducible. However, for ICA, the artifact is extracted by linear unmixing using matrices that are estimated blindly from the data. Hence, the ICA approach can unmix only to the extent that the mixture model is valid in combination with the independence assumption.

Continuous measurement of BCG signals using a wearable device would greatly enhance the capabilities of the technique for assessing cardiovascular health at home. If BCG signals were continuously obtained throughout the day and night, then specific responses of cardiac output and contractility to perturbations such as ambient temperature [19], posture [20], activity [21], and sleep [22] could be gathered, and a more comprehensive picture of the person’s cardiovascular health could be obtained. Accordingly, researchers have developed wearable systems based on miniature accelerometers to attempt to measure BCG signals continuously [23-24]. However, since the morphology and timing of these signals are significantly different from BCG signals measured using the weighing scale [25], or other historical techniques such as the Starr Table [11], the analysis and interpretation techniques developed for BCG signals should not directly be applied to these wearable acceleration measurements. For example, while the time interval between the electrocardiogram (ECG) R-wave peak and the BCG J-wave peak-the R-J interval-was typically 250 ms for a healthy adult [17] measured with the static-charge-sensitive bed apparatus, and ranged from 203–290 ms for 92 healthy subjects participating in a study with the weighing scale system [26], for the accelerometer-based wearable system the R-J interval was found to be between 150–180 ms [23]. Similar results were found by Wiard et al., with an accelerometer-based BCG system where the R–J interval was 133 ms [27]. Cardiac timing measurements such as the R–J interval are clinically important for a number of reasons. Calcium ions regulate contractility and relaxation of the heart, and recycling of these ions controls the timing of cardiac events. Regulation of calcium ions is thus critically important in mechanical dysfunction and arrhythmia.

3. PROPOSED MODEL

In this section we discuss about the LMS algorithm for filtering. According to this approach the estimated signal’s interval is computed and subtracted from the desired output signal. This computed error is used to perform the updation of tap coefficients before arrival of the next sample. However, in the existing approach of LMS filtering, the disadvantage is
resultant delay from the decision making process. Similar issues arises in the parallel architectures i.e. pipeline structure or a systolic architecture which causes delay in to the architecture. Due to these issues we propose a new architecture of signal filtering by using weight update process.

Weight update is done by using following Equation 1.

\[ W(s) = W(s-1) + \xi e(s-D) X(s-D) \]  
\[ (1) \]

Error computed is given as:

\[ e(s-D) = d(s-D) - y(s-D) \]  
\[ (2) \]

And the filtered output is given as:

\[ y(s-D) = X'(s-D)W(s-D-1) \]  
\[ (3) \]

\( X(s) \) is the input vector which can be denoted as:

\[ X(s) = [x(s), x_1(s-1), ..., x(s-N+1)]' \]  
\[ (4) \]

Coefficient vectors are given as:

\[ c(s) = [c_0(s), c_1(s), ..., c_{N-1}(s)]' \]  
\[ (5) \]

d(s) is desired signal, \( \xi \) represents the error vector.

For the optimization of the mean square error we use averaging on the output signal, this is achieved by performing on the input signal and error signal. Equation (2) and (3) represents the conventional LMS algorithm, by using Equation (1) we achieve the optimal solution for the weights which is given as \( W_{opt} \) and can be computed as:

\[ W_{opt} = H^{-1}P \]  
\[ (6) \]

where \( H = \text{avg}(X^{-1}(s)X(s)) \) and \( P = \text{avg}(X^{-1}(s)d(s)) \)

![Figure 1. Shows the Architecture of the LMS Algorithm for Signal Filtering](image_url)

This architecture contains FIR filtering block, and weight update block. Weight updation of LMS adaptive filter takes place during each iteration (Figure 1). This process in \( ith \) iteration can be defined as:

\[ W_{i+1} = W_i + s \epsilon_{i}X_i \]  
\[ (7) \]
where $E_v$ is the computed error, $s$ denotes the step-size, $X_i$ is the input vector.
In other way, $E_v$ can be written as:

$$E_v = R_d - y_{out} \tag{8}$$

where $R_d$ is the desired response of the signal, $y_{out}$ is the filtered output signal which is computed by using equation (3)

$$y_{out} = W^T_i . X_i \tag{9}$$

$W^T_i$ is the weight vector in $i$th iteration

**input vector and weight vector** are given as:

$$X_n = [x_i, x_{i-1}, ..., x_{i-N+1}]^T \tag{10}$$

$$W_n = [w_i(0), w_i(1), ..., w_i(N-1)]^T \tag{11}$$

In pipelined architecture, error is achieved after the $m$ delay cycles, $m$ is denoted as the adaption delay of the architecture. In the proposed DLMS algorithm delayed error is used for updating the current weight.

This process of weight update is given by

$$W_{i+1} = w_i + s . E_{i-m} . x_{i-m} \tag{12}$$

According to this proposed design, the architecture is divided into two parts: (i) reducing the delay using pipeline architecture and (ii) weight updation

Weight updation of proposed LMS algorithm can be written as:

$$W_{i+1} = w_i + s . E_{i-i1} . x_{i-i1} \tag{13}$$

where $E_{i-i1} = R_{di-i1} - y_{out_{i-i1}}$

$$y_{out} = W_{i-i2} . x_i \tag{14}$$

The proposed modified LMS algorithm performs the computation of error-blocks and weight-updating blocks. This process allows to utilize the pipelining operation by using feed-forward method and results into the pipeline stages and reduces the adaption delay.

a. Weight Updation

During the weight updation process $N$ operation of multiply and accumulate are performed to update the weight of each data. In order to update the weight, the step size $s$ is considered as a negative power by performing the shift operation. This process is continued on each input sample which follows the addition corresponding to the previous weights.

b. Delay adaption

Delayed error generated by the error-computation block at each cycle is given to the weight update block which is scaled by $s$. This process results in the delay of 1 cycle and then total delay, introduced by FIR filters, is computed.

c. Error Computation

Proposed architecture of $N$ taps LMS filter contains error-computation block which consists of product generator, multiplier, adder, subtractor etc. Product generator block receives the input from the original signal and performs the multiplication with the defined step size to achieve the product output. Adder-subtractor blocks utilizes the outputs which are produced by the previous taps of the filter architecture. Similarly multiplier blocks uses the delayed response and the updated weights for the the $n$-tap filter stage.

d. Fixed point implementation

In the proposed architecture we use fixed point implementation and optimization of the architecture of the proposed filtering technique.
During the implementation of the fixed-point implementation we choose word-length, radix-point, weights and internal signals.

4. Results and Discussion

In this section we discuss the results of the proposed approach for BCG signal filtering. This scheme is implemented using IEEE standards VHDL. Xilinx ISE is used for the synthesis and simulation. Proposed LMS filter architecture is simulated using Xilinx ISE 14.3 for the simulation study Virtex V family is used which contains the device XC5VSX95T, package FF1136 and speed grade of -2.

The entire implementation of the approach for filtering contains three stages which are 2-Tap filtering implementation, 3-Tap and 4-Tap filtering architecture implementation. Initially the architecture for reading the data and adding AWGN noise is performed using MATLAB tool. The generated signal which are raw signal and noisy signal are given in Figure 2 and 3.

In this work BCG signals are considered for heart contraction. Each signal contains various peaks and valleys corresponding to the specific event. These datasets were collected with approval from the Institutional Review Board (IRB) at the Georgia Institute of Technology, and with written informed consent obtained [26].

![Figure 2. Raw Data Signal](image)

![Figure 3. Noisy Data](image)

In order to generate the data we have considered various simulation parameters which are given in Table 1.

4.1. FPGA Implementation of 2-Tap Filtering

The noisy and raw signal is processed through the Xilinx System Generator to perform the filtering by using proposed filtering scheme. Initially this operation is performed for 2 tap filtering and architecture is simulated using Xilinx ISE simulator. Performance of the proposed scheme for 2-tap architecture is evaluated by considering maximum operating frequency, delay,
slice logic utilization and total power consumption. The estimated results for 2-tap filtering scheme are given in the Table 2, Table 3 and Table 4.

<table>
<thead>
<tr>
<th>Parameter Considered</th>
<th>Values Considered</th>
</tr>
</thead>
<tbody>
<tr>
<td>No. of Beats</td>
<td>5</td>
</tr>
<tr>
<td>Peak Detection Threshold</td>
<td>0.6</td>
</tr>
<tr>
<td>Left Side Window BCG segmentation</td>
<td>0</td>
</tr>
<tr>
<td>Right Side Window BCG segmentation</td>
<td>700</td>
</tr>
<tr>
<td>Sampling Frequency</td>
<td>1000</td>
</tr>
<tr>
<td>Stop Band Freq. 1</td>
<td>0.2</td>
</tr>
<tr>
<td>Pass band freq. 1</td>
<td>1</td>
</tr>
</tbody>
</table>

**4.2. FPGA Synthesis Results for 2-tap Filter**

<table>
<thead>
<tr>
<th>Performance Parameter</th>
<th>Achieved Results</th>
</tr>
</thead>
<tbody>
<tr>
<td>Maximum Frequency</td>
<td>528.513 MHz</td>
</tr>
<tr>
<td>Input Arrival time before clock</td>
<td>3.702 ns</td>
</tr>
<tr>
<td>Maximum output required time after clock</td>
<td>0.682 ns</td>
</tr>
<tr>
<td>Maximum Combinational path delay</td>
<td>7.286 ns</td>
</tr>
</tbody>
</table>

**4.3. FPGA Implementation of 3-Tap Filtering**

Next stage is to implement the proposed filtering architecture for 3-tap filter scheme. The synthesis results achieved for the 3 tap filter are shown in the Table 5, 6 and 7.

<table>
<thead>
<tr>
<th>Performance Parameter</th>
<th>Achieved Results</th>
</tr>
</thead>
<tbody>
<tr>
<td>Maximum Frequency</td>
<td>528.513 MHz</td>
</tr>
<tr>
<td>Input Arrival time before clock</td>
<td>3.702 ns</td>
</tr>
<tr>
<td>Maximum output required time after clock</td>
<td>0.682 ns</td>
</tr>
<tr>
<td>Maximum Combinational path delay</td>
<td>8.366 ns</td>
</tr>
</tbody>
</table>

**4.4. FPGA Implementation of 4-Tap Filtering**

Next stage is to implement the proposed filtering architecture for 4-tap filter scheme. The synthesis results achieved for the 4 tap filter are shown in the Table 8.
4.4. FPGA Synthesis Results for 4-Tap Filter

Synthesis results for this are given in the Table 1.

<table>
<thead>
<tr>
<th>Performance Parameter</th>
<th>Achieved Results</th>
</tr>
</thead>
<tbody>
<tr>
<td>Maximum Frequency</td>
<td>528.513 MHz</td>
</tr>
<tr>
<td>Input Arrival time before clock</td>
<td>3.702 ns</td>
</tr>
<tr>
<td>Maximum output required time after clock</td>
<td>0.682 ns</td>
</tr>
<tr>
<td>Maximum Combinational path delay</td>
<td>9.446 ns</td>
</tr>
</tbody>
</table>

Table 8. Timing Summary of 4-Tap Filter using FPGA

The proposed design mainly concentrates on the weight updation process for the given noisy data in order to achieve the Figure 4 and 5.

Figure 4. Simulation Waveform for Proposed Filter Architecture

Figure 5. Weight Updated Process for the Given Noisy Input Data
Table 9. Device Utilization Summary

<table>
<thead>
<tr>
<th>Slice Logic Utilization</th>
<th>Logic Utilized</th>
</tr>
</thead>
<tbody>
<tr>
<td>Slice Registers</td>
<td>113</td>
</tr>
<tr>
<td>Flip-Flops</td>
<td>113</td>
</tr>
<tr>
<td>Slice LUTs</td>
<td>267</td>
</tr>
<tr>
<td>Number used as Logic</td>
<td>259</td>
</tr>
</tbody>
</table>

Table 10. Power Consumption Results for 4-tap Filter

<table>
<thead>
<tr>
<th></th>
<th>Total</th>
<th>Dynamic</th>
<th>Static Power</th>
</tr>
</thead>
<tbody>
<tr>
<td>Supply Power (mW)</td>
<td>1482.44</td>
<td>5.80</td>
<td>1476.64</td>
</tr>
</tbody>
</table>

During the next stage proposed model is simulated using Xilinx System Generator which gives the filtered data and weight updated during the filtering. These stages are given in Figure 6:

Figure 6. Input, Filtered and Unfiltered Data

5. Conclusion

In this work an efficient architecture is proposed to achieve the low power and high operating frequency results for the LMS filter. We have used a new approach to update the weights to remove the noise. The proposed architecture uses pipelining architecture to reduce the adaption delay. We have proposed a fixed-point implementation for the LMS filtering scheme. The proposed architecture is implemented and simulated using Xilinx ISE simulator to measure the performance of the architecture in terms of operating frequency, slice logic used and power consumption. This architecture contains three stages of implementation i.e. 2-tap, 3-tap and 4-tap filter. This scheme of signal filtering is used to filter the BCG signal. Results shows the efficient performance of the architecture in terms of frequency, for 2-tap, 3-tap and 4-tap filtering scheme, operating frequency is achieved 528.513 MHz.

REFERENCES


Bibliography Of Authors

Mrs. Manjula B.M graduated from Mangalore University in 2000. She finished her M.Tech in 2007 from Vishweshwaraiyah Technological University and currently working as an Asst. Professor in ECE Dept, Nitte Meenakshi Institute of Technology, Yelahanka, Bangalore

Dr. Chirag Sharma graduated from Maharaja Sayajirao University, Vadodara. in June 2001. He finished his M.S and Doctorate from Utah State University, USA and currently working as Professor in ECE Dept, Nitte Meenakshi Institute of Technology, Yelahanka, Bangalore.