Abstract

We classify the driver distraction level (neutral, low, medium, and high) based on different wavelets and classifiers using wireless electroencephalogram (EEG) signals. 50 subjects were used for data collection using 14 electrodes. We considered for this research 4 distraction stimuli such as Global Position Systems (GPS), music player, short message service (SMS), and mental tasks. Deriving the amplitude spectrum of three different frequency bands theta, alpha, and beta of EEG signals was based on fusion of discrete wavelet packet transform (DWPT) and FFT. Comparing the results of three different classifiers (subtractive fuzzy clustering probabilistic neural network, -nearest neighbor) was based on spectral centroid, and power spectral features extracted by different wavelets (db4, db8, sym8, and coif5). The results of this study indicate that the best average accuracy achieved by subtractive fuzzy inference system classifier is 79.21% based on power spectral density feature extracted by sym8 wavelet which gave a good class discrimination under ANOVA test.

1. Introduction

In many countries distraction is responsible for many car accidents. The National Highway Traffic Safety administration (NHTSA) estimates that 100,000 of police-reported crashes because of driver fatigue happened each year [1]. Thereby, it is important to develop automatic detectors of this state. Most of the automatic detection methods are based on analyzing the driver behavior to detect abnormal actions [2] or using image processing technique to monitor and evaluate his head position and eye movement or blinking [3, 4]. Drowsiness can also be identified through electroencephalographic (EEG) signals, which contain alertness information [5]. The EEG plays fatal role into measuring the electrical activity of the brain [6]. Different signal processing techniques, like wavelet transform (WT) [5], means comparison test [7], independent component analysis [8] with different classifiers such as neural networks (NNs) [810], and Fussy Logic [11], have been applied to detect drowsiness in EEG signals. Driving is a complex task in which different skills and functions are combined simultaneously; therefore monitoring drivers’ attention regarding brain resources is a strong challenge for researchers and analytics in the field of cognitive brain research and brain interface computer. The level of interference from task-irrelevant stimulus information (conflict), reflected in slowed responses and decreased accuracy for incompatible relative to compatible stimuli, is found to be reduced after the processing of an incompatible as compared with a compatible stimulus [12]. Recently, soft computing had been seen as an attractive alternative, and several methods were developed for trajectory design and robot motion control using neurofuzzy techniques [13]. Data-driven approaches have been widely applied to solve industrial problems encountered in the real life, including control engineering, instrumentation and measurement, computer security, intelligent transportation systems, and vehicles [14]. Causes of distractions during driving were quite widespread, including eating, drinking, talking with passengers, use of cell phones, reading, fatigue, problem solving, and using in-car equipment such as GPS, media player, and in-vehicle entertainment, thus making it likely that the problem of driver is inattention [1518]. Many researchers have proposed a lot of methods to detect attention change using physiological changes such as eye blinking, heart rate, pulse rate or skin electric potential, and especially brain wave [19]. EEG based methods mainly focus on the monitoring of the alertness variation of driver fatigue due to drowsiness, whereas the detecting approaches of alertness change in tasks requiring sustained attention have been seldomly explored [20]. This work has two objectives: (1) to select the optimal wavelet function for getting the better classification accuracy from the alpha, theta, and beta band features and (2) to determine the classifier which gives better average and individual classification rate. In our work, we have used audiovisual stimuli for evoking four different levels such as neutral, low, medium, and high distraction. Two features, spectral centroid and power spectral density (PSD), are derived using wavelet transform on theta, alpha, and beta band. These numerical features are classified using three different classifiers, namely, -nearest neighbor (KNN), probabilistic neural network (PNN), and fuzzy inference system. In our last recent work, we used PNN classifier to classify driver drowsiness level (sleepy state) and achieved 61% based on db4, and we expect that this accuracy would be more, if fuzzy classifier had been used [21]. This paper adds on significant solution for driver distraction level related to EEG bands and their position in the packet of the wavelet transform explored mathematically by their designated equations. In this work, a set of four distraction stimuli, namely, media player, GPS, mental task, and SMS message are induced by using audio-visual stimuli. The rest of this paper is organized as follows. In Section 2, we summarize the research methodology by elucidating the data acquisition process. Sections 3 and 4 explain feature extraction using wavelet transform and classification of distraction level by different classifiers, respectively. Section 5 illustrates the overview of the results and discussion of this present work, and conclusions are given in Section 6.

2. Data Acquisition

Mobile phone is considered as the main reason of driver distraction compared to other distraction reasons such as GPS, music and video players, and mental thinking. Therefore, we applied these four distractions to develop suitable database for this work using EEG signals. Figure 1 shows a simulated environment of real driving in one of our university laboratories based on simulation driving software. Infrared camera had been used to capture the driver face image for data validation after finishing the experiment.

Before start driving, the subject was asked to initially keep eyes closed for 2 min duration followed by another two minutes for open eyes. After this neutral initialization, the driver was asked to drive for 30 minutes containing different tasks of distraction, each 2 minutes duration, such as media player, GPS, mental thinking by answering few mental questions through mobile phone, and finally he or she should type and send SMS messages. Through this protocol and according to the continuous performance test (CPT), we can determine whether the subject is in low, medium, or high level distraction according to his/her time response through relooking to the screen and controlling the steering wheel continuously. For the first 30 subjects, we first determined visually the 1 sec. duration of distraction (like typing in GPS or SMS messages) and considered as low level. Secondly, for the medium level, the continuous 2 sec. distraction time was extracted, whereas the continuous 3 sec. distraction time is assumed to be as a high level.

In this work, 50 subjects (43 Males and 7 Females) in the age range of 24 years to 34 years have participated. Emotive EEG system is used to acquire the EEG signals over the complete scalp through 14 electrodes (FP1, FP2, F7, F8, F3, F4, T7, T8, P7, P8, O1, O2, A1, and A2). All the electrodes are placed over the subject scalp based on international 10–20 system of electrode placement. EEG signals are acquired at a sampling frequency of 128 Hz and band pass was filtered between 0.05 Hz and 60 Hz. The reference electrode and ground electrode are placed on right and left ear lobes. The impedance of the electrodes is kept below 5 KΩ.

3. Feature Extraction

Brain electrical signals are time-varying and nonstationary signals, which have different frequency elements at different times. Indeed, the EEG signals cannot be considered as stationary even under short duration, since it can exhibit considerable short term nonstationary [22]. Therefore, DWT is a more suitable method to decompose the EEG signal into its different frequency bands and retain the signal information in both time and frequency domain unlike FFT or STFT [22, 23]. In this work, the spectrum features from the EEG signals for different distraction levels are derived from three frequency bands, namely, theta, alpha, and beta, by applying four different wavelets (db4, db8, sym8, and coif5). These wavelet functions have been chosen due to their near optimal time frequency localization properties. Moreover, the waveforms of these wavelets are similar to the waveforms to be detected in the EEG signal, the orthogonal property, and optimal number of filter coefficients for reducing the computational complexity. Therefore, extraction of EEG signals features is more likely to be successful [23]. Due to the nonstationary nature of EEG signals, we need to analyze them onto basis functions created by dilation and shifting the mother wavelet function. In general, the characteristic nature of mother wavelet function should be similar in shape to the original signal under processing. The extracted wavelet coefficients provide a compact representation that shows the energy distribution of the EEG signal in time and frequency [24].

The researchers are utilizing discrete wavelet packet transform (DWPT) for efficient frequency band localization. DWPT decomposes both high and low frequency component of the input signal into any level of decomposition as shown in Figure 2, unlike normal wavelet transform which decomposes only the approximation coefficients in the subsequent levels. In this work, DWPT is used to obtain three frequency bands, namely, theta (4–8 Hz), alpha (8–12 Hz), and beta (14–32 Hz) frequency bands, for distraction detection. PSD estimates of noise signals from a finite number of its samples are based on three fundamentally different approaches, namely, parametric, nonparametric, and subspace method. Though the computation complexity is higher during the PSD computation using DWPT and FFT approach, it gives good classification accuracy on efficiently distinguishing the distraction levels. As a beginning of this research, we computed the PSD feature through DWPT and FFT. In future, we aim to analyze the significance of PSD through DWPT alone for distraction levels classification.

The mathematical derivation of the approximation coefficients (, , and ) is by taking the samples of the input signal and extend it to , as is a constant which is equal to 0 for even or 1 for odd [25]. This extension is highly needed to make matching between the numbers of input samples with the wavelet filter coefficients, and this thing should be applied on each input to any level. Therefore the new extended signal is as follows: And by applying wavelet decomposition on this signal by performing convolution of the input samples with low pass filter coefficients of coefficients as shown in Figure 3 to produce approximation coefficients, we have Convolution of the input signal samples with high pass filter coefficients produces the first level detail coefficients (CD0, CD1, and ) as follows: The generalized equation for deriving approximation coefficients and detail coefficients for wavelet decomposition is given as The basic relation between the input samples and filter coefficients (low pass and high pass) for generating approximation and detail coefficients for any level “” can be stated as The general wavelet packet transform equations for deriving theta band (level 4, part 1), alpha band (level 4, part 2), beta 1 band (level 5, part 7) and beta 2 band (level 2, part 1) as shown in Figure 2 are given in (6) to (9), respectively, based on db4 () as follows:

3.1. Amplitude Spectrum

Amplitude spectrum is defined as the magnitude of the Fourier transform of a time-domain signal. Every signal can be written as a sum of sinusoids with different amplitudes and frequencies. It can have other names like spectral density, voltage spectrum, power spectrum, and spectral intensity which describes how the power of a signal or time series is distributed over the different frequencies. The frequency spectrum of a time-domain signal is a representation of that signal in the frequency domain. The frequency spectrum can be generated via a Fourier transform of the signal, and the resulting values are usually presented as amplitude and phase, both plotted versus frequency. A signal can be broken into short segments (sometimes called frames), and spectrum analysis may be applied to these individual segments. In this work, the average amplitude of the FFT output of EEG bands wavelet transformed is used to derive two different features, namely, spectral centroid and PSD.

3.1.1. Power Spectral Density (PSD)

Spectral analysis is the distribution of power over frequency. Spectral analysis finds applications in many fields such as speech analysis, monitoring vibration, economics, and sonar systems. In medicine, spectral analysis of various signals measured from a patient, such as electrocardiogram (ECG) or electroencephalogram (EEG) signals, can provide useful material for diagnosis. A random signal usually has finite average power and, therefore, can be characterized by an average power spectral density as where represent the out of FFT and is the position of the FFT components.

3.1.2. Spectral Centroid Frequency

Spectral centroid frequency is commonly known as subband spectral centroid [7, 10]. The spectral centroid is used to find the center value of the groups for each frequency band. Spectral centroids feature extraction technique was widely used in audio recognition because of its robustness to recognize the dominant frequency and to extract EEG features for stress identification [12, 13]. In this work, the author tried to use this feature for EEG classification. The spectral centroid () is calculated using the following formula:

3.1.3. Features Extraction Algorithm

(1) Load the input EEG signal from 14 channels.(2) Apply 4th order Butterworth IIR band-pass filter and followed by notch filter to remove the effects of noises and artifacts.(3) Perform the framing on the preprocessed signal with duration of 1 second.(4) Decompose the EEG signal into five levels using the chosen wavelet function (db4, db8, sym8, and coif5) to extract the wavelet coefficients for theta, alpha, beta 1, and beta 2 frequency bands through DWPT.(5) Perform FFT for each frequency band to get the frequency spectrum where = position of sample after FFT, is the input wavelet coefficients corresponding to any of the four frequency bands, is the number of input sample positions, and is the maximum length of the input wavelet coefficients.(6) Determine the absolute value of FFT to get the PSD and of the spectrum of each band. (7) Add the amount of this PSD and of each band in this specified channel to the total mean of the said values of each band over the 14 channels(8) Take the average of PSD and of each band by dividing by 14.(9) Repeat the above steps from 4 to 8 for the next 1 sec. EEG and continue to perform the analysis for all the active EEG channel.

4. The Classifiers

A standard classification problem generally follows a two-step procedure which consists of training and testing phases. During the training phase, a classifier is trained to achieve the optimal separation for the training data set. Then, in the testing phase, the trained classifier is used to discriminate new samples with unknown class information. As the predictability of the features may vary, an exhaustive method was used to select the best combination of features. That is, try all possible combinations of features and pick up those with best performances. In this paper, three different classifiers have been used to compare the results and choose the most suitable classifier for this distraction level classification purpose.

4.1. PNN Classifier

In this work, PNN architecture is constructed using newpnn function in MATLAB 7.0. The PNN model is one among the supervised learning networks and has many features different from those of other networks in the learning processes. The data training set was used to train designed PNN. The PNN is tested with testing data set to show the impact on classification rate. The spread value () of the radial basis function (RBF) was used as a smoothing factor, and classifier accuracy was examined with different values of . The first step of training the PNN network is by selecting the optimal spread values which control the spread of the RBF functions. If the spread value is too large, then the model will not be able to closely fit the function if the spread value is too small, the model will over fit the data because each training point will have too much influence. In this work smoothing factor of 0.1 value has been used to classify the hypovigilance level.

4.2. -Nearest Neighbor Classifier

The algorithm of classification of new test feature vector is determined by the class of its -nearest neighbors. This classifier memorizes all vectors in the tanning sets and then compares the test vector with them. Therefore this classifier is called memory based learning. KNN algorithm is based on Euclidian distance metrics to locate the nearest neighbors. The Euclidian distance between the two points and is explained as in (13) where is the number of coordinates. In this work the -nearest neighbor value is varied from 2 to 9. The optimal value of is selected based on the higher classification rate.

4.3. Fuzzy Subtractive Clustering

Fuzzy subtractive (FS) clustering is a fast, one-pass algorithm for estimating the number of clusters and the cluster centers in a set of data. This technique depends upon the measure of the density of data points in the feature space. The aim is to find areas in the feature space with high densities of data points. The point with the highest number of neighbors is considered as the center of a specific cluster. The algorithm will remove the data points within a prespecified fuzzy radius. This process will check all the data points. The radii variable is a vector of entries between 0 and 1 that specifies a cluster center’s range. Small radii values will generate few large clusters. Recommended values for radii should be between 0.2 and 0.5. In this work, a value of 0.5 for all the radii was chosen because this leads to fewer membership functions and less computation time, without losing accuracy. Once the inputs for hypovigilance classification are selected, input membership functions must be determined. The Gaussian membership function shown in Figure 4 is selected since it has continuous derivability. The function is given by . This function is based on two factors, and , as they represent the center and the width of the Gaussian function, respectively. MATLAB fuzzy logic toolbox provides an important function to generate FS system. This function constructs a set of rules to model the data organization based on the subtractive clustering centers and clusters to allocate antecedent membership functions.

4.4. Data Preparation for Classification

The requirement of generating classifier system is to divide the training data into two data sets. Firstly, an input data set which has 6 values of two features and over three bands (, , and ) [, , , , , and ], where and represent centroid frequency and power spectral density features, respectively. Hence each vector of the overall 200 vectors contains 6 values. Therefore the overall data inputs are 1200 values over 50 subjects for four levels (). Secondly, an output data set (1, 2, 3, or 4) is used for one output. The output is either 1 for neutral or 2 for low level or 3 for medium level, or 4 for high level. These points were placed into a single output data set with 200 values, each 50 values for one class, where 60% of the vectors are used as training (120) and 40% as testing (80).

5. Results and Discussion

This research work is intended to investigate the effects of distraction due to cognitive, visual, and auditory distraction using different stimuli. In this work, we utilized the potential of localizing the frequency bands in EEG signals through DWPT and fusion with FFT for efficient feature extraction to get efficient distraction classification. The significance of these two features, spectral centroid and PSD, are checked based on Analysis of Variance (ANOVA) test over each wavelet (db4, db8, sym8, and coif5) as shown in Table 1.

All the results are presented as mean ± SD with values. The ANOVA test with values generally less than 0.005 suggests that these features measures can be used as classification features. We extracted PSD and centroid frequency features from the amplitude spectrum and performed ANOVA test on four classes of distraction (neutral, low, medium, and high). These two features give excellent values under ANOVA test as shown in Table 1. Features are computed from 3-second window of the 14 EEG channels, and ANOVA test is used to check if the mean values are different for the different classes. Table 1 shows the results of the amplitude spectrum parameters for different wavelets over the four levels of distraction. The mean centroid frequency magnitude after neutral state seems to decrease from low to medium to high distraction EEG based on db4. Therefore, both said parameters are suitable for differentiating and classification. For db8 these two features cannot differentiate the medium from the high distraction. When sym8 is applied, the mean centroid frequency magnitude starts decreasing from low to medium to high distraction EEG, and the two features are very weak in medium distraction state. Therefore, it is easy to distinguish this state from low and high distraction. It is obvious under coif5 that the centroid is decreasing from low to medium to high state, while PSD almost shows no significant changes. Finally, we concluded that sym8 wavelet is the most suitable wavelet for distraction classification, therefore it gives maximum classification achievement of 79.10% as shown in Tables 2 and 3 using PSD feature for fuzzy classifier which its input vectors distribution is shown in Figure 5 and its structure is shown in Figure 6. Therefore, we considered this wavelet for subsequent analysis.

Sensitivity and specificity are commonly used performance measures of binary classification tests. Sensitivity is defined as the proportion of actual positives which are correctly identified as positive, and specificity is the proportion of negatives which are correctly identified as negative. These parameters, namely, accuracy, sensitivity, specificity, true positive rate (TPR), and false negative rate (FNR) can be calculated as follows: where TP is the true positive, TN is the true negative, FP is the false positive, and FN is the false negative.

Table 3 summarizes the classification accuracy (% CR), sensitivity, specificity, TPR, FNR of KNN, PNN, and fuzzy classifiers for the two features (centroid and PSD) under db8. The best performance of classification of 79.21% was achieved by fuzzy using PSD feature based on sym8 as shown in Table 3 with an average sensitivity of 82.09%, specificity of 70.36%, TPR of 73.88%, and FNR of 66.45%. The KNN and PNN classifiers produce maximum classification accuracy of 64.32% and 69.72%, respectively, both based on same wavelet (sym8) and same feature (PSD) as shown in Table 3. Therefore, sym8 wavelet can be considered as the dominant wavelet type to get good accuracy of classification of different levels of distraction based on PSD feature.

Table 4 shows the comparison between the maximum mean distraction classification rate of the previous researchers work and the present work. From this table, the maximum mean classification rate of 92% is achieved on classifying two classes [28]. The maximum classification rate of 89.4% is achieved on classifying two classes based on Fisher linear discrimination method [26]. Junya et al. [27] got maximum classification rate of 75.9% on classifying three classes based on hybrid of physical and performance methods mentioned in Section 1. However the present recognition system used 50 subjects and achieved the average maximum mean rate of 98.7% and 79.21% on classifying two and four different levels of distraction, respectively.

6. Conclusion

Most of the research works have discussed the classification of driver distraction into two levels based on EEG frequency bands (distracted or nondistracted). In addition, many of the researchers have not attempted to investigate different types of distraction stimuli in the literature. This paper present amplitude spectrum of the three bands (theta, alpha, and beta) of the EEG signal which has been proposed along with the hybrid scheme based on DWT and FFT. Fusions of the above two methods give more significant results on extraction of centroid and PSD features under ANOVA analysis. The proposed methodology has been tested on 50 subjects and provides maximum accuracy of 79.21% using sym8 and subtractive fuzzy inference system for PSD feature with an average sensitivity of 82.09% and of 70.36%. However, we focus on strengthening this present database with more number of subjects for developing a generalized driver distraction detection system using the proposed methodology.