# Seismological Research Letters

- © 2014 by the Seismological Society of America

## INTRODUCTION

Infrasonic data at volcanoes have been increasingly analyzed to get information about eruptive activity. Wind noise is an important problem, which cannot be solved using more classical seismological techniques such as deconvolution (Robinson, 1967), as the interaction between wind and original signal cannot be modeled simply as a convolution. The problem has therefore been tackled with a wide spectrum of original approaches, from the use of sensor arrays (Ripepe and Marchetti, 2002; Matoza *et al.*, 2011), to spatial filters consisting of a network of pipes (Hedlin *et al.*, 2003) or the location of the sensors in a densely forested area (Garcés *et al.*, 2003). For arrays of infrasonic sensors, a pure state data‐adaptive polarization filter has been proposed by Olson (1982), dependent upon a measure of the multivariate coherence. This cannot, however, be applied to single sensor time series, whereas for monitoring purposes the availability of a single sensor very close to the crater is very important, as it was demonstrated, for example, in the 2011 Shinmoe‐dake eruption. To enhance the recognition of infrasound signals of small amplitude produced during that eruption, Ichihara *et al.* (2012) proposed a method for exploiting the use of a co‐located seismometer through the cross‐correlation function with seismic data. The appearance of characteristic patterns indicates an infrasound signal possibly originated at the volcanic vent. An alternative strategy was adopted in Cannata *et al.* (2013) to detect explosive activity of Mt. Etna; in that case the method is based on joint analysis of seismic and infrasonic data by wavelet transform coherence.

Here, we apply the wind noise reduction procedure recently proposed by Cabras *et al.* (2014) for seismic data, based on non‐negative matrix factorization (NMF) with sparse coding (SC; Hoyer, 2002) and on the construction of a wind noise dictionary that is estimated from the available recording itself. The procedure is here extended to the filtering of wind noise in infrasonic microphones and its efficiency evaluated by the method proposed by Ichihara *et al.* (2012). The combined use of both methods strongly enhances the capability to distinguish infrasonic signals of possible volcanic origin and therefore monitoring volcanic activity of low energy.

## METHODS

Generally, we assume that recorded data **X** is a superposition of hidden components, with specific structures and physical interpretation, from a dictionary **D**, such that **X**=**DH**, in which **H** represents the synthesis coefficients matrix of **X** from **D**. Desirable data decomposition often leads to the sparsest (simplest or most economical) representation of the signal, that is, few synthesis coefficients have large magnitude, whereas most of them are close to zero; in this case we refer to **H** as sparse code. Methods that exploit non‐negativity and sparsity constraints usually lead to correct estimation of the hidden components with physical meaning, in contrast to other matrix factorization methods. In the following, we introduce a dictionary learning technique that adaptively learns **D** directly from a real world dataset **X**, as infrasonic records of possible volcanic origin. More details on these techniques can be found in Cabras (2014). An NMF combined with SC (Cichocki *et al.*, 2009) was used by Cabras *et al.* (2012) to separate seismic signals produced by the lava lake convection at Erta Ale, Ethiopia (Jones *et al.*, 2006), showing that a factor decomposition (Lee and Seung, 1999) can be carried out on single channel time series imposing only the constraints of non‐negativity and sparseness of the data. One of the requirements is the availability of a noise‐only section of data, which is not often the case with the wind noise at an erupting volcano. Cabras *et al.* (2014) then modified the original procedure into a two‐step learning process. In the first step, we learn a volcanic‐set dictionary selecting a nonwindy dataset for training; in the second training step, we learn the wind‐set basis components (or noise‐set dictionary) **D**_{n} selecting a windy dataset for training. The separation stage remains the same of (Cabras *et al.*, 2012), providing a fixed wind‐set basis components, **D**_{n} to the constrained sparse NMF learning loop.

NMF problem can be formalized as follows: given a non‐negative data matrix (with **X**≥0) and a reduced rank *K*: *K*≤min(*F*,*T*), find two non‐negative matrices called dictionary or basis components and called sparse code or weights, which factorize **X** as well as possible, that is (1)in which represents the approximation error to minimize. To estimate the factor matrices **D** and **H**, we need to minimize a chosen cost function between the data matrix **X** and the estimated model matrix , for example, an Euclidean distance.

Given the signal *x*(*t*)=*s*(*t*)+*n*(*t*), the interference *n*(*t*) and the source of interest, or target, *s*(*t*) should be estimated directly from the observable data *x*(*t*) with minimal *a priori* knowledge. First, we transform into a sparse non‐negative time–frequency representation suitable for NMF scheme: (2)

More specifically, we use an element wise *β*=1/3 exponentiated short‐time fourier transform (STFT) with 50% overlap (Schmidt *et al.*, 2007) (3)

Assuming the additivity of sources, we have (4)in which all matrices are non‐negative, as required by NMF constraints.

We use a constrained sparse NMF (NMF^{*}) to compute the dictionary of the target source **D**_{s} and the sparse code of both sources **H**_{s} and **H**_{n} assuming the dictionary **D**_{n} known *a priori* (Cabras *et al.*, 2012). Finally, we estimate the spectrogram of the source as (5)and the spectrogram of the wind noise as (6)

We first precompute a background target dictionary () when nonoverlapped to foreground noise, then learn foreground noise dictionary (**D**_{n}) from mixed recording. Finally, we recompute background target dictionary (**D**_{s}) from mixed recording with the foreground training and background learning enhancement model. Details of such procedure can be found in Cabras *et al.* (2014).

When an acoustic wave is propagating horizontally over a homogeneous elastic ground surface, it induces ground oscillation of which vertical velocity component is delayed by a quarter of a cycle relative to the acoustic wave itself (Ben‐Menahem and Singh, 1981). When the records of a microphone and a co‐located seismometer are dominated by such acoustic oscillation, the cross‐correlation function between them shows therefore a particular pattern having a node at zero time delay with positive and negative maximum correlation in the positive and negative time delay, respectively, of the vertical ground velocity to the pressure oscillation. Therefore, a graphical presentation of the cross‐correlation function is useful in recognizing infrasound signals hidden in wind noise (Ichihara *et al.*, 2012). If a separation between infrasound and wind noise is successful, we should therefore see a graphical enhancement of such patterns in the separated infrasound component and their partial or total disappearance in the separated wind noise component.

## A CASE STUDY: SHINMOE‐DAKE ERUPTION INFRASONIC DATA

Shinmoe‐dake volcano, part of the Mount Kirishima cluster of volcanoes in southwestern Japan, after precursory small‐scale phreatic eruptions in 2008 and 2010, last erupted in 2011 (Suzuki *et al.*, 2013). The event started with a phreatomagmatic eruption on 19 January, followed by three subplinian eruptions on 26 and 27 January, lava effusion from 28 to 31 January and a series of vulcanian eruptions between February and April. All the times indicated are GMT. An infrasonic microphone (Hakusan, SI102, 0.1–1000 Hz) was installed on 6 December 2010 at 5 m distance from an existing telemetered broadband seismic station (Trillium 120P), about 700 m from the active crater. The signals are recorded at 100 Hz. This eruptive period was one of the case studies presented by Ichihara *et al.* (2012) to illustrate the use of their cross‐correlation methodology, proving that the correlation pattern highlights qualitative changes in the eruptive activity. Although the infrasound signal that causes the patterns is not necessarily of volcanic origin, the location of the station that is close to the active crater and far from sources of artificial noise allows us to assume that the infrasound is of volcanic origin. In fact, the method was very useful for suggesting relatively short periods of potential volcanic activity to be examined in detail, for example, with visual observations or other sources of additional information. Moreover, disappearance and reappearance of the correlation pattern do not necessarily indicate cessation and restart of volcanic infrasound, but they do indicate a change in the relative strength of the incident seismic and infrasonic waves, which in turn indicates changing volcanic activity (Ripepe *et al.*, 2005; Johnson and Aster, 2005). The identification of such patterns and their visual appearance can be strongly affected by the presence of wind. A proper filtering of the wind noise should therefore highlight the infrasonic patterns in the cross‐correlation graphs, especially when the signal‐to‐noise ratio (SNR) is low, for example, when the infrasonic is caused by moderate degassing and small ash emissions. To evaluate the effectiveness of the wind noise filtering procedure and the possibility to improve the Ichihara *et al.* (2012) method, we processed the days of infrasonic data before the onset of the phreatomagmatic eruption on 19 January.

## RESULTS AND DISCUSSION

To estimate the preliminary target dictionary , we used data recorded on 2 January, from 17:24 to 18:36, a period where no wind was present. The recognition by the user of such a time period for which there is only source signal is essential for the methodology to work properly. The choice is helped by the validation of a statistical foreground activity detector, implemented as a time‐recursive averaging algorithm based on signal‐presence uncertainty, similar to statistical voice activity detector used in speech enhancement methods (e.g., Loizou, 2007). This pre‐estimated source model is then used to estimate the wind noise dictionary **D**_{n} applying a sparse NMF^{*} learning algorithm on a strong windy time frame, in this case 2 January, from 0:06 to 2:30. This is the final wind noise dictionary, and it is therefore essential to avoid leaking source components into it. This is the reason for using the pre‐estimated source model . The wind dictionary is saved on an external file and is used for the final learning step that is applied to any day of infrasonic data that is to be filtered, to produce **D**_{s}, **H**_{s}, and **H**_{n} of equation (4) and estimate the volcanic source spectrogram and the wind spectrogram with equations (5) and (6), respectively.

Usually this is the final result of the wind filtering procedure (Cabras *et al.*, 2014), whose effectiveness one has to trust. In this case study, however, we can actually evaluate the results of the filtering using the cross‐correlation procedure proposed by Ichihara *et al.* (2012). The cross‐correlation procedure, based on 5 s moving time windows, is applied first to the original microphone data and its co‐located seismic data. Both data are band‐pass filtered between 1 and 7 Hz. This guarantees that the microphone–seismometer distance is larger than the correlation length of wind noise for realistic wind speed but shorter than infrasonic wavelength. Then, exactly the same procedure is applied to the separated volcanic component and to the separated wind component, always using the original seismic signal (i.e., not processed with NMF) for the cross correlation. To illustrate the results, we show the analysis of three full days, 12 January (Fig. 1), 17 January (Fig. 2), and 18 January (Fig. 3). In all three cases, we see how the typical correlation patterns that, according to the method of Ichihara *et al.* (2012), indicate the presence of infrasonic signals of possible volcanic origin are significantly enhanced in the separated volcanic signal with respect to the original signal, and at the same time significantly reduced in the separated wind signal. In addition, Figure 4 shows an example where a short section (10 min) of time series is shown in panel (a) as raw data. After NMF decomposition is applied, this raw data is decomposed into separate wind and source components, shown in panels (b) and (c), respectively. Please note the different vertical scale in panel (c), indicating that SNR in the raw data was far from ideal. The fact that source is improved in the NMF‐filtered signal with respect to the raw data is proven by the increased correlation with seismic data illustrated in Figure 1.

One of the known limitations of the NMF filtering is that the procedure is applied only to the (exponentiated) amplitude spectra, that is, the spectral phases are not filtered. This means that both filtered, separated components will have exactly the same phases as the original, unfiltered time series. The separation of the phases does not seem to be an important issue in one of the key applications on NMF, that is, monaural audio separation (Virtanen, 2007). Correctly separating the phases would probably enhance the results of the cross‐correlation method of Ichihara *et al.* (2012), however, and remains an open issue to be investigated further. Nevertheless, the results presented here demonstrate already that even neglecting phases the proposed combination of NMF wind filtering (Cabras *et al.*, 2014) and cross correlation with seismic data (Ichihara *et al.*, 2012) constitutes a perfectly feasible way to distinguish and enhance infrasonic signals of possible volcanic origin in a time series affected by wind noise.

## CONCLUSIONS

We have presented an original method for highlighting and separating an infrasonic component of possible volcanic origin hidden in a time series affected by (possibly strong) wind noise. The method is a combination of a wind filtering procedure based on NMF and a successive cross correlation with seismic data recorded by a co‐located seismometer. A preliminary dictionary model for only the target source is precomputed and a wind dictionary model is then derived from a source + wind mixture, therefore avoiding the need for a wind‐only, volcano‐free time frame. Only a time period where volcanic source is not affected by wind is required. The method can work effectively with single channel infrasonic and seismic recordings and is therefore particularly useful in situations where available sensors are limited.

## ACKNOWLEDGMENTS

The original cross‐correlation method development was supported by the MEXT Grant‐in‐Aid for Scientific Research (20740251, 22540431). R. Carniel acknowledges the financial and logistical support of the International Research Promotion Office of the Earthquake Research Institute, The University of Tokyo, that made possible the collaboration that led to this paper.