# Seismological Research Letters

- © Seismological Society of America

## ABSTRACT

The standard measure for evaluation of the immediate effects of an earthquake on people and man‐made structures is intensity. Intensity estimates are widely used for emergency response, loss estimation, and distribution of public information after earthquake occurrence (Wood and Neumann, 1931; Brazee, 1976). Modern intensity assessment procedures process a variety of information sources. Those sources are primarily from two main categories: physical sensors (seismographs and accelerometers) and social sensors (witness reports). Acquiring new data sources in the second category can help to speed up the existing procedures for intensity calculations. One potentially important data source in this category is the widespread microblogging platform Twitter, ranked ninth worldwide as of January 2016 by number of active users, ∼320 million (Twitter, 2016). In our previous studies, empirical relationships between tweet rate and observed modified Mercalli intensity (MMI) were developed using data from the **M** 6.0 South Napa, California, earthquake (Napa earthquake) that occurred on 24 August 2014 (Kropivnitskaya *et al.*, 2016). These relationships allow us to stream data from social sensors, supplementing data from other sensors to produce more accurate real‐time intensity maps.

In this study, we validate empirical relationships between tweet rate and observed MMI using new data sets from earthquakes that occurred in California, Japan, and Chile during March–April 2014. The statistical complexity of the validation test and calibration process is complicated by the fact that the Twitter data stream is limited for open public access, reducing the number of available tweets. In addition, in this analysis only spatially limited positive tweets (marked as a tweet about the earthquake) are incorporated into the analysis, further limiting the data set and restricting our study to a historical data set. In this work, the predictive relationship for California is recalibrated slightly, and a new set of relationships is estimated for Japan and Chile.

## INTRODUCTION

Earthquake intensity is a location‐specific characteristic of the extent and amount of seismic damage that depends not only on the magnitude of the earthquake but also on the distance from the earthquake epicenter to the site of interest and the geological features of the surrounding area. The first earthquake intensity classification was designed and used by Italian scientist Schiantarelli in the 1780s. However, the first modern intensity scale was created in 1828 by P. N. G. Egen. The widely adopted intensity scale, the Rossi–Forel scale, was introduced to the scientific community in the late nineteenth century. Since then, numerous intensity scales have been developed and are used in different parts of the world (Hough, 2007).

The traditional understanding of earthquake intensity is based on estimates of infrastructure damage at the specific site of interest estimated from the subjective perception of professional observers and/or volunteers who witnessed the earthquake and its consequences. For this purpose, specially designed questionnaires are used at particular locations. A successful example of the modern implementation of this approach is the “Did You Feel It?” program created by U.S. Geological Survey (USGS) that collects information from people who experience an earthquake and volunteer to share their observations online to create Community Internet Intensity Maps with observations and extent of damage (Wald *et al.*, 1999).

In addition to the intensity evaluation method based on human observations, another approach is to determine intensity from the peak ground motion, either velocity or acceleration, at a station near the site of interest. Then, empirical relationships are employed to calculate intensity level (Worden *et al.*, 2012). Results can be obtained much faster than results obtained from the observation‐based analysis. A successful realization of this method is the ShakeMap application, also developed by the USGS (Wald *et al.*, 2006).

The intensity evaluation approaches can be divided into two main categories according to the input information source types mentioned above. The first evaluation category employs social sensor data from people who witness the consequences of the earthquake. The second utilizes data from physical sensors, such as seismometers and accelerometers to estimate intensity. Although the first method historically estimated intensity at a much slower rate than that of physical sensors, today electronic questionnaires and observers’ reports can be supplemented with auxiliary online data sources from social networks, in which people also share their observations after an earthquake. Here, we access data from the online social networking service Twitter. Twitter enables users to send and read short 140‐character messages called “tweets.” Registered users can read and post tweets, whereas unregistered users can only read those tweets. Users can access Twitter through the website interface, SMS, or a mobile device application that is the true source of the real‐time nature of tweets (Twitter, 2016). Twitter data are immediately available in a data stream, which can be mined using stream‐mining techniques (Schonfeld, 2009). In this study, we work with data following the stream model. In this model, data arrive at high speed, and data‐mining algorithms must be able to predict intensity level in real time and under strict constraints of space and time (Bifet *et al.*, 2010).

Previous studies have shown the potential of Twitter data for earthquake detection (Earle *et al.*, 2010, 2011; Sakaki *et al.*, 2010; Crooks *et al.*, 2012; Burks *et al.*, 2014) and intensity estimation (Kropivnitskaya *et al.*, 2016). Kropivnitskaya *et al.* (2016) created four empirical predictive relationships (linear, two‐segment linear, three‐segment linear, and exponential) that link the positive tweet rates in the first 10 min following the earthquake with the instrumental intensity level in modified Mercalli intensity (MMI) scale units from a regression analysis of data from physical and social sensors during the Napa earthquake. Figure 1 shows the ratio between the combined data (instrumental and Twitter) and the MMI values recorded from the USGS in the Napa region. The ratio between estimated and actual intensity is relatively low for this particular earthquake. The proposed joint processing technique using social and physical data demonstrates a significant potential for near‐real‐time predictive streaming applications. However, the developed empirical relationships between earthquake intensity and tweet rates still need to be validated for all of California and other seismically active regions of the world and, if necessary, they need to be spatially calibrated.

The statistical complexity of validating and calibrating the model is complicated by the fact that the Twitter data stream is limited for open public access. The basic levels allow only up to 1% of the total tweets volume to be streamed (Twitter, 2016). For our purposes, only spatially limited positive tweets (tweets about an earthquake) are used, so the rate limitations are critical, and historical data are used in our application instead of a real data stream. The relationships are validated and calibrated for three regions: California, Chile, and Japan. The degree of social media engagement across these countries is relatively high. The proportion of each country’s population that had a Twitter account as of February 2012 is 36% in the United States, 24% in Japan, and 33% in Chile (Dawson, 2012). Therefore, the potential for the success of Twitter streaming applications in these regions is high. In the next section, we discuss the social and physical sensor data used in this study. The following section contains the details of the validation procedure followed by an explanation of and results for the calibration process for California, Japan, and Chile regions. Finally, concluding remarks are provided.

## DATA PREPARATION

To validate whether the numerical results quantifying hypothesized relationships between logarithmic tweet rates following 10 min after an earthquake and earthquake intensity on MMI scale, obtained from Kropivnitskaya *et al.* (2016), can be used for other earthquakes in California and in other regions around the world, we selected independent events for testing purposes listed in Table 1. The magnitude scale used in Table 1 is moment magnitude. The moment magnitude is a static measure of earthquake size that strongly depends on stress drop. It does not directly quantify the radiated energy and, as a result, there are limits to its ability to reliably estimate earthquake size (Hanks and Johnston, 1992). However, these considerations do not impact the current analysis because earthquake size is given here only for reference purposes and not used as predictive attribute in the model. The events listed in Table 1 are shown in Figure 2a (California), Figure 2b (Chile), and Figure 2c (Japan). None of these events was included in the original analysis of Kropivnitskaya *et al.* (2016).

The selection of seismic events used in this validation is dictated by two constraints. The first is to ensure that there is a large amount of freely available Twitter data. Gathering information from social media feeds is, in essence, a web‐mining process (Kosala and Blockeel, 2000; Sakaki *et al.*, 2010). It entails three operations: extracting data from the data providers (in this case Twitter) via application programming interfaces; parsing, integrating, and storing these data in a resident database; and then analyzing these data to extract information of interest. However, the currently available Twitter tools offer limited capabilities for information‐gathering procedures (Twitter, 2016). As a result, we used an archived historical data set of Twitter records from March to April 2014 (see Data and Resources). The second limitation is related to the lack of geotagged data within available Twitter data sets. We selected seismic events that both showed a spike in the number of positive tweets within 10 min following the earthquake and in which at least 14 of those tweets were geotagged (the minimum number of tweets in the models; Kropivnitskaya *et al.*, 2016). The geotagged tweets contain the current user location indicator at the time of tweeting and can be directly used in the location‐specific intensity estimation algorithms. However, Graham *et al.* (2014) showed that only 0.7% tweets are geotagged among 19.6 million tweets. In this case, other tweets that do not have a direct specific location reference but contain a link to specific cities can be used. The percentage of geotagging in this case is between 2% and 5% (Severo *et al.*, 2015). A geotag can also be obtained from a field in the user account description (7.5% of profiles contain latitude and longitude values, 57% include a named location, 20.4% referenced information that can be used to identify a country, whereas 15.1% provided humorous or nonspatial information; Takhteyev *et al.*, 2012). In the approach used by Kropivnitskaya *et al.* (2016), all three types of geotagged techniques were used, and the same logic has been implemented here. A text‐based geolocation algorithm has been improved by employing a location database extension for California, Japan, and Chile. In updating the location database, the complete, shortened, or abbreviated names are included for any settlements with a population of more than 5000 people in a 200‐km radius of the epicenter of the earthquakes listed in Table 1 (see Fig. 2).

After building a data set for each event that is limited in time and space, we generated a tweet‐frequency time series with 1 s time bins and normalized to number of tweets per minute. In the current study, we did not use strong‐motion records and did not calculate intensity level using empirical relationships. Instead, for each point from the Twitter data set, we assigned an MMI value from the instrumental intensity database of the USGS National Earthquake Information Center (see Data and Resources) corresponding to the same or closest latitude–longitude location (for closest location selection, the smallest distance between the points is estimated). For California, we selected the **M** 6.8 Ferndale, California, earthquake (10 March 2014) and the **M** 5.1 La Habra, California, earthquake (29 March 2014) (Fig. 2a; Kropivnitskaya *et al.*, 2016).

## RELATIONSHIP VALIDATION

Earlier relationships between positive tweet rates and observed MMI were developed only using the data from the Napa earthquake (Kropivnitskaya *et al.*, 2016). To employ them to supplement data from physical sensors for more accurate real‐time intensity maps production, the relationships have to be validated for other areas of California and other tectonic regions worldwide. The validation process analyzes the goodness of fit of the regression for the Napa earthquake, determining whether the regression residuals are random and checking whether the model’s predictive performance deteriorates substantially when applied to data for the Ferndale and La Habra earthquakes, events in California that are not used in the original model derivation. Both events are not large enough to validate empirical relationships at intensity levels higher than MMI VI. As a result, the three‐segment model is excluded from the validation test.

Both earthquakes’ data sets show that the positive tweets rate increases slower than that predicted by the empirical relationships for the Napa earthquake derived in Kropivnitskaya *et al.* (2016).

The residuals for both events are shown with whisker diagrams for different models in Figure 3, and display the distribution of the residuals based on a five‐number summary: minimum, first quartile, median, third quartile, and maximum. The quartiles of the present data values are the three points that divide the data set into four equal groups; each group comprising a quarter of the data. The first quartile is defined as the middle number between the smallest number and the median of the data set. The median of the data is a second quartile. The third quartile is the middle value between the median and the highest value of the data set. The interquartile range is used here to characterize outliers that skew the data.

Despite the fact that the La Habra data (Fig. 3b) contain significantly more outliers in the residuals than the Ferndale data (Fig. 3a), the mean residuals for every model for the La Habra event are much closer to zero (the linear model mean residual is 0.37 MMI units; the exponential model mean residual is −0.25 MMI units; the two‐segment model mean residual is 0.28 MMI units) than for the Ferndale event (the linear model mean residual is 1.28 MMI units; the exponential model mean residual is 0.98 MMI units; and the two‐segment model mean residual is 1.57 MMI units).

A visual examination of the residuals has an advantage over numerical model validation methods because it illustrates the complex aspects of the relationship between the model and the new data. Figure 4 shows the residuals from the fitted models. The residuals are not randomly distributed around zero, indicating that the linear assumption may not be reasonable (Fig. 4c,d). However, even for the exponential model case (Fig. 4a,b), the residuals are not distributed normally around zero. In this case, the variances of the error terms at each MMI level are not equal, due to the different number of twitter responses at each level. Moreover, some of the residuals stand out from the basic pattern, confirming that there are outliers in the data, as indicated by whisker diagrams (Fig. 3). These data points have to be excluded from the calibration procedure detailed below.

Intensity attenuates with distance from the epicenter of an earthquake, and intensity prediction equations usually rely on some distance metric. For example, some models use epicentral distance (Bakun and Wentworth, 1997) or closest distance to rupture or some variant that considers extended fault sources, such as the Joyner–Boore distance (Joyner and Boore, 1993). For small‐magnitude earthquakes, in which an earthquake can be approximated by a point source, the difference between point and extended‐source distance metrics can be minimal. However, at larger magnitudes in which source finiteness can be significant, prediction equations which use point‐source distance metrics may not be applicable (Cua *et al.*, 2010). In this evaluation, we consider the epicentral distance, a point‐source distance metric, as a potential predictor for the equations of Kropivnitskaya *et al.* (2016). The anticipated influence of the distance metric on the predictive equations is based on the assumption that the number of positive tweets and tweet rates are expected to attenuate with distance from the epicenter. As a result, the behavior of the ground‐motion components may vary with epicentral distance and result in a variation in the type and/or magnitude of structural damage. To assess this possibility, we plotted residuals‐versus‐epicentral distance (Fig. 4a,c,e). These plots do not exhibit a systematic structure and, as a result, the form of the function cannot be improved using that predictor.

To check for temporal variation in the data, the residuals also are plotted versus time (Fig. 4b,d,e). At each minute, the intensity map is updated according to the new values received during the last minute. However, at the same time, all values obtained earlier remain on the map and are incorporated into the overall intensity representation. Results show that the residuals plotted versus time do not show any drift in the errors and appear to behave randomly.

Both sets of residual plots do record the delay from zero distance and zero time for the Ferndale earthquake, an offshore event that occurred 78 km off the coast. The cumulative error over the 10 min interval is shown in Figure 5. The maximum error for the Ferndale earthquake registered at the third minute, the minimum error occurred at the tenth minute. For the La Habra event, the maximum error registered at the first and seventh minutes, whereas the minimum error occurred at the fifth and tenth minute. Results suggest that the original model of Kropivnitskaya *et al.* (2016) does not fit the validation data set as well as the data from the Napa Valley earthquake it was developed from, and a more complete calibration process for the California region is warranted as more data become available.

Figure 6 shows the data for Chile and Japan earthquakes from Table 1 plotted with the prediction models of Kropivnitskaya *et al.* (2016). The data distribution follows the same pattern as the models, but it is clear that the models have to be calibrated and shifted to the left to represent the data relationships better. One of the Chile earthquakes has a significantly higher number of tweets after an earthquake (Fig. 6a, upward triangles). Noting that all earthquakes generally show common characteristics, one possible reason could be related to the fact that this earthquake happened the day after the **M** 6.7 event that occurred 64 km west‐northwest of Iquique on 16 March 2014. People were likely alert, and their reaction was stronger. The data from Japan do not allow for validation of the empirical relationships at an intensity level higher than MMI V. As a result, the three‐segment and two‐segment models are excluded from the calibration. Observed data for the Chile region represent the intensity levels up to MMI VI; therefore, the three‐segment model is not taken into consideration.

## CALIBRATION OF EXISTING RELATIONSHIPS

In validating the earlier relationship in the previous step, discrepancies were noted in the spatial variation of intensity, resulting in the need for spatial model calibration. This calibration task involves systematic adjustment of model parameter estimates so that the model outputs more accurately reflect external benchmarks. The forward problem that describes the relation between the logarithmic tweets rate and intensity predictions may be represented with an equation in the following form: (1)in which MMI is the aggregated instrumental intensity level observed, **Θ** is the vector of the tweet rates observed, *f* is the forward equation representing each mathematical model (linear, exponential, two‐segment, three‐segment), and **e** is the residuals vector which describes the deviation between the measured and predicted values of intensity (MMI II). Here, (2)in which *e*_{o} accounts for measurement error in the observations, and *e*_{m} is the model error.

Here, the calibration is an inverse procedure. An inverse estimator *C* that connects the observations MMI to “good” estimates *g* of the parameters of interest (3)We used known data in the observed relationship between the dependent variable MMI and independent variable logarithmic tweets rate to estimate values for regions other than the Napa Valley using new observations from the earthquakes listed in Table 1. Linear regression is used to produce new regression coefficients for each region. We regress instrumental MMI against the logarithmic mean of the number of tweets per minute to obtain new predictive equations (Fig. 7) within a legitimate range of values for each model (see Table 2). We also regress the average ground‐motion values for specified MMI levels to approximately follow the appropriate trend instead of producing a relationship that is overly influenced by the greater statistical volume of data at lower intensities. We applied a least‐squares solution with 95% confidence bounds for each model.

For the California region, the lowest root mean square (rms) of 0.0029 MMI units is observed for the two‐segment model. For the Napa earthquake (Kropivnitskaya *et al.*, 2016), the lowest error was achieved with the three‐segment model, excluded from validation here. The exponential model demonstrates the rms error of 0.0035 MMI units. For the Japanese data, the difference between the rms error for the linear and exponential models also is not significant (0.0207 vs. 0.0186). This may be explained by the use of the midvalued intensity data in the regression (MMI III–V). For lower intensity levels (MMI I–III), the difference will be more significant, and for intensity levels higher than MMI V it is less significant. For the Chile earthquakes, the lowest rms error of 0.0143 MMI units is observed for the linear model, which is 0.0002 units lower than for exponential model.

Because the rms error cannot determine whether the model estimates and predictions are biased, we also assessed the residual plots (Fig. 8). The residuals between the predicted and observed data are shown in Figure 8a for each model in the California region and demonstrate normality in every case. The residual distribution for Japan (Fig. 8c) is also nearly normal for both calibrated models. The distribution of the Chile data residuals (Fig. 8b) is skewed to the left for the two‐segment model and skewed to the right for the linear and exponential models. The condition that the error terms are normally distributed is not met in that case.

## CONCLUSIONS

Incorporation of social sensor data with traditional data sources using advanced computational processing methods can provide more complete and accurate coverage of damage and loss for rapid earthquake response. The prediction equations obtained in this work could be used for real‐time seismic‐hazard mapping and emergency management purposes in California, Japan, and Chile using the real‐time data‐streaming concept of Twitter data. However, because our previous work showed a higher level of uncertainty resulting from the use of Twitter data alone when compared with results streamed jointly with instrumental intensity, we propose to use the empirical equation for Twitter data together with the data from physical sensors as presented in Kropivnitskaya *et al.* (2016).

Here, the twitter stream processing uses data recorded within 10 min following an earthquake. We confirm our earlier hypothesis that the logarithmic number of tweets can be used as a proxy for shaking intensity not just in the California region, but also in other regions with large numbers of Twitter users, such as Japan and Chile. However, for actual real‐time implementations, there is a need to remember that the calibrated relationships presented in this article are constrained by the number of tweets in the historical data set. Unlimited access to the Twitter data stream would allow for new improved calibration of the regional relationships from around the world. In many areas, the importance of this additional data source could be very significant, due to the complete or partial lack of traditional instrumental data sources as a result of the high cost of their installation and ongoing operation.

## DATA AND RESOURCES

Instrumental intensity data used in this study is obtained from U.S. Geological Survey National Earthquake Information Center at http://earthquake.usgs.gov/earthquakes/search/ (last accessed June 2016). Archived historical data set of Twitter records from March to April 2014 (downloaded from https://archive.org/details/twitterstream, last accessed January 2015) were used in this study. Figures were created using Generic Mapping Tool (Wessel and Smith, 1991) and MATLAB plotting software (www.mathworks.com/products/matlab, last accessed July 2016). The streaming application implementation is based on International Business Machines (IBM) InfoSphere Streams, a cloud platform for real‐time analytics on big data (IBM, 2014).

## ACKNOWLEDGMENTS

The research of K. F. T. and Y. K. was made possible by a Mathematics of Information Technology and Complex Systems (MITACS) Accelerate Grant and a Natural Sciences and Engineering Research Council Discovery Grant and is the result of collaboration between the Western University Computational Laboratory for Fault System Modeling, Analysis, and Data Assimilation and Shared Hierarchical Academic Research Network (SHARCNET), a high performance computing consortium of Canadian academic institutions.