# Seismological Research Letters

- © 2016 by the Seismological Society of America

## ABSTRACT

The temporal epidemic‐type aftershock sequence (ETAS) model is widely used to describe seismicity. However, only a few programs for estimation of the temporal ETAS model parameters are publicly available, and it is difficult to routinely apply some of them to observed data, due to initial value dependence. A robust temporal ETAS estimation program is required to routinely process earthquake catalogs or to perform Monte Carlo simulations. In this study, we developed a new Fortran program, etas_solve, which is based on the steepest descent method and calculates exact gradient and Hessian using the automatic differentiation technique. We performed numerical tests for a real earthquake catalog and for synthetic catalogs. Through the numerical tests, we found that initial value dependence of etas_solve is several orders of magnitude smaller than that of statistical analysis of point processes (SAPP), which is an R wrapper of statistical analysis of seismicity. We also found that standard error estimated from a Hessian at a maximum‐likelihood estimate (MLE) is useful to quantify the uncertainty of the MLE. Computation time of etas_solve was comparable to that of SAPP on a single‐core machine and faster on a multicore machine (around eightfold speed up on a 16‐core machine). In addition, etas_solve provides rich features that may be useful in real‐world applications, such as options to use fixed parameters and an auxiliary window in time and magnitude.

## INTRODUCTION

The temporal epidemic‐type aftershock sequence (ETAS) model introduced by Ogata (1988) is widely used to describe seismicity (Ogata, 1992; Llenos *et al.*, 2009). However, only a few programs for the estimation of the temporal ETAS model parameters are publicly available, and it is difficult to routinely apply some of them to observed data, due to initial value dependence (e.g., Ogata, 2006). A robust and fast temporal ETAS estimation program is required to routinely process earthquake catalogs or perform Monte Carlo simulations.

We summarized the main features of the etas_solve program and some publicly available temporal ETAS estimation codes, statistical analysis of seismicity (SASeis2006; Ogata, 2006), statistical analysis of point processes (SAPP; Ogata, 2006), etasFLP (Chiodi and Adelfio, 2015), and PtProcess (Harte, 2010) in Table 1. The table allows the user to compare the functionality and characteristics of various software packages.

The SASeis2006 software (Ogata, 2006) provides a FORTRAN 77 code to estimate the temporal ETAS model parameters. The program uses the Davidon–Fletcher–Powell (DFP) algorithm (Davidon, 1959; Fletcher and Powell, 1963), which is a kind of quasi‐Newton method, to obtain the maximum‐likelihood estimate (MLE). The DFP algorithm uses an approximate Hessian, which is updated at each iteration using the DFP formula (Davidon, 1959; Fletcher and Powell, 1963). The user is able to fix the *p*‐value in equation (1) to 1.0 and perform optimization over the rest of the parameters. SASeis2006 supports an auxiliary window (Wang *et al.*, 2010) in time but not in magnitude. The software does not provide an estimate of the standard error of the MLE, and hence it is difficult to quantify the uncertainty of the MLE. The code is not parallelized and thus may not be suitable for analyzing large catalogs.

The SAPP software (Ogata, 2006) is an R (R Core Team, 2015) wrapper of SASeis2006 and has the same characteristics as SASeis2006.

The etasFLP software (Chiodi and Adelfio, 2015) is an R package used to estimate both the temporal ETAS model parameters and spatiotemporal ETAS model parameters. It uses built‐in optimization routines of R to obtain the MLE. The R’s built‐in optimization routines provide several optimization techniques, and the user is able to select one of them at runtime (R Core Team, 2015). The software provides standard error estimated from a finite‐difference Hessian. The etasFLP software applies the logarithmic transformation to all parameters. As noted by Veen and Schoenberg (2008), logarithmic transformation for *c* improves stability. The user is able to fix any parameters and performs optimization over the rest of the parameters. On the other hand, etasFLP does not support an auxiliary window in time and magnitude. This limitation may bias the result (Wang *et al.*, 2010). The etasFLP program does not support parallelization and thus may not be suitable for analyzing large catalogs.

The PtProcess software (Harte, 2010) is an R package that provides a function to calculate a negative log‐likelihood function for the temporal ETAS model. PtProcess is parallelized, supports fixed parameters, and supports an auxiliary window in time. PtProcess does not support an auxiliary window in magnitude and does not provide an optimization routine.

In this study, we developed a new program, etas_solve, which is based on the steepest descent method (Boyd and Vandenberghe, 2004) and calculates the exact Hessian, using the forward‐mode automatic differentiation technique (Griewank, 1989). The etas_solve program provides an estimate of the standard error from the exact Hessian at the MLE. It also is robust to the choice of initial values and fast enough to estimate the parameters for about 4000 events in, on average, 1 min on a standard notebook computer (single thread on 1.7 GHz Intel Core i7). As etas_solve is parallelized by OpenMP (Dagum and Menon, 1998), it should be applicable to large catalog datasets. The etas_solve program supports an auxiliary window in time and magnitude to mitigate the edge effect (Wang *et al.*, 2010). The software also supports fixed parameters. The code is written in Fortran and available for free (see Data and Resources). In the following sections, we will describe its implementation and basic usage and show results of numerical tests for a real earthquake catalogs and synthetic catalogs.

## IMPLEMENTATION

According to Ogata (1988), the intensity function of earthquake production given seismicity data {(*t*_{j},*m*_{j})} and parameters *μ*, *K*, *c*, *α*, and *p* is written as (1)in which *m*_{r} is a reference magnitude. The unit of *K* in equation (1) is day^{p−1}, and its physical interpretation is unclear. Sometimes, the temporal distribution in equation (1) is normalized by (2)to make *K* equal to the expected number of aftershocks for an event with magnitude *m*_{r} as follows (3)(e.g., Ogata, 1998), in which *K* is dimensionless. We followed their approach in a slightly different manner because normalization with equation (2) is not applicable when *p*≤1. Instead of equation (2), we apply normalization by (4)in which 0<*τ*<∞ is a normalized time interval. Equation (4) converges if *p* is finite. This normalization makes *K* dimensionless while allowing *p*≤1. The choice of *τ* does not affect stability of etas_solve as long as it does not cause numerical instability. For this purpose, we recommend *τ*=1 day for most applications. We parameterize the intensity function as (5)

Subscript es is an abbreviation of etas_solve. Following equation (5), background seismicity is expected to produce *μ*_{es} earthquakes per *τ* time interval, and an **M**=*m*_{r} earthquake is expected to produce *K*_{es} direct aftershocks in the *τ* time interval since its origin time. Magnitude *m*_{r} does not need to be the smallest magnitude in the data (Ogata, 2006). The negative log‐likelihood function of equation (5) is written as (6)in which **θ**=(*μ*_{es},*K*_{es},*c*,*α*,*p*)^{T}, *t*_{begin}, and *t*_{end} are start and end times of a target time window, respectively. To estimate the temporal ETAS model parameters, we minimize equation (6) using the steepest descent method. At the *k*th iteration, the current solution **θ**_{k}, the gradient **g**_{k}, and the Hessian **H**_{k} are known. Exact values of **g**_{k} and **H**_{k} are calculated by the forward‐mode automatic differentiation technique. The algorithm finds a new solution using the steepest descent method for the quadratic norm as follows (7)in which **z** is a dummy variable, ‖·‖_{2} is L2 norm, |·|_{elem} is an elementwise absolute values operation, , **D**_{k} is a diagonal matrix whose diagonal elements are the eigenvalues of **H**_{k}, and **P**_{k} is an orthonormal matrix whose columns are eigenvectors of **H**_{k}. Thus, a search direction for the *k*th iteration is (8)

Δ**θ**_{k} corresponds to a search direction of Newton’s method if **H**_{k} is positive definite. Inexact line search (Boyd and Vandenberghe, 2004) with quadratic interpolation is performed along the search direction Δ**θ**_{k} and the (*k*+1)st solution (9)which satisfies *f*(**θ**_{k+1})<*f*(**θ**_{k}), is obtained. Iteration terminates when both stopping criteria (10)and positive definiteness of **H**_{k} are satisfied. The latter criterion assures that the final solution is not on a saddle nor peak point. Following Veen and Schoenberg (2008), etas_solve performs optimization over log*c* instead of *c* to improve stability. Also, we optimize log*μ*_{es} and log*K*_{es} instead of *μ*_{es} and *K*_{es}, because they can vary several orders of magnitude depending on data. In preliminary analyses, etas_solve with the exact Hessian still had smaller initial value dependence without the logarithmic transformations compared to that of SAPP. However, the logarithmic transformations approximately halved the number of iterations required for convergence of the iterative solver and reduced the condition number of the Hessian at the MLE about three orders of magnitude for the Miyagi earthquake catalog described in the Numerical Test Results and Discussion section.

## USAGE

We will introduce basic usage of etas_solve; `$` indicates the shell prompt. First, clone a repository as follows $ git clone https://github.com/kshramt/fortran_lib.git

Second, build etas_solve: $ cd fortran_lib$ make

Then, prepare a data file, the first column of which is time and the second column of which is magnitude. Time should be sorted in an ascending order $ head ‐n5 time_magnitude.dat0.53129398148093 0.83.997903703705030.04.403340046295411.97.16115138888911 0.77.16381481481371 1.1

Finally, run etas_solve: $ bin/etas_solve.sh \ ‐‐t_normalize_len=1 \ ‐‐m_fit_min=2 \ ‐‐m_for_K=6 \ ‐‐t_pre=2 \ ‐‐t_begin=4 \ ‐‐t_end=12 \ ‐‐mu=1 \ ‐‐K=1 \ ‐‐c=0.01 \ ‐‐alpha=0.5 \ ‐‐p=1.1 \ ‐‐data_file=time_magnitude.dat |release/bin/etas_solve.exe

In the example, initial value is(11)

The terms ‐‐mu and ‐‐K correspond to *μ*_{es} and *K*_{es}, respectively. The MLE for (*μ*_{es},*K*_{es},*c*,*α*,*p*), corresponding to *μ* and *K* of equation (1) for the MLE, gradient, and Hessian at the MLE will be printed. By default, etas_solve applies box constraints as follows (12)

The user is able to change the lower and upper bounds through command‐line options. The terms `t_normalize_len` and `m_for_K` correspond to *τ* and *m*_{r}, respectively. A target window covers and , and an auxiliary window covers and −∞<*m*<∞, excluding the target window (Fig. 1). Events in the auxiliary window and target window contribute to the intensity function, whereas optimization is performed to fit events within the target window. A utility script to calculate intensity function is also provided.

## NUMERICAL TEST RESULTS AND DISCUSSION

To demonstrate robustness of the developed program, we tested the dependence of estimated parameters on the choice of initial value by running the program from randomly chosen initial values and then compared the results with that of SAPP. We used etas_solve Git revision `ab50b83f3db064870458466ca31b82a4001f83c2` and SAPP v.1.0.6 for the numerical tests in this section (see Data and Resources). We used aftershock data of the 26 July 2003 **M** 6.2 northern Miyagi, Japan, earthquake, which is shipped together with SAPP and contains 553 **M**≥2.5 events, as a testing data. We set an auxiliary window empty and a target window to cover 0≤*t*≤18.68 (day) and 2.5≤*m*<∞. The end time 18.68 day and the minimum magnitude 2.5 were set based on an example in SAPP manual. The values *ε*, *τ*, and *m*_{r} were 10^{−6}, 1 day, and 6, respectively. First, we randomly drew 1024 initial values from a uniform distribution on 10^{−7}≤*μ*_{es}≤10, 10^{−7}≤*K*_{es}≤200, 10^{−7}≤*c*≤1 (day), 0≤*α*≤4, and 0≤*p*≤2. Then, we ran both etas_solve and SAPP from the initial values and obtained estimates. To conduct the numerical test in a reasonable time, we imposed 360 s running‐time limitation. We ran etas_solve without parallelization to impose the same time limitation for both the programs. Most trials converged in less than 10 s, and no trials for etas_solve were terminated by the time limitation.

Figure 2 shows the initial values, SAPP results, and etas_solve results. As shown in Figure 2, estimation values with etas_solve were independent of the initial value for the testing data, whereas those with SAPP show variations, depending on the initial value.

Inverse of the Hessian of the negative log‐likelihood function at the MLE provides error estimates of the MLE. Table 2 shows MLE obtained with etas_solve and its standard errors. Table 3 shows correlation coefficients of the estimation errors. There are relatively large trade‐offs between *p* and both *c* and *μ*_{es} for the testing data. There are relatively small trade‐offs between *μ*_{es} and both *α* and *K*_{es} for the testing data.

Although there was initial value dependence in the SAPP results, estimated values with small gradient (‖**g**‖_{2}≤10^{−5}) coincided with the MLE obtained with etas_solve. Figure 3 shows a comparison of the intensity function, which corresponds to the MLE obtained with etas_solve, and the testing data. As shown in Figure 4, all trials from the 1024 random initial values have converged within 25 iterations. On a single‐core machine, etas_solve took longer computation time per iteration than SAPP, due to calculation of the Hessian, but total execution time was comparable to that of SAPP, because fewer iterations for convergence was required. In addition, etas_solve was faster than SAPP on multicore machines (around eightfold speed up on a 16‐core machine), because etas_solve is parallelized by OpenMP.

We additionally checked performance of etas_solve and SAPP for synthetic catalogs with known temporal ETAS model parameters. First, we randomly generated 1024 synthetic catalogs with known temporal ETAS model parameters (, , *c*^{true}=0.01 day, *α*^{true}=1.5, and *p*^{true}=1.1 with and *m*_{r}=6). We assumed no earthquake for *t*<0; the magnitude distribution follows the Gutenberg–Richter law (Gutenberg and Richter, 1944) with *b*=1 and magnitudes of all earthquakes being within [1.5,7). We used essentially the same algorithm as Zhuang (2004) to generate the synthetic catalogs (see Data and Resources). We set an auxiliary window to cover 0≤*t*<50 and −∞<*m*<∞; and a target window to cover 50≤*t*≤500 and 1.5≤*m*≤7. Figure 5 shows a histogram of the number of events in the target window for the synthetic catalogs. There were 1037 events in the target window, on average. Then, we ran both etas_solve and SAPP from 32 fixed starting points (*μ*_{es}∈{0.1,10}×*K*_{es}∈{10,1000}×*c*∈{0.001,0.1}(day)×*α*∈{1,2}×*p*∈{0.5,1.5}) for each catalog. To conduct the numerical test in a reasonable time, we imposed a 360 s running‐time limitation. We ran etas_solve without parallelization to impose the same time limitation for both programs. No trials were terminated by the imposed time limitation for etas_solve, whereas about 10% of the 32,768 trials were terminated by the time limitation for SAPP.

Figure 6 shows histograms of MLEs for the numerical test with the synthetic catalogs. MLEs obtained with SAPP distributed in broader regions compared to that of etas_solve, suggesting larger initial value dependence.

For each of the 1024 synthetic catalogs, we ran etas_solve from 32 fixed starting points. Thus, there are at most 32 MLEs for each catalog, and they may vary due to initial value dependence. Figure 7 shows histograms of the difference between the maximum and minimum of MLEs for each catalog. The initial value dependence for etas_solve was smaller than 1×10^{−6} for all the catalogs and was more than five orders of magnitude smaller than that of SAPP.

Figure 8 shows closer look at the results obtained with etas_solve. As the initial value dependence was negligible for MLEs obtained with etas_solve, we took only one MLE out of 32 MLEs for each catalog to plot the histograms. Upper panels show histograms of MLEs. Peaks around the true values suggest correctness of etas_solve. Middle panels of Figure 8 show histograms of standard errors estimated from Hessians at the MLEs. Means of standard errors estimated from Hessians were comparable to standard errors estimated from standard deviations of MLEs. This result also indicates correctness of etas_solve. Bottom panels of Figure 8 show distances between the MLE and the true value normalized by the standard error at the MLE. Possibility of having the true value within two standard errors from an MLE was more than 90%, suggesting usefulness of the Hessian for the estimation of the standard error.

## CONCLUSION

We developed etas_solve to estimate the temporal ETAS model parameters based on the steepest descent method and the automatic differentiation technique. We performed numerical tests for a real catalog and synthetic catalogs and found that etas_solve outperforms SAPP in both robustness to the choice of the initial value and computation time. From the numerical test with the synthetic catalogs, initial value dependence of etas_solve was at least five orders of magnitude smaller than that of SAPP. Also, computation speed was more than eight times faster than that of SAPP on a 16‐core machine. We also found that standard errors estimated from the exact Hessian are useful to quantify the uncertainty of the MLE. The etas_solve program supports an auxiliary window in time and magnitude. The user is able to fix any parameters and perform optimization over the rest of the parameters. Source code of etas_solve is available under GNU General Public License v.3.

## DATA AND RESOURCES

The etas_solve program is available at https://github.com/kshramt/fortran_lib (last accessed October 2015) and https://bitbucket.org/kshramt/fortran_lib (last accessed October 2015). A Python script that contains essentially the same code as that used to generate the synthetic catalogs is also available in the repositories. Statistical analysis of point processess (SAPP) v.1.0.6 was downloaded from http://jasp.ism.ac.jp/ism/sapp/index_e.html (last accessed July 2015).

## ACKNOWLEDGMENTS

We thank two anonymous reviewers and Associate Editor Jeremy Douglas Zechar for their useful comments. We thank Editor Zhigang Peng for his careful guidance. A seminar presentation by Jiancang Zhuang was useful. This study was supported by Grant‐in‐Aid for The Japan Society for the Promotion of Science (JSPS) Fellows (No. 25·653) to A. K., Grant‐in‐Aid for Scientific Research 24101011 of the Japan Ministry of Education, Culture, Sports, Science, and Technology to Y. Y., and JSPS KAKENHI Grant 26240004 to B. E.