On the likelihood of future eruptions in the Chilean Southern Volcanic Zone: interpreting the past century’s eruption record based on statistical analyses

. A sequence of 150 explosive eruptions recorded during the past century at the Chilean Southern Volcanic Zone (SVZ) is subjected to statistical time series analysis. The exponential, Weibull, and log-logistic distribution functions are fit to the eruption record, separately for literature-assigned volcanic explosivity indices (VEI) ≥2 and VEI ≥3. Since statistical tests confirm the adequacy of all the fits to describe the data, all models are used to estimate the likelihood of future eruptions. Only small differences are observed between the different distribution functions with regard to the eruption forecast, whereby the log-logistic distribution predicts the lowest probabilities. There is a 50% probability for VEI ≥2 eruptions to occur in the SVZ within less than a year, and 90% probability to occur within the next 2-3 years. For the larger VEI ≥3 eruptions, the 50% probability is reached in 3-4 years, while the 90% level is reached in 9-11 years.


Introduction
The Chilean Southern Volcanic Zone (SVZ) is the volcanologically most active part of the Chilean Andes, stretching from Tupungatito (33.40°S, 69.80°W) in the wider Santiago region (Región Metropolitana) southwards to Cerro Hudson (45.90°S, 72.97°W) in central Patagonia.Several population centres used for living, agriculture, industry and commerce are in the vicinity of these volcanoes.Coastal lowlands to mountainous regions are touristically attractive and frequently visited for recreation throughout the year.Larger cities, in particular Santiago de Chile, while not in the immediate vicinity of the volcanic chain, are located inside drainage channels from the Andes and would, in the event of a volcanic eruption, be threatened by possible lahars and pyroclastic flows.The entire region is endangered by fallout tephra from explosive eruptions, which may cover large areas at long distances from the eruptive source, also reaching Argentina to the east.Estimating the risk posed by this volcanic arc segment is thus of paramount importance for the most densely populated region of Chile and Argentina.
Continuous monitoring, i.e. the surveillance of seismic activity, gas release and composition, ground deformation, scientific constraints on the regional tectonics and geochemistry are the most powerful tools for assessing the current state of a volcano.Recent efforts by the ʻObservatorio Volcanológico de los Andes del Surʼ (OVDAS, part of SERNAGEOMIN, the Servicio Nacional de Geología y Minería), in Chile have greatly improved the monitoring of these volcanoes by the installation of a large number of seismometers on the Chilean volcanoes considered as most active (Fernando Gil, personal communication, 2010).However, in-detail monitoring requires substantial funds for staff and instrumentation and can therefore only be performed for selected volcanoes.Monitoring and scientific investigation in the Andes are additionally complicated by the still poor accessibility of some volcanoes at high altitudes in rugged terrains.Moreover, the unexpected 2008 violent eruption of Chaitén volcano should be taken as an alert that not only those volcanoes with a recent (historical) eruptive record should be recognised in hazard considerations.
Since all these volcanoes are located in the same volcanic arc, formed by subduction of the Nazca Plate beneath the South American Plate, they share a collective susceptibility to the tectonic driving forces, which may potentially affect the system on a large scale.Although individual volcanoes may not be matter of statistical analysis because of too few eruptions, a compiled investigation will pay credit to their importance for relaxing the system by volumetric magma release and stress changes, and may reveal changes in the eruption rates of the entire arc.Also in the context of an interdisciplinary hazard and risk estimation, it is fundamental to evaluate the overall eruption hazard in addition to the individual analyses of single eruption sites.Furthermore, few cases of registered eruptions were revealed where an eruption was initially assigned to a wrong source volcano -such a problem becomes inconsequential when regarding compiled data sets.
An additional advantage of combining all SVZ eruptions is the possibility to regard larger eruptions (VEI ≥3), which are usually too infrequent to be assessed for each volcano individually.Although this collective approach does not allow us to specify where the next eruption is most likely to occur, the exact site becomes less important by the potential of a larger eruption to be more devastating for a much wider area.
This paper therefore aims at describing the typical temporal eruption behaviour of the SVZ as a coherent subduction system by means of several statistical distributions.These models are then used to infer an estimate for the probability of future eruptions.With these results we provide a probabilistic part of volcanic hazard assessment, to be combined with results of the various continuous monitoring techniques.

Tectonic setting and volcanic activity at the SVZ
Volcanism in the SVZ is driven by the subduction of the Nazca Plate beneath the South American Plate, which takes place in an oblique direction of approximately 80-82°ENE at a rate of 70-90 mm/ year.The incoming plate dips uniformly at about 25° to a depth of 90-100 km (Stern, 2004).At the northern boundary, the dip angle at depth shallows to less than 5°, considered as the reason for the lack of volcanism in the North-adjacent Pampean flat slab segment.The southern end of the SVZ is defined by the subducting Chile Ridge.
In the active arc over a length of 1,400 km, several calderas and more than 60 large volcanic complexes tower over numerous smaller cones (Fig. 1; López-Escobar et al., 1995;Stern et al., 2007).Of those, twenty have erupted not only in the Pleistocene/ Holocene, but also produced a series of eruptions in historical times.
The historical records investigated in this study are taken from the Global Volcanism Program webpage (www.volcano.si.edu, Siebert and Simkin, 2002-) and literature compilations, in particular from Petit-Breuilh (2004).Only eruptions from a minimum volcanic explosivity index (VEI, defined by Newall and Self, 1982;Simkin and Siebert, 1994) upwards are considered in this study.The eruption chronologies from the different sources are largely in agreement with each other.In the cases where discrepancies exist on eruption occurrence and VEI-assignment, the eruptions were included in the analysis if at least one source states them to be large enough (VEI ≥2), and no other source explicitly discredited them.The eruption records were converted into time series by calculating the repose times between successive eruptions.Following the convention established by Klein (1982), the repose time is defined as the interval from the onset of one eruption to the onset of the next one, thereby neglecting the duration of the eruptions.The repose times are resolved on a yearly scale.This results from the precision of the available source data, which sometimes do not provide exact dates, and is sufficient for most purposes.For those years in which several eruptions took place, repose times of ʻzero yearsʼ are accumulated and as such included in the statistical analysis.In several cases, volcanoes showed eruptive cycles with variably strong activity and variable interruptions therein over two or several years.For example, an eruption might commence with weak activity in one year, and then culminate to a degree of explosivity sufficient for being considered in this study in the following year/s.The decision on whether to treat such eruptive activity as one longer-lasting complex eruption, or as a sequence of individual events, is not within the scope of this study.We therefore adopted the eruption demarcation as given in the available literature.As such, they form the data base for the statistical processing, and are provided with the literature-assigned VEI in table 1.
Due to the limited historical record, our study will only include eruptions from the year 1900    2004) Given by Naranjo and Polanco (2004) and in the GVP as 2000, by Petit-Breuilh (2004) as 2001 onwards.Although the statistical method would be more reliable if based on a longer eruption record, in particular since short-term fluctuations in activity over time scales of centuries might bias the analysis, the older part of the record becomes progressively more unreliable.For those volcanoes analysed individually by Dzierma and Wehrmann (2010a), the year 1900 presents an acceptable threshold after which the eruption record can be reasonably assumed to be complete based on statistical tests.The eruption filtering by a minimum magnitude is performed here because very small eruptions are more prone not to be registered in the eruption record.For eruptions of higher explosivity index (VEI ≥4), the eruption record is too scarce for statistical analysis, although their devastating energy is obviously much larger.Our measure here is eruption occurrence; the events are not weighted by their magnitude.The analysis is twofold performed here, (1) including all eruptions with a VEI ≥2, and (2) for all eruptions with VEI ≥3.The collective eruption record is visualised as a cumulative frequency plot over time in figure 2.

3.Method
The method employed here is a simple application of the standard statistical lifetime and failure analysis, presented elsewhere in detail (Dzierma and Wehrmann, 2010a, b).Several studies have used this technique or parts of it to characterise the eruption behaviour and quantify the eruption hazard of volcanoes around the world (e.g., Wickman, 1966;Klein, 1982;Ho, 1990;Bebbington and Lai, 1996a;De la Cruz-Reyna, 1996; Scarpa and Tilling, 1996; Connor et al., 2003;Mendoza-Rosas and De la Cruz-Reyna, 2008, 2009, 2010).
In brief, to determine whether the eruptions can be modelled as simple stochastic point processes, it is assessed to what extent successive eruptions are independent of each other by calculating the correlation between successive repose times.Whether or not the time series can be considered stationary is checked using a 5-point moving average of the repose times and assessing the deviations from the average.
Given the assumption of a stationary stochastic point process, several standard models are available which can be used to represent the repose time distribution function:

F(t) = probability of a repose time T≤t = P(T≤t)
or the corresponding survival function:

S(t) = P(T ˃ t) = 1 -F (t)
The most widely-used statistical models are: a. the exponential distribution; b. the Weibull distribution and c. the log-logistic distribution function.
The exponential distribution mathematically represents the most simple scenario where the eruptions can be described by a Poisson process (for an introduction, see, e.g., Cox and Lewis, 1966;Cox and Oakes, 1984).In this case, the rate of eruptions λ is constant with time, and the survival function is given by the functional form: As an extension of this model, the Weibull distribution takes into account a shape parameter d which facilitates to describe systems with a temporally changing hazard rate, usually attributed to one dominant process acting in the system: Depending on whether the shape parameter d is larger or smaller than one, the hazard rate of the system is increasing or decreasing with time, respectively (Ho, 1991;Bebbington and Lai, 1996a, b;Watt et al., 2007).
As additional conceptual adjustment, the loglogistic model (Connor et al., 2003(Connor et al., , 2006;;Dirksen, 2006) accommodates the inclusion of competing processes, where some factors increase the probability of an eruption with time, whereas others counteract them, decreasing the eruption probability: All three survival functions are fit to the data, optimised by a least-squares method, and the quality and adequateness of the fits evaluated using several statistical measures, the Kolmogorov-Smirnov-test (KS-test, Gibbons, 1976), c² test and corrected Akaike Information Criterion (AIC c , Akaike, 1973;Sigiura, 1978;Burnham and Anderson, 1998).
The fit functions can then be used to estimate the probability of at least one VEI ≥2 or VEI ≥3 eruption within a given time span t in the future as: The green line corresponds to an overall constant eruption rate, which presents a relatively good approximation to the data.
where x is the time that has elapsed since the last eruption (Marshall and Olkin, 2007).In the present case, since the year 2011 accommodates the onset of the Puyehue eruption, the statistical forecast is calculated as seen from the beginning of 2012.

Results
The first visual impression of the chronological eruptive sequence is obtained from the cumulative number of eruptions versus time plot (Fig. 2).For the VEI ≥3 limit, the cumulative number of eruptions increases approximately linearly with time.The plot displaying the VEI ≥2 eruptions shows an overall increase over time that can generally be described as narrow scattering bent over a linear increase, indicating that the series of eruption occurrence has remained relatively stable since 1900, with maybe a slight decrease in eruption rate over time.If the time series was to be described in much greater detail, a stepwise consideration could be invoked, distinguishing several short regimes of higher and lower activity levels.This is, however, beyond the scope of this study.When regarding the subduction system on a large scale and in simplified terms, the long-term driving forces involved in provoking volcanic activity have not undergone any drastic changes during the short eruption record considered here, which in part allows us to assume near-stationarity for the overall eruption rate.
Statistically testing the time series for stationarity, this first visual impression is confirmed by the outcome from the correlation and moving-average analyses (Figs. 3,4), although sensu stricto the requirements of no simultaneously occurring events is not entirely fulfilled at the chosen time resolution, as the data set contains several years in which more than one eruption took place.The correlation coefficients of R=-0.111 and R=-0.181 for VEI ≥2 and VEI ≥3 eruptions, respectively, do not indicate a significant correlation at the 5% level of significance (p-values are 0.180 and 0.408, respectively).The stationarity checks are equally satisfied, with variations in the 5-point average repose times remaining below 2 standard deviations from the mean.While there appears to be a slight trend towards increasing repose times in the recent past, this has not yet reached the point of statistical significance.From a mathematical point of view it is therefore justified to treat both datasets as stationary for the following analyses.
The life time distributions can be fit stably in all cases, and all fits pass the goodness-of-fit test (Table 2, Fig. 5).For the VEI ≥2 eruptions, the exponential and Weibull distributions give the best fit to the data according to the AIC c .The Weibull model differs from the exponential distribution in that it can take into account a shape parameter, which for both data sets is slightly larger than one, indicating a weakly increasing hazard rate.For the VEI ≥3 eruptions, the AIC c attributes the best fit quality to the log-logistic and exponential distributions.Both the VEI ≥2 and VEI ≥3 eruptions can be reasonably described by a Poisson process, which agrees well with the approximately linear shape of the cumulative eruption curve (Fig. 2).More complicated models (the Weibull distribution for VEI ≥2 eruptions and ).Each repose time is plotted as a function of the preceding repose time.Since many repose times occur several times and dots would overlap, jitter is applied to the data before plotting.In this way, each data point is displaced randomly a small distance around its coordinate in the plot.This is done for better visual inspection of the plot and was not included in the correlation/regression analysis.Because of the yearly resolution, points appear as being clustered on a grid instead of showing an irregular distribution.The red line gives a linear fit of the data points, the green line would be expected for perfect correlation between successive repose times.the log-logistic model for VEI ≥3 eruptions) are also in agreement with the data, but do not significantly improve the fit of the exponential distribution.Since all distribution functions pass the goodnessof-fit test, the predictions for the future eruption probabilities are calculated for all distribution functions (Fig. 6), although it appears on the basis of the AIC c that the Weibull or exponential model should be preferred for the VEI ≥2 eruptions and the exponential or log-logistic behaviour for the VEI ≥3 eruptions.The differences in the predictions for the three distribution functions are slight.All of them reach over 50% probability for at least one VEI ≥2 eruption in less than one year, and a probability of 90% in 2-3 years.These predictions stochastically underline the high hazard posed by the SVZ volcanoes, as also directly observed by the short recurrence intervals in the order of no more than a few years and sometimes several eruptions occurring in the same year.
For the VEI ≥3 eruptions, the probabilities are somewhat smaller.The analysis gives at most 20% probability of at least one VEI ≥3 eruption within one year, and the 50% probability is reached after 3-4 years.With over 90% probability, it can be expected that at least one VEI ≥3 eruption will occur over the next 9-11 years, depending on the distribution function.

Conclusions
Based on time series analysis of 150 eruptions in the historical record since 1900, we provide the likelihood for future eruptions of VEI ≥3 and VEI ≥2 to occur in the SVZ.The models predict a statistical probability of 65-72% of a VEI ≥3 eruption within the next 5 years, and 96-99% of a VEI ≥2 eruption within this time.For obvious reasons, hazard evaluation should never build up on statistical analyses alone.Volcanic settings are known to be able to produce catastrophic eruptions at short time scales, which cannot always be reliably predicted based on the average behaviour.Statistical eruption analysis as presented here can only estimate the likelihood that eruptions will take place within a given time interval.As a main outcome with regard to the high probability for future eruptions, the estimate provided here underpins the necessity for continuous volcano surveillance in the SVZ.Monitoring of the volcanoseismic signals, gas emissions, inflation/deflation, and understanding magmatic and tectonic systems are indispensable for successful short-term disaster prevention, management and mitigation.A joint assessment of these physical parameters combined with insight from statistical analyses yields the potential to reliably indicate signs of an impending eruption at short notice.

FIG. 1 .
FIG. 1. Overview SRTM image of the Chilean Southern Volcanic Zone(SVZ) displaying the location of the volcanoes considered in this study (red triangles), other major volcanic centres within the SVZ (yellow triangles), the plate boundary (dashed blue line), major cities (white circles).Inset shows the location of the SVZ on the South American continent.Image courtesy SRTM Team NASA/JPL/NIMA, modified.

FIG. 2 .
FIG. 2. Cumulative number of eruptions as a function of time for all volcanoes of the SVZ combined; left: VEI ≥2, right: VEI ≥3.The green line corresponds to an overall constant eruption rate, which presents a relatively good approximation to the data.

FIG. 4 .
FIG. 4. Stationarity check by plotting the 5-point moving repose time averages as a function of time.The red line gives the overall mean 5-point repose time average, the green dashed line denotes two standard deviations from the mean.
FIG.6.Probability (in %) that at least one eruption with VEI ≥2 (left) or with VEI ≥3 (right) will occur in the near future, based on the three different distribution function fits.

Table 1 . continued.
There are several additional eruptions mentioned in different sources, but the some conflict occurs between the sources.These eruptions are listed in the following table, but they were not included in the analysis.