MLOC Station Data

Station Data

Management of information on seismograph stations, especially the geographic coordinates to be associated with a specific station code, is a critical aspect of all earthquake location algorithms but it is often especially complex for the kinds of relocation studies undertaken with mloc. Location programs are typically conducted (e.g., by a local or regional network) with a small, stable set of seismograph stations. The U.S. Geological Survey’s National Earthquake Information Center (NEIC) uses data from thousands of stations in their global monitoring but they are all real-time digital stations that report their metadata (including station coordinates) automatically. Even the International Seismological Centre (ISC), which collects the most complete and diverse set of arrival time data globally, uses data only from stations that have been registered in the International Registry of Seismograph Stations (IR) that they maintain.

In contrast mloc is designed with the philosophy that all possible data should be usable. A successful calibration analysis often depends on combining data from multiple sources, including temporary deployments and data from local and regional networks that are not normally made available to any openly-accessible database. In such datasets, station information may come in a variety of formats and, most important, station code conflicts are common. mloc incorporates tools, reporting and logic to assist in managing such issues but the user must become familiar with a number of subtle aspects of the subject in order to ensure full and correct use of the available arrival time data. Some important aspects are:

  • Station codes may have up to 5 characters
  • Case-sensitive station codes are supported (but not recommended)
  • Operational epoch is supported
  • Authorship, i.e., the source of the coordinates, is supported
  • The concepts of “agency” and “deployment” are supported to help resolve station code conflicts

The most important rule to appreciate regarding station code information in mloc is this:

The master station file contains entries only for seismograph stations that have been registered with the International Registry of Seismograph Stations (IR). Station information for unregistered stations is carried in one or more supplemental station files.

The master station file is always read by mloc. Supplemental files are usually referenced in a command file with the sstn command, but they can be declared interactively as well.

A limit (currently 24,000) for the total number of stations that can be defined is carried in mloc.inc. The master station file presently contains 21,614 entries, leaving 2,386 entries for supplemental station files. A maximum of 8 supplemental station files can be referenced.

The remainder of this section deals with the master station file. Supplemental file formats are described elsewhere. See also the discussion of how to deal with various problems associated with station codes.

The New IASPEI Station Coding Standard (ADSLC)

In 2013 IASPEI endorsed a new standard for identifying seismograph stations. A document describing that standard is available at the ISC website. This standard is often referred to as the ADSLC standard, in respect of the five defined fields:

agency.deployment.station.location.channel

that take the place of the traditional station code. The MNF data format that is used for event data files in mloc has fields for all five elements of the ADSLC standard, but the agency, deployment and station fields are the most important for relocation analysis.

It is important to understand that a given station can have multiple ADSLC codes, governed by a complex system of aliases and umbrella designations, some of which are mentioned below. This unfortunate complexity is necessary to handle the multiple ownership and operational affiliations that exist for many seismograph stations.

Unfortunately, little progress has been made in actually supporting the ADSLC standard at the ISC or most other seismological agencies. Specifically, neither arrival time datasets downloaded from the ISC Bulletin nor station information datasets downloaded from the IR carry information concerning agency or deployment. On the other hand, by definition, any arrival time data downloaded from the ISC Bulletin may be legitimately characterized with the agency code ”ISC” and deployment code ”IR”.

The ADSLC standard provides for a maximum of five characters for the agency and station fields, and eight characters for the deployment field. These field lengths are supported in the MNF format. Although earlier versions of mloc supported 6-character station codes as a mechanism for resolving station code conflicts, with v10.4.0 station codes are limited to 5 characters and conflicts must be resolved with another mechanism, e.g., agency and deployment codes.

It is also important to understand that, although MNF format supports the ADSLC formulation, it is usually best not to encode arrival time data files (.mnf files) with that information as a matter of course, even if it is available. mloc processes most arrival time datasets more efficiently if the agency and deployment fields in the phase lines of the .mnf files are left blank. In other words, situations requiring the use of agency and deployment fields are rare and matching stations solely by station code works fine most of the time. Therefore it is recommended to use agency and deployment codes only when they actually solve a problem and then they should be applied only to the phase readings that require them.

NEIC Station Metadata

One organization that does support the ADSLC, in a sense, is the NEIC, which maintains a database of station metadata that includes the FDSN network codes. As can be seen in the above-referenced defining document, stations belonging to FDSN-affiliated networks may be legitimately characterized in the ADSLC formulation with agency code ”FDSN” and a deployment code taken from the FDSN network code, a two character alphanumeric code. NEIC processing identifies stations by the combination of station code and FDSN network code.

Complexities arise, however, for several reasons:

  • Some stations in the NEIC metadata database are not registered with the IR
  • Some of those unregistered stations in the NEIC metadata database have station codes that conflict with the codes of stations (in other places) that are registered at the IR
  • Some stations in the NEIC metadata database have been registered at the IR, but with station codes that have been modified, typically by adding an extra character to the code

A recent download of the NEIC station metadata file is included in the mloc installation, and the command nsmd can be used to recover data from it for a supplemental station file for stations that fail to match the master station file.

Operational Epoch

The period of time during which a set of coordinates is valid for a given station code is called the operational epoch. The mloc master station file format supports operational epoch, as do several of the supplemental station file formats. It is defined by two variables, date_on and date_off, which are given as a 7-digit integer composed of the year (left-most four digits) and Julian date (right-most three digits). Either one or both may be left blank, with the obvious meaning.

Seeding of the operational epochs for many codes in the master station file was based on an analysis of data holdings at Lawrence Livermore National Laboratory by Steve Myers, supplemented by information acquired by Bob Engdahl. I have found that there are many errors in these epoch data, especially in the date_off field, such that mloc often reports a “date range” failure in processing actual arrival time data against the master station file. In the case of erroneous date_off entries the problem appears to be that the field was filled with the date of the most recent data holding at that time, but obviously most stations have continued to operate past that point; therefore the correct course of action is to delete the date_off entry completely. In the case of violations of the date_on entry, the preferred course of action is usually to update to the earliest date of the available data.

The definition of the SEISAN station file format (isstn = 2) has been extended to include optional date_on and date_off fields, because this format is often used for temporary networks that had a limited period of operation, and the station codes for such deployments often conflict with registered stations.

Authorship

The notion of authorship (of station information) has been added to the master station data file to help document discrepancies in reported station coordinates that are sometimes encountered. The author of a set of coordinates is identified in a character variable with up to 8 characters. This use of “authorship” is distinct from authorship for hypocenters and authorship of phase readings in the MLOC Native Format for arrival time data. There are no standard definitions for these authorship entries, but I have used “IR” as the code for entries downloaded from the International Registry, the base set of coordinate information for the master station file. Definitions of authorship codes are carried at the beginning of the master station file in lines that begin with ‘#’ in column 1 (comment lines).

A standard variation on authorship codes in the master station file is the use of the ‘+’ character to indicate a different author for the information on station elevation and depth of burial (if given) than the author appropriate for the latitude and longitude. This is useful because the IR in many cases does not carry any information on station elevation, and in many other cases it is quite inaccurate. An easy way to rectify such lapses is to enter the coordinates in Google Earth and take the elevation from that, in which case the authorship code would be amended from “IR” to “IR+GE”. In some cases, where the seismic vault can be clearly identified in Google Earth, I have used the authorship code “EAB+GE” to indicate that I specified all coordinates of the station using Google Earth.

Master Station File

The default set of station information for mloc is contained in the master station list, with the pathname /tables/stn/master_stn.dat relative to the mloc working directory. The filename can be changed in the mloc configuration file. The master list contains only stations that have been registered at the IR, but the coordinates in some cases have been revised, for two main reasons:

  • In some cases stations have been moved without changing the station code. The format includes the concept of epochs that define when a station occupied the given location. Therefore it is necessary to check the date for which coordinates of a given station code are needed to ensure that the correct coordinates are used.
  • The coordinates carried in the IR are sometimes found to be in error, for example, by investigation with Google Earth.

When coordinates are modified, the original entry (i.e., the data from the IR) is not deleted. The new entry is placed above the original one in the list; mloc selects the first instance it encounters. The format used for the master station list includes the concept of an author for the station coordinates (eight characters) and this is used to annotate the source of the preferred coordinates.

New stations are regularly registered with the IR, so the master station file must be updated manually when a dataset includes IR-registered stations that are not in the current master list. This seldom involves more than a handful of stations for any one cluster.

Format

The format used for the master station file has changed several times during development of mloc. The current master station file format was introduced with v10.0.0, released on April 28, 2014. mloc identifies station file formats by an integer (isstn) in the first column of the first row of the file. The rest of the first row can be used for a comment about the dataset. The master station file format is identified by isstn = 0. This format can be used for supplemental station files too.

Master Station File Format (isstn = 0)
Column Description
1:5 Station code (a5)
7:15 Latitude (f9.5)
17:26 Longitude (f10.5)
28:32 Elevation, m (i4)
34:37 Depth of burial, m (i4)
39:46 Author (a8)
48:52 Agency (a5)
54:61 Deployment (a8)
63:64 Location (a2)
66:72 Date_on (i7)
74:80 Date_off (i7)
82: Station name (no limit)

Any line with a hashtag (#) in column 1 is treated as a comment line. The following extract from the master station file illustrates several aspects of the format.

0 MLOC master station list
# ERE = Bob Engdahl
# GE = Google Earth
# IR = International Registry of Seismograph Stations <http://www.isc.ac.uk/registries/>
ALCN   40.55       0.48      177      IR       ISC  .IR      .                   Alcanar, Spain
ALCS   37.25417   -3.54389  1489      IR+GE    ISC  .IR      .   1982032 1982060 Alfacar, Spain
ALCT   47.6475  -122.0370     55      IR       ISC  .IR      .   2000188         Alcott Elementary School, Washington, U.S.A.
ALD    45.8194  -120.0670    427      IR       ISC  .IR      .   1975305         Alter Ridge, Washington, U.S.A.
ALDR   58.61000  125.40944   682      IR       ISC  .IR      .   1970001         Aldan, Sakha, Russia
ALE    82.4833   -62.4000     65      ERE      ISC  .IR      .   1961272 1990049 Alert, Northwest Territories, Canada
ALE    82.5033   -62.3500     65      IR       ISC  .IR      .   1990050         Alert, Northwest Territories, Canada

The elevation for station ALCS has been determined from Google Earth. The station coordinates and operational epoch in the IR entry for station ALE have been superseded by information provided by Bob Engdahl (ERE). Dots are added in the appropriate columns to distinguish the fields for agency, deployment and location for legibility.