%%{init: { 'logLevel': 'debug', 'theme': 'base' } }%% graph LR waveform[Get Data]:::active --> correlate(Correlation) correlate -->|save| corrdb[(CorrDB/hdf5)] corrdb --> monitor monitor[Measure dv] -->|save| dv{{DV}} click waveform "../trace_data.html" "trace_data" click correlate "../correlate.html" "correlate" click monitor "../monitor.html" "monitor" click corrdb "../corrdb.html" "CorrDB" click dv "../monitor/dv.html" "DV" classDef active fill:#f666, stroke-width:4px, stroke:#f06;
Module to Download and Read Waveform Data#
The waveform module contains functions and classes that allow to read waveforms from a local database
or download waveform data from FDSN servers by using the Obspy Mass Downloader.
The downloaded data will then be written in a seiscom data structure (SDS).
Both data download and management of the raw database are handled using a Store_Client
object.
Download Data#
The recommended way to download continuous data for noise correlations is the method
download_waveforms_mdl()
. To use this method, we first need to initialise a Store_Client
object.
The code block below shows and example for downloading data from the IU.HRV station.
1from obspy.clients.fdsn import Client
2from obspy import UTCDateTime
3
4from seismic.trace_data.waveform import Store_Client
5
6root = '/path/to/project/'
7starttime = UTCDateTime(year=1990, julday=1)
8endtime = UTCDateTime.now()
9network = 'IU'
10station = 'HRV'
11
12# Note that you coul initiate the client object with
13# an Eida token if you should wish to download
14# restricted data.
15c = Client()
16sc = Store_Client(c, root, read_only=False)
17sc.download_waveforms_mdl(
18 starttime, endtime, clients=[c], network=network,
19 station=station, location='*', channel='*')
Note
Instead of defining station and network codes, we could have used geographical coordinates
to decide which stations to download. To do that, define the parameters minlon
, maxlon
,
minlat
, and maxlat
.
Note
download_waveforms_mdl()
requires a list as input
of clients.
Access Data#
Note
If you just want to compute some noise correlations (i.e., Green’s function estimations), you might not need this section.
Now, that we downloaded some waveform data, we might want to access it. This can as well be realised
using Store_Client
. In general, seismic data comes in two parts:
The waveform data in .mseed format and the Station Response information in XML format.
Store_Client
comes with several methods to fetch the two. There
are methods that will only read local data and methods that will check whether data is available locally
and, if not, download them from remote (i.e., the FDSN server that we defined when initialising the
Store_Client
object).
- To access only already available data use:
response information:
read_inventory()
to read all available response information.inventory
to access the pythonic inventory object (see obspy’s documentation for available methods)
waveform data:
_load_local()
to load local data
- If you would like to download missing data, use the following methods:
Response Information:
waveform data:
Getting an overview of your database#
Once your database grows in size, you might not exactly know anymore, which data you have available. There are a couple of functions to help you:
get_available_stations()
returns all codes of stationsfor which data is available (a network may or may not be defined).
_get_times()
returns the earliest and latest availablestarttimes for a certain station.
Store_Client_compute_spectrogram()
can be used to examine the spectral content of thedata. Check out the tutorial for details on how to do that.
Feed in Data in a Different Way#
There may be scenarious, in which you will not need or won’t be able to download waveform data from an FDSN server. In such cases, it is easy to use your own mseed data to “mimic” SeisMIC’s file system structure.
You will need daily mseed files for each component of the seismometer. If you need to convert/merge/split, you files we recommend using PyRocko.
Now you will have to sort your mseed files in the following way:
path/to/project/mseed/{year}/{network}/{station}/{channel}.{sds_type}/{network}.{station}.{location}.{channel}.{sds_type}.{year}.{doy:03d}
Where sds_type
almost always corresponds to D (i.e., data). doy:03d
is the “day of year”/Julian day in three digits.
For example path/to/project/mseed/2010/IU/HRV/BHZ.D/IU.HRV.00.BHZ.D.2010.001 is the mseed file corresponding to a waveform recording from station IU.HRV channel BHZ on January 1st 2010.
When using PyRocko, you can use the following command to convert your data to the correct directory structure, daily length and format:
$ squirrel jackseis --add /path/to/your/input/waveforms/ --out-sds-path myproject/mseed/ --out-format mseed
# use this command to open the help page and see other options
$ squirrel jackseis --help
Note
The mseed files do not have file endings.
Note
Files that do not correspond to this format won’t be found by SeisMIC. Pay particular attention to always saving the mseed files for days 1 to 99 with leading zeros (i.e., day 1 has the file ending 001)!
Station Inventories#
You might require response files either to provide station coordinates or to provide reponse information. SeisMIC reads StationXML files these are saved in /path/to/project/inventory/{network}.{station}.xml. There should be exactly one file per station! If you use SeisMIC’s preimplemented methods to download data, station information will be downloaded automatically. Otherwise, add it manually.