Real- and near-real time data are transfered to EBAS database through NASA-ames data files (text file, .nas extension). Extensive information on how to submit data to EBAS is described on the EBAS website, and more specifically for bioaerosols here. An overview of the general submission procedure can be found here.
To simplify this procedure, AutoPollen will request 1) the EBAS station code and 2) the EBAS platform code, which you will need for the metadata in your submission. This will be done after the site has been AutoPollen-certified.
Once these codes have been received, the data transfer protocole must be set up. The present page proposes a walk-through explaining how a NASA-ames format looks like. Data submission to EBAS is mandatory for a site to be AutoPollen certified. In bold on the Example are the fields that should be adapted for each device.
NASA-ames files contain metadata and bioaerosol concentrations for one instrument over time for all taxa measured.
A python package for producing the NASA-ames files can be found here.
A template file can be downloaded here, and the standard script for producing a file fitting AutoPollen requirements can be found here.
NASA-ames example
Line number | Field | Explanation |
1 | 66 1001 | Total number of lines in header (66; Concretely, here, the number of rows in the table.) and NASA-Ames format number ( 1001; For our use always 1001). You must remember to separate the total number of headerlines using a space. |
2 | Holmes, Sherlock | Name of the data originator |
3 | LO0038D, Holmes, S., 221B Baker Street, London, England | Station code, Name and Address of the organisation |
4 | Watson, John | Name of the operator sending data |
5 | AutoPollen_NRT | Name of the project |
6 | 1 1 | Volume number and total number of volumes |
7 | 2025 01 01 2025 02 12 | File reference date and revision date. The file reference date indicates the start point of the time axis in the file. The time axis is always stated in days and begins at 00 UTC on the file reference date. For example, a time stamp for 5 January 2008, 0 UTC may be stated with a reference date of 2008 01 05 in line 7 and a day of 0 in the data section. The revision date is the date when the file was created or last updated. Both dates are space separated, and stated in the format YYYY MM DD. |
8 | 0.041667 | Time interval between measurement start points: for hourly averaged data, it is 1/24 = 0.041667, for 3-hourly data 0.125, and for daily averaged data, it is 1. |
9 | days from file reference point | days from file reference point |
10 | 6 | Number of dependant variable (data columns, except starttime column ) i.e. the number of bioaerosol taxa concentrations, uncertainty, pressure, humidity, numflag etc. |
11 | 1 1 1 1 1 1 | Scaling factor for each variable (one per dependant variable) |
12 | 99.999999 999.99 9 9.999 9 9.999 | Missing value tags. The value corresponding to "end_time of measurement" (first value) must be = to 99.999999; The value corresponding to "pollen_taxa arithmetic mean" must be = 999.99; The value corresponding to "pollen_taxa uncertainty" must be = 999.99; The value corresponding to the "numflag" must be = 999. |
13 | end_time of measurement, days from the file reference point | Dependent variable column containing the end time of the measurement period and the number of days from the reference point |
14 | pollen_alnus, 1/m3, Statistics=arithmetic mean | Dependent variable column: depend on what taxa the algorithm detects. To be selected from this list, depending on what your system can measure. Will be one of the columns in the last line before mesurements. |
15 | pollen_alnus, 1/m3, Statistics=uncertainty | Dependent variable column: depend on what taxa the algorithm detects. To be selected from this list, depending on what your system can measure. Will be one of the columns in the last line before mesurements. |
16 | pollen_artemisia, 1/m3, Statistics=arithmetic mean | Dependent variable column: depend on what taxa the algorithm detects. To be selected from this list, depending on what your system can measure. Will be one of the columns in the last line before mesurements. |
17 | pollen_artemisia, 1/m3, Statistics=uncertainty | Dependent variable column: depend on what taxa the algorithm detects. To be selected from this list, depending on what your system can measure. Will be one of the columns in the last line before mesurements. |
18 | flow_rate, l/min, Location=sheath, Matrix=instrument | Flow, Pressure, temperature and related flag columns are optional. |
19 | numflag flow_rate, no unit, Location=sheath, Matrix=instrument | Flow, Pressure, temperature and related flag columns = optional. |
20 | numflag flow_rate, no unit, Location=inlet, Matrix=instrument | Flow, Pressure, temperature and related flag columns = optional. |
21 | flow_rate, l/min, Location=inlet, Matrix=instrument | Flow, Pressure, temperature and related flag columns = optional. |
22 | pressure, hPa, Location=instrument internal, Matrix=instrument | Flow, Pressure, temperature and related flag columns = optional. |
23 | numflag pressure, no unit, Location=instrument internal, Matrix=instrument | Flow, Pressure, temperature and related flag columns = optional. |
24 | temperature, K, Location=instrument internal, Matrix=instrument | Flow, Pressure, temperature and related flag columns = optional. |
25 | numflag temperature, no unit, Location=instrument internal, Matrix=instrument | Flow, Pressure, temperature and related flag columns = optional. |
26 | numflag, no unit | Dependent variable column. Flags are not implemented yet. |
27 | 0 | Number of special comment lines |
28 | 38 | Number of comment line, so number of line after this line, including the columns names. |
29 | Data definition: EBAS_1.1 | Data definition |
30 | Data license: "https://creativecommons.org/licenses/by/4.0/" | Data license |
31 | Citation: Holmes, S. AutoPollen_NRT , data hosted by EBAS at NILU | Citation (keep the AutoPollen_NRT , data hosted by EBAS at NILU part) |
32 | Set type code: TU | The dataset type code describes whether the time spacing of the dataset is strictly homogeneous (code "TU", meaning time-series, uniform), or whether the user has to expect gaps and shifts in the timesamp (code "TI", meaning time-series, irregular). |
33 | Timezone: UTC | Timezone |
34 | File name: FI0038U.20250112160000.20250212141402.nas | File name. To be created for each file following this pattern. |
36 | File creation: 20250313140823885149 | File creation date and time. To be created for each file.Generated automatically by EBAS python library. |
37 | Startdate: 20200101000000 | Measurement start date and time. To be created for each file.Generated automatically by EBAS python library. |
38 | Revision date: 20250313140823 | Date and time of revision by EBAS. |
39 | Data level: 1.5 | Data level |
40 | Period code: 1mo | Time span covered by the time series contained in the file |
41 | Resolution code: 1h | Interval between start times of samples, e.g. 1h or 3h. |
42 | Sample duration: 1h | Time between start and end of a sample, e.g. 1h or 3h. |
43 | Station code: LO0038D | The station code is a unique identifier of your station in the EBAS database that hosts the WDCA. After registration with GAWSIS, new stations should write an e-mail to ebas@nilu.no for requesting their station code. |
44 | Platform code: LO0038S | Some stations use several platforms, i.e. a ground station and a boundary layer tower. The platform code is used to distinguish these if present. The "S" in the example stands for "surface", which applies to most stations. After registration with GAWSIS, new stations should write an e-mail to ebas@nilu.no for requesting their platform code. |
45 | Station name: Detective & Co | Station name, internal |
46 | Station other IDs: BackerStreet, Anything | Station other IDs, internal |
47 | Station land use: Urban park | Station land use, following https://ebas-submit.nilu.no/templates/comments/sl_station_landuse |
48 | Station setting: Suburban | Keyword for station biogeographcial zone, following www.eea.europa.eu/data-and-maps/figures/biogeographical-regions-in-europe-2 |
49 | Station latitude: 51.51777926823324 | Station latitude |
50 | Station longitude: -0.15526278686654174 | Station longitude |
51 | Station altitude: 25.0 m | Station altitude above sea level |
52 | Measurement latitude : 51.51777926823324 | Instrument latitude |
53 | Measurement longitude : -0.15526278686654174 | Instrument longitude |
54 | Measurement altitude : 35.0 m | Instrument altitude above sea level + above ground |
55 | Regime: IMG | Regime code |
56 | Component: | Component names are fixed for and identify the type of the reported data. Leave empty. |
57 | Unit: 1/m3 | Unit of data for each dependent variable. |
58 | Matrix: aerosol | Atmospheric component sampled |
59 | Laboratory code: | Station code provided by EBAS. |
60 | Instrument type: pollen_monitor | Instrument type |
61 | Instrument name: Poleno_Jupiter | Instrument name: manufacturer_model |
62 | Instrument serial number: P200 | Instrument serial number (e.g. code of the Poleno) |
63 | Method ref: ModelA | Name of the algorithm used to produce level-1 data from level-0 data |
64 | Originator: Holmes Sherlock, 221B Baker Street, London, England, ORCID=0000-0001-9542-4578 | Contact of the data originator |
65 | Submitter: Watson, John, 221B Baker Street, London, England, ORCID=0000-0001-9542-4579 | Contact of the submitter |
66 | starttime endtime pollen_alnus_mean pollen_alnus_uncertainty pollen_artemisia_mean pollen_artemisia_uncertainty flag | Column names. Start time and end time are mandatory, and must be provided in UTC format. However, the data columns pollen_taxa_mean and pollen_taxa_uncertainty will vary depending on the identification algorithm used. Only the taxa name should be updated. If more than two taxa detected, add columns, but update the field "Number of data columns, except starttime column.". Additionally, add lines for each data column, and update the "Number of lines in the metadata". |
67 | starttime endtime pollen_alnus_mean pollen_alnus_uncertainty pollen_artemisia_mean pollen_artemisia_uncertainty flag | Column names. Start time and end time are mandatory, and must be provided in UTC format. However, the data columns pollen_taxa_mean and pollen_taxa_uncertainty will vary depending on the identification algorithm used. Only the taxa name should be updated. If more than two taxa detected, add columns, but update the field "Number of data columns, except starttime column.". Additionally, add lines for each data column, and update the "Number of lines in the metadata". |
68 | starttime endtime pollen_alnus_mean pollen_alnus_uncertainty pollen_artemisia_mean pollen_artemisia_uncertainty flag | Column names. Start time and end time are mandatory, and must be provided in UTC format. However, the data columns pollen_taxa_mean and pollen_taxa_uncertainty will vary depending on the identification algorithm used. Only the taxa name should be updated. If more than two taxa detected, add columns, but update the field "Number of data columns, except starttime column.". Additionally, add lines for each data column, and update the "Number of lines in the metadata". |
68 | ||