Skip to content

NASA-ames data file production

Real- and near-real time data are transfered to EBAS database through NASA-ames data files (text file, .nas extension). Extensive information on how to submit data to EBAS is described on the EBAS website, and more specifically for bioaerosols here. An overview of the general submission procedure can be found here

To simplify this procedure, AutoPollen will request 1) the EBAS station code and 2) the EBAS platform code, which you will need for the metadata in your submission. This will be done after the site has been AutoPollen-certified.

Once these codes have been received, the data transfer protocole must be set up. The present page proposes a walk-through explaining how a NASA-ames format looks like. Data submission to EBAS is mandatory for a site to be AutoPollen certified. In bold on the Example are the fields that should be adapted for each device.

NASA-ames files contain metadata and bioaerosol concentrations for one instrument over time for all taxa measured. 

A python package for producing the NASA-ames files can be found here.

A template file can be downloaded here, and the standard script for producing a file fitting AutoPollen requirements can be found here.

NASA-ames example

Line number Field Explanation
1 66 1001 Total number of lines in header (66; Concretely, here, the number of rows in the table.) and NASA-Ames format number ( 1001; For our use always 1001). You must remember to separate the total number of headerlines using a space.
2 Holmes, Sherlock Name of the data originator
3 LO0038D, Holmes, S., 221B Baker Street, London, England Station code, Name and Address of the organisation
4 Watson, John Name of the operator sending data
5 AutoPollen_NRT Name of the project
6 1 1 Volume number and total number of volumes
7 2025 01 01 2025 02 12 File reference date and revision date. The file reference date indicates the start point of the time axis in the file. The time axis is always stated in days and begins at 00 UTC on the file reference date. For example, a time stamp for 5 January 2008, 0 UTC may be stated with a reference date of 2008 01 05 in line 7 and a day of 0 in the data section. The revision date is the date when the file was created or last updated. Both dates are space separated, and stated in the format YYYY MM DD.
8 0.041667 Time interval between measurement start points: for hourly averaged data, it is 1/24 = 0.041667, for 3-hourly data 0.125, and for daily averaged data, it is 1.
9 days from file reference point days from file reference point
10 6 Number of dependant variable (data columns, except starttime column ) i.e. the number of bioaerosol taxa concentrations, uncertainty, pressure, humidity, numflag etc.
11 1 1 1 1 1 1 Scaling factor for each variable (one per dependant variable)
12 99.999999 999.99 9 9.999 9 9.999 Missing value tags. The value corresponding to "end_time of measurement" (first value) must be = to 99.999999; The value corresponding to "pollen_taxa arithmetic mean" must be = 999.99; The value corresponding to "pollen_taxa uncertainty" must be = 999.99; The value corresponding to the "numflag" must be = 999.
13 end_time of measurement, days from the file reference point Dependent variable column containing the end time of the measurement period and the number of days from the reference point
14 pollen_alnus, 1/m3, Statistics=arithmetic mean Dependent variable column: depend on what taxa the algorithm detects. To be selected from this list, depending on what your system can measure. Will be one of the columns in the last line before mesurements.
15 pollen_alnus, 1/m3, Statistics=uncertainty Dependent variable column: depend on what taxa the algorithm detects. To be selected from this list, depending on what your system can measure. Will be one of the columns in the last line before mesurements.
16 pollen_artemisia, 1/m3, Statistics=arithmetic mean Dependent variable column: depend on what taxa the algorithm detects. To be selected from this list, depending on what your system can measure. Will be one of the columns in the last line before mesurements.
17 pollen_artemisia, 1/m3, Statistics=uncertainty Dependent variable column: depend on what taxa the algorithm detects. To be selected from this list, depending on what your system can measure. Will be one of the columns in the last line before mesurements.
18 flow_rate, l/min, Location=sheath, Matrix=instrument Flow, Pressure, temperature and related flag columns are optional.
19 numflag flow_rate, no unit, Location=sheath, Matrix=instrument Flow, Pressure, temperature and related flag columns = optional.
20 numflag flow_rate, no unit, Location=inlet, Matrix=instrument Flow, Pressure, temperature and related flag columns = optional.
21 flow_rate, l/min, Location=inlet, Matrix=instrument Flow, Pressure, temperature and related flag columns = optional.
22 pressure, hPa, Location=instrument internal, Matrix=instrument Flow, Pressure, temperature and related flag columns = optional.
23 numflag pressure, no unit, Location=instrument internal, Matrix=instrument Flow, Pressure, temperature and related flag columns = optional.
24 temperature, K, Location=instrument internal, Matrix=instrument Flow, Pressure, temperature and related flag columns = optional.
25 numflag temperature, no unit, Location=instrument internal, Matrix=instrument Flow, Pressure, temperature and related flag columns = optional.
26 numflag, no unit Dependent variable column. Flags are not implemented yet.
27 0 Number of special comment lines
28 38 Number of comment line, so number of line after this line, including the columns names.
29 Data definition: EBAS_1.1 Data definition
30 Data license: "https://creativecommons.org/licenses/by/4.0/" Data license
31 Citation: Holmes, S. AutoPollen_NRT , data hosted by EBAS at NILU Citation (keep the AutoPollen_NRT , data hosted by EBAS at NILU part)
32 Set type code: TU The dataset type code describes whether the time spacing of the dataset is strictly homogeneous (code "TU", meaning time-series, uniform), or whether the user has to expect gaps and shifts in the timesamp (code "TI", meaning time-series, irregular).
33 Timezone: UTC Timezone
34 File name: FI0038U.20250112160000.20250212141402.nas File name. To be created for each file following this pattern. 
36 File creation: 20250313140823885149 File creation date and time. To be created for each file.Generated automatically by EBAS python library.
37 Startdate: 20200101000000 Measurement start date and time. To be created for each file.Generated automatically by EBAS python library.
38 Revision date: 20250313140823 Date and time of revision by EBAS.
39 Data level: 1.5 Data level
40 Period code: 1mo Time span covered by the time series contained in the file
41 Resolution code: 1h Interval between start times of samples,  e.g. 1h or 3h. 
42 Sample duration: 1h Time between start and end of a sample, e.g. 1h or 3h.
43 Station code: LO0038D The station code is a unique identifier of your station in the EBAS database that hosts the WDCA. After registration with GAWSIS, new stations should write an e-mail to ebas@nilu.no for requesting their station code.
44 Platform code: LO0038S Some stations use several platforms, i.e. a ground station and a boundary layer tower. The platform code is used to distinguish these if present. The "S" in the example stands for "surface", which applies to most stations. After registration with GAWSIS, new stations should write an e-mail to ebas@nilu.no for requesting their platform code.
45 Station name: Detective & Co Station name, internal
46 Station other IDs: BackerStreet, Anything Station other IDs, internal
47 Station land use: Urban park Station land use, following https://ebas-submit.nilu.no/templates/comments/sl_station_landuse
48 Station setting: Suburban Keyword for station biogeographcial zone, following www.eea.europa.eu/data-and-maps/figures/biogeographical-regions-in-europe-2
49 Station latitude: 51.51777926823324 Station latitude
50 Station longitude: -0.15526278686654174 Station longitude
51 Station altitude: 25.0 m Station altitude above sea level
52 Measurement latitude : 51.51777926823324  Instrument latitude
53 Measurement longitude : -0.15526278686654174 Instrument longitude
54 Measurement altitude : 35.0 m Instrument altitude above sea level + above ground
55 Regime: IMG Regime code
56 Component:  Component names are fixed for and identify the type of the reported data. Leave empty.
57 Unit: 1/m3 Unit of data for each dependent variable.
58 Matrix: aerosol Atmospheric component sampled
59 Laboratory code:  Station code provided by EBAS.
60 Instrument type: pollen_monitor Instrument type
61 Instrument name: Poleno_Jupiter Instrument name: manufacturer_model
62 Instrument serial number: P200 Instrument serial number (e.g. code of the Poleno)
63 Method ref: ModelA Name of the algorithm used to produce level-1 data from level-0 data
64 Originator: Holmes Sherlock, 221B Baker Street, London, England, ORCID=0000-0001-9542-4578 Contact of the data originator
65 Submitter: Watson, John, 221B Baker Street, London, England, ORCID=0000-0001-9542-4579 Contact of the submitter
66 starttime endtime pollen_alnus_mean pollen_alnus_uncertainty pollen_artemisia_mean pollen_artemisia_uncertainty flag Column names. Start time and end time are mandatory, and must be provided in UTC format. However, the data columns pollen_taxa_mean and pollen_taxa_uncertainty will vary depending on the identification algorithm used. Only the taxa name should be updated. If more than two taxa detected, add columns, but update the field "Number of data columns, except starttime column.". Additionally, add lines for each data column, and update the "Number of lines in the metadata". 
67 starttime endtime pollen_alnus_mean pollen_alnus_uncertainty pollen_artemisia_mean pollen_artemisia_uncertainty flag Column names. Start time and end time are mandatory, and must be provided in UTC format. However, the data columns pollen_taxa_mean and pollen_taxa_uncertainty will vary depending on the identification algorithm used. Only the taxa name should be updated. If more than two taxa detected, add columns, but update the field "Number of data columns, except starttime column.". Additionally, add lines for each data column, and update the "Number of lines in the metadata". 
68 starttime endtime pollen_alnus_mean pollen_alnus_uncertainty pollen_artemisia_mean pollen_artemisia_uncertainty flag Column names. Start time and end time are mandatory, and must be provided in UTC format. However, the data columns pollen_taxa_mean and pollen_taxa_uncertainty will vary depending on the identification algorithm used. Only the taxa name should be updated. If more than two taxa detected, add columns, but update the field "Number of data columns, except starttime column.". Additionally, add lines for each data column, and update the "Number of lines in the metadata". 
68