NASA - National Aeronautics and Space Administration
HOME MISSIONS DATA SERVICES ABOUT US
NASA Tropospheric Chemistry Integrated Data Ceneter

 

ICARTT Data Format

Jim Crawford; LaRC/NASA
Ali Aknan; LaRC/NASA
Eric Williams; ESRL/NOAA
Hans Schlager; DLR

(Last Modified: 05 May 2009)

ICARTT Data Format (166 KB) Image



ICARTT Data File Format

In large studies such as ICARTT 2004 and MILAGRO 2006, there are many different types of data collected. Many data sets are simply straight time series with one or a number of parameters being measured sequentially (and simultaneously) in time. However, there are some data sets that are truly multi-dimensional in that a sample will be taken by an instrument at a single point in time and a number of parameters will be measured on that sample simultaneously. An example is wind profiler data in which a 30- minute averaged sample taken at some time period will be binned into height information and at each height will be wind speed, wind direction, and temperature. Another, more extreme, example is output from 3- dimensional models. Data such as these clearly cannot be represented as a single time series. Sections 1- 2 below outline the ICARTT format for all types of data, with an emphasis on standard time-series types of data. Section 3 - 4 is specific for standard time-series types of data. Section 5 offers guidance for non-standard time-series data.

Though adapted from the NASA Ames data format, the ICARTT data format will have no restriction on the number of characters per line or on the number of characters per record. The file name will be limited to 127 characters in length.

1. Requirements for data files

A. Time information

The philosophy here is that the data in the files must possess at least the minimum amount of accompanying information to uniquely identify each data point - this generally means time and location information. Moreover, the format must be able to handle all forms of timing configurations, including data that are irregularly spaced in time. For example, there are instruments that integrate a measurement over time until a certain signal-to-noise threshold has been reached. The integration period varies according to atmospheric conditions so that the resulting data have both variable integration times and are irregularly spaced in time. There is absolutely no way to represent these data with a single time point. The most efficient way of representing these data is with two time points: starting time and stopping time. This is the first requirement for the data file structure.

In those cases when many data sets are used or merged, a convenient single time reference point is the mid-point of the sampling period(s). Generally, this is the average of the start time and the stop time, but this is not always the case. As an example, there are measurements that integrate over a certain time but because of sample airflow changes (e.g., changing altitude during aircraft sampling) the sampling volume mid point does not correspond to the sampling time mid-point. In this case, the actual time mid-point must be specified by the investigator. Thus in order to encompass all of the possible diversity in sampling, three times need to be specified for each data point: start time, mid-point time, and stop time.

There are different views on what format should be used to represent time. In current measurement practice it is typical to find 1 second sampling intervals regardless of the platform (i.e., aircraft, etc.). Measurements at 1 Hz generally capture most of the important variability in air quality data, and, while longer intervals are commonly reported, shorter intervals are not. The Ames format shows time as seconds from the start of the day defined in the file header and in the file name (see below). The ICARTT file format will adopt this structure. Recognizing the need in some cases for >1 Hz sampling, the ICARTT format will allow data in fractional seconds though the default will be integer seconds. This does not mean that data MUST be shown in 1 second increments; whether it be 1 minute or some other increment, this decision is left to the principal investigator. In all cases, though, all times are explicitly accounted for in the period (day) specified by the header and file name. If no data are available for any time period, then that is represented by the missing data identifier. There are two exceptions to this policy. One exception is when no sampling takes place from the start of a day to some point during the day when the data begin. This might occur because of, for example, aircraft take-off. The other exception is for satellite data where significant time gaps may exist due to cloud interference, mode changes, etc. This exception is made due to the extremely large volume of data acquired from these platforms (see Section 5 below). All times are in UTC.

B. Location information

The specification of this information is straightforward. All data points in the files need to have latitude (lat), longitude (lon), and either altitude (for aircraft, lidars, sondes) or elevation (for surface data). The lat/lon system used here will be strictly numeric: decimal degrees (to five decimal places) with south latitudes and west longitudes represented as negative numbers (i.e., no N, E, W, S identifiers). Elevations will be in integral meters. Altitudes must be explicitly defined since many types of altitude measurements are in use (pressure alt; GPS alt; geopotential alt; etc.).

Because this information is required to uniquely identify any given data point, ideally it is included in the file with those data. However, it is sometimes advantageous to have location information consolidated and uniquely identified in a separate file (e.g., an aircraft parameter file). If this is done, then information about that parameter file must be included in the data file header information. This will be specified below.

C. Measurements

In general, each file contains data of one parameter or species. Multiple variables per file are allowed only if all were measured on exactly the same time base, as, for example, by the same instrument (e.g., GC/MS; PILS/IC). The numeric representation of a variable will be defined by the units in which it was measured. The ICARTT format contains the NASA Ames provision for a data scaling factor. However, we recommend that all scale factors be 1 unless it is grossly inconvenient to do so. If very large or very small numbers are required, then they can be represented with exponential notation, as in 1.01e9 or 5.23e-6.

i. Uncertainties

Every data point should have a corresponding total uncertainty (or error) which has the same units as the measurement. This uncertainty in the measurement is indicated as a TOTAL uncertainty to include all systematic and random effects. Ideally, these uncertainties are tabulated as the next (and separate) column after the data column in the file. However, this requirement can be relaxed if the uncertainty data can be reproduced by information in the header of the file. For example, if all uncertainties can be calculated by a function that has any given data point as input, then the formula can be included as header information.

ii. Missing data

Missing data are just that - missing. It makes no difference what the reason, whether it be a calibration period, a system crash, instrument maintenance, etc. Missing data are represented by negative numbers large enough to never be construed as actual data. For the ICARTT file format the value is -9999 (or -99999, etc.). Note that this is different from the Ames data exchange format in that Ames requires missing data flags to be numbers larger than any “good” data value. This somewhat arbitrary standard breaks down for measurements in urban areas where “good” data values can exceed reasonable expectation. For example, it is not uncommon in these areas for NO, NO2, or CO data to be in the parts per million range which are very large numbers for the standard units of measure (ppbv) for these species. On the other hand, there is no conceivable situation in which large negative numbers (e.g., -9999) can be construed as “good” data. Therefore, we specify for the ICARTT format that the primary missing data flag be -9999.

On the other hand, data below (or above) the limit of detection (LOD) are not actually “missing” but do convey some information. While some investigators choose to tabulate all of their quantifiable data, including negative values, others choose not to show these data points, but rather indicate the value is less than (or greater than) some quantifiable limit. These conditions will be indicated by two additional missing data flags that are substituted for the missing data values. The flag for data values GREATER THAN some UPPER LOD (ULOD) will be –7777 (or -77777, etc.), and the flag for data values LESS THAN some LOWER LOD (LLOD) will be -8888 (or -88888, etc.). These flags (if used) and the values of the upper and lower LOD are documented at specific locations in the header file (see below).

iii. Data delimiter characters

Previous versions of the ICARTT data format specified that spaces be used to delimit data fields within records (lines) of data in a file. Beginning with this version of the ICARTT format, only commas will be allowed. This change is made at the recommendation of data producers and data users and is meant to provide improved readability of header files and data records. For legacy purposes the format will accommodate the space character as delimiter in ICARTT data files produced prior to the date of this format version. However, as of this date the file-checking software will not allow spaces as delimiters in the data records.

2. File names

Features of different file naming conventions (including Ames) have been adapted here. File names for the ICARTT data format, limited to 127 characters or less, are defined as follows:

dataID_locationID_YYYYMMDD[hh[mm[ss]]]_R#[_L#][_V#][_comments].extension,

where the only allowed characters are: a-zA-Z0-9_.- (that is, upper case and lower case alphanumeric, underscore, period, and hyphen). All fields not in square brackets are required and are described as follows:

dataID: short description of measured parameter/species, instrument, or model (e.g., O3; RH; VOC; PTRMS; MM5)
locationID: short description of site; station; platform; laboratory or institute
YYYY: four-digit year
MM: two-digit month
DD: two-digit day
hh: optional two-digit hour
mm: optional two-digit minute
ss: optional two-digit second
R: revision number of data
L: optional launch number
V: optional volume number
comments: optional additional information
extension: file type descriptor

The underscore is used ONLY to separate the different fields of the file name; it has special significance for file-checking software. To separate characters within a field for readability, use lower and upper case letters. The use of the hyphen, though allowed, is discouraged since this character in file names may cause problems with some older operating systems and network software. The square brackets “[ ]” enclose optional parameters but are not shown in the file name. Dates and times in file names are always UTC. The date and time in the file name give the date/time at which the data within the file begin (data files), or date/time at which the image applies (image files). For aircraft and sonde data files, the date always refers to the UT date of launch.

The dataID is a short string of characters used to identify the parameters in the file. For files that contain one or two variables those variable names can be used in the file name. For files in which many variables are represented, it may be best to indicate in the file name a class of compounds (e.g., VOC; PhotolysisRates) or an abbreviation of the instrument used to make the measurements (e.g., PTRMS).

The locationID is used to identify the measurement platform, site, station, or source (laboratory or institute) of the information within a data file. Some examples could be: DC8, BAE146, RHBrown, GOME (satellite), IoS (Appledore Island site), ChebPt (Chebogue Point site), and others. It may be useful to have a standardized set of abbreviations used for a given field mission. These should be decided upon by the mission Science Team.

The R parameter will not be optional in the ICARTT data format. We must specify a data revision code that will track changes in data and document why those changes occurred. For this we specify a revision number counter “_R#” where the underscore is a required element to separate the fields (this is needed for certain file checking software). The revision number "#" must match the revision number specified in the Normal Comments section of the file header (see below).

During and immediately after the campaign, “field” data files will be available. Data exchanged during the field study are considered a special case since these data are typically “first look” and, due to time constraints, are not likely to have undergone the full scrutiny of the PI. In order to reflect this fact the file names will be modified slightly with respect to the convention stipulated above in that the data revision code (R#) will be a "letter" (e.g., RA, RB, etc.) instead of a numeric code. This will be the flag to indicate to the user that these are "field" data to be used only during the field study. These files should be deleted as soon as possible after the study and replaced with preliminary data files which will have some QA/QC performed.

The optional parameters “_L#” and “_V#” may be needed in some special cases. If the contents of the file pertain to a second or third aircraft launch on the indicated date, then a launch counter "_L#" (i.e. L2, L3, etc.) must appear after the "R" identifier but before a volume counter, if present (see below). Launch number one is implied when "_L#" is omitted from the file name. If a data file is one volume of a multi-volume dataset, then a volume counter "_V#" (i.e. V1, V2, V3, etc.), must appear after the "R" parameter (and the “L” parameter, if present) separated by an underscore from the rest of the identifier. The volume number (the "#" in "V#") must match the volume number in the file header. When "_V#" is missing from the file name a one-volume dataset is implied.

The optional comments parameter is for additional information required by the PI (or Data Manager) to identify the file contents but that does not fit into the other fields of the file name. This should be used sparingly.

The file extension is a 2-4 character parameter that identifies the file type. The principal file type for the ICARTT data format will be “.ict” and describes time series data in a file formatted to ICARTT standards.

3. Recommended file format specification for ICARTT time-series data files

A. Structure

We recommend that, whenever possible, ICARTT time series data files conform to the following Ames file format:

FFI = 1001; one real, unbounded independent variable; primary variables are real; no auxiliary variables; independent and primary variables are recorded in the same record.

What this means in English is that there is one time (independent) variable and that all other data depend on that variable. Any number of other variables can be defined, but they all depend on the one. In the typical case the fundamental variable is the start time of the measurement and others can be defined as in the following example, where the variable names refer to columns in the data file:

start time
stop time
mid-point time
latitude
longitude
altitude/elevation
data variable1
variable1 uncertainty
data variable2
variable2 uncertainty
<etc.>

This format accounts for most time series data measured anytime, over any arbitrary integration period, and at any place on or above the planet (within reason for air quality data). Obviously, the format can be condensed. For example, if measurements are reported as 1 second (or sub-second) intervals, then stop time and mid-point time need not be included as data columns provided all time intervals in the measurement period are accounted for by inclusion of the missing data flag(s). Similarly, if the measurements are made at a fixed location then latitude, longitude, and elevation are fixed and these data would be included in the header information (see below). As pointed out above, if the location data (latitude, etc.) are included in a separate file, then these columns can be excluded provided the location data file name is included in the header information for the data file. Similarly, if uncertainty is defined as some function that is the same for all data points then that function can be included in the header information and the user can then calculate uncertainties. Variations in the way the format is used, based on the needs of the data provider, are accounted for in the file header information. As an example, some PIs may wish to report the END time of the measurement period as the independent variable. The ICARTT format allows this provided that the time variable is clearly labeled as such (e.g., End_UTC) and that additional information describing this (non-standard) situation be provided in the Normal Comments section of the file header. If the data periods are not of a constant duration, then the start time and mid-point time of each period must be included as an additional column and the Data Interval value set to 0 (see below). The header specifications are described below.

B. File header information

The basic structure of the ICARTT file header is similar to the Ames exchange format. For the ICARTT data format we recommend some additional information that will be included in the comments sections. The most general header is shown below as an example; more specialized headers will be described as modifications to the general form. Delimiters are commas only (spaces for legacy purposes only) and cannot be used anywhere else in the file. For delimiters to separate text, use underscores.

  • Number of lines in header, file format index (most files use 1001) - comma delimited
  • PI name: last name, first name/initial
  • Organization/affiliation of PI
  • Data source description: e.g., instrument name; platform name; model name, etc.
  • Mission name: usually the acronym for your mission (e.g., ARCTAS or a compound name such as ICARTT_ followed by your project; e.g., NEAQS, INTEX, etc.)
  • File volume number, number of file volumes (these integer values are used when the data require more than one file per day; for data that require only one file these values are set to 1, 1) - comma delimited
  • UTC date when data begin, UTC date of data reduction or revision - comma delimited
  • Data Interval: This value describes the time spacing (in seconds) between consecutive data records. It is the (constant) interval between values of the independent variable. For 1 Hz data the data interval value is 1; for 1 minute data the value is 60; for 10 Hz data the value is 0.1. All intervals longer than 1 second (the exception is 1 minute intervals) must be reported as Start and Stop times (and optional Mid-point times), and the Data Interval value is set to 0. For additional information see Section 5 below.
  • Description or name of independent variable: This will be the name chosen for the start time or in some cases the mid-point time or end time of the data stream. It always refers to the number of seconds from the UTC start of the day.
  • Number of variables: Integer value showing the number of dependent variables (the total number of columns of data will be this value plus one).
  • Scale factors (1 for most cases, except where grossly inconvenient) - comma delimited
  • Missing data indicators: This will be –9999 (or -99999, etc.) for any missing data condition, except for the main time (independent) variable which is never missing) - comma delimited
  • Variable names and units: Short variable name, units, and optional long descriptive name, in that order, and separated by commas (or semicolons). If the variable is unitless, enter the keyword "none" for its units. Each short variable name and units (and optional long name) are entered on one line. The short variable name must correspond exactly to the name used for that variable as column header (i.e., the last header line prior to start of data).
  • Number of SPECIAL comment lines: Integer value indicating the number of lines of special comments, NOT including this line.
  • Special comments: Notes of problems or special circumstances unique to this file. An example would be comments/problems associated with a particular flight.
  • Number of Normal comments (i.e., number of additional lines of SUPPORTING information): Integer value indicating the number of lines of additional information, NOT including this line.
  • Normal comments (SUPPORTING information): This is the place for investigators to more completely describe the data and measurement parameters. The supporting information structure is described below as a list of key word: value pairs. Specifically include here information on the platform used, the geo-location of data, measurement technique, and data revision comments. Note the non-optional information regarding uncertainty, the upper limit of detection (ULOD) and the lower limit of detection (LLOD) for each measured variable. The ULOD and LLOD are the values, in the same units as the measurements that correspond to the flags –7777 and –8888 within the data, respectively. The last line of this section should contain all the variable names on one line. The key words in this section are written in BOLD for clarity below. The actual file will not have special formatting codes. The key word must be typed followed by a colon then followed by your text (information). When more than one value (or information) is to be written on the same line, separate the values using a semicolon. For lines where information is not needed or applicable, simply enter N/A. The scanning program will look for these key words (case insensitive) when the file is submitted.

    PI_CONTACT_INFO: Phone number, mailing address, email address and/or fax number.
    PLATFORM: Platform or site information.
    LOCATION: including lat/lon/elev if applicable.
    ASSOCIATED_DATA: File names with associated data: location data, aircraft parameters, ship data, etc.
    INSTRUMENT_INFO: Instrument description, sampling technique and peculiarities, literature references, etc.
    DATA_INFO: Units and other information regarding data manipulation.
    UNCERTAINTY: Uncertainty information, whether a constant value or function, if the uncertainty is not given as separate variables.
    ULOD_FLAG: -7777 (Upper LOD flag, always -7's).
    ULOD_VALUE: Upper LOD value (or function) corresponding to the -7777's flag in the data records.
    LLOD_FLAG: -8888 (Lower LOD flag, always -8's).
    LLOD_VALUE: Lower LOD value (or function) corresponding to the -8888's flag in the data records.
    DM_CONTACT_INFO: Name, affiliation, phone number, mailing address, email address and/or fax number.
    PROJECT_INFO: Study start & stop dates, web links, etc.
    STIPULATIONS_ON_USE: (self explanatory)
    OTHER_COMMENTS: Any other relevant information.
    REVISION: R# (see file names discussion)
    R#: comments specific to this data revision. The revision numbers and the associated comments are cumulative in the data file. This is required in order to track the changes that have occurred to the data over time. Pre-pend the information to this section so that the latest revision number and comments always start this part of the header information. The latest revision data should correspond to the revision date on Line 7 of the main file header. Note that FIELD data files have revision LETTERS, not numbers.
    Indep_Var, VarName_1, VarName_2, VarName_3, … VarName_n

    The formula for the total number of lines in the header for FFI=1001 files:
    14 + ( # of dependent variables, given in line 10) + (# lines of special comments) + (# lines of normal comments)

C. Examples

Below are three examples of (similar) time series data using different forms of header information. Be aware that the automatic word-wrap feature in word processing programs gives the appearance that there are more lines of text than are really there. In these examples any continuation of lines from directly above has been indented for clarity.


Example 1. All required data columns are shown explicitly.


File name: NOx_RHBrown_20040830_R0.ict

41, 1001
Williams, Eric
Earth System Research Laboratory/NOAA
Nitric oxide and nitrogen dioxide mixing ratios from R/V Ronald H. Brown
ICARTT_NEAQS
1, 1
2004, 08, 30, 2004, 12, 25
0
Start_UTC, number_of_seconds_from_0000_UTC, seconds
9
1, 1, 1, 1, 1, 1, 1, 1, 1
-9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999
Stop_UTC, seconds
Mid_UTC, seconds
DLat, deg_N
DLon, deg_E
Elev, meters
NO, ppbv
NO_1sig, ppbv
NO2, ppbv
NO2_1sig, ppbv
0
18
PI_CONTACT_INFO: 325 Broadway, Boulder, CO 80305; 303-497-3226; email: eric.j.williams@noaa.gov
PLATFORM: NOAA research vessel Ronald H. Brown
LOCATION: Latitude, longitude and elevation data is included in the data records
ASSOCIATED_DATA: N/A
INSTRUMENT_INFO: NO: chemiluminescence; NO2: narrow-band photolysis/chemiluminescence
DATA_INFO: All data with the exception of the location data is in ppbv. All one-minute averages contain at least 35 seconds of data, otherwise missing.
UNCERTAINTY: included in the data records as variables with a _1sig suffix
ULOD_FLAG: -7777
ULOD_VALUE: N/A
LLOD_FLAG: -8888
LLOD_VALUE: N/A, N/A, N/A, N/A, N/A, 0.005, N/A, 0.025, N/A
DM_CONTACT_INFO: N/A
PROJECT_INFO: ICARTT study; 1 July-15 August 2004; Gulf of Maine and North Atlantic Ocean
STIPULATIONS_ON_USE: Use of these data requires PRIOR OK from the PI
OTHER_COMMENTS: N/A
REVISION: R0
R0: No comments for this revision.
Start_UTC, Stop_UTC, Mid_UTC, DLat, DLon, Elev, NO, NO_1sig, NO2, NO2_1sig
43200, 43259, 43229, 41.00000, 71.00000, 15, 0.555, 0.033, 2.220, 0.291
43260, 43319, 43289, 41.01234, 71.01234, 15, 10.333, 0.522, 31.000, 0.375


Example 2. This example is similar to Example 1. Differences include the exception of the elimination of variables stop time, mid time, lat, lon, elev, and uncertainties, the inclusion of a special comment, the inclusion of DM info, and a second revision comment.

File name: NOx_RHBrown_20040830_R1.ict

36, 1001
Williams, Eric
Earth System Research Laboratory/NOAA
Nitric oxide and nitrogen dioxide mixing ratios from R/V Ronald H. Brown
ICARTT_NEAQS
1, 1
2004, 08, 30, 2004, 12, 25
60
Start_UTC, seconds
2
1, 1
-9999, -9999
NO, ppbv
NO2, ppbv
1
Lightning struck the ship at ~ 14:00:23 UTC, or at 50423 seconds after midnight UTC. The 13 minute section of missing data from 14:00 to 14:43 (50400 through 52780 of Start_UTC) reflects the period when the instrument was checked out and the computer rebooted.
19
PI_CONTACT_INFO: 325 Broadway, Boulder, CO 80305; 303-497-3226; eric.j.williams@noaa.gov
PLATFORM: NOAA research vessel Ronald H. Brown; sampling through high-flow manifold (res. time ~ 1 s) at 15 m above waterline
LOCATION: Ship location data in file ShipData_RHBrown_20040830_R0.ict
ASSOCIATED_DATA: ShipData_RHBrown_20040830_R0.ict
INSTRUMENT_INFO: NO: chemiluminescence; NO2: narrow-band photolysis/chemiluminescence, See Williams et al., BigScience, 42, p. 50-51, 2001
DATA_INFO: Units are ppbv. All one-minute averages contain at least 35 seconds of data, otherwise missing. Midpoint time is 29 seconds after the minute. One second data are available, contact the PI.
UNCERTAINTY: NO: +/-(5%+0.005 ppbv); NO2: +/-(12%+0.025 ppbv)
ULOD_FLAG: -7777
ULOD_VALUE: N/A
LLOD_FLAG: -8888
LLOD_VALUE: 0.005, 0.025
DM_CONTACT_INFO: Ken Aikin, ESRL/NOAA, kenneth.c.aikin@noaa.gov; data manager for data within ShipData_RHBrown_20040830_R0.ict is Jim Johnson with PMEL, James.Q.Johnson@noaa.gov
PROJECT_INFO: ICARTT study; 1 July-15 August 2004; Gulf of Maine and North Atlantic Ocean
STIPULATIONS_ON_USE: Use of these data requires PRIOR OK from the PI
OTHER_COMMENTS: N/A
REVISION: R1, R0
R1: NO2 data have been increased by 13% based on calibration standard recheck.
R0: No comments for this revision.
Start_UTC, NO, NO2
43200, 0.555, 2.509
43260, 10.333, 35.030


Example 3. This example is similar to examples 1 and 2. Here the platform is a ground site with a locationID of ChebPt.

File name: NOx_ChebPt_20040830_R2.ict

36, 1001
Williams, Eric
Earth System Research Laboratory/NOAA
Nitric oxide and nitrogen dioxide mixing ratios from Chebogue Point, Nova Scotia
ICARTT_NEAQS
1, 1
2004, 08, 30, 2004, 12, 25
60
Start_UTC, seconds
2
1, 1
-9999, -9999
NO, ppbv
NO2, ppbv
0
20
PI_CONTACT_INFO: Address: 325 Broadway, Boulder, CO 80305; email: eric@al.noaa.gov; 303-497-3226
PLATFORM: 10 m tower at the Chebogue Point ICARTT research site.
LOCATION: Chebogue Point, Nova Scotia, Canada; lat: 43.45678; lon: -66.00000; elev: 30 m.
ASSOCIATED_DATA: Met_ChebPt_20040830_R2.ict
INSTRUMENT_INFO: NO: chemiluminescence; NO2: narrow-band photolysis/chemiluminescence.
DATA_INFO: All data is in units of ppbv.
UNCERTAINTY: NO: +/-(5%+0.005 ppbv); NO2: +/-(12%+0.025 ppbv)
ULOD_FLAG: -7777
ULOD_VALUE: N/A
LLOD_FLAG: -8888
LLOD_VALUE: 0.005,0.025
DM_CONTACT_INFO: Ken Aikin; ESRL/NOAA; kenneth.c.aikin@noaa.gov
PROJECT_INFO: ICARTT study; 1 July-15 August 2004
STIPULATIONS_ON_USE: Use of these data requires PRIOR OK from the PI
OTHER_COMMENTS: N/A
REVISION: R2, R1, R0
R2: NO data have been decreased by 13% based on operator ineptitude.
R1: NO2 data have been increased by 13% based on calibration standard recheck.
R0: No comments for this revision.
Start_UTC,NO_ppbv,NO2_ppbv
43200,0.483,2.509
43260,0.899,35.030


4. Recommended file format specification for ICARTT multi-dimensional data files

Also, view the "Amended FFI 2110" or Amended FFI 2310 documents for more details on these file types.

A. Structure

We recommend the standard Ames file formats FFI=2110 and FFI=2310 for exchange of most multidimensional data files. The FFI's descriptors are:

FFI 2110; two real independent variables, one unbounded and one bounded with its values recorded in the data records, primary variables are real; the first auxiliary variable is NX(m,1) (or, primary variables' ArrayDimension), all other auxiliary variables are real.

FFI 2310; two real independent variables, one unbounded and one bounded with its number of constant increment values, base value, and increment defined in the auxiliary variable list; primary variables are real; auxiliary variables are real.

For a complete description of these file types, please see the Ames file format document. The following are examples on FFI 2110 and FFI 2310 formats. The text in italics indicates comments not in the file but those added here for clarity. The normal comments section mimics FFI 1001 format described above.


Example: FFI 2110

File name: AR_DC8_20050203_R0.ict

54,2110
PI LastName, First Name
Code 916, Goddard Space Flight Center, Greenbelt, MD 20771
AROTAL
PAVE Mission
1, 1
2005, 02, 03, 2006, 01, 18
60
Altitude[], meters, Altitude_array
UTC, XX.XXXX_hours_from_0_hours_on_flight_date
7 {Number of PRIMARY variables}
0.1, 0.0001, 0.1, 0.01, 0.0001, 0.1, 0.0001
-9999, -999999, -999999, -999999, -999999, -99999, -999999
TempK[], K, Temperature_array
Log10_NumDensity[], part/cc, Log10_NumDensity_array
TempK_Err[], K, Temperature_error_array
AerKlet[], Klet, Aerosol_array
Log10_O3NumDensity[], part/cc, Log10_Ozone_NumDensity_array
O3_MR[], ppb, Ozone_mixing_ratio_array
Log10_O3NumDensity_Err[], part/cc, Log10_NumDensity_error_array
11 {Number of AUXILIARY variable}
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0
-9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999
NumAlts, none, Number_of_altitudes_reported
Year, UT
Month, UT
Day, UT
AvgTime, xxx.x_minutes, Averaging_time_of_presented_data
Latitude, degrees
Longitude, degrees
PAlt, meters, pressure_altitude
GPSAlt, meters, GPS_altitude
SAT, K, Static_air_temperature
SZA, degrees
0
18
PI_CONTACT_INFO: Enter PI Address here
PLATFORM: NASA DC8
LOCATION: Lat, Lon, and Alt included in the data records
ASSOCIATED_DATA: N/A
INSTRUMENT_INFO:
DATA_INFO:
UNCERTAINTY:
ULOD_FLAG: -7777
ULOD_VALUE: N/A
LLOD_FLAG: -8888
LLOD_VALUE: N/A
DM_CONTACT_INFO: Enter Data Manager Info here
PROJECT_INFO: PAVE MISSION: Jan-Feb 2005
STIPULATIONS_ON_USE: Use of these data should be done in consultation with the PI
OTHER_COMMENTS:
REVISION: R0
R0: Version 2005-0: AROTAL T &amp; O3 Rayleigh Retrievals. Further revisions may be needed to fine-tune aerosol characterization.
UTC, NumAlts, Year, Month, Day, AvgTime, Latitude, Longitude, PAlt, GpsAlt, SAT, SZA, Altitude[], TempK[], Log10_NumDensity[], TempK_Err[], AerKlet[], Log10_O3NumDensity[], O3_MR[], Log10_O3NumDensity_Err[]
54000,9,2005,2,3,0,42.308,-70.582,6910,6979,242.5,65.5
     9154,-9999,-999999,-9999,-9999,113178,212,-999999
     9304,-9999,-999999,-9999,-9999,123353,2250,-999999
     9454,-9999,-999999,-9999,-9999,123008,2116,-999999
     9604,-9999,-999999,-9999,-9999,120933,1337,-999999
     9754,-9999,-999999,-9999,-9999,119675,1019,-999999
     9904,-9999,-999999,-9999,-9999,122655,2061,-999999
     10054,-9999,-999999,-9999,-9999,124384,3126,-999999
     10204,-9999,-999999,-9999,-9999,124632,3371,-999999
     10354,-9999,-999999,-9999,-9999,121341,1609,-999999
54060,8,2005,02,03,0,42.278,-70.613,6978,7043,241.7,65.5
     10118,-9999,-999999,-9999,-9999,124458,3205,-999999
     10268,-9999,-999999,-9999,-9999,123160,2421,-999999
     10418,-9999,-999999,-9999,-9999,121221,1582,-999999
     10568,-9999,-999999,-9999,-9999,120950,1523,-999999
     10718,-9999,-999999,-9999,-9999,117339,680,-999999
     10868,-9999,-999999,-9999,-9999,122751,2423,-999999
     11018,-9999,-999999,-9999,-9999,124230,3491,-999999
     11168,-9999,-999999,-9999,-9999,124039,3424,-999999

{Note the use of scale factors in this example.}


Example: FFI 2310

File name: LidarO3_WP3_20040830_R0.ict

46, 2310
Williams, Eric
NOAA/Earth System Research Laboratory
Ozone number density profile from WP3 aircraft lidar
ICARTT_ITCT
1, 1
2004, 08, 30, 2009, 09, 04
60.0
Geo_Alt, meters, Geometric_altitude_of_observation
UT_Time, seconds, Elapsed_time_from_0_hours_on_day_given_by_date
1 {Number of PRIMARY variables}
1.0e9
-9999
O3_NumDensity[], #/cc, Ozone_NumDensity_array
9 {Number of AUXILIARY variable}
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0
-9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999
Num_altitudes, number, number_of_altitudes_at_current_time_mark
geo_alt_begin, meters, geometric_altitude_at_which_data_begin
alt_increment, meters, altitude_increment
geo_alt_aircraft, meters, geometric_altitude_of_aircraft
UT_hour, hours
UT_min, minutes
UT_sec, seconds
Lon_aircraft, degrees_N
Lat_aircraft, degrees_E
0
18
PI_CONTACT_INFO: 325 Broadway, Boulder, CO 80305; 303-497-3226; eric.j.williams@noaa.gov
PLATFORM: NOAA WP3
LOCATION: Lat, Lon, and Alt included in the data records
ASSOCIATED_DATA: N/A
INSTRUMENT_INFO: Differential absorption lidar. See Williams et al., BigScience, 42, p. 50-51, 2001
DATA_INFO: The units are number density (#/cc). The vertical averaging interval is 975 m at 1-7 km above the aircraft and 2025 m > 7 km above the aircraft. Horizontal averaging interval: 60 km.
UNCERTAINTY: N/A
ULOD_FLAG: -7777
ULOD_VALUE: N/A
LLOD_FLAG: -8888
LLOD_VALUE: N/A
DM_CONTACT_INFO: N/A
PROJECT_INFO: ICARTT study; 1 July-15 August 2004
STIPULATIONS_ON_USE: Use of these data requires PRIOR OK from the PI
OTHER_COMMENTS: N/A
REVISION: R0
R0: No comments for this revision.
UT_TIME, Num_altitudes, geo_alt_begin, alt_increment, geo_alt_aircraft, UT_hour, UT_min, UT_sec, Lon_aircraft, Lat_aircraft, O3_NumDensity[]
30300,26,12819,75,10389,8,25,35,-133.24,-9.45
     1340,1519,1660,1779,1868,1939,1973,1992,1989,1955,1934,1897,1817,1721,1619,1514,1434,1343,1258,1203,1140,1088,1037,956,892,878
30360,22,12819,75,10383,8,26,0,-133.22,-9.93
     1351,1523,1658,1774,1860,1930,1962,1974,1966,1932,1909,1877,1803,1706,1600,1493,1407,1310,-9999,-9999,1094,1045


Note that this file uses a scale factor (1e9) for the number density data since it would be very cumbersome to add the exponential notation to every value. Also, this example was adapted from the NASA document and did not have uncertainty or flag values associated with the data.


5. File formats for other data

Data collected during a mission for which a standard ICARTT time-series format does not apply can be formatted according to standards common to the user community and agreed to by the mission Science Team. For example, many modeling data sets store data in net.cdf format, which is a de facto standard for that community. However, the multi-dimensional data format defined above can accommodate these data sets and we leave this as an optional format. For some instruments (e.g., lidars), data are available as image files usually in standard formats such as GIF or JPEG. Not all software for reading and writing these formats allow additional text information (e.g., as a header) so the file names for these files must be defined to include as much information as possible. If necessary, the Data Management team will work with these PIs to achieve a mutually acceptable solution.

Data acquired by sensors on satellites are not conveniently incorporated into the ICARTT format. The data protocol allows each data record to be identified with a single timestamp only if data are reported continuously with a constant time interval (e.g., 1 second). Otherwise, start and stop times must be reported, and a data interval of 0 is entered on line 8 of the file header. Satellite data are unique in that while they are recorded on a constant data interval, significant gaps in the data may exist. These gaps may be due to cloud interference, changes in viewing mode (e.g., nadir versus limb), or other considerations. Given the sheer volume of data and the file sizes associated with satellite observations, it is not sensible to populate these data gaps with missing data values. It is also unreasonable to report start and stop times since data are typically collected on short timescales (typically sub-second) such that integration time is not an issue. Instead, satellite data files will report a Data Interval of -1 on line 8 of the file header. This signifies that each data record is identified by a single timestamp, but the actual timeline is discontinuous. The ICARTT format does not support a Data Interval of -1 for any measurements other than from satellite instruments.

In general, if problems or difficulties arise the Data Management team will deal with them on a case-by-case basis. We want to ensure that all data that are collected during field studies are made available to all participants as quickly and as seamlessly as possible. Please feel free to send your enquiries, questions, comments, or suggestions to Ali Aknan or Jim Crawford.

 

6. File scanning software

View the "Help FScan" document for more details.

There are 2 versions available to Scan ICARTT formatted files:

1. Web-based: http://www-air.larc.nasa.gov/cgi-bin/fscan

2. Standalone Version (Windows Only)

 

7. Responsibilities of data access

A major goal of this data management plan is to facilitate the free exchange of data between and among various teams of researchers. The intention of this data sharing is to broaden the interpretation of observations and to exploit complementary data collected by different research teams. While this level of access is desirable, there are clear responsibilities that come with this access. It is appropriate and expected that researchers may browse all data unfettered; however, once earnest research is pursued, it is essential that relevant Principal Investigators will be made aware that their data are being used. It is also expected that they will be offered co-authorship and the opportunity to comment on the content of manuscripts prior to submission for publication. It is imperative that Principal Investigators be consulted when suspicious data are encountered or when interpretation of data becomes dependent upon understanding the underlying technique.

FirstGov - Your First Click to the US 
 Government
Last Updated: December 14, 2009

+ Freedom of Information Act
+ NASA Privacy Statement, Disclaimer,
noneand Accessibility Certification


NASA - National Aeronautics and Space 
 Administration
Curator: Ali Aknan and Clyde Brown
NASA Official: Dr. Gao Chen

+ Send Us Your Comments
+ Contact NASA
Last updated: December 16, 2009