Newest Viewed Downloaded

Catherine Maillard First Training Session Ostende, February 12-17, 2007 Introduction to Oceanographic Data ManagementCatherine Maillard First Training Session Ostende, February 12-17, 2007 Introduction to Oceanographic Data Management

Catherine Maillard First Training Session Ostende, February 12-17, 2007 Introduction to Oceanographic Data Management

Catherine Maillard First Training Session Ostende, February 12-17, 2007 Introduction to Oceanographic Data Management 1

Data management

ANALYSIS & MODELLING SYSTEMS OBSERVING SYSTEMS Product generation Quality control Checks Data discovery Safeguarding Data sets aggregation Catalogues Data Compilation Data Formatting CDI - Data indexing in local archiving system 2

1. Data Compilation

The data never go directly to the data centres – therefore it needs to: Locate the data sets not yet archived Request and get a copy of the missing data sets from the source laboratory/scientist – Check that the data sets is properly documented 3

COMPILATION 1.1: Locate the data sets which are not yet archived

Search in cruise report (CSR) catalogue Or in observation system (EDIOS) Or in EDMED or EDMERP  A data set should be identified either + Maintain regular direct contacts 4

COMPILATION 1.2: get a copy of the missing data sets from the source laboratory/scientist

Request(s) a copy of the missing data sets identified as not archive at any format Emphasize the importance of: long term archiving to follow up the environmental changes Integration in long time series of data of the same type – availability of global/regional/thematic database depends on all contributions Facilitate the use of these databases Get and safeguard the electronic file Sometimes necessity of digitalization (GODAR) 5

COMPILATION 1.3: The mandatory meta-data

Check that the data sets is properly documented with the mandatory fields described a minimum of meta-data should be included in the data files eg. Reference to cruise or observation system and source laboratory Sensor type Parameter names and units etc. Complete the missing information by asking questions to the originator 6

2 - Data Reformatting

In general the original formats of the data files cannot be used in data management Incomplete/not standardized meta-data Incompatibility with QC and other processing input format Need of a unique archiving format for safeguarding the data sets of the same type Data management format, Archiving format and dissemination/exchange format(s) may be but not necessarily the same 7

2 - Different Data Formats used

Archiving format : can be one of the actual exchange format or local format designed according to rules to insure sustainability Exchange/Disemination format(s): joint projects and interoperability require common exchange format(s) Data Management/processing 8

2.1 : General rules for sustainability of an archiving format

The archiving format should: be independent from the computer (and libraries) – RDBS are not appropriate insure that any isolated data includes enough meta-data to be processed (eg. Location and date) be compatible and include at least the mandatory fields (meta-data) requested for the agreed exchange format(s) Include additional textual or standardized "history" or "comment" fields to prevent any loss of information Provide similar structure and meta-data for different data type such as vertical profiles and time series These rules are normally followed also for exchange formats 9

2.2 - SeaDataNet Data transport Formats

obligatory formats: NetCDF (Binary) for gridded data and 3D observation data such as ADCP (Modified) ODV spreadsheet for other data types (vertical profiles and time series) optional format: ASCII Medatlas as standard exchange format for the Mediterranean and Black Sea community. BODC leads the task to modify the present ODV and NetCDF formats for SeaDataNet use (QC flags, parameters semantics etc..and conformity with the international standards) Formatting exercises to asses the coherence and compatibility of exchange formats 10

2.3 – Processing Formats

For data management, (QC, cataloguing, selection, extraction, visualisation) the data can be In the archiving format and the In relational database system (RDBS) – the presently most used RDBS in the community are ORACLE and MySQL Note: an interface is needed between the software input format and the local data management system 11

3 - Quality Checks

What they do Detect missing mandatory information Detect errors made during the transfer or reformatting Detect remaining outliers Detect duplicates Attach a quality flag to each numerical value What they don’t do the preliminary data calibration and validation made by the expert scientists Modify the data points General rule The tools for data QC are not unique (eg. ODV and other local systems), but the procedures are compatible. Any QC of a data set should be reported to the originator to give feedback and ask questions How they are performed  Next presentation by Sissy 12

4 - Safeguarding

The QCed data sets should be safeguarded in a perennial system for further use 2 copies Following up of the backup when the system or the technology changes It is recommended to use the common computer infrastructure of the institutes for making the backup regular and automatic The original not standardized and not QCed data sets should be safeguarded also, for possible further checks by the data manager or the source scientists, but not to be disseminated 13

5 - Data Dissemination and service

National data sets according to the national rules Aggregated data sets with other data sources Export the data in a unique exchange format With the appropriate documentation on: the format and codes QC performed on the data The source of the data and the condition of use (license) 14

5 - Data aggregation

Data Aggregation represents a service and a product To answer data requests related to a geographical area or other selection criteria independently from the source Interrogate the local data centre Complete with other sources Eliminate the duplicates 15

Other data sources

The other data centres of the consortium Regional and project databases: ICES: North-East Atlantic Medatlas 2002, Mater1996-1999 but some data included in Medatlas, MFS/MOON for RT The World Ocean Atlas – delayed mode data The Coriolis/Argo Server – Real Time Data The satellite data 16

The consortium data

The Common Data Index (CDI) shows what is presently available in the data centres. It will be continously updated during the project http://www.sea-search.net/cdi/ (also from the SeaDataNet website) During the development phase (2006-2007) of the interoperable system, by the Technical Task Team, each data centre is interrogated separately to get access to the the data - Several Data centres provide on line tools for data search and access, including geographical selection and web services. 17

Regional Databases

ICES http://www.ices.dk/ocean/ ICES format Medatlas 2002 www.ifremer.fr/medar + Cdrom +ftp site Developed in the frame of the EU Medar project (a regional DAR) Data selection tools according to various criteria including geographical search available on the Cdrom Also available on line from several partner data centres Medatlas format 18

World Ocean Atlas 2005 http://www.nodc.noaa.gov/OC5/WOD05/pr_wod05.html

Developed by US/NODC – WDC Washington – Ocean Climate Laboratory in the frame of IOC/GODAR project with the contribution of the other data centres Data, mainly delayed mode data, are available through on line selection tool or on DVD (on request) All the fields can be interrogated for data selection. The possibility to select countries by group ( to get all but the own country, or all but the SDN consortium for example) is commonly used. 19

Data Types in WOA 2005

Type of observations Ocean Station Data (OSD) [Bottle, low resolution CTD/XCTD, plankton data] High Resolution CTD/XCTD (CTD) Expendable (XBT) and Mechanical (MBT) Bathythermographs Autonomous Pinniped Bathythermographs (APB) Profiling Floats (PFL) Drifting Buoys (DRB) Moored Buoys (MRB) [TAO, PIRATA, others] Undulating Oceanographic Recorder (UOR) [Towed CTD] Glider data (GLD) Surface-Only (SUR) [Bucket, Thermosalinograph] Parameters Pressure, Temperature, salinity + 23 bio-geochemical parameters + biological taxons 20

Showing 1 - 20 of 30 items Details

Name: 
tc1f
Author: 
N/A
Company: 
N/A
Description: 
Catherine Maillard First Training Session Ostende, February 12-17, 2007 Introduction to Oceanographic Data ManagementCatherine Maillard First Training Session Ostende, February 12-17, 2007 Introduction to Oceanographic Data Management
Tags: 
environmental | management | data | format | set | system | archiv | sourc | type | manag
Created: 
10/29/2008 10:46:24 AM
Slides: 
30
Views: 
7
Downloads: 
3
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap