EDG Server Decommissioning and Alation: Frequently Asked Questions
- What is the Environmental Dataset Gateway (EDG)?
- EPA’s metadata management system that contains EPA’s enterprise data inventory. Per Federal Open Data Initiatives and the Evidence Act federal agencies are required to increase transparency and expand the amount of federal data that is made available to the public. Agencies must also implement and maintain a comprehensive enterprise data inventory and describe datasets using metadata. The EDG currently fulfills these requirements.
- Why is the EDG Server being decommissioned?
- Since the EDG’s implementation in 2005, capabilities of metadata management solutions have matured and evolved. EDG software is now at end-of-life, will soon be unsupported by the vendor, and requires modernization.
- What is replacing the EDG Server?
- EPA is implementing Alation’s metadata management tool to ensure the Agency continues to meet federal mandates for an enterprise data management inventory and is able to leverage Alation’s full data management capabilities to improve the way EPA manages data assets (i.e., the data source represented by the metadata) and make them discoverable.
- What is Alation?
- Alation is a commercial-off-the-shelf (COTS) metadata management tool and data catalog that helps organizations identify, understand, and manage their data assets. Data catalogs are repositories of metadata information sources. Using Alation offers the following benefits to EPA and the public:
- Benefits for EPA Employees – Alation modernizes the legacy EDG solution, enabling EPA to move to an enterprise data catalog. Refer to Question 6 for more information on Alation’s capabilities.
- Benefits for the Public – Enables EPA to continue to support federal Open Data initiatives and increase transparency and access to data resources by making unrestricted EPA information assets available online through Data.gov.
- Alation is a commercial-off-the-shelf (COTS) metadata management tool and data catalog that helps organizations identify, understand, and manage their data assets. Data catalogs are repositories of metadata information sources. Using Alation offers the following benefits to EPA and the public:
- What is Data.gov?
- Data.gov is the United States government's open data website. It provides access to datasets published by agencies across the federal government. Data.gov is intended to provide access to government open data to the public, achieve agency missions, drive innovation, fuel economic activity, and uphold the ideals of an open and transparent government. Accessing EPA’s data through Data.gov enables the following:
- Streamlined public search and discovery of EPA datasets and EPA program and regional office geospatial tools by directing users to Data.gov.
- A more user-friendly search capability that allows users to search by additional criteria (geographic locations, topics, etc).
- Data.gov is the United States government's open data website. It provides access to datasets published by agencies across the federal government. Data.gov is intended to provide access to government open data to the public, achieve agency missions, drive innovation, fuel economic activity, and uphold the ideals of an open and transparent government. Accessing EPA’s data through Data.gov enables the following:
- What are some of Alation’s key features?
- Alation provides the following features:
- Terms and Definitions – Users may add terms and definitions to objects (i.e., fields) stored in the catalog. Objects that exist in multiple datasets can have terms and definitions applied across all data sets at one time to support standardization efforts.
- Articles – Users may define business and technical processes, display entity relationship diagrams (ERDs), schemas, etc.
- Policies – The system applies established policies governing the use of data assets.
- Workflows – Data owners control who can edit the catalog by implementing workflows that require review or approval before changes are made.
- Connectors – Connect to data assets and display columns and other attributes within data tables to give users more information about the data.
- Data Lineage – Help users understand how data came to be (source, etc.) and down-stream impacts of changes to the data.
- Conversations – Ask questions of data stewards. Conversations are archived and searchable so other users can learn from questions that were already asked.
- Alation provides the following features:
- How will the EDG Server decommissioning and transition to Alation take place?
- The transition to Alation will occur in phases:
- Phase 1 will be completed no later than July 31, 2023 and includes decommissioning the EDG server.
- Phase 2 will be completed by January 31, 2024 and improves the way EPA staff manage data assets and make them discoverable (refer to question 5).
- The transition to Alation will occur in phases:
- What changes will take place during Phase 1?
- At the completion of Phase 1, all EDG users will see the following changes:
- EPA’s Inventory Search Tool on EPA.gov will be retired and users will be redirected to Data.gov for searching the enterprise inventory.
- EPA guidance and user support information for the enterprise inventory will transition from edg.epa.gov to epa.gov/data/environmental-dataset-gateway
- EPA Data Officers and Data Stewards for the Enterprise Inventory will experience the following changes:
- Users who previously uploaded metadata to EDG server for harvesting will instead email updates to edg@epa.gov.
- Users that require automated data.gov harvest reports will need to submit a request to edg@epa.gov. The reports will not be available by default.
- EPA Data Officers and Data Stewards for the Enterprise Inventory will experience no change with respect to the following:
- Use of the EPA Open Data Metadata Editor. Data stewards will continue to submit EPA metadata to the Enterprise Inventory.
- Geospatial metadata will continue to be harvested or copied from existing regional and program sources and made available to Data.gov. The edg@epa.gov team will facilitate and maintain these processes.
- The EPA’s Clip & Ship site will not be affected by the decommissioning of the EDG server. It will continue to allow users to preview and explore data before choosing a subset to extract and download in various open formats.
- At the completion of Phase 1, all EDG users will see the following changes:
- What changes will take place during Phase 2?
- Phase 2 improvements include the following:
- Provides a more user-friendly search capability that allows users to search by additional criteria (geographic location, topics, etc.).
- Ability to connect to data sources to automatically collect metadata, information about the structure of the dataset, its origin, and how it has changed over time.
- Governance tools for improving data management/stewardship.
- Ability to identify relationships between data assets.
- Customizable interface and robust APIs to enable data exchange.
- Phase 2 improvements include the following:
- Where can I find more information?
- Additional information about Alation and the decommissioning process is available at the new EDG Landing page within EPA’s Open Data site. You may also contact Janet Kremer (Kremer.janet@epa.gov), Alation Implementation and EDG Project Lead.