Introduction to Geographic Information Systems in Forest Resources
UW Home GIS@UW Search
Syllabus Schedule Class Meetings Assignments Course Data
Contact Us CFR 520 Lab Locations Software Collect It Page


Finding Data on the Net

Discussion:

The web abounds with GIS data. Finding the specific data you need may not be too easy, but the journey is part of the adventure.

Web Search Engines

Geospatial Data Clearinghouse (FGDC-Compliant Metadata)

USGS GeoData

Washington State Geospatial Data Archive (WAGDA)

The Geography Network

ArcIMS Sites

Important note: If you obtain data from the web, or from any source, make sure you do your best to know or find out its ultimate source. For example, many of the datasets at the Washington State Geospatial Data Archive are originally from King County GIS or the City of Seattle data. When referring to these datasets, give credit to the original developer, not the UW Map Library Geospatial Data Archive! When using data that you found on the Clearinghouse, give credit to the ultimate source, not the Clearinghouse node!


Web Search Engines

The single best tool to use for finding all various types of data on the net is the Web Search Engine.

Search engines allow you to look across the web for keywords. Use the search engines to look up keywords like "GIS Data."

Here is a list of some common search engines (which I found by searching a search engine on the term "search engine").

My current personal favorite is Google, but there's also Yahoo, Bing, and WolframAlpha. Some older ones are Ixquick, Dogpile, and Metacrawler, which simultaneously search multiple search engines.

Here are a few references on search engines. These sources can teach you techniques to help narrow your searches down so that you limit superfluous hits.

Lynch, Clifford. March 1997. Searching the Internet, Scientific American.
Phelps, Alan. 1999. Looking for Answers. Smart Computing in Plain English. 7(11).
The Spider's Apprentice: A Helpful Guide to Web Search Engines
How search engines work.
How search engines rank web pages.
Search engine features chart
How big are the search engines?
Power Searching 101: a tutorial.

Web search engines are good at finding data in general, but they are poor at finding context-specific content. If you do search for "GIS Data" using a search engine, you will get numerous hits. However, you will need to sort through those many hits to find data for the place you are interested in. The reason for this poor performance is that the search engine spiders crawl across pages and save keywords. The keywords may or may not be associated with a page that really contains the data you want. The search is just for a word, but the word is not placed in a specific context. Many matches for "GIS Data" will be pages that do not contain any links to real GIS data!

Enhanced GIS data searching is done using the Geospatial Data Clearinghouse.


Geospatial Data Clearinghouse

The primary method of searching for GIS data on the Net is to use the National Spatial Data Infrastructure's series of servers known collectively as the Geospatial Data Clearinghouse. These servers generally do not house spatial datasets, but they contain metadata, which are files that describe geospatial datasets, giving various pieces of information, such as spatial extent, layer keywords, points of contact, etc. Once you find metadata for a dataset you are interested in, you can use the contact information contained within the metadata record to find out how to get a copy of the actual data. Presently, all datasets created using US federal funds are required to also have FGDC-compliant metadata records.

Our Washington State Node of the NSDI Clearinghouse has an on-line system for contributing metadata. If you generate any unique data that are geospatial in nature, you can submit metadata to the server.

Metadata are the files created by data developers that contain detailed descriptions of the datasets. Metadata records will generally contain information such as the title, data format, developer, time of creation, source scale, projection & coordinate description for a GIS dataset. A specific content standard exists, which requires all compliant metadata to be formatted in the same way, with a core set of values, as well as an extended optional set of values. The standard has been developed by the Federal Geographic Data Committee (FGDC), so these metadata records are known as FGDC-Compliant metadata.

The content standard is the reason the Clearinghouse is effective as a search engine. Because the file structure and content of metadata records is standardized, searchable databases can be built from the metadata records. This allows for true database queries on the metadata.

For example, if you may be searching for raster digital elevation data. Looking for "elevation" in a web search engine will result in a wide variety of hits, most of which will have little to do with your needs. Using the Clearinghouse, it is possible to search only metadata records (rather than all web pages in general).

It is possible to limit the search to a geographic extent by place name or by bounding coordinates:

It is possible to specify a time period which the dataset represents, and to search for the keyword "grid" anywhere in the metadata record, and to limit the search to records that contain the string "elevation" in the title of the dataset:

Because a series of servers exists, each one housing metadata for a particular organization or geographic region, it is possible to narrow down the search for "local" datasets:

Finally, the search is submitted, and after a few moments, the results are displayed:

The individual records can be viewed:

And then the full record for each dataset can be browsed. When you see metadata in this format, it should be FGDC-compliant, and contain values for specific fields, such as projection, coordinate system, units, data format, and a person to contact in order to obtain the dataset. Sometimes the metadata record also contains an online linkage, that will allow you to download from a link in the record itself. What is displayed below is an FGDC compliant metadata record.

If you are interested in getting copies of the datasets, you follow the link to Distribution Information:

Here is a link to the complete metadata record for this dataset.

If you find metadata for a geospatial dataset, and the metadata record does not look like this, with this particular format and these particular fields, you do not have FGDC-compliant metadata.


USGS GeoData

Since its foundation in 1879, the United States Geological Survey has been a leader in developing map data for the USA. Much of the map data developed by the USGS has been converted to digital formats that can be imported into GIS software. Nearly all of these datasets are available for free via the Net. Some datasets are not available online, but may be available at low cost by ordering directly from the USGS, or the datasets may be available off-line for free at a depository library such as the University of Washington Map Library.

Datasets that the USGS provides online at no cost can be found at the USGS GeoData web site.

A very large variety of data at many different source scales is available here. Most of the data are vector versions of the same data shown on the USGS quadrangle maps (1:24,000; 1:100,000; 1:2,000,000; etc.). In addition to the vector datasets, a large assortment of DEMs are available.

Many of the datasets that are available need special translation software in order to be imported into GIS software. Most GIS software comes with import and export utilities for these formats.

There is a tool for importing SDTS data to ArcGIS.


Washington State Geospatial Data Archive ("WAGDA")

The U of W Map Library has a large collection of GIS data, including City of Seattle data, Statewide data, including DEMs, digital orthophotos (DOQQs), and digital raster graphics (DRGs - scanned and georeferenced versions of USGS quadrangle maps).

Some of the datasets, including the City of Seattle data, are restricted to Internet connections originating within the U of W's IP addresses, and are not available for general public download. Other datasets, such as the 10 m DEMs, are available for free without restriction.


The Geography Network

The Geography Network is a group of ESRI-software based servers that have a common search interface. It is possible to search the Geography Network for downloadable data based on bounding coordinates, content themes, and keywords.





Some of the data sets available for download from the Geography Network are free; others must be purchased. There are many more providers of online data connections using ArcIMS (Internet Map Server), a technology that essentially streams data sets across the web from a server or group of servers to your ArcGIS session. Here is an example of some Geography Network data (ESRI provided soils) added to an ArcMap session:






ArcIMS Sites

ArcIMS is a relatively new technology (origins in the mid to late 1990s) that relies on server technologies developed by ESRI for streaming map data across the web using very narrow bandwidth. Data sets are stored on servers, but displayed in your ArcMap session. Connecting to ArcIMS sites is very easy, and the growing list of servers provide a wealth of data you can use for free as basemaps to make excellent maps with a minimum amount of time and effort.

Connections to ArcIMS servers are managed in ArcCatalog. The following images show a connection to the Geography Network's ArcIMS site.





Here is a connection to a server at the University of Washington:



 

Return to top | Ahead to Data Import


Syllabus Schedule Class Meetings Assignments Course Data
Contact Us CFR 520 Lab Locations Software Collect It Page

 

The University of Washington Spatial Technology, GIS, and Remote Sensing Page is supported by the School of Forest Resources
School of Forest Resources