Introduction to Geographic Information Systems in Forest Resources
UW Home GIS@UW Search
Syllabus Schedule Class Meetings Assignments Course Data
Contact Us CFR 520 Lab Locations Software Collect It Page


Project and Data Management

Discussion:

One of the greatest strengths of GIS is the ability to integrate large amounts of data in different formats from different sources. However, this frequently creates data management nightmares. For any GIS project, no matter how simple, effective data management is absolutely essential.

Some common concerns are:

To master the data management these types of questions must be able to be answered definitively for every project, regardless of the size of the project.


The major data management problem with ArcGIS is that it uses two basic system file types (this does not mean data model types, such as raster and vector, but actual system files). The two basic file types are
  1. ArcInfo datasets
    coverages
    grids
    TINs
    INFO tables

  2. non-ArcInfo datasets
    ESRI Geodatabases
    ESRI shapefiles
    Image data (e.g., TIFF, BMP, BIL, JPEG)
    CAD drawings (DXF, DWG, DGN)
    SDE data
    StreetMap data (If StreetMap is installed)
    VPF data
    dBASE tables
    delimited ASCII files

ArcInfo datasets cause a lot of data management problems due to the file structure of the source data. The files that make up ArcInfo datasets cannot be arbitrarily moved or copied around the file system with the Windows Explorer or other OS file management tools. If files or directories that comprise ArcInfo datasets are moved, renamed, deleted, or otherwise altered, the result may be completely corrupt, unusable, and unretrievable data. ArcInfo datasets cannot be moved or altered in any way at the system level; they can only be managed by ArcGIS or ArcInfo.

Non-ArcInfo data generally sets do not cause the same kind of problems as do ArcInfo datasets. They can be moved around the file system without corruption, using system-level file management tools, such as command interfaces and GUI file management tools, (as long as all related files are moved as a unit).


Managing non-ArcInfo datasets
Setting the default output directory for geoprocessing
Copying
Renaming
Archiving
Managing ArcInfo datasets
Copying & Renaming
Archiving
Dealing with ArcInfo coverages
Copying and moving map documents


Managing non-ArcInfo datasets

Non-ArcInfo datasets (with the exception of Geodatabases) can be managed using system tools (e.g., DOS commands, Windows Explorer). However, some datasets are composed of multiple single files. If you copy, rename, move, or delete these files, you need to make sure that all files are handled. If you neglect to copy or rename a single file in a multi-file data source, you could be left with corrupt data.

From the GIS dictionary in ArcGIS help:

Geoprocessing is "a GIS operation used to manipulate data stored in a GIS workspace. A typical geoprocessing operation takes an input dataset, performs an operation on that dataset, and returns the result of the operation as an output dataset. Common geoprocessing operations are geographic feature overlay, feature selection and analysis, topology processing, and data conversion. Geoprocessing allows for definition, management, and analysis of information used to form decisions."

Because geoprocessing can create new files, you will always want to know, and control the location of the new output files. By setting the default location for output datasets, you can more easily manage these datasets. If you use this functionality, you will know where new files are placed. If you take a minute or two at the beginning of creating a project to create and set a working directory, you can save yourself literally hours of cleanup later. It also saves time while working, as you will not need to navigate through the file system as frequently.

Non-ArcInfo datasets can be copied or moved freely across the file system. You can use the DOS copy, xcopy , or move commands, or a GUI-based file manager, such as the Windows Explorer, to copy these types of files from one place on the disk to another, or from a fixed drive to removable media.

In ArcGIS it is advised to use the ArcCatalog to copy, rename, or delete shapefiles, images, coverages, grids, TINs, or other supported data sources. This is a more foolproof and easy way to manage these types of files, because ArcCatalog handles all necessary file maintenance. Here are a few of the menu choices in ArcCatalog for managing data sources:

Even if the data source is composed of multiple files, ArcCatalog will make sure all necessary files are copied, moved, renamed, or deleted.

For example,

    1. shape.shp
    2. shape.shx
    3. shape.dbf
    4. shape.sbn (may be present)
    5. shape.sbx (may be present)
    6. shape.ain (may be present)
    7. shape.aih (may be present)
    8. shape.prj (may be present)

Files 4-8 are special index files that are not necessary for the dataset, but are created automatically for particular needs. A shapefile will be fully functional with files 1-3.

    1. the image file itself
    2. the image world file (for georeferencing)
    3. the image header file (may be present)
    4. the image statistics file (may be present)

See ArcGIS help on Images for a description of what each file does.

Renaming datasets uses the same rules as copying files. For data that are composed of multiple files, make sure to rename all associated files, or use ArcCatalog.

Archive files are generally created with such programs as WinZip, PKZip, Stuffit, or the UNIX command-line utilities tar, zip, and gzip. Archives store multiple files in a single archive file, that is usually compressed. These archives can be saved or stored for backup or recording, or used to copy large numbers of files from one machine to another or from one platform to another (such as PC to UNIX or Mac).

The same concern for copying and renaming data sources applies to archiving data. Make sure that all associated data files are archived, or when the archive is restored, not all necessary files will be present. You may end up with a nonfunctional dataset.

If you are using Geodatabase data, only use ArcCatalog for management of these data sources. A Geodatabase is a special kind of database file. In ArcGIS Desktop, the personal Geodatabase is stored in a Microsoft Access .mdb file. It is strongly advised not to open the Geodatabase in Access, because corruption of the data structure could easily occur if you are not careful or do now know what you are doing.


Managing ArcInfo datasets

The following image is a schematic for how ArcInfo coverages and other ArcInfo datasets are stored. The workspace (top-level data container) in this example is the directory jasper. Within jasper there are several other directories, including a special directory called info. The other directories (water, soil, elevatin, and vegatatn) all contain files for each of their particular datasets. The info directory contains all of the attribute tables for each of the data directories. The file structure must be maintained exactly in this format. Moving, renaming, or deleting any files will cause problems.

In general, when managing ArcInfo datasets in ArcGIS, there are a few rules to follow.

  1. Never copy, move, delete, or rename any coverage, grid, or TIN directory using system tools (such as the Windows Explorer) unless you move the entire parent directory, including the info directory.

  2. Always use the ArcCatalog when renaming, copying, or moving coverage, grid, or TIN data sources.

  3. Close any existing map documents (but leave the ArcGIS application open) before making any alterations. Even if datasets are deleted from the map document , sometimes there may be references to the datasets within the map document that will prevent you from moving or renaming these datasets. Sometimes it is necessary to close ArcMap completely before datasets can be managed with ArcCatalog.

Use ArcCatalog for copying and renaming ArcInfo datasets (coverages, grids, or TINs)

To archive single or multiple data source files, first create a new directory using the operating system (or with ArcCatalog), and then use ArcCatalog to copy the data sources to the new directory. Use one of the archiving utilities to create an archive of the entire directory, including the info directory. If you leave out the info directory, you will be left with corrupt data.


Copying and moving map documents

If you have worked with an ArcMap document and attempted to open it on a different computer, you have most likely faced the orphan dataset problem. As you should know by now, ArcGIS map documents do not contain the datasets themselves, but merely contain pointers to where data are on the file system.

If a map document opens and cannot find these files in the exact location stated in the map document file, the layers will not display, and a red exclamation mark will be displayed next to the layer name in the table of contents. Most people who have used ArcGIS will have dealt with this frustration.

One way to reduce some of this frustration is to store relative pathnames for layer data sources within ArcMap.

To move or copy map documents from one system to another, there are several steps:

  1. Copy all data from one machine to the other, or make sure that the data exist on the other system.
  2. Place the data in the exact same directory structure on the target system.
  3. Open the map document.

or

  1. Copy all data from one machine to the other, or make sure that the data exist on the other system.
  2. Place the data anywhere on the target system, but with the same relative pathway.
  3. Use the Set Data Source control by right-clicking the layer name, or set the data source in the layer's properties.

Before making any changes to map document files, always make a backup copy!!

If projects are moved to different systems with different ArcGIS versions or extensions, projects may not even open.

 

Return to top | Ahead to Data Export


Syllabus Schedule Class Meetings Assignments Course Data
Contact Us CFR 520 Lab Locations Software Collect It Page

 

The University of Washington Spatial Technology, GIS, and Remote Sensing Page is supported by the School of Forest Resources
School of Forest Resources