Introduction to Geographic Information Systems in Forest Resources |
|
|||||||||||||||
|
One of the greatest strengths of GIS is the ability to integrate large amounts of data in different formats from different sources. However, this frequently creates data management nightmares. For any GIS project, no matter how simple, effective data management is absolutely essential.
Some common concerns are:
- Where are the source files?
- What types of files are they?
- What projection/coordinate system are they stored in?
- Are there multiple copies of the files?
- If so, which are the most current or correct?
- Where are new files automatically placed?
- How do I move ArcGIS projects between computers?
To master the data management these types of questions must be able to be answered definitively for every project, regardless of the size of the project.
- ArcInfo datasets
coverages
grids
TINs
INFO tables
- non-ArcInfo datasets
ESRI Geodatabases
ESRI shapefiles
Image data (e.g., TIFF, BMP, BIL, JPEG)
CAD drawings (DXF, DWG, DGN)
SDE data
StreetMap data (If StreetMap is installed)
VPF data
dBASE tables
delimited ASCII files
ArcInfo datasets cause a lot of data management problems due to the file structure of the source data. The files that make up ArcInfo datasets cannot be arbitrarily moved or copied around the file system with the Windows Explorer or other OS file management tools. If files or directories that comprise ArcInfo datasets are moved, renamed, deleted, or otherwise altered, the result may be completely corrupt, unusable, and unretrievable data. ArcInfo datasets cannot be moved or altered in any way at the system level; they can only be managed by ArcGIS or ArcInfo.
Non-ArcInfo data generally sets do not cause the same kind of problems as do ArcInfo datasets. They can be moved around the file system without corruption, using system-level file management tools, such as command interfaces and GUI file management tools, (as long as all related files are moved as a unit).
Setting the default output directory for geoprocessingManaging ArcInfo datasets
Copying
Renaming
Archiving
Copying & RenamingCopying and moving map documents
Archiving
Dealing with ArcInfo coverages
Non-ArcInfo datasets (with the exception of Geodatabases) can be managed using system tools (e.g., DOS commands, Windows Explorer). However, some datasets are composed of multiple single files. If you copy, rename, move, or delete these files, you need to make sure that all files are handled. If you neglect to copy or rename a single file in a multi-file data source, you could be left with corrupt data.
From the GIS dictionary in ArcGIS help:
Geoprocessing is "a GIS operation used to manipulate data stored in a GIS workspace. A typical geoprocessing operation takes an input dataset, performs an operation on that dataset, and returns the result of the operation as an output dataset. Common geoprocessing operations are geographic feature overlay, feature selection and analysis, topology processing, and data conversion. Geoprocessing allows for definition, management, and analysis of information used to form decisions."
Because geoprocessing can create new files, you will always want to know, and control the location of the new output files. By setting the default location for output datasets, you can more easily manage these datasets. If you use this functionality, you will know where new files are placed. If you take a minute or two at the beginning of creating a project to create and set a working directory, you can save yourself literally hours of cleanup later. It also saves time while working, as you will not need to navigate through the file system as frequently.
Non-ArcInfo datasets can be copied or moved freely across the file system. You can use the DOS copy, xcopy , or move commands, or a GUI-based file manager, such as the Windows Explorer, to copy these types of files from one place on the disk to another, or from a fixed drive to removable media.
In ArcGIS it is advised to use the ArcCatalog to copy, rename, or delete shapefiles, images, coverages, grids, TINs, or other supported data sources. This is a more foolproof and easy way to manage these types of files, because ArcCatalog handles all necessary file maintenance. Here are a few of the menu choices in ArcCatalog for managing data sources:
Even if the data source is composed of multiple files, ArcCatalog will make sure all necessary files are copied, moved, renamed, or deleted.
For example,
- Shapefile layer sources are composed of at least 3 data files. Consider a shapefile called shape:
- shape.shp
- shape.shx
- shape.dbf
- shape.sbn (may be present)
- shape.sbx (may be present)
- shape.ain (may be present)
- shape.aih (may be present)
- shape.prj (may be present)
Files 4-8 are special index files that are not necessary for the dataset, but are created automatically for particular needs. A shapefile will be fully functional with files 1-3.
- Image layer sources are composed of at least two files
- the image file itself
- the image world file (for georeferencing)
- the image header file (may be present)
- the image statistics file (may be present)
See ArcGIS help on Images for a description of what each file does.
Renaming datasets uses the same rules as copying files. For data that are composed of multiple files, make sure to rename all associated files, or use ArcCatalog.
Archive files are generally created with such programs as WinZip, PKZip, Stuffit, or the UNIX command-line utilities tar, zip, and gzip. Archives store multiple files in a single archive file, that is usually compressed. These archives can be saved or stored for backup or recording, or used to copy large numbers of files from one machine to another or from one platform to another (such as PC to UNIX or Mac).
The same concern for copying and renaming data sources applies to archiving data. Make sure that all associated data files are archived, or when the archive is restored, not all necessary files will be present. You may end up with a nonfunctional dataset.
If you are using Geodatabase data, only use ArcCatalog for management of these data sources. A Geodatabase is a special kind of database file. In ArcGIS Desktop, the personal Geodatabase is stored in a Microsoft Access .mdb file. It is strongly advised not to open the Geodatabase in Access, because corruption of the data structure could easily occur if you are not careful or do now know what you are doing.
The following image is a schematic for how ArcInfo coverages and other ArcInfo datasets are stored. The workspace (top-level data container) in this example is the directory jasper. Within jasper there are several other directories, including a special directory called info. The other directories (water, soil, elevatin, and vegatatn) all contain files for each of their particular datasets. The info directory contains all of the attribute tables for each of the data directories. The file structure must be maintained exactly in this format. Moving, renaming, or deleting any files will cause problems.
In general, when managing ArcInfo datasets in ArcGIS, there are a few rules to follow.
Use ArcCatalog for copying and renaming ArcInfo datasets (coverages, grids, or TINs)
To archive single or multiple data source files, first create a new directory using the operating system (or with ArcCatalog), and then use ArcCatalog to copy the data sources to the new directory. Use one of the archiving utilities to create an archive of the entire directory, including the info directory. If you leave out the info directory, you will be left with corrupt data.
If you have worked with an ArcMap document and attempted to open it on a different computer, you have most likely faced the orphan dataset problem. As you should know by now, ArcGIS map documents do not contain the datasets themselves, but merely contain pointers to where data are on the file system.
If a map document opens and cannot find these files in the exact location stated in the map document file, the layers will not display, and a red exclamation mark will be displayed next to the layer name in the table of contents. Most people who have used ArcGIS will have dealt with this frustration.
One way to reduce some of this frustration is to store relative pathnames for layer data sources within ArcMap.
To move or copy map documents from one system to another, there are several steps:
- Copy all data from one machine to the other, or make sure that the data exist on the other system.
- Place the data in the exact same directory structure on the target system.
- Open the map document.
or
- Copy all data from one machine to the other, or make sure that the data exist on the other system.
- Place the data anywhere on the target system, but with the same relative pathway.
- Use the Set Data Source control by right-clicking the layer name, or set the data source in the layer's properties.
Before making any changes to map document files, always make a backup copy!!
If projects are moved to different systems with different ArcGIS versions or extensions, projects may not even open.
Return to top | Ahead to Data Export
|
The University of Washington Spatial Technology, GIS, and Remote Sensing Page is supported by the School of Forest Resources |
School of Forest Resources |