๐Ÿ—บ๏ธGeospatial Engineering

Fundamental Geospatial Data Types

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Geospatial data types form the foundation of everything you'll do in geospatial engineering, from building 3D city models to analyzing flood risk across watersheds. You're being tested on more than definitions; examiners want to see that you understand when to use vector versus raster, why point clouds capture detail that DEMs cannot, and how attribute data transforms raw geometry into actionable intelligence.

The key principle is fitness for purpose: every data type exists because it solves a specific representation problem. Vector data excels at discrete boundaries; raster data captures continuous phenomena; 3D formats model the vertical dimension that flat maps ignore. Don't just memorize what each format contains. Know what problem it solves and when you'd choose it over alternatives.


Spatial Representation Models

The most fundamental decision in geospatial work is how to represent real-world features digitally. Discrete features with clear boundaries call for vector models; continuous phenomena that vary gradually across space demand raster grids.

Vector Data

  • Represents geography through points, lines, and polygons. Each feature is a distinct geometric object defined by precise coordinate pairs (or triplets in 3D). A fire hydrant is a point, a road centerline is a line, and a land parcel is a polygon.
  • Ideal for discrete features where edges and exact positions matter, such as road networks, cadastral boundaries, and building footprints.
  • Supports rich attribute association. Every feature can carry descriptive data in linked tables, enabling complex queries like "select all parcels zoned commercial with an area greater than 500 m2m^2."

Raster Data

  • Organizes space as a regular grid of cells (pixels). Each cell stores a single value representing a measurement or classification at that location.
  • Optimized for continuous phenomena such as elevation, temperature, precipitation, and spectral reflectance from satellite sensors. These are fields that have a value everywhere, not just at discrete locations.
  • Resolution determines detail and file size. A 1-meter resolution raster has 100 times more cells than a 10-meter raster covering the same area, so storage and processing demands increase with the square of the resolution improvement.

Compare: Vector vs. Raster. Both represent spatial features, but vector preserves precise boundaries while raster captures gradual variation. If asked to model a watershed boundary, use vector; if asked to model rainfall intensity across that watershed, use raster.


Three-Dimensional Surface Models

When the vertical dimension matters for terrain analysis, volumetric calculations, or line-of-sight modeling, you need data types designed to capture elevation. The choice between these formats depends on whether you prioritize uniform sampling or adaptive detail.

Digital Elevation Models (DEMs)

  • Raster grids where each cell value represents surface elevation. They're typically derived from satellite stereo imagery, radar interferometry (such as InSAR), or LiDAR.
  • Foundation for terrain analysis including slope, aspect, hillshade, and hydrological flow direction calculations. Most GIS hydrology tools expect a DEM as input.
  • Regular grid structure simplifies processing but can miss fine details in complex terrain. A cliff face that falls between grid posts may be smoothed out entirely. Note the distinction: a Digital Surface Model (DSM) includes buildings and vegetation tops, while a Digital Terrain Model (DTM) represents the bare earth.

Triangulated Irregular Networks (TINs)

  • Vector-based surface model using irregularly spaced triangles. Vertices (called mass points) concentrate where terrain is complex and spread out where it's uniform.
  • More efficient than DEMs for variable terrain. TINs capture cliff edges and ridgelines precisely without wasting storage on flat areas, because triangle density adapts to surface complexity.
  • Preferred for engineering applications like flood modeling, cut-and-fill volume calculations, and viewshed analysis requiring high local accuracy. Breaklines can be enforced along features like stream channels and road edges to preserve sharp transitions.

Point Clouds

  • Dense collections of 3D coordinates (x, y, z) typically generated by LiDAR scanning or dense-image-matching photogrammetry. A single airborne LiDAR survey can produce tens of points per square meter.
  • Captures unprecedented surface detail. Millions of points can represent building facades, vegetation canopy structure, and ground topography simultaneously. LiDAR point clouds also commonly carry intensity and return number values, which help distinguish ground from vegetation.
  • Raw format requiring processing. Points must be classified (ground, building, vegetation, noise), filtered, and often converted to DEMs or TINs before most analyses can run. Common storage formats include LAS and its compressed variant LAZ.

Compare: DEMs vs. TINs. Both model terrain surfaces, but DEMs use uniform grids while TINs adapt triangle density to surface complexity. For a flat agricultural region, a DEM works fine; for mountainous terrain with sharp ridges, a TIN preserves critical detail that a DEM at the same data volume would miss.


Data Description and Context

Spatial geometry alone tells you where something is, but attribute data tells you what it is, and metadata tells you whether you can trust it. These descriptive layers transform raw coordinates into meaningful information.

Attribute Data

  • Tabular information linked to spatial features. It's stored in database tables with a unique identifier (often called a feature ID or primary key) connecting each row to its geometry.
  • Enables querying, classification, and thematic mapping. Without attributes, a polygon is just a shape; with attributes, it becomes a land parcel with an owner name, assessed value, and zoning designation.
  • Supports both qualitative and quantitative fields: text categories (land use type), numeric measurements (area in m2m^2), dates (last inspection), and coded domains that constrain entries to valid values.

Metadata

  • Documents data lineage, accuracy, and limitations. It answers critical questions: What is the source? When was it collected? What coordinate reference system does it use? What processing has been applied?
  • Essential for fitness-for-purpose assessment. A 30-meter DEM from 2005 might be useless for a project requiring 1-meter accuracy under current conditions. Without metadata, you can't make that judgment.
  • Enables interoperability and data sharing. Standardized metadata schemas (like ISO 19115 for geographic information) let systems automatically discover, evaluate, and integrate datasets across organizations.

Compare: Attribute data vs. Metadata. Attributes describe the features themselves (what is this parcel?), while metadata describes the dataset as a whole (how accurate is this parcel layer? when was it last updated?). Both are essential but serve different purposes in data quality assessment.


Storage and Exchange Formats

Data types must be stored in file formats that balance portability, functionality, and software compatibility. Your format choice affects what analysis is possible and who can use your data.

Shapefiles

  • Industry-standard vector format comprising multiple component files (.shp for geometry, .dbf for attributes, .shx for the spatial index, .prj for the coordinate reference system). All component files must stay together in the same directory.
  • Near-universal GIS compatibility. Virtually every mapping application can read and write shapefiles, making them the lingua franca of geospatial data exchange.
  • Legacy limitations persist. Field names are capped at 10 characters, file size is limited to 2 GB, there's no support for topology rules, and each shapefile can hold only one geometry type (points or lines or polygons, not a mix). Newer open formats like GeoPackage (.gpkg) address many of these shortcomings while maintaining broad compatibility.

GeoTIFF

  • Georeferenced raster format that embeds coordinate system information and spatial extent directly in the TIFF file header through standardized tags.
  • Self-describing and portable. No sidecar files are needed; any GIS can place the image correctly on a map without manual georeferencing.
  • Supports diverse raster content: satellite imagery, classified land cover, and continuous surfaces like elevation or temperature. Cloud Optimized GeoTIFF (COG) is a variant structured for efficient streaming from cloud storage, which is increasingly important for large datasets.

Geodatabases

  • Integrated spatial database systems (like Esri's file geodatabase or open-source PostGIS/PostgreSQL) that store vector, raster, and attribute data together in a unified structure.
  • Supports advanced data structures such as topology rules, relationship classes, attribute domains, and subtypes that enforce data integrity at the database level.
  • Enables multi-user editing and versioning. Enterprise geodatabases are essential for workflows where multiple team members edit the same datasets concurrently.

Compare: Shapefiles vs. Geodatabases. Shapefiles maximize portability and compatibility, while geodatabases maximize functionality and data integrity. Use shapefiles (or GeoPackages) for sharing data externally; use geodatabases for internal production workflows where data quality enforcement matters.


Quick Reference Table

ConceptBest Examples
Discrete feature representationVector data, Shapefiles, Geodatabases
Continuous phenomenon representationRaster data, GeoTIFF, DEMs
3D surface modelingDEMs, TINs, Point clouds
High-resolution 3D capturePoint clouds, TINs
Feature descriptionAttribute data
Data quality documentationMetadata
Maximum software compatibilityShapefiles, GeoTIFF, GeoPackage
Enterprise data managementGeodatabases (file or enterprise)

Self-Check Questions

  1. You need to model both building footprints and land surface temperature for an urban heat island study. Which data types would you use for each, and why?

  2. Compare DEMs and TINs: what terrain characteristics would make you choose one over the other for a flood inundation model?

  3. A colleague sends you a shapefile with no documentation. What specific metadata would you need before using it in a professional analysis?

  4. Point clouds, DEMs, and TINs all represent 3D surfaces. Rank them from rawest to most processed, and explain what information might be lost at each processing step.

  5. If you needed to share elevation data with an organization using different GIS software than yours, would you choose a geodatabase or GeoTIFF format? Justify your choice based on the properties of each format.

Fundamental Geospatial Data Types to Know for Geospatial Engineering