The Unique Characters of Four Projections
Q1: What information can be found in the Source tab?
Three fields of information can be found when accessing the Source tab: 1) How the data has been defined via a geographic coordinate system and a projection as well as the corresponding datum being used, 2) the data’s geographic extent in reference to the geographic coordinate system it resides in, and 3) the location of the file on the local server being accessed.
Q2: In what coordinate system is this layer?
The continental United States data layer is saved in the GCS North American 1983 coordinate system.
Q3: How does shape of the continental US change with each projection?
The Mercator projection produces a relatively balanced U.S. with similar height and width. The U.S. appears flatter than its real-world counterpart.
The Robinson projection produces a stretched shape of the U.S. with the Southwest states and the New England states both elongated southwest and northeast respectively.
The Plate Caree projection produces the widest U.S. shape of all four projections included with the greatest distortion occurring at the width of the nation’s landmass. Coastlines are further apart than the real-world counterpart.
The Albers Equal Area Conic appears to be the most accurate in shape when compared to the real-world U.S. landmass. The top border is curved southward, representing the Earth’s curvature.
Q4: How do the positions of cities in relation to each other appear to change between projections?
Major cities such as Seattle and Chicago, for instance, appear further (East-West) in both the Mercator and Plate Caree projections. Meanwhile, the distance between cities such as Los Angeles and Seattle appear slightly further in both the Albers Equal Area Conic and the Robinson projections, suggesting a North-South distortion in distance between these projections and the former two discussed.
Q5: Which properties of the continental United States are distorted with each projection?
The Mercator projection: Northern ends of the coastline appear standardized which could introduce a distortion for direction due to the fact that Maine and Washington may not be exactly due East or West of one another.
The Robinson projection: appears to be perceived from a datum different from the other three projections with its stretched shape (Northeast-Southwest). Direction and shape of states are most likely distorted most in this projection.
Both display a tendency to elongate distance running East-West.
The Plate Caree projection: all states are stretched from East to West, distorting the property of absolute distance from one location to another.
The Albers Equal Area Conic projection: State shapes appear condensed but with an Equal Area projection, and thus, the property of area has been preserved. Due to the conic nature of the projection, distortion may be taking place in the Southern-most states of the nation as they are furthest from the point of tangency. In this case, shape, distance and direction may be slightly compromised.
Both display a tendency to elongate distance running North-South.
Q6: How does the distance between major cities change between projections?
The straight line distance between Seattle and Chicago varies between the four projections considerably (the greatest difference being an approximate 800 miles!).
For the Mercator and Plate Caree projections respectively, the East-West distances were 2,462 miles and 2,430 miles.
For the Robinson and Albers Equal Area Conic projections respectively, the East-West distances were 1,675 miles and 1,738 miles.
Three Choropleth Maps, Three Data Class Schemes
Q7: What variables does this additional dataset contain?
The newly joined data table contains two variables. The first is State abbreviation or ‘STUSPS’ which is the common attribute facilitating the table join. The second variable is titled ‘PERCH’ standing for Percent Change in farm population sums from 2007 to 2012, represented with either a positive of negative whole number.
Q8: Which classification methods were used? How does each classification method bias the interpretation of the data?
The Farm population data was treated with a quantile distribution which resulted in four data classes. In doing so, the percent change values were color-coded by the following parameters: (-14) – (-7), (-6) – (-3), (-2) – (2), and (3) – (32).
First, the map reader may notice that there is one color class which represents both negative and positive values (-2) – (2). This lumping classification is not helpful to the reader in case a farm population increase or decrease is relevant to solving a geographic problem.
Second, there is one color class which represents positive percent changes in farm population ranging from (3) – (32). A reader may want to further divide these respective states in order to establish a gradient of farm ‘births’ across most, if not all, agriculturally progressive states.
The Quantile classification scheme is capable of distorting value gradients and disguising important data anomalies across a dataset with its four-class programming.
When Farm population data was treated with a Jenks (Natural Breaks) distribution four data classes were also created. In doing so, the percent change values were color-coded by the following parameters: (-14) – (-7), (-6) – (-1), (0) – (8), (9) – (32).
Once again, because the data’s range in value, large color-coded groups have been created. However, the negative values and the positive values including zero have been separated due to this classification method. Yet, an overgeneralized perception is still harbored due to the range of the dataset’s values because in the field of national agricultural productivity, the difference in, say, a 1% increase and a 30% increase may be extremely relevant and telling of some economic or political conditions in those respectable states.
Although the issue of a negative-positive identifier within the color-coding scheme has been solved with the Jenks distribution, the range in specifically positive data values has been undermined and untold to the map reader.
The same data was then fitted with the Defined Interval distribution method. The desired interval was then set to (13) in hopes of subdividing the dataset into five data classes. The resulting parameters were: (-14) – (-13), (-12) – (0), (1) – (13), (14) – (26), and (27) – (39).
In using the Defined Interval of 13 for this particular dataset, color classes were assigned uneven ‘weights’ in terms of the range in value which they represent and values were created in order to accommodate the distribution. With the majority (4) of the data classes representing a range of 13 percentage points in population change, even though not all classes actually have real values that consecutively fill those ranges, the first data class does not deserve its own color-code because it only represents 1 or 2 real values. Also, the final and highest positive percent change class is identified as representing values which do not exist in the accommodating attribute table.
When deciding to treat your dataset with a defined interval distribution, you need to understand the uniqueness of your data. Where do anomalies (large gaps) in data values occur? Does emphasizing these regions of the dataset add or subtract from the story the map must tell? Ranges of 13 did not accurately nor precisely represent the real distribution of my dataset’s ‘value spectrum’ and thus has no appropriate use in displaying the data of this topic.