<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Hocodata on frankhecker.com</title>
    <link>https://frankhecker.com/tags/hocodata/</link>
    <description>Recent content in Hocodata on frankhecker.com</description>
    <image>
      <title>frankhecker.com</title>
      <url>https://frankhecker.com/%3Clink%20or%20path%20of%20image%20for%20opengraph,%20twitter-cards%3E</url>
      <link>https://frankhecker.com/%3Clink%20or%20path%20of%20image%20for%20opengraph,%20twitter-cards%3E</link>
    </image>
    <generator>Hugo -- 0.156.0</generator>
    <language>en</language>
    <lastBuildDate>Wed, 09 Oct 2019 12:00:00 -0500</lastBuildDate>
    <atom:link href="https://frankhecker.com/tags/hocodata/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Population density in Ellicott City, Maryland</title>
      <link>https://frankhecker.com/2019/10/09/population-density-in-ellicott-city-maryland/</link>
      <pubDate>Wed, 09 Oct 2019 12:00:00 -0500</pubDate>
      <guid>https://frankhecker.com/2019/10/09/population-density-in-ellicott-city-maryland/</guid>
      <description>A look at the numbers at the census block group and census block levels.</description>
      <content:encoded><![CDATA[<p><a href="/assets/images/ec-density-2010-cbg-map.jpg"><img loading="lazy" src="/assets/images/ec-density-2010-cbg-map-embed.jpg"></a></p>
<p>UPDATE 2024/01/04: This post originally appeared on my <em>Civility and Truth</em> Substack newsletter.  I’ve moved it to my main site in an effort to collect all of my writing in one place.</p>
<p>I’m back again with more population density maps, this time for Ellicott City&mdash;or, more precisely, the Ellicott City Census Designated Place or CDP.</p>
<p>The map above shows population density variations as of the 2010 census for the various census block groups that are wholly or mostly within the Ellicott City CDP.  There are 34 such census block groups, compared to 54 census block groups for Columbia or 154 census block groups within the county, with Ellicott City thus accounting for about a fifth of Howard County.  (The 2010 population of 64,245 for these census block groups is also almost a fifth of the county’s total 2010 population.)</p>
<p>The main take-away from the map above is that most of Ellicott City has pretty much the same population density: it has less multi-unit housing than Columbia, and fewer large lots with single-family homes than western Howard County.  (The major exception is the area east of US 29 between US 40 and I-70, which contains a number of apartment complexes.)</p>
<p><a href="/assets/images/ec-density-2010-cbg-graph.png"><img loading="lazy" src="/assets/images/ec-density-2010-cbg-graph-embed.png"></a></p>
<p>This is confirmed by the histogram above, which shows the distribution of density among the various Ellicott City census block groups in 2010.  Most of the CDP was between 1,000 to 4,000 people per square mile, with only two census block groups out of 34 having higher density.  The overall population density for Ellicott City in 2010 was 2,234 people per square mile, about a third less than the overall population density for Columbia.</p>
<p><a href="/assets/images/ec-density-changes-map.jpg"><img loading="lazy" src="/assets/images/ec-density-changes-map-embed.jpg"></a></p>
<p>The next map shows changes in population density in Ellicott City between the 2010 census and the 2017 American Community Survey 5-year estimates (which reflect surveys done in 2013 through 2017).  Even over this relatively short period we can see a significant decrease in density in the area between US 40 and I-70 west of Rogers Avenue (which includes a good chunk of Patapsco State Park).</p>
<p><a href="/assets/images/ec-density-2013-2017-cbg-map.jpg"><img loading="lazy" src="/assets/images/ec-density-2013-2017-cbg-map-embed.jpg"></a></p>
<p>There was also almost a doubling of population density in the area north of old Ellicott City and south of US 40.  I don’t know if the latter is due to new development or to changes in household size.  However it’s worth noting that based on the 2010 population density map compared to the map for 2013-2017 immediately above, that area was originally less densely populated than most of Ellicott City, and whatever changes occurred brought it up to a “typical” density for the area.</p>
<p><a href="/assets/images/ec-density-2010-cb-map.jpg"><img loading="lazy" src="/assets/images/ec-density-2010-cb-map-embed.jpg"></a></p>
<p>As I did for Columbia, I also mapped density variations at the level of census blocks.  This uses population data available for the 2010 census but not for the American Community Survey.</p>
<p>This map is most notable for showing areas of Ellicott City that have very low population densities.  These include retail areas like the Long Gate shopping center north of MD 100 and east of US 29, park areas including Patapsco Valley State Park, Meadowbrook Park, and Centennial Park, and golf courses like the one at Turf Valley.</p>
<p>Note that as with Columbia some areas appear on this map that were not on the prior maps.  These reflect census blocks that are in the Ellicott City CDP, but that are in census block groups that are mostly not in Ellicott City.  The most notable example of this is the Turf Valley resort.</p>
<p>There is also a set of Ellicott City census blocks west of Centennial Lane that appear to be almost disconnected from the rest of the CDP.  These appear to be associated with the soccer fields and church at Covenant Park, with Centennial High School and Burleigh Manor Middle School another “low-density” area just north of there.  (Recall that “low-density” in this context refers to the size of the residential population in a given area, not how built-up the area is.)</p>
<p>You can find the code and data behind this post, as well as more Ellicott City population statistics, in the document “<a href="https://rpubs.com/frankhecker/519881">Ellicott City, Maryland, population density</a>.”</p>
<p>That’s all for this week.  A reminder: if you find these posts interesting and useful please tell other people about them and encourage them to subscribe to the Civility and Truth mailing list.  Having readers who care enough to subscribe helps motivate me to send these posts out on a regular basis, and the more readers I have the more motivated I’ll be.  In the meantime, thanks for reading this post!</p>
]]></content:encoded>
    </item>
    <item>
      <title>Population density in Columbia, Maryland</title>
      <link>https://frankhecker.com/2019/10/02/population-density-in-columbia-maryland/</link>
      <pubDate>Wed, 02 Oct 2019 12:00:00 -0500</pubDate>
      <guid>https://frankhecker.com/2019/10/02/population-density-in-columbia-maryland/</guid>
      <description>Zooming in on the Columbia CDP at the census block and block group levels</description>
      <content:encoded><![CDATA[<p><a href="/assets/images/columbia-density-2010-cbg-map.jpg"><img loading="lazy" src="/assets/images/columbia-density-2010-cbg-map-embed.jpg"></a></p>
<p>UPDATE 2024/01/04: This post originally appeared on my <em>Civility and Truth</em> Substack newsletter.  I’ve moved it to my main site in an effort to collect all of my writing in one place.</p>
<p>Hey!  I finally managed to figure out to get a list of census block groups or census blocks for the Columbia CDP.  (As a reminder, “CDP” or “Census Designated Place” is US Census Bureau jargon for a population center that’s unincorporated.)  So now I can bring you some density maps that zoom in to focus on just Columbia as opposed to all of Howard County.</p>
<p>The map above shows population density variations as of the 2010 census for the various census block groups that are wholly or mostly within the Columbia CDP.  (Some census block groups contain only a small portion of the Columbia CDP.  I omitted them from the map.) There are 54 such census block groups, as compared to 154 census block groups within the county.  Thus from this point of view Columbia accounts for about a third of Howard County.</p>
<p>As I mentioned previously, census block groups are a nice “not too large, not too small” subdivision.  In 2010 the least populated census block group in Columbia contained 645 people, while the most populated census block group contained 3,632 people.  The smallest Columbia census block group covered an area of 0.12 square miles (about 79 acres), while the largest block group covered an area of 3.11 square miles.</p>
<p>A typical census block group is considerably smaller than a Columbia village: since there are nine Columbia villages, each village would contain about 6 census block groups on average if they were equally distributed.</p>
<p>In the map above I’ve shown more roads than in my previous maps, to help orient you vis-a-vis the various parts of Columbia.  I thought about also superimposing the boundaries for the various Columbia villages (data that’s available on the Howard County GIS site), but ran out of time to make this work.</p>
<p>The main take-away from the map above is the areas of Columbia that are relatively high-density vs. relatively low-density.  Relatively high-density areas include portions of Harpers Choice and Owen Brown villages, presumably due to apartment complexes there.  In contrast the Columbia Gateway area is primarily office space and thus low-density in terms of residential population.</p>
<p><a href="/assets/images/columbia-density-2010-cbg-graph.png"><img loading="lazy" src="/assets/images/columbia-density-2010-cbg-graph-embed.png"></a></p>
<p>The histogram shows the distribution of density among the various Columbia census block groups in 2010.  Density varied from a low of 761 people per square mile to a high of 13,285 people per square mile, a difference of over an order of magnitude.  Overall population density for Columbia in 2010 was 3,187 people per square mile.</p>
<p>As a comparison, the lowest density Howard County census block group in 2010 had 151 people per square mile, and the highest density block group had 15,181 people per square mile.  Overall population density for the county in 2010 was 1,144 people per square mile.  Thus the least-dense Columbia census block group was over five times as dense as the lowest-density Howard County block group, and Columbia as a whole was about three times more densely populated than the county as a whole.</p>
<p><a href="/assets/images/columbia-density-changes-map.jpg"><img loading="lazy" src="/assets/images/columbia-density-changes-map-embed.jpg"></a></p>
<p>The next map shows changes in population density in Columbia between the 2010 census and the 2017 American Community Survey 5-year estimates (which reflect surveys done in 2013 through 2017).  Even over this relatively short period we can see significant decreases in density in the area of Harpers Choice village off Harpers Farm Road in northwest Columbia, and a increase in density in the area just south of there, north of Little Patuxent Parkway and east of Cedar Lane.  I don’t know if there’s been any change in the total number of housing units in those areas, so my initial guess is that these changes are due to changes in household size: older children moving out on the one hand, and more young children on the other.</p>
<p><a href="/assets/images/columbia-density-2010-cb-map.jpg"><img loading="lazy" src="/assets/images/columbia-density-2010-cb-map-embed.jpg"></a></p>
<p>For my final map I decided to map density variations at the level of census blocks.  This uses population data available for the 2010 census but not for the American Community Survey.</p>
<p>Census blocks are very small: there are 1605 census blocks in the Columbia CDP, with an average area of 0.02 square miles (about 13 acres).  In 2010 over half of the census blocks contained no people at all, and the average population of a block was only 62 people; the largest block contained 2,696 people.</p>
<p>Because of the small size of census blocks the corresponding population density can be very high if the block mainly contains apartment complexes.  In 2010 there were several census blocks in Columbia with population densities over 50,000 people per square mile, and a few over 100,000 people per square mile.  At the other end of the spectrum the majority of census blocks in Columbia contain no people and thus have a population density of zero.</p>
<p>Because of this wide distribution of population densities I don’t think the block-level map is that useful for looking at population densities in residential areas.  However on this map it’s very easy to pick out the parts of Columbia that are devoted to office, retail, or industrial uses (in dark blue).  This includes the Columbia Gateway and Dobbin Road areas on either side of Snowden River Parkway, the areas along Broken Land Parkway, the Mall at Columbia and the future Merriweather District, and the area around the hospital and Howard Community College.</p>
<p>UPDATE: I was able to resolve the problems that were preventing me from publishing the underlying code and data for this post.  See the document “<a href="http://rpubs.com/frankhecker/518294">Columbia, Maryland, population density</a>.”</p>
<p>That’s all for this week.  A reminder: if you find these posts interesting and useful please tell other people about them and encourage them to subscribe to the Civility and Truth mailing list.  Having readers who care enough to subscribe helps motivate me to send these posts out on a regular basis, and the more readers I have the more motivated I’ll be.  In the meantime, thanks for reading this post!</p>
]]></content:encoded>
    </item>
    <item>
      <title>More on Howard County population density</title>
      <link>https://frankhecker.com/2019/09/19/more-on-howard-county-population-density/</link>
      <pubDate>Thu, 19 Sep 2019 12:00:00 -0500</pubDate>
      <guid>https://frankhecker.com/2019/09/19/more-on-howard-county-population-density/</guid>
      <description>Looking at density variations in a slightly different way.</description>
      <content:encoded><![CDATA[<p><a href="/assets/images/hocomd-pop-density-quintiles-2010.png"><img loading="lazy" src="/assets/images/hocomd-pop-density-quintiles-2010-embed.png"></a></p>
<p>UPDATE 2024/01/04: This post originally appeared on my <em>Civility and Truth</em> Substack newsletter.  I’ve moved it to my main site in an effort to collect all of my writing in one place.</p>
<p>This is a brief follow-up to last week’s post.  I had hoped to be able to take a closer look at density variations in Columbia and Ellicott City.  However I haven’t yet found a good way (at least via the Census API) to get a list of census block groups or census blocks for the Columbia and Ellicott City CDPs.  (“CDP” or “Census Designated Place” is US Census Bureau jargon for a population center that’s unincorporated.)</p>
<p>I also happened to think about whether the census block group boundaries had changed from 2010 to 2017.  After looking at this I concluded that they probably had not, but this took a while to nail down.</p>
<p>The one new thing I did was to produce a map of density variations based on the quintiles the various census block groups fell into.  In other words, in the map above the census block groups in dark blue (quintile 1) are the 20% of all census block groups with the lowest population density in 2010, while the census block groups in yellow (quintile 5) are the 20% of all census block groups with the highest population density in 2010.  The other areas in light blue, green, and orange (quintiles 2, 3, and 4 respectively) are intermediate between those two groups.</p>
<p>I think this map does a slightly better job of letting you tell at a glance which are the most dense and least dense parts of the county, as well as which areas are roughly in the middle in terms of density.</p>
<p><a href="/assets/images/hocomd-pop-density-quintiles-2013-2017.png"><img loading="lazy" src="/assets/images/hocomd-pop-density-quintiles-2013-2017-embed.png"></a></p>
<p>Here’s a similar map using the 2017 ACS 5-year estimates, which cover the 2013-2017 timeframe.  At first glance I can’t see any differences between this map and the prior map.  This means that even though some areas of the county may have experienced changes in population density between 2010 and 2013-2017, the changes weren’t large enough to make any real difference in the overall density picture.</p>
<p>I’m going to try again to do density maps for Columbia and Ellicott City.  In the meantime, see the revised version of “<a href="http://rpubs.com/frankhecker/513490">Howard County density trends by census block groups</a>” for the code behind the maps above.</p>
<p>That’s all for this week.  A reminder: if you find these posts interesting and useful please tell other people about them and encourage them to subscribe to the Civility and Truth mailing list.  Having readers who care enough to subscribe helps motivate me to send these posts out on a regular basis, and the more readers I have the more motivated I’ll be.  In the meantime, thanks for reading this post!</p>
]]></content:encoded>
    </item>
    <item>
      <title>Howard County: divided by density?</title>
      <link>https://frankhecker.com/2019/09/12/howard-county-divided-by-density/</link>
      <pubDate>Thu, 12 Sep 2019 08:00:00 -0400</pubDate>
      <guid>https://frankhecker.com/2019/09/12/howard-county-divided-by-density/</guid>
      <description>Some areas of Howard County are over a hundred times more densely populated than others.</description>
      <content:encoded><![CDATA[<figure><a href="/assets/images/hocomd-pop-density-map-2010.png">
    <img loading="lazy" src="/assets/images/hocomd-pop-density-map-2010-embed.png"
         alt="Map of Howard County population density based on 2010 census"/> </a><figcaption>
            <p>A map of population density of each of the 154 census block groups in Howard County, Maryland. Click for a higher-resolution version.  Map by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p><em>tl;dr: Some areas of Howard County are over a hundred times more densely populated than others.</em></p>
<p>A long-time theme in writings about Howard County is the distinction between the more densely populated suburban and semi-urban areas like Columbia and the less densely populated rural areas in the western part of the county.  This has implications for issues from political affiliations to school redistricting, and of course for affordable housing as well.</p>
<p>In this post I’m going to ignore those issues though, and just look at the simple facts about density variation across the county.  The map above shows density variations as of the 2010 census&mdash;a data source I chose because it contains accurate population counts at a fairly fine-grained level.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>  The map shows population densities (people per square mile) for each of the census block groups within the county.</p>
<p>A census block group is a geography defined by the US Census Bureau that is one level below a census tract.  There are currently 154 census block groups in Howard County, compared to 55 census tracts. The smallest geography in US Census data is the census block, one level below the census block group.  There are currently 4,845 census blocks defined for Howard County.</p>
<p>Census tracts are relatively large, about 6,000 people or so on average in Howard County.  On the other hand, census blocks are too small: more than half of all census blocks in Howard County contained no people at all in the 2010 census.</p>
<p>Census block groups are a nice “not too large, not too small” subdivision of the county’s overall area.  In 2010 the least populated census block group contained 645 people, while the most populated census block group contained 3,632 people.  The smallest block group covered an area of 0.1 square miles (about 64 acres), while the largest block group covered an area of 13.4 square miles.</p>
<p>A typical census block group is thus comparable in both size and population to a traditional village or small town.  It is large enough to be a recognizable “place,” but small enough to have its own identity distinct from that of other places in the county.</p>
<p>What about density?  One of the most surprising things to me in doing this analysis was the wide variation in population density across the county.  Population density in 2010 varied from a low of 151 people per square mile to a high of 15,181 people per square mile, a difference of two orders of magnitude.  In comparison, overall population density for the county in 2010 was 1,144 people per square mile (287,085 people divided by the county land area of 251 square miles).</p>
<figure><a href="/assets/images/hocomd-pop-density-quintiles-2010.png">
    <img loading="lazy" src="/assets/images/hocomd-pop-density-quintiles-2010-embed.png"
         alt="Map of Howard County population density quintiles based on 2010 census"/> </a><figcaption>
            <p>The 154 census block groups in Howard County, Maryland, divided into five different groups based on their population density in the 2010 census. Click for a higher-resolution version.  Map by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>The map above is a variation on the first map.  It is based on the same 2010 census data, but divides the census block groups into five groups (or “quintiles”) of 31 block groups each (or 30, for the highest quintile).  This map shows much more clearly that almost all of the census block groups with the highest population density are in Columbia and eastern Howard County, and almost all of the census block groups with the lowest population density are in western Howard County.</p>
<p>(Some of the major exceptions are areas like Columbia Gateway and the light industrial districts east of I-95 that have little or no residential construction.  Here “low population density” is not the same as “not built up.”)</p>
<figure><a href="/assets/images/hocomd-pop-density-histogram-2010.png">
    <img loading="lazy" src="/assets/images/hocomd-pop-density-histogram-2010-embed.png"
         alt="Howard County population density histogram based on 2010 census"/> </a><figcaption>
            <p>A histogram showing the number of census block groups in Howard County, Maryland, that fall into certain ranges of population density.  Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>As shown in the above histogram, and also in the first map above, there are many census block groups in Howard County that had a population density of less than 500 people per square mile in 2010. At the other end of the spectrum a few census block groups had population densities of well over 10,000 people per square mile.</p>
<p>The largest number of census block groups fell into the range of 2,000&ndash;2,500 people per square mile; the typical (median) block group had a population density of about 2,400 people per square mile.  (This is different from the overall population density of Howard County of 1,441 people per square mile quoted above.)</p>
<p>To help think about what these numbers mean, consider a square mile, about 640 acres.  Suppose we have a few neighborhoods of single-family houses on 3- to 4-acre lots, with three or four people per house. With 50&ndash;100 total houses we have a total of 150&ndash;400 acres and 150&ndash;400 people.  Throw in two or three 100&ndash;150 acre farms plus road surfaces and open spaces and you’d have a typical Howard County rural census block group with a population density of 200&ndash;500 people per square mile.</p>
<p>Now suppose instead we have 0.1 square miles, about 64 acres, occupied by five or six apartment buildings with 50&ndash;100 units each, with two to three people per unit.  Now we have 500&ndash;1,800 people total in that 0.1 square mile area, for a total of 5,000 to 18,000 people per square mile&mdash;in other words, a typical semi-urban setting in Columbia or eastern Howard County.</p>
<figure><a href="/assets/images/hocomd-pop-density-changes-2010-2017.png">
    <img loading="lazy" src="/assets/images/hocomd-pop-density-changes-2010-2017-embed.png"
         alt="Map of Howard County population density changes based on 2010 census and 2013-2017 ACS estimates"/> </a><figcaption>
            <p>A map showing estimated changes in population density of census block groups in Howard County, Maryland, between the 2010 census and the 2013&ndash;2017 timeframe.  Click for a higher-resolution version.  Map by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>How is population density in the various parts of Howard County changing over time?  It’s hard to get a good picture of this in between censuses, because the available population estimates are from surveys taken over multiple years (from 2013 through 2017 for the latest available data) and have fairly high margins of error at the level of census block groups (up to 30% or more above or below the estimates themselves).</p>
<p>The map above is an attempt to show density changes from the 2010 census forward, using the American Community Survey 2017 5-year estimates.  As with population density itself, there is wide variation in population density changes.</p>
<p>A few areas stand out as having significant increases in population density, from 50 to 100%.  These appear to be include the Maple Lawn Farms development in Fulton as well as adjacent neighborhoods south of MD 216, areas along US 1 south and north of MD 175, and areas near downtown Ellicott City.</p>
<p>Other areas apparently experienced decreases in population density. Assuming that the number of housing units did not decrease in those areas, this likely was caused by the number of people per household decreasing, for example due to children leaving families and “empty nesters” remaining.  (Additional census data, for example on household size and the ages of household members, should be able to confirm or refute this idea.)</p>
<p>To sum up: we may argue about how the density divide in Howard County came about and what it all means, but I don’t think there’s any dispute that it exists.  It is especially clear in western Columbia, where the drop off in density west and north of MD 108 and (to a lesser extent) south of MD 32 is particularly dramatic.  It’s also apparent that new development and demographic changes in family size are having disparate impacts across the county.</p>
<p>Combining this density data with information on socioeconomic and political variables could uncover some interesting patterns. Hopefully I’ll have time in the future to look at this.</p>
<h2 id="further-exploration">Further exploration</h2>
<p>For more on how I created the maps and histogram above, see the following:</p>
<ul>
<li>“<a href="http://rpubs.com/frankhecker/513490">Howard County density trends by census block groups</a>” shows the R code used to produce these and other graphs.</li>
<li>My <a href="https://gitlab.com/frankhecker/hocodata">hocodata code repository</a> includes copies of the R Markdown files for this and another analyses.  (Look in the “affordability” subdirectory.)</li>
<li>If you sign up for a free account on the <a href="https://rstudio.cloud/">Rstudio.cloud</a> service you can open and make a copy of my <a href="https://rstudio.cloud/project/353602">hocodata project</a> for this and other analyses, and try your hand at it yourself. (Again, look in the “affordability” subdirectory, and check out the <a href="https://rstudio.cloud/learn/primers">RStudio primers</a> to learn how to use the system.)</li>
</ul>
<p>I also did two other articles focusing specifically on population density in <a href="/2019/10/02/population-density-in-columbia-maryland/">Columbia</a> and <a href="/2019/10/09/population-density-in-ellicott-city-maryland/">Ellicott City</a> for my (now deprecated) <em>Civility and Truth</em> newsletter.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I included some major Howard County highways on the map to help readers orient themselves: interstates, US highways, Maryland numbered routes, and roads with “Parkway” in their name.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>Which areas of Howard County are most and least affluent?</title>
      <link>https://frankhecker.com/2019/07/07/which-areas-of-howard-county-are-most-and-least-affluent/</link>
      <pubDate>Sun, 07 Jul 2019 19:00:00 -0400</pubDate>
      <guid>https://frankhecker.com/2019/07/07/which-areas-of-howard-county-are-most-and-least-affluent/</guid>
      <description>I look at median household income within Howard County, Maryland, and how it has changed.</description>
      <content:encoded><![CDATA[<figure><a href="/assets/images/hocomd-mhi-quintiles-2010.png">
    <img loading="lazy" src="/assets/images/hocomd-mhi-quintiles-2010-embed.png"
         alt="Howard County median household income quintiles for 2006-2010"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p><em>tl;dr: I look at median household income within Howard County, Maryland, and how it has changed.</em></p>
<p>I’m continuing my look at median household income, in pursuit of my ultimate goal of learning more about the issues around housing affordability in Howard County.</p>
<p>After my <a href="/2019/06/02/how-affluent-is-howard-county-really/">previous post</a> about Howard County median household income compared to other local jurisdictions, I now turn my attention to looking at median household income within the different parts of Howard County.  Unfortunately data at the census tract level (the next level below county level as far as the US Census Bureau is concerned) doesn’t go back very far, and has some issues that make analysis more difficult.</p>
<p>First, median household income estimates for census tracts are not available before 2009, at least for Howard County.  Second, the boundaries for Howard County census tracts changed from 2009 to 2010 (presumably as part of the work on the 2010 census).  (Fortunately, the census tract boundaries have been stable since then.)</p>
<p>Also, figures for median household income for census tracts are available only in the American Community Survey 5-year estimates. There are no 1-year estimates available as there are for counties and cities.  This means that the 2010 median household income estimates published for Howard County census tracts actually reflect income surveys in the years 2006&ndash;2010, the 2011 estimates reflect surveys in the years 2007&ndash;2011, and so on.  For this reason the US Census Bureau recommends not comparing 5-year estimates from overlapping sets of years, for example comparing 2011 5-year estimates to 2010 5-year estimates.</p>
<p>I’ve therefore chosen to compare only the 2010 and 2017 5-year estimates, in order to get both the earliest and latest comparable data.</p>
<p>Next, the margins of error for median household income at the census tract level are very high, typically 10&ndash;20% or even higher relative to the base income figures.  That means that ranking individual census tracts based on their median household income doesn’t really make sense, given that the relative positions of tracts on the list will in large part reflect random measurement errors.</p>
<p>Finally, there are 55 census tracts in Howard County, one for every 6,000 people on average.  (The 2018 ACS population estimate for Howard County is about 323,000 people.)  Graphing information on individual census tracts makes for a cluttered graph, unless you either plot data on a map or group census tracts together in some way.</p>
<p>In the map above I did both: I took the 55 census tracts and divided them into 5 groups or “quintiles” of 11 census tracts each, based on their median household income 5-year estimates for 2010.  Quintile 1 contains the 11 census tracts with the lowest median household incomes and quintile 5 contains the 11 census tracts with the highest median household incomes, with the other quintiles containing tracts with income values intermediate between the lowest and highest.</p>
<p>I then plotted the census tracts on a map of Howard County, with each tract colored according to the quintile it’s in.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>  The results are as one might expect: the census tracts with the lowest median household incomes tend to be in eastern Howard County and/or in Columbia, while the tracts with higher median household incomes tend to be in western Howard County.</p>
<figure><a href="/assets/images/hocomd-mhi-quintile-changes-2010-2017.png">
    <img loading="lazy" src="/assets/images/hocomd-mhi-quintile-changes-2010-2017-embed.png"
         alt="Howard County inflation-adjusted average median household income changes by quintile from 2006-2010 to 2013-2017"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>How did the median household income for the various census tracts in Howard County change from 2010 to 2017&mdash;or, more correctly, from the 2006&ndash;2010 timeframe to the 2013&ndash;2017 timeframe?</p>
<p>The graph above shows one way to look at this: I took the income quintiles from above, computed the averages of the inflation-adjusted median household incomes for the census tracts in each quintile, and graphed the results for 2017 compared to 2010.  Thus, for example, for the census tracts in quintile 1 (the lowest-income quintile) I computed the average from the 2010 5-year estimates of inflation-adjusted median household income for all of those 11 census tracts.  I computed similar averages for the other quintiles, and then repeated the process using the 2017 5-year estimates.</p>
<p>As a side note, averaging the median incomes in this way is not strictly speaking correct.  The correct method would be to aggregate all the individual household incomes for all the tracts in a quintile, and then compute a median household income for the quintile overall. However this is not possible because the US Census Bureau does not release data for individual households within a census tract. Averaging the median household incomes for the tracts is the next-best thing.</p>
<p>There’s another subtlety here as well: Many graphs showing changes between income quintiles over time actually reflect different sets of people, households, or geographies between the different timeframes. For example, if in the graph the 2010 figures were to reflect the quintiles as they exist in 2010, and the 2017 figures were to reflect the quintiles as they exist in 2017, then we would not necessarily be comparing like to like.  That’s because a given census tract may have moved from one quintile to another over time.</p>
<p>To avoid this problem I had the 2017 values in the graph reflect the income values for the same sets of quintiles as for the 2010 values: if a census tract were in, say, quintile 3 in 2010 then it was counted as part of the same set of tracts for 2017.</p>
<p>Take-aways from this graph (and the underlying data, not shown here) are as follows:</p>
<ul>
<li>The Howard County census tracts in the lowest quintile in 2010 (i.e., those with the lowest median household income) had a substantial increase in average (real) median household income between the 2006&ndash;2010 and 2013&ndash;2017 time frames, amounting to about a $7,000 increase in real terms or a 9% increase in percentage. This was actually the largest increase of any of the quintiles in both absolute and percentage terms.</li>
<li>On the other hand, the census tracts in the next-to-lowest quintile in 2010 (quintile 2, just above quintile 1) experienced the largest decline in average (real) median household income between the two time frames in both absolute and percentage terms, about $6,000 or 5%.</li>
<li>The census tracts in the highest quintile in 2010 (quintile 5, the 20% of census tracts with the highest median household income) had a slight decrease in average (real) median household income between the two time frames, about $5,000 or 3%.</li>
<li>Finally, the census tracts in the other higher-income quintiles (quintiles 3 and 4) had slight decreases in average (real) median household income.</li>
</ul>
<p>The first item above is somewhat counterintuitive: does it mean that relatively less affluent people in Howard County actually did better income-wise between 2006&ndash;2010 and 2013&ndash;2017 than more affluent people? That’s one possible interpretation, but not the only one, and I suspect not the most likely one.</p>
<p>Remember that even though the census tracts may not have changed between the 2010 and 2017 5-year estimates graphed above, the people living within those census tracts did not necessarily remain the same.  In particular, it’s possible that many people living in the least affluent census tracts experienced a decline in household income so severe that they could no longer afford to live in Howard County.</p>
<p>Under this hypothetical scenario the people who left would presumably then be replaced by people with slighter higher incomes who <em>could</em> afford to live in Howard County.  As a result of this turnover the median household income of these least affluent census tracts would then increase, because the least affluent residents of those tracts would have moved to other counties.</p>
<p>This hypothetical scenario would also explain the difference between the experiences of the quintile 1 census tracts vs. the quintile 2 census tracts: Unlike those in quintile 1, the people in the quintile 2 tracts were presumably not “on the bubble” in terms of their being able to afford to live in Howard County.  They may have suffered declines in household income between the two time frames, but those declines were not so large as to force them to leave the county.</p>
<figure><a href="/assets/images/hocomd-mhi-tract-changes-2010-2017.png">
    <img loading="lazy" src="/assets/images/hocomd-mhi-tract-changes-2010-2017-embed.png"
         alt="Howard County median household income changes by census tract from 2006-2010 to 2013-2017"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>This next map shows changes in median household income between the 2010 and 2015 5-year estimates for each census tract, expressed in percentage terms.  Again the calculated changes are based on median household incomes in constant 2017 dollars, so they reflect real changes rather than changes due to inflation.</p>
<p>Harking back to the discussion above, note that two of the census tracts with increases in median household income were those in Elkridge east of I-95 and north of MD 100.  These were in quintile 1 (lowest 20% of all tracts by income) in 2010.  On the other hand, the census tract east of I-95 between MD 32 and MD 175, which was in quintile 2 in 2010, experienced one of the largest declines in median household income.</p>
<figure><a href="/assets/images/hocomd-mhi-quintiles-2017.png">
    <img loading="lazy" src="/assets/images/hocomd-mhi-quintiles-2017-embed.png"
         alt="Howard County median household income quintiles for 2013-2017"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>This final map, like the first map above, shows Howard County census tracts in the different income quintiles, this time based on the 2017 ACS 5-year estimates for median household income instead of the 2010 estimates.</p>
<p>Note that of the three census tracts discussed above, the first two moved up from quintile 1 (as measured by the 2010 estimates) to quintile 2 (as measured according to the 2017 estimate), while the third moved down from quintile 2 to quintile 1.</p>
<p>I’m done with looking at median household income, at least for now. In my next post I’ll turn my attention to median home values.</p>
<h2 id="further-exploration">Further exploration</h2>
<p>For more on how I created the graphs above, see the following:</p>
<ul>
<li>“<a href="https://rpubs.com/frankhecker/hocomd-census-tract-median-household-income-trends">Howard County median household income trends by census tracts</a>” shows the R code used to produce these and other graphs.</li>
<li>My <a href="https://gitlab.com/frankhecker/hocodata">hocodata code repository</a> includes copies of the R Markdown files for this and another analyses.  (Look in the “affordability” subdirectory.)</li>
<li>If you sign up for a free account on the <a href="https://rstudio.cloud/">Rstudio.cloud</a> service you can open and make a copy of my <a href="https://rstudio.cloud/project/353602">hocodata project</a> for this and other analyses, and try your hand at it yourself. (Again, look in the “affordability” subdirectory, and check out the <a href="https://rstudio.cloud/learn/primers">RStudio primers</a> to learn how to use the system.)</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I included some major Howard County highways on the map to help readers orient themselves. However to reduce clutter I included only highways that are also census tract boundaries along part or all of their length. That’s why highways like US 1 and MD 97 are not displayed.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>How affluent is Howard County, really?</title>
      <link>https://frankhecker.com/2019/06/02/how-affluent-is-howard-county-really/</link>
      <pubDate>Sun, 02 Jun 2019 09:22:00 -0400</pubDate>
      <guid>https://frankhecker.com/2019/06/02/how-affluent-is-howard-county-really/</guid>
      <description>Looking at median household income in Howard County, Maryland, over time compared to other local jurisdictions. [UPDATED]</description>
      <content:encoded><![CDATA[<figure><a href="/assets/images/hocomd-mhi-trends-adjusted.png">
    <img loading="lazy" src="/assets/images/hocomd-mhi-trends-adjusted-embed.png"
         alt="Howard County median household income vs. other local jurisdictions, 1-year estimates"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>UPDATE: I’ve corrected my comments about ranking of counties to make it clear that the rankings reflect only counties and county-equivalents with populations over 65,000.</p>
<p><em>tl;dr: Looking at median household income in Howard County, Maryland, over time compared to other local jurisdictions.</em></p>
<p>I’m continuing my look at median household income, in pursuit of my ultimate goal of learning more about the issues around housing affordability in Howard County.</p>
<p>After my <a href="/2019/05/27/how-affluent-is-maryland-really/">previous post</a> about Maryland median household income I now turn my attention to looking at Howard County specifically. Unfortunately US Census Bureau data on median household income at the county level does not go back nearly as far as state-level data. The earliest county-level data I can find dates from 2005 and the beginning of the American Community Survey.</p>
<p>The graph above shows all the data I could find on Howard County median household income, compared to a select set of other jurisdictions.  All values are in 2017 dollars.  The gains and losses thus represent gains and losses in real terms after adjusting for inflation.</p>
<p>I chose the other jurisdictions as follows:</p>
<ul>
<li>
<p>Howard County has traditionally been compared with Loudoun County, Virginia, as the most affluent counties in Maryland and Virginia respectively.  For this graph I also added Stafford County, Virginia, a rapidly growing county that straddles I-95 south of D.C. just as Howard County straddles I-95 north of D.C.</p>
</li>
<li>
<p>I paired Montgomery County and Fairfax County, the largest and most affluent of the close-in suburban jurisdictions.</p>
</li>
<li>
<p>I paired D.C. and Baltimore city as the respective urban jurisdictions of the Washington-Baltimore metro area.</p>
</li>
<li>
<p>Finally, I added Anne Arundel County as one of Howard County’s most affluent neighbors in Maryland.</p>
</li>
</ul>
<p>(I would have also added Baltimore County and perhaps Frederick County, but I ran out of colors and didn’t want to make the graph more cluttered than it already is.)</p>
<p>Here are some immediate takeaways from the graph above:</p>
<ul>
<li>
<p>Howard County experienced a significant drop in median household income from 2016 to 2017.</p>
</li>
<li>
<p>Northern Virginia continues to outpace central Maryland when it comes to median household income, with Loudoun County still way out in front, Fairfax County continuing to lead Montgomery County, and Stafford County having caught up to Howard County.</p>
</li>
<li>
<p>Similarly the District of Columbia is widening the income gap between itself and Baltimore city, and narrowing the gap between itself and its suburbs.</p>
</li>
</ul>
<p>As to why these trends are occurring, I haven’t done enough research to have a solid opinion.  However I will note that median household income for the Northern Virginia suburbs is increasing even as median household income for Virginia as a whole is stagnant or decreasing. This is presumably due to the rest of Virginia suffering economic problems to which Northern Virginia is immune.</p>
<figure><a href="/assets/images/hocomd-mhi-trends-relative.png">
    <img loading="lazy" src="/assets/images/hocomd-mhi-trends-relative-embed.png"
         alt="Howard County median household income vs. other local jurisdictions, 1-year estimates"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>This graph repeats the previous graph in comparing Howard County to other jurisdictions, except that here the measure is median household income for Howard County, etc., relative to US median household income.  My takeaways here are as follows:</p>
<ul>
<li>
<p>Loudoun County has separated itself from the pack in the last ten years, with a median household income now 225% or more of US median household income.</p>
</li>
<li>
<p>D.C. continues its growth in median household income, and is now above 130% of US median household income.  Given that Baltimore city median household income is stagnant at about 75-80% of US median household income, the next few years could see D.C. have almost double the median household income of Baltimore city.</p>
</li>
<li>
<p>Howard County and the other jurisdictions are trending steadily at 150-200% of US median household income.</p>
</li>
</ul>
<figure><a href="/assets/images/hocomd-mhi-trends-ranking.png">
    <img loading="lazy" src="/assets/images/hocomd-mhi-trends-ranking-embed.png"
         alt="Howard County median household income rank vs. other local jurisdictions"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>The final graph shows the ranking of Howard County over the years versus the most affluent local jurisdictions.  The story is similar to that from the graphs above.</p>
<p>UPDATE: While the relative rankings below are correct, the absolute rank numbers do not account for any affluent counties or county-equivalents with populations under 65,000.  That’s because they are based on the American Community Survey 1-year estimates, and such estimates are not done for smaller counties.</p>
<ul>
<li>
<p>Loudoun County has maintained its position for the last ten years as the most affluent US county as measured by median household income, with Fairfax County also consistently in the top ten.  They have now been joined by Stafford County, although the margins of error for Stafford County estimates are so large that it’s likely a matter of sheer randomness whether Stafford County is in the top ten or just outside it.</p>
</li>
<li>
<p>Howard County has dropped out of the top five <em>and</em> the top ten, and Montgomery County has dropped out of the top ten.  (Again, due to margins of error Howard County is probably in reality roughly tied with Stafford County.)  Anne Arundel County was never in the top ten counties by income, and now sits at #30.</p>
</li>
</ul>
<p>Looking at the rankings for other jurisdictions, Virginia has four jurisdictions in the top ten (with Arlington County joining Loudoun, Fairfax, and Stafford), with the remaining top ten counties in California and New Jersey (with three each).  Looking beyond the top ten, ranks #11-30 include two more Virginia jurisdictions (Prince William County and Alexandria city) and five Maryland counties (with Calvert and Charles counties joining Howard, Montgomery, and Anne Arundel).</p>
<p>Caveats aside, overall this reinforces the story of northern Virginia’s economic success and suburban Maryland’s relative economic decline.</p>
<p>In my next post I’ll turn my attention to median household income within Howard County itself, looking at Census data by census tract.</p>
<h2 id="further-exploration">Further exploration</h2>
<p>For more on how I created the graphs above, see the following:</p>
<ul>
<li>“<a href="http://rpubs.com/frankhecker/howard-county-median-household-income-trends">Median household income trends for Howard County, Maryland</a>” shows the R code used to produce these and other graphs.</li>
<li>My <a href="https://gitlab.com/frankhecker/hocodata">hocodata code repository</a> includes copies of the raw data files and R Markdown files for this and another analyses.  (Look in the “affordability” subdirectory.)</li>
<li>If you sign up for a free account on the <a href="https://rstudio.cloud/">Rstudio.cloud</a> service you can open and make a copy of my <a href="https://rstudio.cloud/project/353602">hocodata project</a> for this and other analyses, and try your hand at it yourself. (Again, look in the “affordability” subdirectory, and check out the <a href="https://rstudio.cloud/learn/primers">RStudio primers</a> to learn how to use the system.)</li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>How affluent is Maryland, really?</title>
      <link>https://frankhecker.com/2019/05/27/how-affluent-is-maryland-really/</link>
      <pubDate>Mon, 27 May 2019 10:00:00 -0400</pubDate>
      <guid>https://frankhecker.com/2019/05/27/how-affluent-is-maryland-really/</guid>
      <description>Looking at median household income in Maryland over time compared to DC and Virginia.</description>
      <content:encoded><![CDATA[<figure><a href="/assets/images/maryland-median-household-income-3ya.png">
    <img loading="lazy" src="/assets/images/maryland-median-household-income-3ya-embed.png"
         alt="Maryland median household income vs. DC and Virginia, 3-year averages"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p><em>tl;dr: Looking at median household income in Maryland over time compared to DC and Virginia.</em></p>
<p>With so much attention paid recently to the issue of housing affordability in Howard County, together with related concerns about county taxes, developer impact fees, the school system budget, adequate public facilities, and so on, I decided to take a look at these issues myself.</p>
<p>Instead of throwing some opinions out there without any support, I’m first taking a look at what data I can find that’s relevant to the issue, starting with data on household income and how that’s changed over time.  After all, household income, along with house prices, is a key factor in how affordable Howard County is, both to existing residents and those hoping to move here.</p>
<p>I’d really like to look at income data specifically for Howard County, but unfortunately I can’t find any US Census Bureau income data at the county level until the 2000s.  I’m therefore starting with data on median household income at the state level.</p>
<p>Recall that “median” means half of all households have less income than this value, and half more.  Median income is a better measure than average income, which can be skewed by great inequality, especially at the high end of the income scale.  And household income is a better measure than per capita income, since almost by definition it’s households that buy houses.</p>
<p>Above you can see the data over the last thirty years or so for Maryland, DC, and Virginia, as well as for the US as a whole. These values are inflation-adjusted, expressed in 2017 dollars.  I also used the 3-year average to smooth out spikiness in the values, whether due to sampling issues or year-to-year economic fluctuations. Thus the figure for (say) the year 2000 is actually the average for 1998, 1999, and 2000, and the latest value for 2017 also reflects income levels in 2015 and 2016.</p>
<p>Here are some quick take-aways from the above graph:</p>
<p>First, at least for the past thirty years Maryland median household income has always been higher than that for the US, as well as higher than median household income for both DC and Virginia (although DC has been quickly closing that gap).</p>
<p>Second, US median household income has not increased very much in real terms over time: less than $10,000 in 2017 dollars over thirty years, or about $300 per year.  Virginia’s median household income has risen even more slowly than that.</p>
<p>Using median household income as a measure of household affluence has been criticized because it does not include non-salary benefits like employer-paid health insurance.  This may make the relative stagnation of US median household income less serious of a problem than it otherwise appears.  However there’s no question that Virginia households overall have done less well than those in Maryland and DC over the last thirty years.</p>
<figure><a href="/assets/images/maryland-median-household-income-pct.png">
    <img loading="lazy" src="/assets/images/maryland-median-household-income-pct-embed.png"
         alt="Maryland median household income vs. DC and Virginia, as a percentage of US median household income"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>To get a better feel for how the local jurisdictions have fared relative to one another and to the US as a whole, here’s another graph showing median household income for Maryland, DC, and Virginia as a percentage of the US median household income (the horizontal line at 100%).  Again, this uses the 3-year averages for median household income.</p>
<p>My take-aways from this graph:</p>
<p>First, Maryland median household income has been between 120 and 130% of US household median income for most of the past thirty years.  It reached its high point in the early part of this decade, but has suffered a relative decline in the past few years.  In looking at the first graph above, it appears that the relative decline has occurred because recently US median household income has been rising faster than Maryland median household income.</p>
<p>Second, Virginia median household income has been between 110 and 120% of US household median income for most of the past thirty years. Like Maryland, it reached its high point in the early part of this decade, but has suffered an even sharper relative decline in the past few years.</p>
<p>Finally, DC has undergone a startling change from the 1990s, when its median household income was less than 90% of US median household income.  Today its median household income rivals the median household income of Maryland, and if we look at the 1-year estimates (not shown above) actually surpasses it.</p>
<p>In general median household income in a jurisdiction can rise either because existing households become more affluent, or because less affluent households leave the jurisdiction and are replaced by more affluent ones.  I haven’t done any research on what’s happening in DC with respect to in- and out-migration, so I can’t say which factor might have been more important.</p>
<figure><a href="/assets/images/maryland-median-household-income-rank.png">
    <img loading="lazy" src="/assets/images/maryland-median-household-income-rank-embed.png"
         alt="Maryland median household income rank vs. DC and Virginia"/> </a><figcaption>
            <p>Click for a higher-resolution version.  Graph by Frank Hecker, made available under the <a href="https://creativecommons.org/publicdomain/zero/1.0/">CC 1.0 Universal Public Domain Dedication</a>.</p>
        </figcaption>
</figure>

<p>People love to see how they’re doing in national rankings, so here’s a graph showing how Maryland, DC, and Virginia have ranked against the other US states<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> over the past thirty years.  For consistency I have used the same 3-year averages for median household income, although note that news articles about income rankings will almost always be based on the 1-year estimates.</p>
<p>Maryland has been in the top five states for almost all of the past thirty years, and has been number one in the rankings in recent years. Virginia has consistently been in the top ten states by median household income, although it has recently dropped out of the top ten.</p>
<p>Finally, DC has gone from the bottom ten to the top ten in the span of twenty years.  In the rankings based on 1-year estimates it is now ranked number one, higher than any US state.</p>
<h2 id="further-exploration">Further exploration</h2>
<p>For more on how I created the graphs above, see the following:</p>
<ul>
<li>“<a href="https://rpubs.com/frankhecker/499524">Maryland median household income over time</a>” shows the R code used to produce these and other graphs.</li>
<li>My <a href="https://gitlab.com/frankhecker/hocodata">hocodata code repository</a> includes copies of the raw data files and R Markdown files for this and another analyses.  (Look in the “affordability” subdirectory.)</li>
<li>If you sign up for a free account on the <a href="https://rstudio.cloud/">Rstudio.cloud</a> service you can open and make a copy of my <a href="https://rstudio.cloud/project/353602">hocodata project</a> for this and other analyses, and try your hand at it yourself. (Again, look in the “affordability” subdirectory, and check out the <a href="https://rstudio.cloud/learn/primers">RStudio.cloud primers</a>.)</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Yes, I’m counting DC as a state here.  In US census data it typically is included in state-level datasets, sometimes with Puerto Rico as well.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>Getting down with data</title>
      <link>https://frankhecker.com/2019/05/22/getting-down-with-data/</link>
      <pubDate>Wed, 22 May 2019 12:00:00 -0400</pubDate>
      <guid>https://frankhecker.com/2019/05/22/getting-down-with-data/</guid>
      <description>Everyone can now work with data and visualize it. Should you?</description>
      <content:encoded><![CDATA[<figure><a href="/assets/images/rstudio-cloud-screenshot.png">
    <img loading="lazy" src="/assets/images/rstudio-cloud-screenshot-embed.png"
         alt="Screenshot of RStudio Cloud, showing the available tutorials"/> </a><figcaption>
            <p>The RStudio Cloud service provides a web-based alternative to installing the RStudio desktop software, and includes tutorials for using the R statistical language and the Tidyverse set of functions. (Click for a higher resolution version.)</p>
        </figcaption>
</figure>

<p><em>tl;dr: Everyone can now work with data and visualize it. Should you?</em></p>
<p>NOTE: This article was originally published in my <em>Civility and Truth</em> Substack newsletter. I have republished it here without changes.</p>
<p>I haven’t had time yet to look at the detailed median household income data for Howard County (for which I’m going to try to do some maps of income by census tract), so that will have to wait for a future post. In the meantime I wanted to talk a bit about how I do these visualizations, how you can do them too if you have the time and interest, and what I’ve learned in the process.</p>
<h2 id="easy-data-analysis-and-visualization-for-free">Easy data analysis and visualization for free</h2>
<p>Once upon a time anyone wanting to do serious statistical analysis and graphic visualization of data needed to purchase a license for proprietary software products like <a href="https://en.wikipedia.org/wiki/SAS_(software)">SAS</a> or <a href="https://en.wikipedia.org/wiki/SPSS">SPSS</a> that cost hundreds or even thousands of dollars per user. The traditional alternative for most users was Microsoft Excel, which included at least a basic set of statistical functions and graphing operations. However it was still not exactly cheap, especially for home users, and given its origin in accounting spreadsheets it was not really that suitable for advanced statistical and data visualization work.</p>
<h3 id="r-and-the-tidyverse">R and the tidyverse</h3>
<p>What has changed from then until now? First, noncommercial alternatives arose to SAS, SPSS, and similar products, most notably the <a href="https://en.wikipedia.org/wiki/R_(programming_language)">R statistical programming language</a> and its associated runtime environment. Unlike SAS and SPSS, R was developed through an open collaborative process in which anyone could participate, and the resulting software was distributed in both binary and source form at no charge. R relatively quickly gained many users, and today it is pretty much the most popular language (along with Python) for so-called “data science” projects.</p>
<p>Unfortunately as a programming environment R is relatively difficult to use, especially for people coming to it as a first language. The second advance was to simplify the use of R by dictating a particular way of programming in it.  This was accomplished by the statistician Hadley Wickham and his colleagues, who developed a set of R extensions or “packages” known colloquially as the “Hadleyverse” and now renamed as the “tidyverse.”</p>
<p>The <a href="https://www.tidyverse.org/">tidyverse packages</a> implement a simplified philosophy for working with data, basically treating all data as sets of tables whose rows and columns can be manipulated in various ways, with the output of each manipulation producing a new table used as input to the next manipulation. The tidyverse packages also include an accompanying set of functions (“ggplot” and others) to graph data in various ways, again adhering to a particular philosophy of how to transform data into visuals.</p>
<h3 id="data-analysis-and-visualization-as-a-service">Data analysis and visualization as a service</h3>
<p>So-called “free and open source” software products like R and the tidyverse packages are a godsend for people like me who can’t or don’t want to pay for expensive proprietary software. But to paraphrase <a href="https://en.wikipedia.org/wiki/Jamie_Zawinski">a former colleague of mine</a>, free software is only free if your time has no value: the time and effort spent downloading, installing, and configuring software can be daunting, especially for a casual user who just wants to do a basic data plot.  This is especially true if you want to do more advanced things, like displaying data on maps.</p>
<p>To address this issue Hadley Wickham and his colleagues founded a startup, <a href="https://www.rstudio.com/about/">RStudio</a>, to lower the barriers to widespread use of R and the tidyverse packages. Their first product, also called <a href="https://www.rstudio.com/products/rstudio/">RStudio</a>, provides a web-based interactive development environment (IDE) to simplify creating R-based data analyses. In its RStudio Server version it allows an organization to stand up a central web site to which users can connect and use R, the tidyverse packages, and other R-based capabilities without having to install software on their own PCs.</p>
<p>However, RStudio Server removed a burden from end users only to place it on the people charged with standing up the server system with all its necessary software. That was fine for larger organizations, but a problem for small businesses, not to mention individual users.</p>
<p>To address that issue RStudio is now developing a new service, <a href="https://rstudio.cloud/">RStudio.cloud</a>, currently being made available for testing by the public.  With RStudio.cloud all you need is a browser: the R and RStudio software is already pre-installed for you, with additional packages easily installable on the service if and when you need them. RStudio.cloud also includes a full set of interactive tutorials (see the graphic above), so that anyone who’s familiar with (say) working with Excel spreadsheets, formulas, and macros can learn to do basic data analyses and visualizations.</p>
<p>(If you want to try RStudio.cloud yourself, you can <a href="https://login.rstudio.cloud/register?redirect=https%3A%2F%2Fclient.login.rstudio.cloud%2Foauth%2Flogin%3Fshow_auth%3D0%26show_login%3D1%26show_setup%3D1">sign up for a free account</a>. and work through some of the <a href="https://rstudio.cloud/learn/primers">interactive tutorials</a>. If you want to explore a non-trivial project, I’ve shared a version of my <a href="https://rstudio.cloud/project/353602">“hocodata” project</a> on RStudio.cloud for others to access.)</p>
<h2 id="public-data-for-public-use">Public data for public use</h2>
<p>Of course, it’s not enough to know how to do data analyses and visualizations.  You also need some actual data to work with. Here, as in other areas, government (local, state, and Federal) has come to the rescue&mdash;though not always, and not always completely.</p>
<h3 id="governments-data-exhaust">Government’s “data exhaust”</h3>
<p>Governments by their nature generate a lot of data about the jurisdictions over which they hold sway. The most notable (and ancient) example of this is the census, which has gone from being a simple count of people to collecting all sorts of relevant demographic, economic, and other data about populations.</p>
<p>Governments also collect a lot of other data in the course of their operations, for example about crimes both serious and petty, building permits and zoning decisions, the locations of fire hydrants and streetlights, and so on. Traditionally this data was generated and kept as paper documents, but now it is almost always generated and stored as digital files or as entries in a digital database&mdash;a sort of “data exhaust” that is emitted by the day-to-day running of governments.</p>
<p>Having generated this data, it’s natural for governments to consider giving citizens access to it. In some cases this is part of an overarching strategy to improve visibility into the workings of government. A good example is the <a href="https://www.baltimoresun.com/news/maryland/howard/ellicott-city/ph-ho-cf-political-notebook-0911-20140911-story.html">“HoCoStat” system</a> proposed by former Howard County Executive Allan Kittleman during his successful 2014 campaign.</p>
<p>In other cases government just takes data and makes it available without an overall strategy&mdash;after all, the data is being produced in digital form already, whether that be as Excel spreadsheets or in some other form, and the incremental work to make it publicly available may not be that large. For example, although the full HoCoStat system was never deployed, under Allan Kittleman Howard County did <a href="https://www.howardcountymd.gov/News/ArticleID/156/05-08-15-Executive-Kittleman-launches-open-data-portal-to-increase-government-transparency">stand up</a> a new <a href="https://opendata.howardcountymd.gov/">OpenHoward site</a> that collected data produced by various Howard County agencies. Somewhat confusingly, there is also a separate site <a href="https://data.howardcountymd.gov/">data.howardcountymd.gov</a> that also hosts a variety of data provided by the Howard County GIS division&mdash;another project that appears to have been done as an incremental effort.</p>
<p>However, governments do not always make data available, or make it available only in inconvenient ways, for a variety of reasons. For example, some government agencies release data only in the form of PDF documents, the electronic equivalent of traditional paper reports. These can be relatively difficult to extract data from. In other cases data may be displayable on a public web site, but with no way to download it in a more convenient form.</p>
<p>But even here people have created automated ways to access data even in odd formats, whether that be extracting tables from PDF files or “scraping” it off of web sites. The result is yet more data to add to that available from more convenient sources.</p>
<h3 id="the-downside-of-data">The downside of data</h3>
<p>So with all this data available, and free ways to analyze it, are we living in utopia (at least as far as data analysis and visualization are concerned)? I don’t really think so: there are downsides to having lots of data to analyze just as there are downsides to not having it.</p>
<p>First, we tend to think that data is more accurate and reflective of reality than it actually is. For example, take the median household income estimate for Howard County and comparable estimates for other counties. In 2017 the estimate for Howard County median household income was $111,473 while the estimate for Stafford County, Virginia, was $112,795, or $1,322 more. This difference was enough to propel Stafford County into the list of top ten counties by median household income, and knock Howard County out of it.</p>
<p>But the margins of error for these estimates were $2,666 for Howard County and $5,081 for Stafford County. There’s therefore a good possibility that Howard County and Stafford County had pretty much equal median household incomes for 2017, and a fair chance that Howard County’s median household income was actually higher than Stafford County’s.</p>
<p>This failure to take margins of error into account is ubiquitous in people’s treatment of data (and I’ve been guilty of it myself). It’s not that significant an issue with respect to median household income estimates, but it can be a big deal indeed when it comes to data measurements that drive funding and personnel decisions, as with student test scores. It’s quite possible that many if not most of the reported test score increases and decreases that are alternatively lauded or derided are actually just random year-to-year fluctuations that don’t reflect any underlying change in students’ ability to learn or teachers’ ability to teach.</p>
<p>School test scores provide another reason not to put too much faith in data: When data measurements are used to drive rewards and punishments, the temptation to game the measurements in various ways can be irresistible. With school test scores such gaming can range from “teaching to the test” up to outright fraud, as shown by scandals around the US.  We now have to account not only for the possibility of random fluctuations, which are relatively benign in origin, we also have to assess to what degree the data might be fraudulently measured or reported.</p>
<p>Finally, in many cases we should question ourselves as to whether some data is actually useful, or should be used. For example, do student test scores actually tell us anything useful? Would it be better not to do student testing at all, or to restrict it to certain narrow purposes? One benefit of working directly with raw data, as opposed to consuming pre-cooked graphs and tables prepared by others, is that it can give you a good sense of the limits to what data can tell us.</p>
]]></content:encoded>
    </item>
    <item>
      <title>Get your county government data at the OpenHoward portal</title>
      <link>https://frankhecker.com/2015/05/11/get-your-county-government-data-at-the-openhoward-portal/</link>
      <pubDate>Mon, 11 May 2015 08:00:00 -0400</pubDate>
      <guid>https://frankhecker.com/2015/05/11/get-your-county-government-data-at-the-openhoward-portal/</guid>
      <description>&lt;p&gt;&lt;em&gt;tl;dr: Howard County government ups its game in providing data with a new web site opendata.howardcountymd.gov.  Next stop, HoCoStat?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I’ve &lt;a href=&#34;https://frankhecker.com/2015/01/19/howard-county-government-by-the-numbers/&#34; title=&#34;Howard County government by the numbers&#34;&gt;previously written&lt;/a&gt; about Howard County’s initial foray into publishing government data, the &lt;a href=&#34;https://data.howardcountymd.gov&#34;&gt;data.howardcountymd.gov&lt;/a&gt; web site created by the Howard County GIS division.  As &lt;a href=&#34;http://www.howardcountymd.gov/News050815.htm&#34; title=&#34;Executive Kittleman launches open data portal to increase government transparency&#34;&gt;announced by the county&lt;/a&gt; and &lt;a href=&#34;http://www.baltimoresun.com/news/maryland/howard/ellicott-city/ph-ho-cf-open-howard-story.html&#34; title=&#34;Howard launches government transparency site&#34;&gt;reported by Amanda Yeager at the &lt;em&gt;Baltimore Sun&lt;/em&gt;&lt;/a&gt;, Howard County has launched a new site &lt;a href=&#34;https://opendata.howardcountymd.gov&#34;&gt;opendata.howardcountymd.gov&lt;/a&gt; to provide access to government data.  This new site, also known as the OpenHoward portal,&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; can be considered as a concrete implementation of open data practices mandated by the Howard County Council (see &lt;a href=&#34;https://apps.howardcountymd.gov/olis/LegislationDetail.aspx?LegislationID=839&#34;&gt;Council Bill 32-2014&lt;/a&gt;) and as a down payment on County Executive Allan Kittleman’s &lt;a href=&#34;https://web.archive.org/web/20141013202423/http://kittleman.com/hocostat/&#34; title=&#34;HoCoStat: It’s Time for Citizens to Have a Platform to Hold Government Accountable&#34;&gt;campaign promise&lt;/a&gt; to create an automated system (“HoCoStat”) to “help government increase responsiveness, improve efficiency and heighten accountability.”&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p><em>tl;dr: Howard County government ups its game in providing data with a new web site opendata.howardcountymd.gov.  Next stop, HoCoStat?</em></p>
<p>I’ve <a href="/2015/01/19/howard-county-government-by-the-numbers/" title="Howard County government by the numbers">previously written</a> about Howard County’s initial foray into publishing government data, the <a href="https://data.howardcountymd.gov">data.howardcountymd.gov</a> web site created by the Howard County GIS division.  As <a href="http://www.howardcountymd.gov/News050815.htm" title="Executive Kittleman launches open data portal to increase government transparency">announced by the county</a> and <a href="http://www.baltimoresun.com/news/maryland/howard/ellicott-city/ph-ho-cf-open-howard-story.html" title="Howard launches government transparency site">reported by Amanda Yeager at the <em>Baltimore Sun</em></a>, Howard County has launched a new site <a href="https://opendata.howardcountymd.gov">opendata.howardcountymd.gov</a> to provide access to government data.  This new site, also known as the OpenHoward portal,<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> can be considered as a concrete implementation of open data practices mandated by the Howard County Council (see <a href="https://apps.howardcountymd.gov/olis/LegislationDetail.aspx?LegislationID=839">Council Bill 32-2014</a>) and as a down payment on County Executive Allan Kittleman’s <a href="https://web.archive.org/web/20141013202423/http://kittleman.com/hocostat/" title="HoCoStat: It’s Time for Citizens to Have a Platform to Hold Government Accountable">campaign promise</a> to create an automated system (“HoCoStat”) to “help government increase responsiveness, improve efficiency and heighten accountability.”</p>
<p>But enough marketing speak, what is this thing really?  Briefly, the opendata.howardcountymd.gov site, like the original data.howardcountymd.gov site, is a web site that allows you to view and download various datasets relating to Howard County government activities and Howard County in general.  However in other respects the new OpenHoward site goes well beyond what the previous site offers.  First, the new site includes many types of data not previously available on the older site, including (to take but two examples) datasets relating to county budgets and police reports.</p>
<p>Second, the new site has a search facility that is extremely handy when trying to find data and datasets of interest.  For example, since the renovation of Merriweather Post Pavilion has been in the news I decided to <a href="https://opendata.howardcountymd.gov/en/browse?q=+merriweather">search for “Merriweather”</a>.  The search returned (among other things) datasets and records relating to police reports, reports from the <a href="http://www.howardcountymd.gov/tellhoco.htm">Tell HoCo</a> web site and mobile app used to report potholes, broken street lamps, and other problems, and a list of payments the county made relating to Wine in the Woods.  I also tried searching for the name of the street I live on, and got a similar mix of results.  I predict that this will be a popular use of the site.</p>
<p>Finally, the new site offers an <a href="https://opendata.howardcountymd.gov/developers" title="Developer Resources">application programming interface</a> (API) by which independent developers can create applications that access the data in real-time.  Most people won’t care about this, but (among other things) it offers local Howard County businesses and motivated individuals a way to create their own applications to add value to the underlying county data.</p>
<p>The opendata.howardcountymd.gov site was not built from scratch, but was instead deployed using the online service provided by <a href="http://www.socrata.com">Socrata</a>, a Seattle-based private company specializing in helping governments to implement open data initiatives.  Socrata’s is a “cloud-based” or “software as a service” (SaaS) offering, meaning that Howard County did not purchase software and hardware to run the site, but instead pays a ongoing subscription fee to host its data on Socrata’s servers running Socrata’s software.  We’ll see in future exactly how much Howard County is paying Socrata for this service (since presumably it will show up in the “Payments to Vendors” database), but based on an <a href="https://thomaslevine.com/!/socrata-products/" title="Semi-open data about pricing of open data">independent analysis of Socrata pricing</a> it’s likely that the cost to the county is on the order of several thousand dollars per month.</p>
<p>That may sound like a lot, but you have to compare it to the fully-burdened cost (i.e., including salaries, heath care, and pensions) of having Howard County employees build the site, or the cost of having a contractor develop a custom site.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>  Socrata appears to be a market leader in the open data space, is <a href="http://www.socrata.com/newsroom-article/socrata-continues-to-catapult-data-driven-government-forward-with-robust-q1-2015-results/" title="Socrata Continues to Catapult Data-Driven Government Forward with Robust Q1 2015 Results">growing rapidly</a>, and has a <a href="http://www.govtech.com/data/Open-Data-Goes-Mainstream-Accelerates-Success-for-Socrata.html" title="Open Data Goes Mainstream, Accelerates Success for Socrata">coherent vision</a> for future product offerings.  Socrata also has other customers in Maryland at both the state and local levels, with Socrata powering the <a href="https://data.maryland.gov">Open Data Portal</a> used in the <a href="http://www.statestat.maryland.gov">StateStat</a> system, as well as open data portals and related applications for <a href="https://data.baltimorecity.gov">Baltimore City</a> and <a href="https://data.montgomerycountymd.gov">Montgomery County</a>.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>In general I think going with Socrata was a good decision for the county.  The site looks pretty functional from the point of view of both beginners and more advanced users, Socrata appears to have good mechanisms for getting new datasets into their system, and the provision of an API is a plus for advanced usage.  Plus Socrata also has a separate <a href="http://www.socrata.com/products/open-performance-govstat/">Open Performance (GovStat)</a> product that looks as if it would be a good base on which to build the HoCoStat system.</p>
<p>In comparison to the pluses my concerns about OpenHoward thus far are relatively minor.  First, the site could use more datasets, and more data in existing datasets.  (For example, there’s no police or fire and rescue data for 2015.)  However the press release is upfront about this being a “beta” site at present, so presumably more data is on the way.  One major potential lack is data on Howard County schools; I presume the Board of Education and Superintendent Foose would need to cooperate to get that done, and it’s an open question as to whether such cooperation will be forthcoming.</p>
<p>Second, I think the conditions for access to the site and its data need to be spelled out a bit more clearly.  The original County Council bill CB32-2014 stated that “All accessible data . . . shall be made available without copyright, patent, trademark, or trade secret, or similar regulation other than reasonable privacy, security, and privilege restrictions.” In other words, all data published on the site is presumably in the public domain with no restrictions on its use.  However it would be nice if that could be spelled out more explicitly.  The terms of use for the API are somewhat unclear as well: There’s a basic level of API access available by default, and more intensive usage is possible by registering and getting an “application token.”  These are both provided at no charge, but it’s not clear whether there is some level or type of API access that would incur a charge to the application developer or to application users.  Again, this is worth spelling out.</p>
<p>Finally, what will happen to the existing data.howardcountymd.gov site?  Will its data be folded into the OpenHoward portal and the original site decommissioned, or will it continue to operate?  I confess to a personal interest in this, since I’ve previously published <a href="http://rpubs.com/frankhecker/">analyses</a> that pull datasets from the older site, and if the old site goes away I’d like the web links I used to be redirected to the new site.</p>
<p>Leaving these relatively minor concerns aside, overall the launch of the OpenHoward portal is a very welcome event, and I’m looking forward to see how it and the larger HoCoStat initiatives evolve.  Our thanks should go to all those who made this possible, including to Greg Fox, Jen Terrasa, and the other members of the Howard County Council for pushing Howard County to provide open accessible data, to Allan Kittleman for his work thus far to fulfill his campaign pledges around open access, and, most importantly, to those who did the real work, Chris Merdon’s staff in the Department of Technology and Communication Services.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Although both the county press release and the <em>Baltimore Sun</em> article reference the OpenHoward name, the actual web site doesn’t use that name.  Maybe they’re still finalizing the logo and related branding?&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Based on the figures on page 190 of the <a href="http://www.howardcountymd.gov/Budget2016.pdf">Howard County FY2016 proposed operating budget</a> [PDF], personnel costs for the Department of Technology and Communication Services (the county’s IT department) appear to be almost $100,000 per employee on average.  So a hypothetical subscription fee of $8,000 a month would be equivalent to hiring one new employee.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>The Maryland connection goes beyond what I mentioned: <a href="https://www.linkedin.com/pub/beth-blauer/53/838/10">Beth Blauer</a>, who headed up the Maryland StateStat project, subsequently worked at Socrata for a couple of years before leaving to head up the <a href="http://hub.jhu.edu/2015/04/20/what-works-cities">Center for Government Excellence</a> at John Hopkins University.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>How politicians see Howard County</title>
      <link>https://frankhecker.com/2015/03/21/how-politicians-see-howard-county/</link>
      <pubDate>Sat, 21 Mar 2015 13:00:35 -0400</pubDate>
      <guid>https://frankhecker.com/2015/03/21/how-politicians-see-howard-county/</guid>
      <description>&lt;figure&gt;&lt;a href=&#34;https://frankhecker.com/assets/images/hocomd-precinct-cartogram.png&#34;&gt;
    &lt;img loading=&#34;lazy&#34; src=&#34;https://frankhecker.com/assets/images/hocomd-precinct-cartogram-embed.png&#34;
         alt=&#34;Howard County, Maryland precinct cartogram&#34;/&gt; &lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Howard County, Maryland precinct cartogram.  Precinct area is proportional to the number of registered voters as of the 2014 general election.  Click for higher-resolution version.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;&lt;em&gt;tl;dr: The map of Howard County looks very different if you’re looking for votes.  Cartograms help you see like a politician.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;There are 118 election precincts in Howard County, Maryland, varying both in geographic area and in the number of voters they contain.  Precincts in western Howard County tend to be larger, because the population density in western Howard is lower.  Precincts in more densely populated areas of the county (including Columbia) tend to be smaller.  If we’re interested in how voters behave across the county a conventional map can be misleading because the larger area of western Howard precincts causes us to overrate the importance and impact of those precincts.  (This is similar to the US electoral map being visually dominated by large states like Montana, Wyoming, and the Dakotas that have fewer voters than small states like Connecticut and Rhode Island.)&lt;/p&gt;</description>
      <content:encoded><![CDATA[<figure><a href="/assets/images/hocomd-precinct-cartogram.png">
    <img loading="lazy" src="/assets/images/hocomd-precinct-cartogram-embed.png"
         alt="Howard County, Maryland precinct cartogram"/> </a><figcaption>
            <p>Howard County, Maryland precinct cartogram.  Precinct area is proportional to the number of registered voters as of the 2014 general election.  Click for higher-resolution version.</p>
        </figcaption>
</figure>

<p><em>tl;dr: The map of Howard County looks very different if you’re looking for votes.  Cartograms help you see like a politician.</em></p>
<p>There are 118 election precincts in Howard County, Maryland, varying both in geographic area and in the number of voters they contain.  Precincts in western Howard County tend to be larger, because the population density in western Howard is lower.  Precincts in more densely populated areas of the county (including Columbia) tend to be smaller.  If we’re interested in how voters behave across the county a conventional map can be misleading because the larger area of western Howard precincts causes us to overrate the importance and impact of those precincts.  (This is similar to the US electoral map being visually dominated by large states like Montana, Wyoming, and the Dakotas that have fewer voters than small states like Connecticut and Rhode Island.)</p>
<p>The figure above is actually a map of Howard County electoral precincts, not as they exist in reality but as they might appear if their size were proportional to the number of voters they contain.  More specifically, this is a <em><a href="http://en.wikipedia.org/wiki/Cartogram">cartogram</a></em> in which the precinct map is distorted to make precinct areas proportional to the number of registered voters in each precinct as of the 2014 general election.</p>
<figure><a href="/assets/images/kittleman-2014-vote-margins-choropleth.png">
    <img loading="lazy" src="/assets/images/kittleman-2014-vote-margins-choropleth-embed.png"
         alt="Allan Kittleman’s victory margins by precinct."/> </a><figcaption>
            <p>Conventional map of Allan Kittleman’s election-day margin of victory in each precinct in the 2014 general election for Howard County Executive.  Click for a higher-resolution version.</p>
        </figcaption>
</figure>

<p>Let’s look at a real-life example of how cartograms can present a more accurate picture of election results.  The next map shows Republican Allan Kittleman’s election-day margin of victory in each precinct in his 2014 race for Howard County Executive against Democrat Courtney Watson.  (The margin of victory is expressed as votes per precinct, not as a percentage.  Thus a value of 100 means that Kittleman received 100 more votes in a precinct on election day than Watson.  The map does not include absentee and early voting results because they are not reported per precinct.)</p>
<p>Each precinct is colored from bright red (large Kittleman margin) to bright blue (large Watson margin) and all shades in between.  (Incidentally, this type of colored map is known as a <em><a href="http://en.wikipedia.org/wiki/Choropleth_map">choropleth map</a></em>.)  Since precincts in western Howard County are both large and heavily Republican the conventional map exaggerates the extent of Kittleman’s election-day victory margin over Watson.</p>
<figure><a href="/assets/images/kittleman-2014-vote-margins-cartogram.png">
    <img loading="lazy" src="/assets/images/kittleman-2014-vote-margins-cartogram-embed.png"
         alt="Cartogram of Allan Kittleman victory margins by precinct"/> </a><figcaption>
            <p>Cartogram of Allan Kittleman’s election-day margin of victory in each precinct in the 2014 general election for Howard County Executive.  Click for a higher resolution version.</p>
        </figcaption>
</figure>

<p>To address this perceptual problem we can instead represent the exact same data in the form of a cartogram, as seen in the next map.  Here the precincts of western Howard shrink in size to reflect their true contribution to the overall registered voter population.  In particular Howard County Council District 5 now appears to be roughly equal in size to the other districts&mdash;which makes sense since county council redistricting had as one of its goals making the districts contain roughly equal number of voters.  On this map Kittleman’s margin of victory still appears to be significant, but we can better identify precincts (like those in Columbia) in which Watson polled strongly on election day.</p>
<p>Cartograms can be used in place of conventional maps in any context in which each geographic subdivision has associated with it some common variable of interest.  For example, suppose we want to look at elementary school overcrowding in Howard County.  Looking at a conventional map (like the <a href="http://www.hcpss.org/f/schoolplanning/map-es201415.pdf">elementary school attendance area map</a> provided by the Howard County Public School System) we might say, “Gee, there are a lot of elementary schools in eastern Howard.  How could they possibly be overcrowded?” It would make much more sense to show school attendance areas as a cartogram in which the size of each attendance area was proportional to the number of students in that area.  Each of the attendance areas could then be colored according to the extent of overcrowding at that school.</p>
<p>This sounds like a possible future project for me if and when I have time.  Or if anyone out there would like to try this yourself, I’ve provided more detailed information on how to create maps like those shown above.  See my three-part series “Creating Howard County Precinct Cartograms Based on 2014 Registered Voters” (<a href="http://rpubs.com/frankhecker/63528" title="Creating Howard County Precinct Cartograms Based on 2014 Registered Voters, Part 1">part 1</a>, <a href="http://rpubs.com/frankhecker/63529" title="Creating Howard County Precinct Cartograms Based on 2014 Registered Voters, Part 2">part 2</a>, and <a href="http://rpubs.com/frankhecker/64539" title="Creating Howard County Precinct Cartograms Based on 2014 Registered Voters, Part 3">part 3</a>) and my second three-part series “Allan Kittleman’s Election-Day Victory Margins in the Howard County 2014 General Election” (<a href="http://rpubs.com/frankhecker/60538" title="Allan Kittleman’s Election-Day Victory Margins in the Howard County 2014 General Election, Part 1">part 1</a>, <a href="http://rpubs.com/frankhecker/63458" title="Allan Kittleman’s Election-Day Victory Margins in the Howard County 2014 General Election, Part 2">part 2</a>, and <a href="http://rpubs.com/frankhecker/63561" title="Allan Kittleman’s Election-Day Victory Margins in the Howard County 2014 General Election, Part 3">part 3</a>).</p>
]]></content:encoded>
    </item>
    <item>
      <title>Useful datasets for Howard County election analysis</title>
      <link>https://frankhecker.com/2015/03/01/useful-datasets-for-howard-county-election-analysis/</link>
      <pubDate>Sun, 01 Mar 2015 07:00:17 -0500</pubDate>
      <guid>https://frankhecker.com/2015/03/01/useful-datasets-for-howard-county-election-analysis/</guid>
      <description>&lt;p&gt;&lt;em&gt;tl;dr: I release two useful Howard County election datasets in preparation for future posts.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In the coming days and weeks I’ll be posting some analyses of Howard County election results.  Unfortunately the data released by the &lt;a href=&#34;http://www.howardcountymd.gov/Departments.aspx?id=4294968268&#34;&gt;Howard County Board of Elections&lt;/a&gt; and the &lt;a href=&#34;http://www.elections.state.md.us&#34;&gt;Maryland State Board of Elections&lt;/a&gt; is not always in the most useful form for analysis.  In particular I was looking for per-precinct turnout statistics for the 2014 general election in Howard County, along with some way to match up precincts with the county council district of which they’re a part.  That data is available in the &lt;a href=&#34;http://www.howardcountymd.gov/WorkArea/linkit.aspx?LinkIdentifier=id&amp;amp;ItemID=6442477038&amp;amp;libID=6442477030&#34;&gt;2014 general election results per precinct/district&lt;/a&gt; published by the Howard County Board of Elections, but unfortunately that document is a PDF document.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p><em>tl;dr: I release two useful Howard County election datasets in preparation for future posts.</em></p>
<p>In the coming days and weeks I’ll be posting some analyses of Howard County election results.  Unfortunately the data released by the <a href="http://www.howardcountymd.gov/Departments.aspx?id=4294968268">Howard County Board of Elections</a> and the <a href="http://www.elections.state.md.us">Maryland State Board of Elections</a> is not always in the most useful form for analysis.  In particular I was looking for per-precinct turnout statistics for the 2014 general election in Howard County, along with some way to match up precincts with the county council district of which they’re a part.  That data is available in the <a href="http://www.howardcountymd.gov/WorkArea/linkit.aspx?LinkIdentifier=id&amp;ItemID=6442477038&amp;libID=6442477030">2014 general election results per precinct/district</a> published by the Howard County Board of Elections, but unfortunately that document is a PDF document.</p>
<p>PDF files are great for reading by humans, but lousy for reading by machines.  They violate guideline 8 in the <a href="http://sunlightfoundation.com/opendataguidelines/">Open Data Policy Guidelines</a> published by the <a href="http://sunlightfoundation.com/about/">Sunlight Foundation</a>:</p>
<blockquote>
<p>For maximal access, data must be released in formats that lend themselves to easy and efficient reuse via technology.  … This means releasing information in open formats (or “open standards”), in machine-readable formats, that are structured (or machine-processable) appropriately.  … While formats such as HTML and PDF are easily opened for most computer users, these formats are difficult to convert the information to new uses.</p>
</blockquote>
<p>Since the data I wanted wasn’t in a format I could use, I manually extracted the data from the PDF document and converted it into a useful format (Comma Separated Value or CSV format) myself.  Then since someone else might find a use for them, I published the files online in a <a href="https://github.com/frankhecker/hocodata/tree/master/datasets">datasets area</a> of my <a href="https://github.com/frankhecker/hocodata">Github hocodata repository</a>.  The first two files are as follows:</p>
<ul>
<li><a href="https://raw.githubusercontent.com/frankhecker/hocodata/master/datasets/hocomd-2014-precinct-council.csv">hocomd-2014-precinct-council.csv</a>.  This dataset maps the 118 Howard County election precincts to the county council districts in which those precincts are included.</li>
<li><a href="https://raw.githubusercontent.com/frankhecker/hocodata/master/datasets/hocomd-2014-general-election-turnout-by-precinct.csv">hocomd-2014-general-election-turnout.csv</a>.  This dataset contains turnout statistics for each of the 118 Howard County precincts in the 2014 general election, including the number of registered voters and ballots cast in each precinct on election day.</li>
</ul>
<p>Stay tuned for some interesting ways to use this data.</p>
<hr>
<h4 id="268ded72-001">Walter Carson (wcarson@columbiaunion.net) - 2015-03-01 14:38</h4>
<p>Thank you. As always, of interest. How might such data be used to look at the state legislative districts, if at all? Best wishes. WEC Sent from my iPhone</p>
<h4 id="268ded72-002"><a href="/">hecker</a> - 2015-03-01 19:50</h4>
<p>See my future posts for some ideas on how this data might be used. Probably the first thing I&rsquo;ll do is look at different county council districts to see if there seems to be any real difference in 2014 general election turnout between the districts. A similar analysis could be done for legislative districts, or at least those portions of the districts within Howard County. (A more complete analysis would need data from Carroll County, Baltimore County, etc.)</p>
]]></content:encoded>
    </item>
    <item>
      <title>Fun with Howard County building permit data</title>
      <link>https://frankhecker.com/2015/02/16/fun-with-howard-county-building-permit-data/</link>
      <pubDate>Mon, 16 Feb 2015 18:53:59 -0500</pubDate>
      <guid>https://frankhecker.com/2015/02/16/fun-with-howard-county-building-permit-data/</guid>
      <description>&lt;p&gt;&lt;em&gt;tl;dr: I have fun creating graphs and maps with building permit data from data.howardcountymd.gov.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I’ve &lt;a href=&#34;https://frankhecker.com/2015/01/19/howard-county-government-by-the-numbers/&#34; title=&#34;Howard County government by the numbers&#34;&gt;written previously&lt;/a&gt; about the cornucopia of interesting data sets that Howard County government has made available at the &lt;a href=&#34;http://data.howardcountymd.gov/&#34;&gt;data.howardcountymd.gov&lt;/a&gt; site.  I had some spare time over a long weekend and decided to try analyzing some of that data, including making use of the various map files on the site (under the “Spacial Data (GIS)” tab).&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p><em>tl;dr: I have fun creating graphs and maps with building permit data from data.howardcountymd.gov.</em></p>
<p>I’ve <a href="/2015/01/19/howard-county-government-by-the-numbers/" title="Howard County government by the numbers">written previously</a> about the cornucopia of interesting data sets that Howard County government has made available at the <a href="http://data.howardcountymd.gov/">data.howardcountymd.gov</a> site.  I had some spare time over a long weekend and decided to try analyzing some of that data, including making use of the various map files on the site (under the “Spacial Data (GIS)” tab).</p>
<p>The particular data set I decided to start with was for building permits issued for residential and commercial construction&mdash;not because I have a burning interest in building permits but because I mentioned this type of data in my last post and thought it would be a relatively easy data set to analyze.  The particular question I decided to look at was how many residential building permits were issued in each zip code within Howard County in 2014&mdash;basically to get a feel for where the most construction was occurring in the county.  (It’s only an approximate measure because some permits cover multiple units.)</p>
<p><a href="/assets/images/hoco-residential-permits-2014-graph.png"><img alt="bar chart showing Howard County residential building permits per zip code" loading="lazy" src="/assets/images/hoco-residential-permits-2014-graph-embed.png"></a></p>
<p>To do the analysis I used the skills and the tools I learned in the courses that are part of the <a href="https://www.coursera.org/specialization/jhudatascience/1?utm_medium=courseDescripTop">Johns Hopkins data science specialization</a> series on Coursera.  (See my <a href="/tag/coursera/">Coursera-related posts</a> for more on my experiences in these classes.)  I won’t go over the process here since I’ve separately published full details on <a href="http://rpubs.com/frankhecker">my RPubs page</a>, with the source code available in <a href="https://github.com/frankhecker/hocodata">my hocodata GitHub repository</a>.</p>
<p>I first created a simple table of the top zip codes for residential permits issued.  This was sort of boring so I won’t reproduce it here; you can find it in the <a href="http://rpubs.com/frankhecker/59553">first example analysis</a> I did.  More interesting is the bar chart I created as part of the <a href="http://rpubs.com/frankhecker/59591">second example</a>.  It’s clear from the chart that there’s wide variation among Howard County zip codes in terms of residential construction.  The two Ellicott City zip codes combined (21042 and 21043) accounted for the largest fraction of residential building permits in 2014; in contrast there were almost no permits issued for east Columbia (21045).</p>
<p><a href="/assets/images/hoco-residential-permits-2014-map.png"><img alt="Howard County map showing residential building permits per zip code" loading="lazy" src="/assets/images/hoco-residential-permits-2014-map-embed.png"></a></p>
<p>However what I really wanted to create was a map showing exactly where permits were being issued across the county.  The Howard County GIS division provides on data.howardcountymd.gov a set of map data for zip codes within Howard County.  After doing a bit of research and experimentation, in <a href="http://rpubs.com/frankhecker/59816">my third example</a> I was able to use this in conjunction with the building permit data to produce a map that is a nice alternative to the bar chart.</p>
<p>I have to stop here and ask the unspoken question: What’s the point of all this?  I’d answer as follows:</p>
<p>First, this shows that releasing government data empowers people to do interesting things with it, especially when combined with free software and easily available online information and training.  Maybe everybody isn’t interested in building permit data or any other individual government data set, but I suspect that there are a fair amount of people out there who are, including small businesses, nonprofit organizations, or just individual activists and interested citizens.</p>
<p>Second, I did all this in a way that is completely reproducible by anyone else.  How often have you seen a graph or map in a newspaper or government report and wondered, where exactly did that data come from?  Wonder no longer: In my examples I start with the raw data as released by Howard County and show all my work in analyzing the data and creating the tables, charts, and maps.</p>
<p>Finally, this is all reusable and adaptable.  For example, suppose you have a better source of data on construction activity, perhaps one that gives the actual numbers of residential units, commercial square footage, and so on.  You can easily plug that modified data into the analysis steps I’ve documented, and create better versions of the charts and maps in my examples.</p>
<p>You can also reuse the overall technical approach for any type of data tied to a geographic area within Howard County.  For example, in addition to zip code areas the data.howardcounty.gov site contains map data for Howard County school districts, election precincts, census tracts, and many other subdivisions of the county.  If you have data sets that are based on those subdivisions (for example, vote totals or turnout percentages for precincts) then you can adapt the code I wrote (all of which is in the public domain) to create your own maps showing how that data varies across the county.</p>
<p>The bottom line is that the data is out there for the picking, as are the tools to make sense of it.  You just need to spend some time learning how to use them or (if you don’t feel up to the task yourself) finding someone who can.  Have fun!</p>
]]></content:encoded>
    </item>
  </channel>
</rss>
