Monday, July 21, 2014


A while back at Ordinary Times, there was an interesting comment thread on the subject of defining the Midwest region of the US.  One of the thoughts that occurred to me while reading that was whether it was possible to define regions based on inter-state migration patterns.  The idea grew, I suppose, out of my own experience.  I lived and worked in New Jersey for ten years, but never really felt like I fit in there.  Eventually my wife and I moved to Colorado, to the suburbs of Denver, where we immediately felt right at home.  Most people, I thought, might have been brighter than we were and not moved to someplace so "different."

I've also encountered a variety of nifty data visualization tools that look at inter-state migration in the US, like this one and this one from Forbes.  State-level data for recent years turns out to be readily available from the Census Bureau.  We can define a simple distance measure: two states are close if a relatively large fraction of the population of each moves between them each year.  "Relatively" because states with large population have large absolute migration numbers in both directions.  For example, large numbers of people move between California and Texas -- in both directions -- because those states have lots of people who could move.  From Wyoming, not so many.  Given a distance measurement, it turns into a statistical problem in cluster analysis: partition the states into groups so that states within a group are close to each other.  Since there's only a distance measure, hierarchical clustering seems like a reasonable choice.

The map to the left shows the results of partitioning the 48 contiguous states into seven clusters.  The first thing I noticed about the partition is that states are grouped into contiguous blocks, without exception.  While that might be expected as a tendency [1], I thought there would be at least a couple of exceptions.  The resulting regions are more than a little familiar: there's the Northest, the Mid-Atlantic, the Southeast, the Midwest (in two parts), the West, and "Greater Texas".  There are a couple of other surprises after reading the discussion at Ordinary Times: Kentucky is grouped with the Midwest, and Missouri and Kansas with Greater Texas.  New Mexico clustered with Texas isn't surprising, but New Mexico with Louisiana and Arkansas?  Hierarchical clustering is subject to a chaining effect: New Mexico may be very close to Texas, and Louisiana also close to Texas, and they get put into the same cluster even though New Mexico and Louisiana aren't very close at all.

One way to test that possibility is to remove Texas from the set of states.  The result of doing that is shown to the left. As expected, New Mexico is now clustered with the other Rocky Mountain states and Louisiana with the Southeast.  Perhaps less expected is that the other four states -- Arkansas, Kansas, Missouri, and Oklahoma -- remain grouped together.  None of them is split off to go to other regions; the four are close to one another on the basis of the measure I'm using here.

Answers to random anticipated questions... I used seven clusters because that was the largest number possible before there was some cluster with only a single state in it [2].  The Northeast region has the greatest distance between it and any of the other regions.  If the country is split into two regions, the dividing line runs down the Mississippi River.  If into three, the Northeast gets split off from the rest of the East.  There are undoubtedly states that should be split, ie, western Missouri (dominated by Kansas City) and eastern Missouri (dominated by St. Louis); a future project might be to work with county-level data.

[1]  My implementation of hierarchical clustering works from the bottom up, starting with each state being its own cluster and merging clusters that are close.  Using the particular measure I defined, close pairs of states include Minnesota/North Dakota, California/Nevada, Massachusetts/New Hampshire, and Kansas/Missouri.  These agree with my perception of population flows.

[2]  The singleton when eight clusters are used is New Mexico.  When ten clusters are used, Michigan also becomes a singleton, and Ohio/Kentucky a stand-alone pair.

Sunday, July 6, 2014

An Update on the War on Coal

[A longer version of this post appeared at Ordinary Times.]

It's been a tough year for coal in the United States. I generally dislike the use of war-on-this and war-on-that. But if the intended meaning is "make it much more difficult and/or expensive to continue burning large quantities of coal to produce electricity," then the phrase is accurate. Where most people who use it are wrong though, is just who it is that's fighting the war. It's the federal courts, and to a lesser degree some of the individual states. The EPA is just the tool through which the courts are acting. Well, also ghosts of Congresses past, who left us with various environmental protection statutes in their current form. Since the SCOTUS hammered the coal side of the fight twice this just-concluded term, it seems like a good time to write a little status report.

Not all the constituents of coal are combustible. Anywhere from 3% and up are not and are left behind as ash, and even 3% of a billion tons is a lot of ash. A bit more than 40% of coal ash is typically reused in various ways: some of it can replace Portland cement in the right circumstances, some it can be used as fill for roadbeds, etc. The remainder winds up in landfills or ash ponds. Ash ponds contain an ash/water slurry; the wet ash stays where it's put rather than being blown away by the wind. Ash pond spills are becoming more common. The federal EPA has not regulated ash ponds in the past; in January this year the DC District Court accepted a consent decree between the EPA and several plaintiffs that requires the EPA to issue final findings on ash pond problems by December. The expectation is that the findings will lead to significant new regulation, and increased spending on both existing and future ash ponds. Things are also happening at the state level. The North Carolina Senate unanimously approved a bill last week that would require the closure of all coal ash ponds in the state over the next 15 years. NC's not exactly one of your liberal Northeastern or Pacific Coast states.

Most of the visible pollutants that go up the flue at coal-fired plants have been eliminated. The picture to the left is the Intermountain generating station near Delta, Utah. The visible white stuff escaping from the stack is steam. Not visible are things like mercury compounds, sulfur and nitrous oxides, and extremely small particles of soot. Those are all precursors to haze, smog, low-level ozone and acid rain, as well as being direct eye, nose, throat and lung irritants. Some of these pollutants can travel significant distances in the open air. In April this year, a three-judge panel of the DC Circuit upheld a tougher rule for emissions of this type of pollutant (the MATS rule). Also in April, the SCOTUS approved the EPA's Cross State Air Pollution Rule that will result in tighter controls on this type of emission. Approval of the cross-state rule has been a long time coming, as EPA rules that would regulate cross-state sources made multiple trips up and down the court system. The courts have always held that the EPA should regulate cross-state pollutants; the problem has been finding a technical approach that would satisfy the courts. In EPA v. EME Homer in April, the SCOTUS reversed the DC Circuit, and the CSAPR will now go into effect.

Finally, last week the Supreme Court issued its opinion in the case of Utility Air Regulatory Group v. EPA. This opinion confirmed the Court's 2009 opinion in Massachusetts v. EPA that the EPA must regulate greenhouse gases. Massachusetts was a suit brought by several states against the Bush EPA, which had decided the carbon dioxide was not harmful. I think Utility is an odd opinion, cobbled together out of three different factions on the court (more about that in a moment). The opinion has three conclusions: (a) the EPA can and must regulate greenhouse gas emissions from stationary sources, (b) the EPA can only regulate greenhouse gas emissions from stationary sources if those sources would have been regulated for non-greenhouse emissions anyway, and (c) the somewhat controversial approach the EPA is taking to the regulation is acceptable. The last one seems to me to have been sort of an afterthought. OTOH, it's likely that we'll see a number of cases about it later when the states make the details of their individual plans known.

The results of the various court decisions are going to have very different effects on different states. Compare California and North Carolina, to pick two (not exactly at random). North Carolina has 43 coal ash ponds; California has none. North Carolina, despite being a much smaller state, generates more than 30 times as much electricity from coal as California; the MATS rule will require much more effort to meet in North Carolina. The CSAPR does not apply to California; but North Carolina power plants will be required to make reductions to improve air quality in downwind states. North Carolina has to reduce the CO2 intensity of its generating plants by more than the national average; California's required reduction is much less than the average, and decisions that California has already made at the state level will probably be sufficient to meet the EPA requirements. North Carolina's electricity rates are likely, it seems to me, to be noticeably higher in the future; California's rates will remain high and perhaps go higher, but aren't going to be driven by these decisions.