The Physical Basis of Proxy Development

Background

Michael Tobis (mt for short), a colleague of mine who works at the University of Texas Institute for Geophysics runs a very interesting online climate/energy/science magazine called Planet 3.0. Prior to this, he used to post on the science blog, Only in it for the Gold (which he has now moved to Planet 3.0). mt was kind enough to feature my article refuting Fred Singer’s claims (Proxy Evidence for Recent Warming) on Planet 3.0.

A person by the name of Pat Frank on WUWT responded to this post by saying this. He claims that there is no physical theory backing up proxy reconstructions, that paleoclimate variables thus obtained are not physically real and that paleoclimatologists are guilty of "statistical hokum" by scaling a measurement to a trend and calling it temperature. This post is motivated by the aforementioned accusation.

First of all, let me start by pointing out the irony of this situation. Fred Singer was the person to claim that proxies do not show 20th-century warming. He used this (false) hypothesis to claim that global warming was not happening. Therefore, it is clear that he puts faith in proxy reconstructions as he uses them to argue his point. Now, we have another denier claiming that proxy reconstructions have no physical meaning, which would nullify what Singer said in the first place! Oh, the irony… In any case, let me bring you some scientific snippets (aka truth) on the topic.

Understanding Proxies

As you are all aware, a paleoclimate proxy is a tool that is used to infer geophysical variables from the past. Generalizing this concept, a proxy could be anything that reveals past information. For example, wet grass in your front lawn on a clear, cloudless, sunny morning tells you that it rained the night before. Despite not seeing or hearing the rain yourself, believing that it rained the night before is not a long shot. The fact that the previous night was a cloudy one can be inferred too. It is logical to subscribe to this stance because we have seen the grass become wet and seen clouds in the sky when it rains. But how can we be so sure that rain caused the grass to be wet? (What if it was a neighbor who accidentally watered your lawn? What if the grass was wet because a pack of dogs peed all over your lawn last night?) There are ways to test this hypothesis, physically and statistically (Is the grass in the backyard wet too? What about the other houses in the neighbourhood? How likely is it that a pack of dogs could urinate uniformly over all the grass in the neighbourhood?) The philosophy behind proxy-based reconstructions, just like geology, is rooted in uniformitarianism - the present is the key to the past. A chemical/physical measurement on a proxy variable (say, stable oxygen isotopes on a coral head that has grown for centuries) reveals a significant amount of information about past geophysical parameters as long as we know how the variable is affected by the relevant geophysical process (eg. the controls of temperature, salinity on the isotopes). Different proxy variables respond to different physical parameters and this can be tested, verified and validated by experiment. This procedure is rooted in physics and the scientific method.

The Physics of Proxies: Foraminifera & Stable Isotopes

Let me focus on a proxy that I am familiar with and well within my realms as a researcher to talk about: oxygen-18 isotopes in the calcium carbonate shells of planktic foraminifera. Foraminifera are small organisms that secrete calcium carbonate shells and live in the ocean. Oxygen-18 is a stable isotope (doesn’t undergo radioactive decay) of the more abundant oxygen-16 and contains two more neutrons than the latter (i.e., the atomic mass is more). The change in the ratio of 18O/16O in any system undergoing a physical/chemical process is termed as isotopic fractionation. We utilize mass spectrometers to measure this ratio of 18O/16O in the calcium carbonate of the small shells (reported as δ18O ‰ relative to a standard). We are sure of pinning down this measurement up to a very high precision (error ≈0.05‰ – an order of magnitude less than 0.05%, mind you).

Nobel laureate Harold Urey, in 1947, explained the behavior of these stable isotopes (18O) and their departure in chemical and physical properties from the more abundant isotope (16O), arising from a difference in atomic mass in his landmark paper, The Thermodynamic Properties of Isotopic Substances (Journal of the Chemical Society, 1947). That paper essentially pinpoints how temperature is the dominant control on stable oxygen isotopic fractionation.

As a simple analogy, consider the oxygen you are breathing in right now - it is not pure 16O2. It is a mixture of the molecules 18O-16O, 18O-18O & 16O-16O – quantified by a certain 18O/16O ratio or δ18O. If you isolated it (closed system) and subjected it to a physical process, say liquefaction, isotope fractionation would occur. You would have a δ18O for the oxygen vapor and a different δ18O value for the liquid oxygen (similar to elementary vapor-gas equilibria studies).  Now, suppose you wanted to obtain different ratios for the vapor and liquid? How can this be achieved? Urey and others discovered that by increasing the temperature of the system, preferentially lighter isotopes in the liquid phase would tend to go into vapor phase and hence the liquid would become more enriched in ¹⁸O and the gas would be depleted in ¹⁸O (or more enriched with 16O). Of course, one could also change the ratio by introducing a stream of pure 18O-18O vapor or liquid (or other mixture), but then, the system would no longer be closed.

Amazingly, Urey predicted that paleotemperatures may be teased out of stable isotopic measurements of old carbonates utilizing this same principle. In the 50s, his student, Cesare Emiliani, carried out isotopic experiments on foraminifera shells and established quantifiable controls for this proxy in terms of a physical transfer function. When the CaCO3 is deposited by these creatures, the resultant δ18O is a function of the temperature at the time of fractionation. However, since the system is not closed, the δ18O of seawater must also factor in – i.e. how much 18O is available for the organism in the first place? Foraminiferal δ18O is a function of temperature and the δ18O of the seawater at the time that it was deposited:

δ¹⁸O-foram = f(Temperatureseawater, δ¹⁸O-seawater)

In other words, ONLY a change in temperature or a change in seawater δ¹⁸O can alter the δ¹⁸O ratio of foraminiferal calcite. If temperature and seawater δ18O stayed constant through time, the measured δ18O of would be constant too. Of course, this is not the case. When we measure isotopes on foraminifera shells in a marine sediment core, and we see that they are not the same, we can infer that there had to have been a change in sea temperature or seawater δ18O (which is related to seawater salinity and ice volume).

Since then, there have been thousands of experiments (laboratory-based, culture experiments, sediment traps) to accurately quantify these estimates and to pin down uncertainties – 60 years is a long time! Even though quantitative estimates are refined every now and then due to progress in mass spectrometry and understanding the biology of these creatures, qualitative inference (trends, variability) of foraminiferal proxy records from as far back as the 50s still holds true (Milankovitch cycles, ice ages etc.)

In summary, a measurement in a geological artifact (speleothem isotopes, fossil content, paleosols composition, tree-rings width, ice-core bubble makeup etc.) known to respond to a climatic parameter (temperature, humidity, precipitation, pCO2 etc.) in the present is utilized as a proxy for the past. These proxy measurements are independently verified and statistically validated by robust methods of comparison with instrumental data and should have a sound physical reason as to why they change with aforementioned climate parameter (correlation does not imply causation); only then are proxy reconstructions and their inherent quantitative and qualitative implications accepted by the community. Nobody merely matches trends and principal components of empirical orthogonal functions to a random measurement in an unknown fossil as was accused.

The Physics of Proxies: The Literature

There are plenty of articles in the literature that describe the physical basis of each proxy in great detail. Here I have provided a (few) links to articles in the literature as an example of the scientific scrutiny through which a proxy is put through before it is used for reconstructing geophysical parameters. Note: I have only included a few proxies off the top of my head. Feel free to include your favorites in the comments.

Take-Home Message

Climatic proxies (including stable isotopes, trace metals, organic biomarkers) are based on sound, well-established, well understood thermodynamic, physical principles. With respect to isotopic reconstructions, whatever I have just explained in this post has been known for over 65 years! Stable isotopes play a huge role in the natural science world today. These principles are even used for oil exploration and in the petroleum industry! It is a shame that deniers cannot even perform a cursory google search before making non-scientific claims. Granted, there are proxies such as faunal assemblages where the mechanistic relationship of species diversity could be related to more than one parameter, thereby complicating transfer functions and there are (new) proxies such as Tex86 paleothermometry where biological constraints aren't fully understood. However, the real strength of proxies lies in how reproducible and repeatable the measurements are. So, you have reconstructed sub-annual sea surface temperatures from a coral head, what does another coral from another colony indicate? Ok, you have estimated paleotemperatures from isotopes in a marine core, how do Tex86 measurements from the same core correlate with those?

To state that paleoclimatologists don't understand the fidelity of proxies is to be in denial. In fact, paleoclimatologists themselves are most critical of proxy measurements and their transferral into reconstructed variables. With advancing scientific progress in terms of instrumentation and new analytical techniques, new proxies are being developed as we speak. Harry Elderfield has an amusing graph regarding the confidence of newly proposed proxies:

Paleoclimatologists are well within our right as scientists to state that proxies do indeed show a 20th century warming and this is with sound physical reasoning and not mere 'statistical hokum'.

Proxy Evidence for Recent Warming

Dr. Fred Singer visited the UT campus last week and gave a talk containing the usual climate denial yarns and I hear, most artfully (not!) dodged scientific questions. I wish I could've attended but unfortunately, I had a class at the same time. This post was motivated by the following claim of his in a WUWT post (in retaliation to the BEST results being publicized and unfavorable to his interests) which he rehashed in the talk as well:

And finally, we have non-thermometer temperature data from so-called proxies: tree rings, ice cores, lake and ocean sediments, stalagmites. Most of these haven’t shown any warming since 1940!

To put it simply: this is false.

Here, I have compiled a (very short) list of scientific articles where the authors do report recent warming in various proxy data. Ice cores, foraminifera, diatoms, stalagmites, corals and lacustrine & marine sediment cores compose some of the listed proxies. Not only do these different proxies around the world show a pronounced warming in the late 20th century, they are also useful in revealing the fossil fuel signature source of recently accumulating carbon dioxide (the Suess effect, see here and here) in the atmosphere. 

Proxy evidence for recent warming:

This is a very small subset of papers where authors report late 20th century warming via non-tree ring proxies. Coincidentally (or not), the marine sediment cores that I am currently working on shows a large 20th century warming signal as well. In fact, I would place more trust in proxies than pre-satellite (pre~1950) or reanalysis data in accurately recording temperature and other climatic variables.

Summary: Dr. Singer's claim is false.

MATLAB: Handling NetCDF files & HadISST data

This is a rehashed post from my old blog which proved to be a popular post. It is a set of basic instructions on handling NetCDF files in MATLAB - something that can be very handy in climate science. There are various instrumental records (man-made thermometer/satellite based measurements) of global temperature variability specified by different parameters (sea-surface temperatures, marine air temperatures, land temperatures, combined land-sea, 5°x5° gridded, 1°x1° gridded and so on). Most of these are open for public use and require citations for scientific publication. Careful consideration is required in choosing the data set (each with specific inherent errors) that you want to work with depending on the question that you want to answer. Recently, I've been working with the Hadley Centre Sea Ice and Sea Surface Temperature (HadISST) data set. This data set gives you global, 1°x1° gridded, sea-surface temperature (SST) data from 1870 to present (updated on the 2nd of every month). The provided array consists of 3 dimensions (longitude, latitude and time) storing SSTs as data.

For my use, I needed the complete SST time series from 1870 up till the present, however, for only one 1°x1° grid point. You can understand that this would require (basic) manipulation of the given data set.

The problem is that most data sets are presented as ASCII characters through a .txt file. These are tough to work with on a non-Linux based system. It takes a long time to edit and optimize these files for statistical/computational use through most software. The good thing is that most of the datasets are provided in NetCDF or .nc format. The Network Common Data Form (netCDF) is an open standard format of software libraries and data formats that support the creation, access, and sharing of array-oriented scientific data. The project was initiated by the University Corporation for Atmospheric Research (UCAR). I couldn't find a simple method online for data manipulation with these big files (~400mb-4gb sized) be it through .txt or .nc files.

Without going into the intricacies of netCDF libraries and formats, here is the easiest way of manipulating netCDF (and hence, global temperature) data sets in basic MATLAB (no fancy toolboxes required!):

  • Download the netCDF version of the data set (or the .nc format).
  • If you have the later versions of MATLAB there are inbuilt functions capable of reading netCDF files, otherwise you can download required functions/libraries here.
  • Create a netCDF object for the data file using the netcdf.open function. Use the NC_NOWRITE command in order to specify a read-only format (you typically don't want to tamper with the original .nc file.)
  • Figure out the specifications involved with the file through the netcdf.inq function which tells you about the variables that the creator of the file used and the dimensions of each variable (if available, you can use ncinfo).
  • Assign the complete data set of the particular variable you want (usually this is the last dimension of the netCDF file - every single data point contained in the array) to an array.
  • This new array takes the dimensions of the complete data set.
  • Now you are ready to go - you can manage the huge data set through simple array manipulation.

For example:

had = netcdf.open('HadISST.nc','NC_NOWRITE'); 
[varname, xtype, varDimIDs, varAtts] = netcdf.inqVar(had,4) % '4' being a specific dimension. 
varid = netcdf.inqVarID(had,varname); 
data = netcdf.getVar(had,varid); % this is the full data set.

In case of the HadISST data set, changing the variable ID (i.e. 4 in the second line) yields different parameters (0 - longitude, 1 - latitude, 2 - time, 3 - specific months, 4 - SST). However, since you ultimately want to work with the SSTs, the var ID 4 would yield the complete SST data set. Now it is a question of simple array manipulation obtaining the data set you require be it a particular time slice, particular range of latitudes, a single spatial point or a single month's global data. To key in on a particular parameter, it would be useful to use the netcdf.inq function on variables other than SST (or in a broader sense, the single data point variable). Once you gather more experience with this basic method, you can look at netCDF handling toolboxes (I would recommend mexcdf - particularly, nc_dump comes in handy).

The specific data set finally obtained as an array can easily be written into any required format (.xls, .xlsx, .dat, .xml etc.) through MATLAB. This is probably the easiest way of extracting data from a global temperature database, fit for use in programs such as Excel or SigmaPlot. Any corrections/suggestions for improving this method are welcome.