Back at sea!

The Trap (Picture courtesy Eric Tappa)

The Trap (Picture courtesy Eric Tappa)

Last week I was out in the Gulf of Mexico aboard the R/V Pelican for a short little research cruise. Our main intent was to find and redeploy a long-running sediment trap. A sediment trap is an instrument used in oceanographic studies to "trap" sediment formed in the column of water above it. They are extremely useful in quantifying fluxes of marine sediment and in constraining the variability in the production of sediment over time. Mainly, we were interested in quantifying the flux of planktic foraminifera: which foram species grow throughout the year; which species prefer warmer/cooler waters; how accurately their shell chemistry reflect environmental conditions (temperature, salinity) etc. In essence, we are trying to ground-truth the variability we observe in the chemistry of the forams preserved in marine sediment cores to reconstruct ancient water conditions (down-core variability). We can use the chemistry of the shells obtained from the sediment trap and utilize known, instrumental temperature and salinity conditions to build transfer functions for ancient, downcore chemical variations in the shells. Remember, these planktic forams live in the upper column of the ocean and build their shell with chemistry dependent on the environmental conditions during which they grew. After they die, the shells fall down towards the seafloor. Our sediment trap catches these shells and preserves them in cups. The trap is programmed to automatically close a cup every 7 or 14 days and subsequently, open a new one. As the cups get filled over a couple of months, we need to go out to sea, retrieve the trap, put in new cups, perform routine maintenance and redeploy the instrument.

My journey started with a flight to St. Petersburg, Florida. Our lab collaborates extensively with the USGS Coastal and Marine Science Center located in St. Pete. Here, I was invited to give a talk on my master's work on single forams by Julie Richey, who studied the Little Ice Age and Medieval Climate Anomaly in the Gulf of Mexico for her PhD work, and now overlooks the center's paleoceanography program. St. Pete is a cool little town and I greatly enjoyed chatting with the folks at USF and USGS. After packing all the equipment and material needed for our research cruise, thanks to the meticulous work of Caitlin Reynolds (a USGS co-author on my AGU presentation who has made the sediment trap "her baby"), we were off to New Orleans, Louisiana - a ~11 hr drive!

We stayed overnight at NOLA and picked up more material for the cruise from Brad Rosenheim's lab at Tulane University. Brad's recent Master's graduate, Matt Pendergraft (who has an excellent paper and video abstract out), would join us for the cruise. Next, we had to drive to LUMCON (Louisiana Universities Marine Consortium) at Cocodrie, LA with all our equipment to set sail on the Pelican.

The R/V Pelican is a ~120 ft. boat with a wide A-Frame capable of multiple oceanographic instrumentation. The crew are an excellent bunch who were very knowledgable about our scientific operation and included a great cook (always good for morale out at sea). At Cocodrie, we were joined by Eric Tappa, a research associate and sediment trap expert from the University of South Carolina. He brought two USC students, Natalie Umling and Jessica Holm, along for this cruise (more hands the better!)

The Crew (Picture courtesy Eric Tappa)

The Crew (Picture courtesy Eric Tappa)

At around 7PM on Thursday, the 21st of November, we were off! It took around 12 hrs for us to get to the sediment trap site. Fortunately the weather was great and the seas were calm. After we reached the vicinity of where the trap was deployed last (thanks to GPS) we sent out an acoustic ping to make sure it was nearby. Thankfully, we "heard" the sed. trap ping back. The sediment trap is maintained at a depth of ~700 m by two strategically chosen buoys that give it buoyancy and an anchor that holds it down. The anchor is attached to the sediment trap via an acoustic release. At the site, we send out a signal to the release to detach itself from the anchor, thereby enabling the buoys to push the trap to the surface ocean.

Seeing the buoys surface is a big relief! The sediment trap setup has survived for six months without going awry! Next, we pick up the sediment trap, install new cups, perform maintenance, redeploy it with a new anchor, and hope that it survives until we're back.

While we were out there, Julie and I wanted to get some core-top material (the topmost portion of the sea-floor). Core-tops are another means through which paleoceanographers can ground-truth down-core variability. For this operation, we turned to a multicorer (here's a neat underwater video). After getting successful core recovery (a total of 4 casts), we had to extrude and sub-sample all the core material at 0.5cm/sample (conventional sampling resolution). Mind you, there were 8 multicores per cast, each at ~45cm which equates to a lot of extruding!

The Cores (Picture courtesy Eric Tappa)

The Cores (Picture courtesy Eric Tappa)

The journey back to Cocodrie was largely uneventful and much to our liking, the seas stayed calm. It was almost a year since I had been out to sea and going back only reminded me how much I like it out there!

Sticky Statistics: Getting Started with Stats in the Lab

Courtesy: xkcd

Courtesy: xkcd

A strong grasp of statistics is an important tool that any analytical laboratory worker should possess. I think it is immensely important to understand the limitations of the process by which any data is measured, and the associated precision and accuracy of the instruments used to measure said data. Apart from analytical constraints, the samples from which data are measured aren't perfect indicators of the true population (true values) and hence, sampling uncertainty must be carefully dealt with as well (e.g. sampling bias).

In most cases, both analytical (or measurement) uncertainty and sampling uncertainty are equally important in influencing the outcome of a hypothesis test. In certain cases, analytical uncertainty may be more pivotal than sampling uncertainty, whereas in others, sampling uncertainty may prove to be more influential to the outcome while testing a hypothesis. Regardless, in all these cases, both analytical and sampling uncertainty must be accounted for when testing (and conceiving) a hypothesis.

Consider a paleoclimate example where we measure stable oxygen isotopes in planktic foraminiferal shells with a mass spectrometer whose precision is 0.08‰ (that's 0.08 parts per 1000), based on known standards. With foraminifera, we take a certain number of shells (say, n) from a discrete depth in a marine sediment core and obtain a single δ18O number for that particular depth interval. This depth interval represents Y years, where Y can represent decades to millennia depending on the sedimentation rate at the site where the core was collected. The lifespan of foraminifera is about a month (Spero, 1998). Therefore the measurement represents the mean of n months in Y years. It does not give you the mean of the continuous δ18O during that time interval (true value). Naturally, as n increases and/or Y decreases, the sampling uncertainty decreases. There may be several additional sampling complications such as the productivity and habitat of the analyzed species' shells that may bias the data to say, summer months (as opposed to a mean annual measurement), or deeper water δ18O (as opposed to sea-surface water) etc. Hence, both foraminiferal sampling uncertainty (first introduced by Schiffelbein and Hills, 1984) along with the analytical uncertainty must be considered while testing a hypothesis (e.g. "mean annual δ18O signal remains constant from age A to age D" - the signal-to-noise ratio invoked by your hypothesis will determine which uncertainty plays a bigger role).

Here are two recent papers that are great starting points for working with experimental statistics in the laboratory (shoot me an email if you want pdf copies):

  1. Know when your numbers are significant - David Vaux

  2. Importance of being uncertain - Martin Krzywinski and Naomi Altman

Both first-authors have backgrounds in biology, a field which I am led to believe that heinous statistical crimes are committed on a weekly (journal) basis. Nonetheless, statistical crimes tend to occur in paleoclimatology and the geosciences too (and a myriad of other fields too I'm sure). The first paper urges experimentalists to use error bars on independent data only:

Simply put, statistics and error bars should be used only for independent data, and not for identical replicates within a single experiment.

What does this mean? Arvind Singh, a friend and co-author at GEOMAR (whom I have to thank for bringing these papers to my attention), and I had an interesting discussion that I think highlights what Vaux is talking about:

Arvind: On the basis of Vaux's article, errors bars should be the standard deviation of 'independent' replicates. However, it is difficult (and almost impossible) to do this for my work, e.g., I take 3 replicates from the same Niskin bottle for measuring chlorophyll but then they would be dependent replicates so I cannot have error bars based on those samples. And as per Vaux's statistics, it appears to me that I should've taken replicates from different depths or from different locations, but then those error bars would be based on the variation in chlorophyll due to light, nutrient etc, which is not what I want. So tell me how would I take true replicates of independent samples in such a situation. I've discussed this with a few colleagues of mine who do similar experiments and they also have no clue on this.

Me: I think when Vaux says "Simply put, statistics and error bars should be used only for independent data, and not for identical replicates within a single experiment." - he is largely talking about the experimental, hypothesis-driven, laboratory-based bio. community, where errors such as analytical error may or may not be significant in altering the outcome of the result. In the geo/geobio community at least, we have to quantify how well we think we can measure parameters especially field-based measurements, which easily has the potential to alter the outcome of an experiment. In your case, first, what is the hypothesis you are trying to put forth with the chlorophyll and water samples? Are you simply trying to see how well you can measure it at a certain depth/location such that an error bar may be obtained, which will subsequently be used to test certain hypotheses? If so, I think you are OK in measuring the replicates and obtaining a std. dev. However, even here, what Vaux says applies to your case, because a 'truly independent' measurement would be a chlorophyll measurement on a water sample from another Niskin bottle from the same depth and location. This way, you are removing codependent measurement error/bias which could potentially arise due to sampling from the same bottle. So, in my opinion, putting an error bar to constrain the chlorophyll mean from a particular depth/location can be done using x measurements of water samples from n niskin bottles; where x can be = 1.

While Vaux's article focuses on analytical uncertainty, the second paper details the importance of sampling uncertainty and the central limit theorem. The Krzywinski and Altman article introduced me to the Monty Hall game show problem, which highlights that statistics can be deceptive on first glance!

Always keep in mind that your measurements are estimates, which you should not endow with “an aura of exactitude and finality”. The omnipresence of variability will ensure that each sample will be different.

In closing, another paper that I would highly recommend for beginners is David Streiner's 1996 paper, Maintaining Standards: Differences between the Standard Deviation and Standard Error, and When to Use Each, which has certainly proven handy many times for me!

One of the more bizarre papers I have come across...

...is written by Oleg McNoleg,published in the peer-reviewed journal, Computers & Geosciences. Oleg is affiliated with the prestigious Brigadoon University of Longitudinal Learning, School of Holistic Information Technology, situated in Noplace, Neverland. The title of the paper: The Integration Of GIS, Remote Sensing, Expert Systems And Adaptive Co-Kriging for Environmental Habitat Modeling of the Highland Haggis using Object-Oriented, Fuzzy-Logic and Neural-Network Techniques(phew).

So, what does Oleg McNoleg have to say about the habitat of the Highland Haggis? But firstly, what is a Haggis? A Haggis is a mythological Scottish creature that vaguely reminds me of the misconceptions associated with lemmings. McNoleg writes:

The Highland Haggis is unique amongst all mammals in that it has a pair of legs (either left or right) that are shorter (longer) than the other pair... It is a sad consequence that each year, many fledgling Haggis die whilst attempting to move upslope...

McNoleg then dives into the theoretical aspects of incorporating various geographical techniques to model the habitat of the Highland Haggis. This, of course, includes the insertion of data from a digital elevation model (DEM) that is hierarchically decomposed (?) into a Polymorphic Euclidean Adaptive Region tree (PEARtree - see figure) - of course. Then, McNoleg provides a mathematical framework for modeling Haggis habitats using geophysical data because "It has become customary for papers to contain copious quantities of gratuitous mathematics (Heckbert, 1987 Well and Du, 1993; Rull, 1993)" where Heckbert (1987) is titled 'Ray tracing in jello brand gelatin', Rull (1993) is titled, 'BARRY: An autonomous train-spotter', and there is no reference for Well and Du (1993).

After the theory has been established, what are the results of this study?

Honestly, this may be the most glorious Academia Bizarro entry thus far. Hats off to Oleg McNoleg for this wonderfully entertaining paper, chockfull of ridiculous and bizarre references/ideas. You must read it in its entirety to fully grasp the depth of this article. Also, hats off to the editor(s?) of Computers & Geosciences for okaying publication (full disclosure - I was rejected from this journal!) Funnily enough, the Haggis paper has been cited 29 times and I have a hunch to whom the nom de plume, Oleg McNoleg, belongs. Hat tip to Lars Beierlein for bringing this article to my attention!