planetwater

ground- water, geo- statistics, environmental- engineering, earth- science

Days 2&3 at #spatialstatistics2017

without comments

It became increasingly difficult to post updates on the spatial statistics conference. The icebreaker, another day full with diverse interesting talks, the dinner, another day that ended the conference with an interesting session honouring the achievements of Peter Diggle. Former and current colleagues such as Paulo Ribeiro and Emanuel Giorgi gave enlightening talks that stressed both the scientific achievements and the great kindness and humanity of Peter Diggle. CHICAS, the center for health informatics, computing, and statistics, is the current culmination of his efforts.

It’s hard to pick topics that stood out during the last two days of the conference, just because there were many great talks on a large variety of topics. Here is an attempt.

Point Processes

There were a number of talks covering Point Processes, notably the keynotes by Thordis Thorarinsdottir and Rasmus Waagepetersen. Thordis had a variety of interesting quotes including this one by Frank H Bigelow from 1905:

There are three processes that are generally essential for the complete development of any branch of science, and they must be accurately applied before the subject can be considered to be satisfactorily explained. The first is the discovery of a mathematical analysis, the second is the discussion of numerous observations, and the third is a correct application of the mathematics to the observations, including a demonstration that these are in agreement.

Thordis urged the need for more and better inference methods. I might be worth pointing out that Bigelow went on to state that

Often a good theory is misapplied to good observations, or good observations are explained by a poor theory.

In summary, these thoughts are not too far away from Peter Diggle’s triangle, pictured above.

Copulas

There were two nice talks that employed copulas for multivariate spatial models and one that I missed, unfortunately:

  • Jonathan Tawn from the University of Lancaster presented on “Modelling Spatial Extreme Events“; he takes great care of marginal distributions and how to reasonably include extremes there for a better joint representation in copula space;
  • Fakhereh Alidoost and Alfred Stein from the University of Twente presented on “Interpolation of Daily Mean Air Temperature Data via Spatial and Non-Spatial Copulas
  • the talk that I missed was entitled “Hierarchical Copula Regression Models for Areal Data” presented by D. Musgrove, J. Hughes and L. Eberly

Various

Written by Claus

July 12th, 2017 at 3:48 pm

Posted in

Day 1 at #spatialstatistics2017

without comments

Peter Atkinson opened the conference with pointing out the broad scope of the conference: “one health” (e.g., CDC, UC Davis) that relates to human, veterinary, and environmental health. I was glad that my talk with interpolating groundwater quality data fit right into that scope.

I saw too many interesting talks and met too many interesting and nice people, to list everything here. Instead, this is a small selection.

Connections

First off, it’s nice to encounter similarly minded work. Particularly, I was happy to see the following presentations:

  • Emilie Chautru presented a poster entitled “Cokriging of Nonnegative Data on the L1 Sphere”, on Cokriging compositional data;
  • Svenia Behm from the University of Passau presented a talk entitled “Statistical Inference in the RIO Model – the Detrending Step Revisited”. She calculates something similar to my “locally mixed distributions”;
  • A. Lawson pointed out the importance of properly taking censored measurements and true zeros into account, both in his keynote (“One Health: Spatial Statistics at the Border of Human and Veterinary Health”) and in his talk (“Bayesian Cure-Rate Survival Model With Spatially Structured Censoring”). I didn’t talk about it at this conference, but it is dear to my heart;

Cool Stuff

  • M. Pereira showed cool images of road crash density estimates based on data from Paris, France. Benedikt Gräler showed a poster with the Envirocar initiative. Data related to driving patterns and fuel consumption is collected while driving, is analysed, and can be viewed online.
  • Samir Bhatt gave a great keynote presentationon mapping malaria endemicity. Besides the interesting issues related directly to malaria, this talk raised some interesting questions on modelling philosophies. Samir Bhatt proposed “richer models” as a way forward beyond his current practice of using multivariate models. Alternatively, he phrased it as models that “include mechanisms”. Peter Diggle asked how his approach relates to the concept of parsimonity. It is interesting to me that Samir Batt suggests to include mechanistic models in his data driven models, whereas for the groundwater quality mapping project I am working on, I have moved to a stochastic model. On the scale of the state, I see that deterministic, pde-based models are not feasible (too many unknown parameters and processes).

Written by Claus

July 6th, 2017 at 8:44 am

Posted in

New Papers!

without comments

I published two new papers recently! Find the titles and the links to more information below. Happy reading!

  1. Detecting and Modelling Structures on the Micro and the Macro Scales: Assessing Their Effects on Solute Transport Behaviour” – This paper sheds light onto a tricky issue: Is a spatial data-set stationary or not? This paper shows a method that can help to decide to delineate a boundary (“macroscale”) between regions that are at least somewhat more stationary than the entire domain. Furthermore, this paper
    • validates the algorithm based on a data-set where a boundary layer has previously been delineated;
    • demonstrates the effects of the macro structure and the smaller scale heterogeneity (“micro structure”) on solute transport behaviour; The micro structure is modelled by multivariate Gaussian and multivariate non-Gaussian structures.
  2. Estimating a Representative Value and Proportion of True Zeros for Censored Analytical Data with Applications to Contaminated Site Assessment” – True zeros such as no precipitation occur frequently in nature. This is one of the very few studies I know that treats those values statistically meaningfully and is based on a real-world data-set. We applied the methodology on a data-set related to contaminated sites, but this has implications everywhere else.

Written by Claus

June 29th, 2017 at 9:03 pm

Posted in

Own Your Writing

without comments

I just posted on claus-haslauer.de about “Own Your Writing!”.

In this post, I

  • discuss how important it is to own what you write, even if it comes at a cost: Knowing technology and money. Also, it seems like publishing has become more complicated than it needs to be on open solutions.
  • play with the new JSON-feed format (in python)
  • Written by Claus

    June 21st, 2017 at 9:48 am

    Posted in

    Thresholds

    without comments

    In my work about spatial dependence, I do see that in different ranges of quantiles, the type of dependence can differ. More generally, this means that thresholds are an important characteristic of environmental systems.

    This is why I think this video that I noticed on kottke.org is so inspiring: sometimes something small leads to a big change — a “threshold” is “jumped over”:

    Written by Claus

    June 7th, 2017 at 3:37 pm

    Posted in

    Years of Blogging

    without comments

    • Manton Reece celebrated 15 years of blogging in March. It turns out that planetwater.org does not date that far back, but almost 11 years of blogging is not nothing either. The details are actually blurry in my mind. Definitely Dayf had started the “boardinger” at some time around 2003, we had used that for a while, and I guess I moved to WordPress in 2006
    • In fact, the WordPress installation at sysprovide has been running and has been upgraded constantly ever since – until last Thursday, when the first outage occurred (at least to my knowledge)
    • If there is one thing certain, it’s that things do change. As does the blogging frequency. I guess there are certain cycles
    • Recently, I started to take claus-haslauer.de as my professional site more seriously, and setup a static page generator (Pelican) that supports Jupyter notebooks. While that is pretty exciting, I do not want to neglect planet water! This is the first post that I am writing and posting with Ulysses and not with my traditional TextMate / MarsEdit combination

    Yay to Independent Blogging!

    I will use the occasion of looking back, following Gabe Weatherhead’s spirit, to give a shout out to independent blogging.

    The following three blogs are in my queue since the beginning of time, none of which are particularly related to water:

    I also admit that I follow a few that are on Gabe’s list, and Gabs himself:

    Written by Claus

    April 9th, 2017 at 8:46 pm

    Posted in

    Update on “Learning and Playing” Update on “Learning and Playing”

    without comments

    Over a year ago I wrote a post on “Learning and Playing“. In the meantime, three important things happened, that lead me to update the original post:

    • yesterday, Apple announced the release of “Swift Playgrounds
      • this is the review of Rene Ritchie at iMore. He says: “It’s one of the finest things Apple has ever done, and it’s going to change the way coding is done for the next generation.”
    • Lorena A. Barba published a blog post on “Computational Thinking
    • “s/buy/make/” published a wonderful empirical statistical analysis that demonstrates how the complexity of legos is increasing

    An old note for my original post included this note:

    From a big picture view, it seems to me that it is more easy now than it was at the time Mindstorm was published to access computer programming. In contrast to this development (now there are more high level programming languages, now there are more simple plotting APIs) most of the (young) people today regard “the computer” more of a consuming and communication device than an invention and try things out device. More and more people use computers, but at least relatively less people use it for creation – and I think creation involves some way of programming.

    I expect that Swift Playgrounds, once it’s released, will offer the best platform since the original Logo, to learn and play. This is the announcement that made me the most happy among all recent Apple announcements. Lorena A. Barba reframed my “learn and play” phrase into “the essence is what we can do while interacting with computers, as extensions of our mind, to create and discover“. I expect Swift Playgrounds to be a wonderful tool for just that.

    Swift playground
    Screenshot of the Swift Playground Demo site.

    Let’s end with some good news: Lego is holding still onto its original values:

    So what happened with Legos? They made a lot more of them. In doing so, they made a lot of new, specialized bricks, but they made even more general purpose bricks. This trend is easily obscured by the opposite trend in the number of brick types, but from a ‘creative play’ standpoint the bricks you actually end up with are more important than the bricks you could have ended up with.

    And: “yeah… you can do this! Ba… Ba.. Baaa…!” (watch until the end!)

    Written by Claus

    June 14th, 2016 at 11:49 am

    Posted in

    automation: getting papers into papersapp from OmniFocus tasks

    without comments

    I am using OmniFocus to organize my life and in an attempt to get a few things done. My academic live also involves staying on top of publications. I rely on papers as my reference manager. In the last little while, I used OmniFocus to keep track of the papers that I found and that I wanted to get and / or read. This has been a quite reliable workflow, but involved quite some manual clicking.

    OF papers

    This post describes how I improved it with a little applescript to go more directly from OmniFocus to papers:

    • whenever I see a paper that interests me, I capture it into OmniFocus;
      • the title of the task is the title of the paper or whatever else helps me to identify the paper;
      • the note contains the URL to the pdf of the paper, and nothing else;
      • the task gets assigned one particular project, whose sole purpose is to collect papers I want to get / download;
      • the capturing works mostly via Omni’s “clipotron“, the share sheet extension in iOS, or from within Reeder;
    • regularly, I check that project (on its review date) and get the papers. Until very recently, this involved a few steps for each paper: open the note in OF, click on the link, the relevant page would open in Safari, and I would klick on the bookmark tool that I had setup such that this webpage would open in papers. I replaced this with a very simple applescript (see code listing below)
      • make sure that I am connected to the University’s network to ensure that most papers are accessible;
      • select the perspective that focusses on the project that contains the papers I want to get;
      • select the papers that I want to get, hit a keyboard shortcut associated to the applescript in Keyboard Maestro;
      • tada! papers opens the links, tries to retrieve the pdf (which works in most of the cases) and the bibliographic information. All is left for me is to add some meta-data and read;

    Below you can find a listing of the applescript code. It is fairly simple and contains only a few lines. I can see a few areas where it could be expanded (parse URL from note if it contains more stuff than just the URL; what if pdf and / or bibliographic information can not be retrieved by papers). But for most of my use cases it works remarkably well. Hence, I would well agree to John D. Cook’s line of thought that yes, it’s a bit about time being saved, and it’s also about not being derailed. It’s also about accuracy (as followed up by Dr. Drang), and about knowledge transfer and improved processes, as Mike Croucher points out.

    Here’s the code snippet:

    tell application "OmniFocus"
        -- Target the content of the front window
        tell content of front window

        -- get selected entries
        set theTasks to value of every selected tree
    
        -- loop over each selected task
        repeat with aTask in theTasks
    
            tell aTask
                -- extract the task name and the note
                -- note contains URL
                set theTaskName to name
                set theNote to note
    
                display dialog theNote
    
                -- open in papers; it automatically retrieves the pdf and 
                ---    the bibliographic information (mostly)
                tell application "Papers"
                    open location theNote
                end tell
            end tell
    
        end repeat
    end tell
    

    end tell

    Relevant links regarding applescript with the two main software packages used:

    Written by Claus

    December 30th, 2015 at 10:39 am

    Posted in

    Mindstorms: Playing and Learning from Mistakes

    without comments

    Seymour Papert lays down the foundations of his philosophy about programming and about learning in his book Mindstorms: Children, Computers, and Powerful Ideas was heavily involved with the creation of the programming language “Logo” (downloadable here). The main use of this programming language is education in general and learning to program as a side effect. It seems like no coincidence that the name of “Lego Mindstorms” is related to Papert’s book’s title, given that the ancestor of the Lego kit was programmed in Logo.

    “Logo” in Python

    There is a python module called “turtle” in python, that mimics the capabilities of Logo. This is what I used to create the figure below. There are many more examples online, f.ex. here. The point of turtle and Logo is to provide an interface that is easy to grasp, but that allows to build stepwise more and more complex things. I’ve seen an implementation of Tetris using turtle.

    “Das Haus vom Nikolaus ” drawn with turtle in python.

    Learning and Playing

    One of the first points of Papert is that the modern car was not created based on an analysis of the bad things of horse-powered coaches. Rather it happened because some people played around, “experimented” is probably a nicer word. Directly related to playing is the fact that you make mistakes. Which according to him is nothing bad as long as the mistakes are recognized and fixed. Unfortunately, some people are afraid of making mistakes. Papert calls this “Mathophobie”.

    In that sense, Papert argues that it is a good thing if a (young) learner is following epistemology, following two approaches

    1. take “the novel”, the things to be learned, and provide a context with something already known
    2. take “the novel”, embrace it and adopt it, and make based on it something new.

    The computer, and the turtle, or any programming language offer a valuable tool for playing, because it is very easy to fix mistakes! Hence, computers and programming can take away the fear of making mistakes! This sounds like such an awesome thing, but then, debugging can be quite painful! 😉

    Mistakes in Hydro-Geo-Logy

    Unfortunately, I am aware of only a very limited number of reported failures in hydro-geo-logy. “The Court of Miracles of Hydrology: can failure stories contribute to hydrological science? is one example. They build on Popper instead of Papert, but go along a similar vein. And in that issue of the journal there is a list of papers that deal with hydro-geo-logical mistakes. One example are models that are “right for the wrong reasons”.

    Incentive

    Keeping all this in mind will hopefully make me a better teacher in the coming term, but also hopefully keeps me constantly reminded to try out things, if nowhere else than in the code editors! When using Computers, I look a lot at data, process data, and analyze the results. Sometimes, it is quite astounding, it almost seems magic to me, that this entire process works. Inevitably, errors occur, and they need to be found and fixed.

    Written by Claus

    September 24th, 2015 at 9:51 am

    Posted in

    Presenting at Heterogeneity Conference in Valencia (MADE site)

    without comments

    I will be presenting at the AGU Chapman Conference “The MADE Challenge for Groundwater Transport in Highly Heterogeneous Aquifers: Insights from 30 Years of Modeling and Characterization at the Field Scale and Promising Future Directions” website.

    Details my talk at the Chapman Conference
    date Monday, October 5, 2015
    time 04:00 PM – 07:00 PM
    location Blue Auditorium at the Research Park, located on the campus of the Universitat Politècnica de València
    title of talk “Modelling Non-Linear Spatial Dependence with Applications to MADE Hydraulic Conductivity Data”
    authors Claus Haslauer, Geoff Bohling

    The MADE site is one of three sites world wide, where very detailed measurements of aquifer properties and solute transport movement within these aquifers were taken. The other two sites are in Borden, Canada and Cape Cod, Massachusetts. The detail of monitoring was very fine for hydrogeologic applications, e.g. via test well installations used to sample aqueous geochemical parameters, that some people argue that those wells influence the properties of the aquifer (see Figure below).

    MADESite1995 blog
    The MADE site in 1995; Image by Geoff Bohling

    A review of the first 25 years at the MADE site is given by

    C. Zheng, M. Bianchi, and S. M. Gorelick, “Lessons Learned From 25 Years of Research at the MADE Site.,” Ground Water, vol. 49, no. 5, pp. 649–662, Sep. 2011. URL

    Written by Claus

    September 9th, 2015 at 10:17 am

    Posted in