`%timeit`

]]>I am happy with Twitterrific, both on the mac (both before and after the revival) and on iOS. I have never used a native twitter client on any OS. I am not sure since when it is known that the end of the third party clients could be near. Version 5 of Twitterrific has been out since October 2017. Was it known then?

Now, as I have posted before, I appreciate the free web. The existence of this website is evidence of this. I guess, a lot of things can happen until June. It would be nice if open alternatives (e.g., micro.blog, mathstodon) would gain more users. On the other hand, on work-related topics, it seems like Twitter has recently stepped over a critical mass threshold, and I do enjoy the conversations there. Yet again, I know people who leave twitter, because of trolling and because of being not open. As they say, the future remains interesting!

]]>Both apps are enabled via Apple’s ARKit:

- GeoGebra has an app called “GeoGebra Augmented Reality” that allows you to plot functions of two variables on a surface that you can pick, like my coffee table. You can then rotate, walk around it, look on top of it and explore in other ways those functions. Great fun!
- The WWF Free Rivers app puts a simple watershed on a surface you can define (like my sofa). Then clouds move in, and you can paddle down the river. Maybe more for kids. Still fun.

Great to see such nice use cases, and let’s get ready for integrated hydrosystem modelling **#hymod18**!

A typical example where censored measurements play an important role are solute concentrations in groundwater. The measured concentration value of some solute depends on the analytical method that was used for quantification of the concentration. Sometimes, the concentration is so small that we can not be certain about it’s value, and we assume that the true value is somewhere between zero and the analytical detection limit.

In a recent example, my coauthors and I demonstrated the importance of including censored measurements to derive a representative concentration of chlorinated solutes in a hydrogeological layer at two boreholes within a fractured sandstone. Due to the fractured nature of the sandstone, at most depths the concentrations were fairly small and frequently below detection limit, whereas in the fractures, typically large concentrations were encountered. Taking the censored measurements (the concentrations below detection limit) in a statistical meaningful way into account lead to an estimate of representative concentrations that corresponded to the conceptual site hydrogeological model at the upstream and downstream borehole, and can be important for site assessment.

Related to censored measurements, but different, are true zeros. An example of a measurement of true zero is a rain gauge that measures precipitation when it does not rain. The distinction between a true zero and a measurement below detection limit can be tricky, because they are both small values. If you’re interested in how to include true zeros in this approach, please continue to read here. A truely zero measurement means that its value is zero and not in an interval between zero and the detection limit.

If you are interested in a statistically reasonable treatment of censored measurements, you can find the related publication in Environmental Science and Technology.

I’ll explain the basic underlying theory below.

I have written about conditional_probabilities quite some time ago. This can be viewed as an extension.

A crisp condition is something like “what is the probability of event A to occur, given event B has occurred”. This is how conditional probabilities are typically taught with. Compared to a univariate density, a conditional density should have a smaller variance, and is shifted towards the condition. So far so good.

It turns out that there is a “not-crisp” condition. This is something like “probability of event A given that ‘event’ B is somewhere between zero and b”. The funny thing is, that the uncertainty about this event to occur is smaller than a corresponding normally-distributed univariate event.

When looking at the figure below, this means:

- the yellow line indicates a standard (variance=1) normal Gaussian density
- two crisp conditional densities are shown by the solid () and the dashed (). Both those densities have a smaller uncertainty (variance) than the univariate standard normal
- two interval-based conditional densities are shown in red () and blue (). The interval-based densities have the same location as the crisp conditionals. Their uncertainties are smaller than the corresponding univariate, but larger than the crisp conditionals.

tl,dr: The statistics related to both teams seems to suggest that the series is very close. Guess what, this is also what I saw when I watched it. Despite this similarity, the numbers favour Cologne slightly but consistently. Granted, the analysis is fairly averaging and not deeply distinguishing.

]]>For fun, I linked AppleScript (that digs into my database on MacOS) with Python, that processes the data (creates a histogram).

The process worked nicely, and being able to debug AppleScript is wonderful.

More info at claus-haslauer.de

]]>Barry’s impact on the assembled Goddard employees was immediate; from the moment she arrived, she insisted on abandoning all electronic devices. “They were really flipped out about it,” says Barry. “The phone gives us a lot but it takes away three key elements of discovery: loneliness, uncertainty and boredom. Those have always been where creative ideas come from.”

At the time of writing this, the Süddeutsche Zeitung insists that social media (WhatsApp) “belong into classrooms“

**update 2017-Oct-11**

- die Tagesschau reports that 14-29 year old Germans are online for about 4.5 hours per day
- the guardian has a longer report on how smartphones are hijacking ones minds. The text warns about a much more severe consequence: “Drawing a straight line between addiction to social media and political earthquakes like Brexit and the rise of Donald Trump, they contend that digital forces have completely upended the political system and, left unchecked, could even render democracy as we know it obsolete.” The article goes on to explain how there are certain hooks emplaced in smartphone-related technology that are designed to keep you there and make for the companies advertising dollars.

It’s hard to pick topics that stood out during the last two days of the conference, just because there were many great talks on a large variety of topics. Here is an attempt.

There were a number of talks covering Point Processes, notably the keynotes by Thordis Thorarinsdottir and Rasmus Waagepetersen. Thordis had a variety of interesting quotes including this one by Frank H Bigelow from 1905:

There are three processes that are generally essential for the complete development of any branch of science, and they must be accurately applied before the subject can be considered to be satisfactorily explained. The first is the discovery of a mathematical analysis, the second is the discussion of numerous observations, and the third is a correct application of the mathematics to the observations, including a demonstration that these are in agreement.

Thordis urged the need for more and better inference methods. I might be worth pointing out that Bigelow went on to state that

Often a good theory is misapplied to good observations, or good observations are explained by a poor theory.

In summary, these thoughts are not too far away from Peter Diggle’s triangle, pictured above.

There were two nice talks that employed copulas for multivariate spatial models and one that I missed, unfortunately:

- Jonathan Tawn from the University of Lancaster presented on “
*Modelling Spatial Extreme Events*“; he takes great care of marginal distributions and how to reasonably include extremes there for a better joint representation in copula space; - Fakhereh Alidoost and Alfred Stein from the University of Twente presented on “
*Interpolation of Daily Mean Air Temperature Data via Spatial and Non-Spatial Copulas*” - the talk that I missed was entitled “
*Hierarchical Copula Regression Models for Areal Data*” presented by D. Musgrove, J. Hughes and L. Eberly

- Denis Allard presented on weather generators, the issues related to different dependence structures in the variables included typically, and advertised a workshop on stochastic weather generators coming up in Berlin.
- Ricardo Carrizo Vergara, a student of Denis Allard, is investigating the relationship between SPDEs and geostatistics.
- Pierre Goovaerts showed his insight into the Flint water crisis, which is published in three papers (1, 2, 3).

I saw too many interesting talks and met too many interesting and nice people, to list everything here. Instead, this is a small selection.

First off, it’s nice to encounter similarly minded work. Particularly, I was happy to see the following presentations:

- Emilie Chautru presented a poster entitled “Cokriging of Nonnegative Data on the L1 Sphere”, on Cokriging compositional data;
- Svenia Behm from the University of Passau presented a talk entitled “Statistical Inference in the RIO Model – the Detrending Step Revisited”. She calculates something similar to my “locally mixed distributions”;
- A. Lawson pointed out the importance of properly taking censored measurements and true zeros into account, both in his keynote (“One Health: Spatial Statistics at the Border of Human and Veterinary Health”) and in his talk (“Bayesian Cure-Rate Survival Model With Spatially Structured Censoring”). I didn’t talk about it at this conference, but it is dear to my heart;

- M. Pereira showed cool images of road crash density estimates based on data from Paris, France. Benedikt Gräler showed a poster with the Envirocar initiative. Data related to driving patterns and fuel consumption is collected while driving, is analysed, and can be viewed online.
- Samir Bhatt gave a great keynote presentationon mapping malaria endemicity. Besides the interesting issues related directly to malaria, this talk raised some interesting questions on modelling philosophies. Samir Bhatt proposed “richer models” as a way forward beyond his current practice of using multivariate models. Alternatively, he phrased it as models that “include mechanisms”. Peter Diggle asked how his approach relates to the concept of parsimonity. It is interesting to me that Samir Batt suggests to include mechanistic models in his data driven models, whereas for the groundwater quality mapping project I am working on, I have moved to a stochastic model. On the scale of the state, I see that deterministic, pde-based models are not feasible (too many unknown parameters and processes).