Wastholm.com

In this post I will describe one small but important part of the theory of causal inference, a causal calculus developed by Pearl. This causal calculus is a set of three simple but powerful algebraic rules which can be used to make inferences about causal relationships. In particular, I’ll explain how the causal calculus can sometimes (but not always!) be used to infer causation from a set of data, even when a randomized controlled experiment is not possible. Also in the post, I’ll describe some of the limits of the causal calculus, and some of my own speculations and questions.

Without the support of two major browsers and major websites most internet users are missing out on the security benefits of perfect forward secrecy. Without the protection of PFS, if an organisation were ever compelled — legally or otherwise — to turn over RSA private keys, all past communication over SSL is at risk. Perfect forward secrecy is no panacea, however; whilst it makes wholesale decryption of past SSL connections difficult, it does not protect against targeted attack on individual sessions. Whether or not PFS is used, SSL remains an important tool for web sites to use to secure data transmission across the internet to protect against (perhaps all but the most well-equipped) eavesdroppers.

Sitespeed.io is an open source tool that helps you analyze and optimize your website speed and performance based on performance best practices. It collects data from multiple pages on your website, analyze the pages using performance best practices rules and output the result as HTML-files or JUnit XML.

We study fifteen months of human mobility data for one and a half million individuals and find that human mobility traces are highly unique. In fact, in a dataset where the location of an individual is specified hourly, and with a spatial resolution equal to that given by the carrier's antennas, four spatio-temporal points are enough to uniquely identify 95% of the individuals. We coarsen the data spatially and temporally to find a formula for the uniqueness of human mobility traces given their resolution and the available outside information. This formula shows that the uniqueness of mobility traces decays approximately as the 1/10 power of their resolution. Hence, even coarse datasets provide little anonymity. These findings represent fundamental constraints to an individual's privacy and have important implications for the design of frameworks and institutions dedicated to protect the privacy of individuals.

Biases in how data are collected, a lack of context, gaps in what’s gathered, artifacts of how data are processed and the overall cognitive biases that lead even the best researchers to see patterns where there are none mean that “we may be getting drawn into particular kinds of algorithmic illusions,” said MIT Media Lab visiting scholar Kate Crawford. In other words, even if you have big data, it’s not something that Joe in the IT department can tackle—it may require someone with a PhD, or the equivalent amount of experience. And when they’re done, their answer to your problem might be that you don’t need “big data” at all.

For the software delivery process, the most important global metric is cycle time. This is the time between deciding that a feature needs to be implemented and having that feature released to users. As Mary Poppendieck asks, "How long would it take your organization to deploy a change that involves just one single line of code? Do you do this on a repeatable, reliable basis?"4 This metric is hard to measure because it covers many parts of the software delivery process—from analysis, through development, to release. However, it tells you more about your process than any other metric.

Explore analyses of popular songs, or contribute an analysis of your own.

That a Facebook fans of "Barack Obama" might be Democrats or people who liked the "No H8" campaign were more likely to be gay seems obvious, but other correlations were far less intuitive. "Curly Fries" and "Thunderstorms" seem to be surprisingly linked with a high IQ, while "That Spider is More Scared Than U" happens to draw a non-smoking fan base. Predictors of male heterosexuality include "Being Confused After Waking Up From Naps." An appreciation of "Hello Kitty" tended to be associated with people who were more open and less emotionally stable. [Sounds like overtraining to me, but surely they wouldn't make such a fundamental mistake? Right?]

Many people who have managed projects with hours have a hard time understanding why story points are better. They have failed to understand some fundamental data that has been published for over 20 years in the industry literature as well as the latest research.

First, let's look at the latest data on project failures. Failure rates are increasing for IT projects during the current disruption of the global financial system. The latest Standish group analysis shows that agile projects have three times the success rate of traditional projects. Jim Johnson now recommends agile practice be used universally on all projects.

PROBLEM: You are a web programmer. You have users. Your users rate stuff on your site. You want to put the highest-rated stuff at the top and lowest-rated at the bottom. You need some sort of "score" to sort by.

...

CORRECT SOLUTION: Score = Lower bound of Wilson score confidence interval for a Bernoulli parameter.

|< First   < Previous   21–30 (64)   Next >   Last >|