I tend to be highly suspicious when I perform statistical tests and they come out significant. Permutation sampling (or bootstrapping distributions to empirically calculate an estimate of the p-value) tends to allay my fears that the test I have chosen is in some way inappropriate. This page gives a great overview of using bootstrapped distributions to calculate confidence intervals around a statistic, and p-values in Python. I have used the permutation_resampling function used in the page as the basis for the foregoing.
import numpy as np import numpy.random as npr import pylab def permutation_resampling(case, control, num_samples, statistic): observed_diff = abs(statistic(case) - statistic(control)) num_case = len(case) combined = np.concatenate([case, control]) diffs =  for i in range(num_samples): xs = npr.permutation(combined) diff = np.mean(xs[:num_case]) - np.mean(xs[num_case:]) diffs.append(diff) pval = (np.sum(diffs > observed_diff) + np.sum(diffs < -observed_diff) )/float(num_samples) return pval, observed_diff, diffs
In the particular case that I was applying this, my two original tests, an independent two sample t-test, and the Mann-Whitney U both gave p-values less than 10^-7 and so I would need at least 10 000 000 samples to calculate anything as my empirical p-value. For the simple case I was working in this was not entirely intractable, but I wanted to take advantage of all the computing power that I had available to me.
Fortunately, as well as giving an interactive, graphical environment for Python, iPython notebook also makes it very easy to launch parallel computing engines. Although it seems like the for loop in the above function could be easily parallelized, the use of non-encapsulated functions and variables actually makes it difficult for iPython’s native parallelization to directly implement.
iPython uses the Python pickle library to serialize objects to be distributed to each of the engines – unfortunately, pickle does not support properly serialization of a range of objects, including functions containing closures. To get around this, a drop in replacement for pickle, dill, has been developed that greatly increases the serialization capabilities of pickle. This article gives an overview of the steps required to set up dill for use within iPython – the ones required for this are shown here:
import pickle import dill from types import FunctionType from IPython.utils.pickleutil import can_map can_map.pop(FunctionType, None) from IPython.kernel.zmq import serialize serialize.pickle = pickle
This code snippet has now replaced standard iPython serialization with the dill module. This allows us to change the above permutation_resampling function – now calling a nested function to create random resamples of the data, which can then be farmed out independently to all the parallel computing engines.
import pylab from IPython.parallel import Client def permutation_resampling(case, control, num_samples, statistic): """ Returns p-value that statistic for case is different from statistic for control. """ # Set up the parallel computing client c = Client() # Create a reference to all the parallel # computing engines dview = c[:] # Block execution while processing dview.block = True observed_diff = abs(statistic(case) - statistic(control)) num_case = len(case) combined = np.concatenate([case, control]) # Do permutation sampling as a callable function def sample_it(i): import numpy as np import numpy.random as npr xs = npr.permutation(combined) diff = np.mean(xs[:num_case]) - np.mean(xs[num_case:]) return diff # Create list of differences in means by mapping # function - repeating the function call as many # times as required by the number of samples diffs = dview.map(sample_it,range(num_samples)) pval = (np.sum(diffs > observed_diff) + np.sum(diffs < -observed_diff) )/float(num_samples) return pval, observed_diff, diffs
And there it is! Parallelized bootstrapping of distributions – with 10 000 000 samples able to be created in a short amount of time (barring overheating of the CPU).
This post is the result of a relatively extended twitter conversation between myself, @teachingofsci and @informed_edu. The initial impetus for the discussion was the announcement by the Education Secretary, Michael Gove, that schools should be allowed to push their brightest students to skip GCSEs entirely and move onto A-Levels at 14.
The consensus was, that rather than push students into harder sets of exams – it would be better for students to deepen and broaden their areas of strength. This could be done by giving students opportunities to engage with global and community issues (perhaps using sites like Interrobang or We Are What We Do) rather than simply push them to go down the same road faster – incidentally, surely encouraging such broader participation would dovetail nicely with the promotion of a ‘Big Society’?
The main road block to opportunities for such exploratory learning is the pervasive mindset that everything that is taught and learned must be assessed, and that the majority of that assessment (in order to be valid and unbiased) must take place through public exams. In addition, the current scheme of modular exams poses a particular road block, as rather than giving students opportunities to more deeply explore issues, the emphasis is on a cycle of learn, prep, and test.
My suggestion to help with this was to set up computer based testing that was taken when ready, rather than at a set time of a public examination. This suggestion would also deal with one of the often quoted issues with computerized testing – the need for schools to have hundreds of dedicated machines that are available for the examination period, but unused the rest of the time. On an examine when ready basis, a school could have one suite of examination computers, and students would book their testing time when ready to take the tests for a particular module of study. In some ways, it is surprising that the government has not considered this possibility for testing, as this scheme has been used for testing Literacy, Numeracy and ICT skills for those seeking Qualified Teacher Status in England and Wales.
@informed_edu‘s suggestion then was to use portfolios to complement examination testing in order to assess a broader range of skills and competencies than those that can be assessed in an exam. He expands on his ideas for the scheme in this blog post. To summarize briefly, not only do exams limit the range of competencies that can be assessed, the ultimate assessment of student skills comes down to the students’ ability to perform in exams, rather than the particular skills and competencies that are meant to be being assessed. In thinking about how this plays out in the real world, he considers the chain of trust and vouching that takes place as people progress in their careers. To your first employer, the grades you took for an examination may be of interest, and perhaps to a second employer – by the time you get to your third employer, however, the chances are that the vouchsafing of your competencies and skills by your previous employers will be of far more interest to your new employer than any examination results you accrued. For students then, instead of examining their abilities, teachers would vouch for their students skills and competencies, as demonstrated through a student’s portfolio of work. Teachers in turn can also be vouched for by colleagues, professional development institutions, mentors and the like.
@teachingofsci pointed out a potential difficulty, where some teachers would then be faced with in excess of one hundred GCSE student portfolios to assess (a portfolio being far more work to assess than the current coursework requirement of Science GCSEs). My response to this is that firstly, the portfolios should be published online, and secondly, the assessment of these portfolios should be crowd sourced. Good portfolios will be vouched for by many professional educators, in addition to the other students. The value of a ‘vouching’ can be determined through several means: firstly through the social metrics of the network – ‘vouching’ for people who have a very long social distance from you will carry far more weight than vouching for people that are close to you (this should prevent abuse by closed vouching loops); secondly, the quality of the individual’s work (as assessed by other people’s vouches) could increase the weight given to that person’s vouches.
The transparency and openness of the system is important, not just to ensure that everyone is able to contribute to the network, but also so that any abuses of the system readily become apparent and can be rectified in order to produce the ideal outcomes. In many ways, this system replicates many features of Cory Doctorow‘s Whuffie in Down and Out in the Magic Kingdom, as an attempt to give a metric to vouch for people that is non-exhaustible, while also building an online portfolio that will allow a student to be appraised based on their best work, not on what they were able to squeeze onto a page in some time between 45 minutes and 3 hours.
Finally, some concerns that could arise:
How could you prevent a student from being penalized just for being bad at social networking? I would say that this is where the professionalism of teachers would have to play a role – part of the system could ensure that portfolios that have generally been overlooked are flagged and promoted for review, in order to ensure that everyone has the opportunity to receive vouches that they deserve.
How is a teacher meant to keep track of all his/her students’ portfolios and offer formative feedback in order to help them improve? This could wind up being a massive burden – by requiring portfolios to be available online, the opportunities for formative feedback from a wide range of sources are much higher (for example, mentors from outside teaching – non-profit volunteers, business people, university faculty, could be engaged to act as mentors for portfolios – that’s the ‘Big Society’, right?) In addition, by shifting part of the burden of assessment away from exams more time would be freed up from revision classes, grading mock exams and the like to be available for ongoing formative assessment of student portfolios.
I think it is important to return to the point that started this whole discussion – the best way to challenge and stretch bright students is not to make them go the same route they would go anyway, only faster. While many bright students will enjoy the challenge of more complex material, we do them a great disservice if all that they ever learn is more content, and not a broader range of skills and competencies to help them to use that knowledge to great effect within society.
Consequentialism requires the maximization of some singular good – pleasure or preferences are frequently cited as being the singular good.
Ben Mackay asked me to consider two people A and B. A is very rich and gives, for example, 500,000 pounds to charity. B has an average income and gives 1000 pounds to charity. In addition, it is assumed that A is rich enough that the remaining disposable income he still has available after his generous donation is sufficient for him to have a millionaire’s lifestyle.
Ben argues that because the consequences of A’s actions will produce more happiness (or some other good), then A is a morally better person than B. (The underlying assumption here and in the foregoing is that giving more money will produce more good).
What we seek, however, is the maximization of the good in question – not simply the increase of it. When A donates his money, while he donates significantly more than B, the amount that he could have donated but did not, is far larger than the amount that B could have donated, but did not. As such, A fell far shorter of maximizing the good than B did.
Now, it may be argued that without the incentive to have his millionaire’s lifestyle, the rich man, A, will not continue to earn as much, and hence will produce no additional good over the average person, B, if he gives nearly all his earning. This may be the case, but there are counter-examples (some of which are detailed in Peter Singer’s ‘The Life You Can Save‘). Even if this is practically the case, for A to act in this way is to act in a way that explicitly fails to maximize the good, and hence his actions are even less moral than B’s.
There is a corrolary question of whether holding people to this high moral standard publicly will actually produce the results that we want – and the answer is probably not – but that does not mean that the actual moral value of their actions is any different, rather that in the face of practical concerns we should moderate the actual blame and praise we apportion in order to maximize the good.
Intelligent Design has failed to get much press lately as far as I have seen – but the pressure to have it edged into school curricula (particularly in Texas, where it can begin a domino effect due to the size of that state’s textbook market) has not ceased.
As I see it, Intelligent Design can be appraised in one of two ways: it is either an earnest attempt to present an alternative scientific theory to evolution that happens to involve something akin to the Christian God, or it is a cynical attempt to keep teaching of the scientific theory of evolution out of the classroom by any means necessary, as a precursor to backsliding all the way to New Earth Creationism (in order to maintain a literal reading of the bible).
If it is earnest – and I find it hard to have so little faith in humanity as to suspect every proponent of Intelligent Design of being a cynical, two-faced liar – I believe there are some significant problems with the basis of the theory as both a scientific position and as a credible theological position. The specific biological points have been rendered time and again, and far better than I would do justice to them (the NCSE have a quick primer, and there are many others available) – however, there are matters of physical science that impact as well.
Firstly, Intelligent Design proponents are carefully agnostic in their proclamations about the exact nature of the intelligent designer who is responsible for some of the irreducible complexity of life. This either means that there is some other intelligent life form who acts as the designer, or that it is some God-like entity. In the case of the intelligent life-form, it is conceivable that they might physically be able to travel to our planet and carry out their design work – but what we then find lacking is an explanation of how those life forms came to exist (as presumably they would also possess some level of irreducible complexity also). What is an additional problem for these life forms, however, is the difficulties of interplanetary or interstellar travel (as one assumes they are either no longer here, or came here from somewhere else).
In the case of a God-like entity, the physics becomes more puzzling. How exactly do we explain the intercession of a being who is not part of the universe in terms of physics? I am sure someone wishing to claim for the intercession of God would claim that it was simply a miracle, and hence not explicable by physical laws – so, in essence, what the Intelligent Design proponent is asking for is that not only should the teaching of Biology as it is understood and agreed upon by working scientists be altered, but also the teaching of Physics. Physical laws are no longer immutable and universally applicable. They are instead changeable on the whim of a creator, regardless of evidential basis to the contrary. To undermine our most successful scientific theories for the sake of finding a mechanism to support a theory in Biology seems somewhat absurd. If an evolutionary biologist demanded that the laws of physics be suspended in order to account for their theory, the scientific community would rightly condemn their work.
For the earnest proponent, however, I feel that the theological point is most telling. Assuming that we are talking of a designer who is in fact God, we are asked to imagine an all powerful, all knowing being who creates an immensely complex physical universe. Not only is this universe immensely complex, its complexity derives from what appears to be a set of physical laws, constants and fundamental particles (if Grand Unified Theories of Physics pan out, then this universe is ultimately the expression of an even more limited set of interacting fundamental entities). To think that the God who created this mind numbing complexity from a highly limited set of raw ingredients would need to come back 11 billion years later (as the first bacteria began to emerge on Earth) and tweak the system to add a flagellum for motion, is to limit the power and foresight of a glorious omnipotent God. It simply makes no sense to think that God could create the universe through simple interacting parts, and fail to set up the universe in such a way as to ensure the evolution of all organisms from the simple rules he set in place.
So, we either have God as the creator of the ultimate Rube Goldberg machine – with the universe unfolding from his initial conditions, including all of physics and the evolution of all living things – or he is an ineffective tinker, setting off the universe and having to continually return to it to set it right, and to fix his broken design.
Finally, if it is a cynical ploy to ultimately put New Earth Creationism into the science classroom, then take up the virtue of honesty, and say it straight out, rather than hiding behind a lie.
In this article Tim Harford puts forward an interesting reflection on the value of money over time. Roughly summarized, because of the much higher base level of goods and services (e.g. central heating, cell phones) available today, $7 today could to many people be worth more than $7 one hundred years ago, in spite of what inflation indices tell us.
This led me to think about the size of gaps between rich and poor. If it is preferable to be given $7 extra for goods and services now than 100 years ago, then it is more of a detriment to be $7 worse off than someone else now than 100 hundred years ago. This means that today the gap between rich and poor is actually far larger than it was 100 years ago. The main assumption for this of course is that you can use all that additional money to buy additional goods and services – I think this is a reasonable assumption, as there are goods and services enough to spend almost any amount of money on.
There are two ways to analyze these scenarios. Firstly, a utilitarian conception – in this case, as Tim Harford suggests, the greater purchasing power of $7 today due to the sheer availability of goods and services greatly increases the general happiness. Secondly, we can try to think about it in terms of a preferred distribution of wealth/opportunities – in this case, the situation preferred by a neutral observer (as per Rawls’ idea of a just society) would be one where it was more equally distributed, and due to the lesser disparity due to the lower purchasing power, the situation 100 years ago would be preferable. While we may be happier now, we may live in a less just world. It is perhaps our perceptions of injustice in this way that prevent us from using rose tinted spectacles to look at the present.