A detailed analysis of a new infographic dealing with the issue of falsification in scientific research.... Read more »
Moonesinghe, R., Khoury, M., & Janssens, A. (2007) Most Published Research Findings Are False—But a Little Replication Goes a Long Way. PLoS Medicine, 4(2). DOI: 10.1371/journal.pmed.0040028
Young NS, Ioannidis JP, & Al-Ubaydli O. (2008) Why current publication practices may distort science. PLoS medicine, 5(10). PMID: 18844432
Fanelli D. (2009) How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PloS one, 4(5). PMID: 19478950
17 is the minimum number of clues required to give a unique sudoku solution -- but how did mathematicians prove this? ... Read more »
Gary McGuire, Bastian Tugemann, & Gilles Civario. (2012) There is no 16-Clue Sudoku: Solving the Sudoku Minimum Number of Clues Problem. ArXiv. info:/arXiv:1201.0749v1
Brownian motion is a very kind mathematical object being very keen to numerical simulations. There are a plenty of them for any platform and software so that one is able to check very rapidly the proper working of a given hypothesis. For these aims, I have found very helpful the demonstration site by Wolfram and [...]... Read more »
After having fixed the definition of the extended Itō integral, I have posted a revised version of my paper on arXiv (see here). The idea has been described here. A full account of this story is given here. The interesting aspect from a physical standpoint is the space that is fluctuating both for a Wiener [...]... Read more »
Two years ago, neuroscientists were shaken by the appearance of a draft paper showing that half of the published work in a particular field had fallen prey to a major statistical error.Originally called "Voodoo Correlations in Social Neuroscience", it ended up with the less snappy name of Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. I prefer the old title.The error in question is now known variously as the "circular analysis problem", "non-independence problem" or "double-dipping" although I still call it the "voodoo problem". In a nutshell it arises whenever you take a large set of data, search for data points which are statistically significantly different from some baseline (null hypothesis), and then go on to perform further statistics only on those significant data points.The problem is that when you picked out the statistically significant observations, you selected the data points that were especially "good", so if you then do some more analyses only on those data, you are almost guaranteed to find something "good". To avoid this you need to make sure that your second analysis is truly independent of your first one.Anyway, Vul and Pashler, the main authors of the original voodoo article, have just written a short piece in NeuroImage offering some reflections on the paper and the aftermath. They don't make any major new arguments but it's a good read. Particularly fun is their explanation of what inspired them to look into the voodoo problem:In early 2005 a speaker in our department reported that BOLD activity in a small region of the brain can account for the great majority of the variance in speed with which subjects walk out of the experiment several hours later (this ﬁnding was never published as far as we know). The implications of this result struck us as puzzling, to say the least: Are walking speeds really so reliable that most of their variability can be predicted? Does a focal cortical region determine walking speeds? Are walking speeds largely predetermined hours in advance? These implications all struck us as far-fetched...But they reveal that it was one paper in particular that set them off voodoo-hunting Our interest in probing the matter was further whetted by an episode occurring a short while later: Grill-Spector et al. (2006) reported that individual voxels in face selective regions have a variety of stable stimulus preferences; in a critical commentary, Baker et al. (2007) found that the analysis used to ascertain this fact implicitly built these conclusions into the method, such that the same analysis applied to noise data (voxels from the nasal cavity) revealed a similar variety of stable preferences. It occurred to us that a similar circularity might underlie the puzzlingly high correlations.To their credit, Grill-Spector et al quickly accepted Baker et al's criticism and admitted that some of their original conclusions had been wrong.Vul, E., and Pashler, H. (2012). Voodoo and circularity errors NeuroImage DOI: 10.1016/j.neuroimage.2012.01.027... Read more »
Lecture notes and video to a talk on an old result by Hindman and Pym on groups in the Stone-Cech compactification of the natural numbers.... Read more »
There is a very good reason why I was silent in the past days. The reason is that I was involved in one of the most difficult article to write down since I do research (and are more than twenty years now!). This paper arose during a very successful collaboration with two colleagues of mine: [...]... Read more »
Marco Frasca. (2012) Quantum mechanics is the square root of a stochastic process. arXiv. arXiv: 1201.5091v1
Farina, A., Giompapa, S., Graziano, A., Liburdi, A., Ravanelli, M., & Zirilli, F. (2011) Tartaglia-Pascal’s triangle: a historical perspective with applications. Signal, Image and Video Processing. DOI: 10.1007/s11760-011-0228-6
Grabert, H., Hänggi, P., & Talkner, P. (1979) Is quantum mechanics equivalent to a classical stochastic process?. Physical Review A, 19(6), 2440-2445. DOI: 10.1103/PhysRevA.19.2440
If you are new to climate science, you might be wondering what, exactly, this ‘temperature anomaly’ thing is that you keep hearing about. I know I was a bit confused at first! This post explains the concept, using a real-world example. Cities tend to be warmer than their surrounding countrysides, a fact known as the [...]... Read more »
Jones, P., Lister, D., & Li, Q. (2008) Urbanization effects in large-scale temperature records, with an emphasis on China. Journal of Geophysical Research, 113(D16). DOI: 10.1029/2008JD009916
This post is based on a book review I recently wrote on The Mathematics of Life, by Ian Stewart. A final version of the review will appear in a future issue of SIGACT News. Please feel free to download a … Continue reading →... Read more »
Ian Stewart. (2011) The Mathematics of Life. Book: ISBN 0465022383. info:/
Just before Christmas I was asked to talk to our molecular biologists about multivariate analyses. I was reminded of this on Thursday afternoon, when I saw that I had to talk to them on Friday. "Ah, no problem", I thought....... Read more »
Gower, J.C. (2005) Principal Coordinates Analysis. Encyclopedia of Biostatistics. info:/10.1002/0470011815.b2a13070
Warton, D., Wright, S., & Wang, Y. (2011) Distance-based multivariate analyses confound location and dispersion effects. Methods in Ecology and Evolution. DOI: 10.1111/j.2041-210X.2011.00127.x
This year 134 suspect new journals have appeared from the abyss, all published by the same clandestine company “Scientific & Academic Publishing, USA“. Scientists have been quick to raise the alarm and ruthless in their response.... Read more »
Morrison, Heather. (2012) Scholarly Communication in Crisis. Freedom for scholarship in the internet age. Simon Fraser University School of Communication. info:/
In 2006 two nations took to the field in Berlin, Germany in front of a worldwide audience of 715 million people. Italy were to play France in the final of the FIFA World Cup. The match itself would later become famous for that “head butt” by France’s Zinédine Zidane. But despite being eclipsed by a [...]... Read more »
Yamamoto, Y., & Yokoyama, K. (2011) Common and Unique Network Dynamics in Football Games. PLoS ONE, 6(12). DOI: 10.1371/journal.pone.0029638
A research group at the University of Indiana has developed a program called Truthy that allows anyone to track cases of "astroturfing" on twitter. Any search term can be entered into Truthy and the program will scan the Twitter API and build a model of how the search term originated. ... Read more »
Ratkiewicz,J. Conover,M. Meiss,M. Gonçalves,B. Patil,S. Flammini,A. Menczer, F. (2011) Truthy: Mapping the Spread of Astroturf in Microblog Streams. World Wide Web Conference Committee (IW3C2). . info:/
Comparing averages should be one of the easiest kinds of information to show, but they are surprisingly tricky.Most people know that when they show an average, there should be an indication of how much smear there is in the data. It makes a huge difference to your interpretation of the information, particularly when glancing at the figure.For instance, I’m willing to bet most people looking at this...Would say, “Wow, the treatment is making a big difference compared to the control!”I’m likewise willing to bet most people looking at this (which plots the same averages)...Would say, “There’s so much overlap in the data, there’s might not be any real difference between the control and the treatments.”The problem is that error bars can represent at least three different measurements (Cumming et al. 2007).Standard deviationStandard errorConfidence intervalSadly, there is no convention for which of the three one should add to a graph. There is no graphical convention to distinguish these three values, either. Here’s a nice example of how different these three measures look (Figure 4 from Cumming et al. 2007), and how they change with sample size:I often see graphs with no indication of which of those three things the error bars are showing! And the moral of the story is: Identify your error bars! Put in the Y axis or in the caption for the graph.ReferenceCumming G, Fidler F, Vaux D 2007. Error bars in experimental biology The Journal of Cell Biology 177(1): 7-11. DOI: 10.1083/jcb.200611141A different problem with error bars is here.... Read more »
Ask the average person on the street if men and women are wired differently and you'll more often than not get an affirmatory response. Not overly suprising given the knowledge that men are from Mars and women are from Venus. Am I right? But dive a little deeper and chances are you'll find that the vast majority of people would be relying heavily on deeply ingrained stereotypes, such as the "mythically superior 'multitasking’ abilities" of women or men who just don't listen, rather than any scientifically verified information (although in fairness the bit about men not listening is probably true). Nonetheless, the fact that we rely on such stereotypes is not generally an issue, after all the human brain is a master at creating these categorical shortcuts in an effort to conserve its resources. However when these shortcuts are being used to endorse segregation in schools or distinct parenting styles based on gender, those of us who can spot the neuroscience from the neurononsense have a responsibility to take action. Sum differences aren't what they seem There is no denying that differences do actually exist between the male and female brain. For example whilst the global cerebral blood flow is higher in the female brain, the male brain is on average 11% larger and consists of a higher proportion of white matter than its female counterpart. However it can also be said that males are, on average, 9% taller and 18% heavier than females, thus suggests that the larger brain size is merely another representation of readily observable sexual dimorphism between men and women. Rather than an indication that the male brain is more suited to such non-emotive skills as spatial relations and mathematics. But if the differences in underlying neuronal connections between the sexes aren't to blame for the fact that over 70% of maths PhDs are men who is? As it would turn out, we are. Or more specifically it's society's fault!A recent study by husband and wife team Jonathon Kane and Janet Mertz investigated gender differences in mathematics performance and participation rates using scores from the internationally standardised OECD Program for International Student Assessment math test (2003 and 2009) and the Trends in International Mathematics and Science Study (2003 and 2007) . These data sets gave Kane and Mertz access to data from over 80 countries, with a 31-country overlap, and enabled them to rule out low-living standards, coeducational environments and innate variability among boys as potential causes for gender bias. Instead the study pointed to prevailing societal views and gender equity as the root of the problem (maths pun intended).Put simply the data shows that overall girls and boys perform equally well when it comes to maths, so no evidence of a biological variability there. But perhaps more importantly, both girls and boys from cultures with a higher level of gender equity performed better in the tests. Or as Kane puts it "Women doing better end up raising their kids better."But if both boys and girls perform equally, what’swith the lack of female mathematicians? For starters, it would appear that thegender gap doesn’t rear its’ ugly head until the young women start thinking abouttheir future careers. Insert a steady drone of societal tut-tutting about womenand numbers in the background and it’s little wonder that most women choose a careeroutside of maths (and science and engineering). The gap is essentially formedby the self-fulfilling prophecy that is this stereotype. Women are told thatthey can’t do maths, so they don’t do maths. Thus the small numbers of women whochoose a career in maths act as proof that women can’t do maths. And so the farcecontinues.From stereotype to societal changeOn paper getting women back in the maths class is a straightforward as giving them equal rights and pay before saying "Hey, turns out we're all great at maths." Sadly in reality it's not quite so simple. Firstly, as it would turn out the aforementioned benefits which stereotypes bestow us regarding our cognitive resources ensure that they are deeply ingrained and so incredibly hard to shake. Secondly, any apparent differences between the genders, no matter how carefully reported, are often distorted and propagated by the media (see The Female Brain as a great example of such neurononsense or The Gender Delusion for an eloquent debunking of such myths). And they do this for the simple reason that biological gender differences fascinate us. And so they should.As scientist, reporters, or simply those who know better (here's that responsibility to act I was talking about earlier) we cannot ignore the possibility that gender differences exist. Nor should we. We should continue to look for them through our proverbial microscopes with a fervour that verges on mania. But, to paraphrase Lise Eliot, we must also be mindful. Mindful to communicate the true magnitude and intricacy of these differences, in an effort to avoid more widespread misuse of such research. ... Read more »
If non-human great apes were coaching more football games, you could expect to see fewer extra points being kicked. We risk-averse humans usually prefer kicking an easy extra point after a touchdown, rather than attempting a more difficult 2-point conversion. But chimps and other great apes, after considering their odds, usually opt for the greater risk and the bigger reward.
By "reward," I mean banana.
Researchers at the Max Planck Institute in Germany tested a group of chimpanzees, bonobos, gorillas and orangutans on their risk-taking strategies using chunks of banana. They wanted to know whether the apes' likelihood to go hunting for banana pieces hidden under cups, rather than taking a smaller banana piece already in front of them, depended on the "expected value" of their choices. Expected value is simply an item's worth, multiplied by your odds of getting it. If a 2-point conversion attempt is successful exactly half the time, then its expected value is 1 point.
The 22 apes each sat through a series of experiments involving banana bits in cups. On one side of a table, they saw a small piece of banana placed under a yellow cup. Next to that was a row of blue cups, anywhere from one to four of them. Under one of the blue cups was a larger piece of banana.
The apes knew the larger piece of banana was hidden under one of the blue cups, but unless there was only one blue cup, they didn't know exactly where the banana was. (They understood the setup because there was also a series of trials in which the apes watched the banana being placed under one of the blue cups.) In each trial, an ape could point to just one cup and get the reward--if there was any--underneath.
The yellow cup was a guaranteed small reward. The blue cups were a gamble. And the size of the gamble (in other words, its expected value) depended on how many blue cups were on the table. It also depended on the difference in size between the two banana chunks. The "safe" piece of banana in the yellow cup ranged from one-sixth to two-thirds the size of the large piece.
The researchers found that the apes' decisions did correlate to the expected value of their options. Overall, as the expected value of picking a blue cup increased--there were fewer blue cups on the table, or the safe piece of banana was small and untempting--apes opted more often to try a blue cup. When the expected value of the gamble was lower--because there were a lot of blue cups to choose between, or the safe banana piece was large to begin with--they were more likely to stick with the yellow cup.
Adjusting choices based on the expected value of each option is similar to how humans would decide. But the apes were less human-like in their general propensity for risk. Even at the lowest possible expected values, apes chose to gamble on a blue cup more than 50% of the time.
In other words, apes acted more like humans playing the lottery than humans kicking an extra point after a touchdown. These apes, of course, didn't have their coaching jobs on the line. They might have just enjoyed playing the cup game. And in a human football game, there are plenty of situations in which a kicked extra point is better than going for 2--even though its expected value, with a success rate of about 50%, is the same.
But even outside of football games, humans are known by psychologists for being risk averse, especially when it comes to potential gains. We'd rather take a small guaranteed reward than a larger and riskier one. (For losses, though, we tend to feel the opposite way.)
When the researchers broke down their results by species, they found that while all four species were risk prone, bonobos were a little more conservative in their choices than chimps were. With only a small number of ape subjects, it's hard to draw any serious conclusions. But it's interesting to speculate about the differences between us and our two closest living relatives. Have chimps evolved to take more risks, always gambling on finding something better, because in the wild they must search for fresh fruit year-round? Can bonobos afford to be more conservative because their diet in the wild is more flexible? What factor in our past put risk-averse humans at an evolutionary advantage?
Next time your favorite football team takes an overly conservative extra point, don't blame the coach for his evolutionary history. You could always call up the owners, though, and suggest they hire a chimpanzee instead.
Haun, D., Nawroth, C., & Call, J. (2011). Great Apes' Risk-Taking Strategies in a Decision Making Task PLoS ONE, 6 (12) DOI: 10.1371/journal.pone.0028801
... Read more »
Haun, D., Nawroth, C., & Call, J. (2011) Great Apes' Risk-Taking Strategies in a Decision Making Task. PLoS ONE, 6(12). DOI: 10.1371/journal.pone.0028801
According to the New England Journal of Medicine, after thirty years of silence, authors of a standard clinical psychiatric bedside test have issued take down orders of new medical research.... Read more »
A few months ago, I turned 27. Had I been a famous musician, I may well have dreaded this moment and gone into hibernation for a year, because 27 is the fabled age of the rock star death.
The member list of the ’27 Club’ – those musicians who met an untimely end at the age of 27 – reads like a Who’s Who of influential rock stars: Jimi Hendrix, Jim Morrison, Kurt Cobain, Janis Joplin, Brian Jones, and so the list goes on.
So why do so many musicians seem to crash and burn at the age of 27?... Read more »
Wolkewitz, M., Allignol, A., Graves, N., & Barnett, A. (2011) Is 27 really a dangerous age for famous musicians? Retrospective cohort study. BMJ, 343(d7799). DOI: 10.1136/bmj.d7799
The Journal of Computer-Aided Molecular Design is having a smorgasbord of accomplished modelers reflecting upon the state and future of modeling in drug discovery research and I would definitely recommend anyone - and especially experimentalists - interested in the role of modeling to take a look at the articles. Many of the articles are extremely thoughtful and balanced and take a hard look at the lack of rigorous studies and results in the field; if there was ever a need to make journal articles freely available it was for these kinds, and it's a pity they aren't. But here's one that is open access, and it's by some researchers from Simulations Inc. who talk about three beasts (or in the authors' words, "Lions and tigers and bears, oh my!") in the field that are either unsolved or ignored or both.1. Entropy: As they say, entropy, taxes and death (entropy) are the three constant things in life. In modeling both small molecules and proteins, entropy has always been the elephant in the room, blithely ignored in most simulations. At the beginning there was no entropy. Early modeling programs then started extracting a rough entropic penalty for freezing certain bonds in the molecule. While this approximated the loss of ligand entropy in binding, it did nothing to take care of the conformational entropy loss that resulted in the compression of a panoply of diverse conformations in solution to a single bound conformation. But we were just getting started. A very large part of the entropy of binding a ligand by a protein comes from the displacement of water molecules in the active site, essentially their liberation from being constrained prisoners of the protein to free-floating entities in the bulk. A significant advance in trying to take this factor into account was an approach that explicitly and dynamically calculated the enthalpy, entropy and therefore the free energy of bound waters in proteins. We have now reached the point where we can at least think of doing a reasonable calculation on such water molecules. But water molecules are often ill-localized in protein crystal structures because of low-resolution, inadequate refinement and other reasons. It's not easy to perform such calculations for arbitrary proteins without crystal structures.However, a large piece of the puzzle that's still missing is the entropy of the protein which is extremely difficult to calculate on many fronts. Firstly, the dynamics of the protein is often not captured by a static x-ray structure so any attempts to calculate protein entropy in the presence and absence of ligands would have to shake the protein around. Currently the favored process for doing this is molecular dynamics (MD) which suffers from its own problems, most notably the accuracy of what's under the hood- namely force fields. Secondly, even if we can calculate the total entropy changes, what we really need to know is how the entropy is distributed between various modes since only some of these modes are affected upon ligand binding. An example of the kind of situation in which such details would be important is the case of slow, tight-binding inhibitors illustrated in the paper. The example is of two different prostaglandin synthase inhibitors which demonstrate almost identical binding orientations in the crystal structure. Yet one is a weak binding inhibitor which dissociates rapidly and the other is slow, tight-binding. Only a dynamic treatment of entropy can explain such differences, and we are still quite far from being able to do this in the general case.2. Uncertainty: Out of all the hurdles facing the successful application and development of modeling in any field, this might be the most fundamental. To reiterate, almost every kind of modeling starts by using a training set of molecules for which the data is known and then proceeds to apply the results from this training set to a test set for which the results are unknown. Successful modeling hinges on the expectation that the data in the test set is sufficiently similar to that in the training set. But problems abound. For one thing, similarity is the eye of the beholder and what seems to be a reasonable criterion for assuming similarity may turn out to be irrelevant in the real world. Secondly, overfitting is a constant issue and results that look perfect for the training set can fail abysmally on the test set.But as the article notes, the problems go further and the devil's in the details. Modeling studies very rarely try to quantify the exact differences between the two sets and the error resulting from that difference. What's needed is an estimate of predictive uncertainty for single data points, something which is virtually non-existent. The article notes the seemingly obvious but often ignored fact when it says that "there must be something that distinguishes a new candidate compound from the molecules in the training set". This 'something' will often be a function of the data that was ignored when fitting the model to the training set. Outliers which were thrown out because they were...outliers might return with a vengeance in the form of a new set of compounds that are enriched in their particular properties which were ignored.But more fundamentally, the very nature of the model used to fit the training set may be severely compromised. In its simplest incarnation for instance, linear regression may be used to fit data points to a set of relationships that are inherently non-linear. In addition, descriptors (such as molecular properties supposedly related to biological activity) may not be independent. As the paper notes, "The tools are inadequate when the model is non-linear or the descriptors are correlated, and one of these conditions always holds when drug responses and biological activity are involved". This problem penetrates into every level of drug discovery modeling, from basic molecular level QSAR to higher-level clinical or toxicological modeling. Only a judicious and high-quality application of statistics, constant validation, and a willingness to wait (for publication, press releases etc.) before the entire analysis is available will preclude erroneous results from seeing the light of day.3. Data Curation: This is an issue that should be of enormous interest to not just modelers but to all kinds of chemical and biological scientists concerned about information accuracy. The well-known principle of Garbage-In Garbage Out (GIGO) is at work here. The bottom line is that there is an enormous amount of chemical data on the internet that is flawed. For instance there are cases where incorrect structures were inferred from correct names of compounds:"The structure of gallamine triethiodide is a good illustrative example where many major databases ended up containing the same mistaken datum. Until mid-2011, anyone relying on an internet search would have erroneously concluded that gallamine triethiodide is a tribasic amine. The error resulted from mis-parsing the common name at some point as meaning that the co... Read more »
Clark, R., & Waldman, M. (2011) Lions and tigers and bears, oh my! Three barriers to progress in computer-aided molecular design. Journal of Computer-Aided Molecular Design. DOI: 10.1007/s10822-011-9504-3
When Amy Winehouse’s death was reported in July of 2011, conspiracy theorists immediately declared that her talent and her age, 27, had doomed her to being yet another member of the “27 club”, a club composed of famous musicians who all … Continue reading →... Read more »
Wolkewitz, M., Allignol, A., Graves, N., & Barnett, A. (2011) Is 27 really a dangerous age for famous musicians? Retrospective cohort study. BMJ, 343(dec20 1). DOI: 10.1136/bmj.d7799
Do you write about peer-reviewed research in your blog? Use ResearchBlogging.org to make it easy for your readers — and others from around the world — to find your serious posts about academic research.
If you don't have a blog, you can still use our site to learn about fascinating developments in cutting-edge research from around the world.
Research Blogging is powered by SMG Technology.
To learn more, visit seedmediagroup.com.