The experimental community corrupts the researcher

Experimentation is a valuable tool in computer science. Random experimentation is, however, a danger. This post examines current research practices and casts doubt on their benefits. Inspired by the French writer Rousseau (1972)—society corrupts the individual—, my conviction is that the experimental community corrupts the researcher.

The scientific method has evolved throughout the centuries, and philosophers have had a distinguished role in that change by questioning the beliefs used to guide discoveries; by challenging new ways of thinking, sometimes without providing any answers, for the pleasure of asking questions (Russell 1997). This passion and awakened mind are missing in the current computer science community. Empiricists, especially, write large amounts of plain technical reports tracing experiments, oblivious to the beauty of essays, the excitement of sharing outlandish ideas. Going through the literature has often become a mechanical skimming/scanning task, seeking for numbers that highlight those decimals that the proposed techniques outperform their competitors by. These results sustain a sort of research, based on a mere parameter tuning of established algorithms.
The purpose of this viewpoint is to show the disenchantment that would-be researchers—even tenured researchers—suffer and to denounce the proliferation of questionable practices that are killing innovation.
We first review the effect of the modern obsession for publishing and to what extent academic research has distorted experimental science. Next, we see what calls the current methodology into question and why approaches are simply ignored or incomprehensibly revived.

The perversion of the community
Pure research is becoming less attractive nowadays. Many research lines are abandoned since investors are more interested in applications—despite the relevance of fundamental investigation. Hence, the groups that subsist are because either their research is leading in applicative domains or their volume of publications is high. What is behind these numbers? Parnas (2007) makes a strong point about them and guts every single perversion of the community—authorship in pacts, monthly instalments, tailor-made conferences—that have encouraged superficial research made from overly large groups, repetition, insignificant studies, half-baked ideas… Publish or perish has wreaked havoc on daily investigation, at least in the halls of European academia.
Eventually, impact factors, h-index, g-index have become the fallacious indicators of good research/ers and fired up the paper factory. Fresh Ph.D. students are burdened junk writers as soon as they learn that their career will be measured by these statistics. The pressure is intensified by supervisors, assessed by the same yardstick, who need to keep CVs up-to-date or repay colleagues in the favour chain which promotes quantity over substance.
This compulsive publishing has plagued conferences and journals with so many papers that it is getting difficult to track innovative ideas. The more one reads, the more one bumps into similar attempts, déjà vus—which slow down the learning curve and discourage further reading. Showing off the abilities of regular methods to non-technical experts and cherry-picking results from much wider experimentation are the most common schemes. The raison d’être of empiricism has been abused and now entails repeated preliminary results with no further continuation.

Experimental computer science
Experimental computer science, defined as “an apparatus to be measured, a hypothesis to be tested, and systematic analysis of the data (to see whether its supports the hypothesis)” by Denning (1980), is recurrent in machine learning, algorithm development, and software engineering. Nevertheless, experimental methodology has been twisted; instead of sustaining conjectures, experiments are run to provide material to decide them retroactively—to build a posteriori theories.
Machine learning, for instance, is based on trials with performance measures, learners, and data. The combination of these elements made Langley (1988) encourage practitioners to join empirical testing, as a process of theory formation. Competition testing—a term coined by Hooker (1995) in relation to heuristics—has been the subsequent chaos of such a call. Many years later, no new learning paradigm has been introduced, some progress in standards has been made, and micro-tuning of the existing techniques is the trendy research—the latter being the gold mine for publications. Superiority of techniques is claimed usually following a three-step procedure: selection of a few data sets, selection of referenced learners to compare with, and extraction of performance conclusions supported by erroneous statistical tests. With a pessimistic but very realistic description of the scene, Demsar (2008) warned of the misuse of such experimentation. Conventional statistical models are designed to test single learners in isolation; they are ill-suited to perform multiple comparisons.
Hypothesis testing is useful to say whether the probability of the apparent accuracy of a learner is due to chance, but its power goes down as the number of data sets examined increases. Then, it is worth determining what the ideal size of the test set is, what problems have to be involved, and empowering the testing methodology by sufficient data analysis. These—old claims—are things that one expects to be delighted with when reading papers. Yet, they are complicated milestones and many negative results are derived from the studies. Although these are meaningful to lead progress as well, the community does not consider them. This forces researchers to move back to the classical developments. In addition, groundless rejections cause frustration in researchers, which is reflected in their subsequent reviews. In turn, after being taught that going against the mass culture is not profitable, they will unwittingly stop promising ideas, frustrating new generations again.

Gaming the system in lieu of research
In validating incoming contributions, the clout of journals and reviewers, and the inertia of the scientific community as a society have a lot to do.
Current research is like politics—each tendency has its own press. No matter the thoroughness of the content, if the work submitted to a journal is not aligned with the thought of its staff, it will never get the green light. This results in contributions focused on pre-empting reviewers’ opinion than disseminating the work. Demsar (2008) suggests the web-to-peer review. This unlikely idea, which appears to enable critical and fair evaluations of “correctness, interestingness, usefulness, beauty, novelty“, also evidences the urge to adopt other measures of productivity and recognition to end with the fake tenure of rigour and biased opinions. The new peer-review process should give back credibility to publications, and researchers should not be able to game it.
Indeed, references have a crucial role in the shallow statistics above. Everyone knows they provide the information for the productivity computation. Thus, self-citations, citations to friends and the community clique, or citations to particular journals are some of the mechanisms to scale. Citing has lost its sense: guiding the reader to obtain the background necessary to understand the paper.
A reinterpreted experimental science and a deep knowledge of the system have been the mean for academic researchers to satisfy a demanding productivity. Unfortunately, this praxis is learnt by the new generation of researchers who will mistake research for poor scientific journalism/scientific patter. Publications should be the recognition to mature works and should slow down to gain in quality.

Demsar, J. “On the appropriateness of statistical tests in machine learning.” Proceedings of the 3rd Workshop on Evaluation Methods for Machine Learning. 2008.
Denning, P.J. “What is experimental computer science?” Communications of the ACM 23, no. 10 (1980): 543-544.
Hooker, J. N. “Testing heuristics: We have it all wrong.” Journal of Heuristics 1, no. 1 (1995): 33-42.
Langley, P. “Machine learning as an experimental science.” Machine Learning 3, no. 1 (August 1988): 5-8.
Parnas, D.L. “Stop the numbers game.” Communication of ACM 50, no. 11 (2007): 19-21.
Rousseau, J.J. Les confessions. Paris: Librairie Générale Française, 1972.
Russell, B. The problems of philosophy. New York: Oxford University Press, 1997.

Data complexity in supervised learning

My thesis, Data complexity in supervised learning: A far reaching implication, is finally available online.

This thesis takes a close view of data complexity and its role shaping the behaviour of machine learning techniques in supervised learning and explores the generation of synthetic data sets through complexity estimates. The work has been built upon four principles which have naturally followed one another. (1) A critique about the current methodologies used by the machine learning community to evaluate the performance of new learners unleashes (2) the interest for alternative estimates based on the analysis of data complexity and its study. However, both the early stage of the complexity measures and the limited availability of real-world problems for testing inspire (3) the generation of synthetic problems, which becomes the backbone of this thesis, and (4) the proposal of artificial benchmarks resembling real-world problems.

The ultimate goal of this research flow is, in the long run, to provide practitioners (1) with some guidelines to choose the most suitable learner given a problem and (2) with a collection of benchmarks to either assess the performance of the learners or test their limitations.

DCoL: New release v1.1

A new version of the data complexity library (DCoL) in C++ is available at

DCoL provides the implementation of a set of measures designed to characterize the apparent complexity of data sets for supervised learning, which were originally proposed by Ho and Basu (2002). More specifically, the implemented measures focus on the complexity of the class boundary and estimate (1) the overlaps in the feature values from different classes, (2) the class separability, and (3) the geometry, topology, and density of manifolds. In addition, two other complementary functionalities, (4) stratified k-fold partitioning and (5) routines to transform m-class data sets (m > 2) into m two-class data sets, are included in the library. The source code can be compiled across multiple platforms (Linux, MacOS X, and Ms Windows) and can be easily configured and run from the command line.

Practitioners are encouraged to consider the use of this software in the analysis of their data. A closer reading of data complexity can help them to understand the performance of machine learning techniques and their behavior.

Universitat d’Estiu d’Andorra

After a first-rate opening in May with the talk given by Prof. Cirac, the 27th edition of the Universitat d’Estiu d’Andorra officially starts today, Aug, 30 with a promising agenda:

6:00 pm: Equilibri climàtic del planeta Terra (Climate balance on planet Earth), presented by Josefina Castellví Piulachs, oceanographer specialized in marine bacteriology (Barcelona).
7:30 pm: La bellesa és dins el cervell? (Is beauty in the mind?), presented by Jean-Pierre Changeux, doctor in biology and pioneer of modern neurobiology (Paris).

For five days, Andorra will offer, under the interesting title Del cosmos a l’àtom passant per la vida, a series of talks focussed on science and society.

ICPR 2010 – Contest: Extended Deadline May, 26

Call for Contest Participation – Classifier domains of competence: The landscape contest (ICPR 2010)

Classifier domains of competence: The landscape contest is a research competition aimed at finding out the relation between data complexity and the performance of learners. Comparing your techniques to those of other participants on targeted-complexity problems may contribute to enrich our understanding of the behavior of machine learning techniques and open further research lines.

The contest will take place on August 22, during the 20th International Conference on Pattern Recognition (ICPR 2010) at Istanbul, Turkey.

We encourage everyone to participate and share with us your work! For further details about dates and submission, please see


The landscape contest involves the running and evaluation of classifier systems over synthetic data sets. Over the last two decades, the pattern recognition and machine learning communities have developed many supervised learning techniques. Nevertheless, the competitiveness of such techniques has always been claimed over a small and repetitive set of problems. This contest provides a new and configurable testing framework, reliable enough to test the robustness of each technique and detect its limitations.


Contest participants are allowed to use any type of technique. However, we highly encourage and appreciate the use of novel algorithms.

Participants are required to submit the results by email to the organizers.
Submission e-mail:
Meet the submission deadline: Wednesday May 26, 2010

The contest is divided into two phases: (1) offline test and (2) live test. For the offline test, participants should run their algorithms over two sets of problems, S1 and S2. However, the real competition, the live test, will take place during the conference. Two more collections of problems, S3 and S4, will be presented.

S1: Collection of data sets spread along the complexity space to train the learner. All the instances will be duly labeled.

S2: Collection of data sets spread along the complexity space with no class labeling to test the learner performance.

S3: Collection of data sets with no class labeling, like S2 to be run for a limited period of time.

S4: Collection of data sets with no class labeling covering specific regions of the complexity space to determine the neighborhood dominance.

For the offline test, the results report consists of:

1. Labeling the data sets of the collection S2.

The procedure is the following:

  1. Train the learner using Dn-trn.arff in S1.
  2. Provide the rate of the correctly classified instances over a 10-fold cross validation.
  3. Label the corresponding data set Dn-tst.arff in S2.
  4. Store the n models generated for each data set to perform the live contest on August 22. Be ready to load them on this day.

2. Describing the techniques used.

A brief summary (1~2 pages) of the machine learning technique/s used in the experiments must be submitted. We expect details such as the learning paradigm, configuration parameters, strength and limitations, and computational cost.


* May 26, 2010: Deadline for submission of the results and technical report

* May 29, 2010: Notification of participation

* Aug 22, 2010: Release of S3 and S4

* Aug 22, 2010: ICPR 2010 – Interactive Session


Dr. Tin Kam Ho – tkh at
Núria Macià – nmacia at
Prof. Albert Orriols Puig – aorriols at
Prof. Ester Bernadó Mansilla – esterb at

ICPR 2010 – Contest

Classifier domains of competence: The landscape contest is a research competition aimed at finding out the relation between data complexity and the performance of learners. Comparing your techniques to those of other participants may contribute to enrich our understanding of the behavior of machine learning and open further research lines. Contest participants are allowed to use any type of technique. However, we highly encourage and appreciate the use of novel algorithms.

The contest will take place on August 22, during the 20th International Conference on Pattern Recognition (ICPR 2010) at Istanbul, Turkey.

We are planning to have a day workshop during the ICPR 2010, so that participants will be able to present and discuss their results.

We encourage everyone to participate and share with us your work! For further details about dates and submission, please visit The landscape contest webpage.

AI: Reality or fiction?

It seems that the artificial intelligence related in science fiction is not as far from reality as we used to think. The main character of the film AI, a little boy belonging to a robot series capable of emulating human behavior, is now a model to reach in current scientific projects, which aim at providing machines with consciousness, thoughts, and emotions to interact with human beings. Thus, the world described in Blade Runner, a world where humans and robots coexist and cannot be distinguished with the naked eye, may be just behind the corner.

The advances in AI field, however, start to raise some serious concerns about robot autonomy and its social status as well as how to face this social disruption, and the three Laws elaborated by Asimov to protect humans from machines start to make sense for other than computer geeks. Scientifics are concerned about the “loss of human control of computer-based intelligences”, and the past February, the Association for the Advancement of Artificial Intelligence organized a conference in Asilomar (not a casual place) to discuss the limits of the research in this field. Development of machines that are close to kill autonomously are worth a discussion by those involved in the creation of the brain of such devices. The news of this event has leaked in the Markoff’s article in the New York Times.

On the other hand, who will be responsible for damages caused by these autonomous friends? Themselves or the corresponding designer? In this sense, philosophy should play a leading role in the design and integration of these “future citizens” since they should have a moral system allowing them to learn ethics from experience and people, and also find their place in our society. The latter implies to create a legal framework that defines machine’s civic rights and duties which is a proposal under study (see the news published by “El Periódico”, in Spanish language).

Finally, one may ask whether or not we are ready to live with human emulators. In my view, we are not. Although in the past years we have been skillful to adapt to new and challenging situations, and our experience with immigration integration and race conflicts should help us to welcome these new electronic neighbors, I tend to think that coexistence with robots will be one of the greatest challenges mankind has ever faced. Anyway, we will need to figure out the way to overcome it because the individualism and loneliness ruling our current society is leading us unrelentingly to a future with custom-made roommates.

GECCO 2009: A binary pre-teenager

GECCO, one of the most relevant conferences on evolutionary computation, starts its 10th edition today in Montréal (Canada). The organization committee has prepared a lot of surprises within a tight agenda. From July 8 to July 12, full days of tutorials, workshops, poster sessions, talks, competitions, awards, the birthday, and the star talk by John H. Holland will be a promising immersion into the emergent world of evolutionary computation. I hope all of them give rise to the “Chronicles of GECCO”.

For further information, please see the program.

The mysterious world of the quantum physics

Three weeks ago I read the post Prof. Cirac interviewed about quantum physics and theory information, where you can find the link to the video of the Cirac’s interview by a Catalan TV. It is amazing how Prof. Cirac introduces some basics of the quantum physics by using easy words and a couple of dices. Basically, he explains the existence of two worlds: the macroscopic world (the real-world, as we know it) and the microscopic world (the world of tiny things, such as the particles). The quantum physics lives in the latter; a tailored world governed by its own laws which open the doors to parallel universes that allow paradoxical phenomena. The interest of the interview was the development of quantum computers and a revealing cryptography method to transmit information in an indecipherable way. How to ensure the reliability of such a secured transmission? Because there is no transmission by a channel, information just appears in the right place at the right moment.

This nice introduction helped me to follow the exciting talk Quantum computer compilers, performed by Prof. Al Aho, a computer science celebrity and one of the authors of the AWK programming language and of the so-called Dragon Book, Compilers: Principles, techniques, and tools.

The talk focused on the following six questions:

1. Why is there so much excitement about quantum computing?
2. How is a quantum computer different from a classical computer?
3. What is a good programming model for a quantum computer?
4. What would make a good quantum programming language?
5. What are the issues in making quantum computer compilers?
6. When are we likely to see scalable quantum computers?

Prof. Aho presented, with a clear explanation and a touch of humor, the fascinating field of the quantum computers by describing how “computation is just a particle dancing around others”, enumerating the four postulates of the quantum mechanics, mentioning that quantum teleportation is information transmission based on changes that take place instantly, envisaging programming without copy operation… However, despite the wonders of the quantum computers, it seems that we should wait a little bit more for being able to solve NP-hard problem.

Finally, take a glance at The Blog of Scott Aaronson. This is an unusual and interesting blog about this topic.

HAIS 2009

The special session Knowledge extraction based on evolutionary learning, organized by Salvador García, Albert Orriols, and José Otero, was one of the opening sessions of the 4th international conference on hybrid artificial intelligent systems (HAIS 2009).

Its program, full of interesting talks that discussed the new trends for knowledge extraction processes by means of evolutionary algorithms, included two contributions from the GRSI entitled: Multiobjective evolutionary clustering approach to security vulnerability assessments and Beyond homemade artificial data sets, presented by Guiomar Corral and Albert Orriols respectively.

Guiomar Corral introduced an evolutionary multiobjective approach to cluster the devices of a network with similar vulnerabilities. This technique provides analysts with a map which is helpful to detect malicious attacks or unauthorized changes in the network.

Albert Orriols, in turn, addressed a hot topic in machine learning: the artificial data sets generation. He explained the importance to work under a controlled experimental framework and pointed some ideas to build it.

Salamanca will be for one more day a forum to exchange new ideas and present recent developments in the field of artificial intelligence.