Monthly Archives: January 2011

The two ends of Open Data

OKD or Open Knowledge Data is a term increasingly used in connection with the Semantic Web. It follows the philosophy of openness as used in Open Source or Open Access, whereby an artefact is free to retrieve, use, reuse, and share.

We have lived with open knowledge for thousands of years. It’s when a tribal elder told a Neanderthal youngster how to hunt, or when your granny passed on the recipe of her magnificent apple pie.

Now, things are about to change. It is a sad fact, that formalisation of openness is the only response we have to the digitisation of knowledge. In the digital world, giving ‘verbal’ advice or collecting data leaves traces and timestamps that trigger copy- and intellectual property rights. It also comes in the widest possible variety of formats. This leads to a dilemma for research, in that it becomes enormously difficult to reproduce findings – a fundamental requirement for testing hypotheses.

For learning analytics there are further obstacles. It’s somewhat bizarre that the default position of commercial social networks is completely opposite to education. Where in the commercial world user data by default is treated as public (unless actively protected by the user), in education the data default is set to private, and special permission needs to be sought to use that data at all. Since, at least in Higher Education, people can be seen as voluntarily opting into that system, in the same way as they opt into a social network, it would be sensible to change the system default.

When it comes to personal or private data, there is a further dilemma: ownership. When data are collected, e.g. by a commercial company, their use maybe unethical and even unlawful. Lawyers in Germany have recently started legal proceedings against Google’s Gmail reading your e-mail messages in order to provide personalised advertisements. While users of Gmail signed an agreement that allows Google to scan their messages, the recipient of the message did not. It’s a privacy infringement.

In conclusion, there are at least two end parties involved in data gathering: the data gatherer and the data subject. It is unresolved who has ownership of the data, the party who produces it through enacting a particular behaviour, or the party that collects this data and makes it explicit and exploitable. Whatever the bottom line of this, in order to protect privacy, it is not enough to have an open data agreement with just one of them.

Is Connectivism a Cyborg pedagogy?

In the Captain Piccard era of the Star Trek series, the Borg (cyborgs – a cybernetic organism) were portrayed as the humans’ worse enemies. A relentless swarm intelligence race invading our galaxy (and threatening to invade Earth) who’s most noteworthy message to the outer world was: “resistence is futile – you will be assimilated”! This is because the Borg don’t thrive as a mass of individuals, but as a single collective.

Connectivism is a nascent pedagogic theory and this is my first critical contribution in order to maybe improve on it. My main point here – does Connectivism negate individuality?

I have expressed before some views on how knowledge resides in an inter-subjective space. Inter-subjective here meaning across a number of individuals, i.e. in networks. However, this is the abstract notion of ‘knowledge’ and we have not identified the boundaries to the very similar concept of ‘belief’, which occupies the very same space.

Be that as it may, knowledge is also present in individual instantiations, or supra-individual instantiations in the case of organisational knowledge, and it is enhanced not only by connectivist activities, but also by own experiences and decision making processes. In a comment to George Siemens’ elaboration of Connectivism, Lanny Arvan describes large parts of learning as ‘verification’. I’d agree to that. This leads to an important perspective on knowledge and its creation, namely that “knowledge is conflict“.

As with the Borg, the unanswered question of Connectivism is where does steering of knowledge development and application come from if not from individuals? Does Connectivism support learning objectives, and where do these come from, intrinsic motivation or external drive? Are we led by individuals pulling and pushing every which way, is it the context that rules (e.g. democracy vs. dictatorship), or is there a dedicated network of experts which we may randomly call the education sector? Or all of them? Much of this would depend on whether connectivists see themselves as descriptive or prescriptive.

As far as the importance of the network over its content goes, I do see a deficiency in Stephen Downes’ “empty pipes” response to Tony Forster:

[Connectivsm] denies that there are bits of knowledge or understanding, much less that they can be created, represented or transferred

Maybe I misunderstand, but to me, this lacks purpose and does not allow intentional behaviour. Without purpose knowledge cannot grow or even exist. Intrinsic motivation or external dictation, what makes networks emerge and grow? More recently, Stephen does acknowledge the existence of content in a connectivist world. This is important, for even in a data network, the bits and bytes that flow through the channels give the network meaning and purpose.

In questioning existing learning theories, George asks:

What adjustments need to be made with learning theories when technology performs many of the cognitive operations previously performed by learners (information storage and retrieval).

and later:

How do learning theories address moments where performance is needed in the absence of complete understanding?

I sincerely hope that this does not express a ‘dumb terminal’ view of a cyborgised knowledge society where people are meant to act according to knowledge external to them. In such a connectivist world, Robinson has no ability to survive a new island situation.

Other observations:

Connectivism and tools:
In connectivist theory, technology takes a leading role, not as a mere medium or facilitator, but as part of the core without which knowledge cannot exist. At the same time, we observe that newest social networking technologies are enormously low in dialogistic nature. Twitter, FB, YouTube, blogs, etc. are information not communication technologies, and their dialogistic function (comment and sharing facility) are in fact secondary to their monologistic broadcasting design. Because responses are not a precondition, this is more favorable to cognitive and constructivist use than to Connectivism. Hence, it can be argued that we are still lacking appropriate tools that support a connectivist approach.

The supposed know-where addition that Connectivism adds, is far from being new. Any lawyer or doctor would not have learnt the legal code by rote, but would have been trained where to find the required knowledge and updates when needed. By contrast, our technology-enhanced way of life requires less of that as we continue to blindly trust in search engine algorithms or GPS navigation. A recent survey has shown that students base their study work largely on the first result returned by Google. Elsewhere, people lost the skill of finding answers in the yellow pages. Connectivism, it would seem to me, has to acknowledge that this dimension on where to find knowledge is about to disappear.

Small world networks:
One point worth noting with respect to small world networked learning is that people often don’t listen to the nodes closest to them (children – parent, wife – husband). Connectivism would have to explain this phenomenon in its own way.

Is there such a thing as digital scholarship?

Around Martin Weller’s talk on the Connectivism and Connective Knowledge course (CCK11) an interesting debate arose on what ‘scholarship’ means. As always, people followed the typical rituals to find a narrow definition for the term. And, also like always, the term kept evading any pinning down. I have done this exercise in a previous life with students making them reflect on what “work” is, or “art”. The beauty of the human language is that it very much depends on your point of view. With respect to scholarship, Martin mentioned Boyer’s criteria and they sounded good enough for me to describe the role.

How has scholarship changed in the digital age? And especially in a fragmented media landscape, where it is no longer a privilege of a few university professors to publish to the world. Debate arose about blog postings (like this one) being a scholarly artefact. But how can we be sure? – short answer: we cannot.

Popularity of web postings are no reliable sign as to whether it contains scholarship. A nice provocative question in the course was whether Sarah Palin’s digital versatility exhibits scholarship. First question I’d ask on this one is: does she write it herself? But again, this is no way to answer the question of what digital scholarship is.

From the discussion (and I poked it a bit) it emerged that the participants seemed to loathe the connection of scholarship with financial value. Yet, most of them I’d argue work for money and in the so-called knowledge economy in a highly competitive environment. And this leads me to my take on what’s scholarship today – digital or not:

Looking back a bit, it seems to me there was a strong divide between researchers working in a non-profit academic institution and researchers working in industry (who would not be called scholars). Hence, scholarship is tightly connected with academia as a system in society. What has dramatically changed recently is that the two sectors have moved closer together, in that the education sector introduced competitiveness and financial viability as a defining factor. This, together with the industrialisation of mass education, led to institutions being measured by market value (student numbers, graduate employability, research output, etc.). New technologies are one way in which competitiveness is lived out between institutions (cf. give aways of e-book readers etc.). Technology may, therefore, only be the battle ground for a larger, more substantial change in redefining the educational and scholarly landscape. And, most importantly: scholars are not exempt from these forces!

Semantic Web and Language Technologies

Ok, it seems like a good time to reflect on semantic technologies a bit, since this week’s topic in the Learning and Knowledge Analytics MOOC (LAK11) is the Semantic Web.

What are the differences between semantic web technologies and language technologies? Following the keynote presentation by Dragan Gasevic, I’d say the key difference is that the Semantic Web is about a technology that allows computers to communicate in an enhanced way, whereas language technologies are about human language. Semantic web technologies like OWL, RDF and ontologies identify relationships between entities. Language technologies are about analysing and understanding natural human language, natural language processing (NLP) is the catch-all term for this.

What semantic web technologies can do is relatively simple to show. Take this example: You know that your friend John has a brother living in South America, but you can’t remember his name. Typing “brother of John” into a traditional search engine won’t work. All it will return is documents that contain the words ‘brother’ and ‘John’ or the exact phrase ‘brother of John’. The Semantic Web “knows” about relations, hence it would return a result saying ‘brother of John’ = ‘Kendon’. It works in exactly the same way for ‘capital of France’ = ‘Paris’; or ‘other words for red’ = ‘crimson’, ‘ruby’, etc. Semantic search engines can do this, based on a vocabulary of relations. This not only stores the words themselves, but also the way in which they relate to each other, i.e. ‘goose’ is a sub-item to ‘bird’.

In contrast to this, natural language processing tries to understand human language. Here is an example: When you put to someone the polite question “may I ask you, how old you are?”, the answer “I’m 42” is a perfectly acceptable response. Not to a computer! A computer does not understand politeness and can only respond with ‘yes’ or ‘no’.

How can we use natural language processing in learning and knowledge analytics? Here are two examples of possible use:

  • (1) conceptual coverage – conceptual development
    Language technologies also use vocabularies and ontologies. But in addition they also refer to grammars and corpora. This gives them the ability to identify synonyms. With language technologies like Latent Semantic Analysis distances between words or terms can be mapped out against their context. By comparing artefacts with a large body of documents, a new language item can be mapped in how closely it relates to them and the domain they cover. It also identifies related concepts and through specific techniques like disambiguation can exclude homonyms with different meaning, e.g. Java (the island) from Java (the programming language).

    Taking the assumption that a learner attempts to progressively adopt more and more subject specific expressions and terminology, their conceptual coverage can be identified in the textual artefacts they produce (e.g. entries to their learning diary or blog), and, longitudinally, their conceptual development. The hypothesis simply is: “if it quacks like a duck…“. By analysing concept coverage in learner texts, a tutor (or support system) can also identify omissions and provide relevant intervention.

  • (2) dialogue analysis
    Chat and forum are widespread in education. However, using them with a large group of learners is challenging and carries a high cognitive load for the participants as well as the tutor. This is because the natural text language is confused in several ways: Firstly, the discussion threads are often wildly intertwined and hard to follow (in fora this is often shown as tree-structured indents). Secondly, some people are slow typists or make mistakes. A computer system analysing such text must understand typos. Thirdly, an abbreviated form of language (btw, omg, lol), emoticons, or grammatically incomplete sentences are used.

    Sophisticated techniques like anaphora resolution, utterance categories, etc. can go a long way to analyse dialogues. In a learning context, this can support discussion moderators, but also the participants themselves in identifying strengths and weaknesses in textual conversations. The system can, for example, show who posed most questions, and who provided answers. It can also find out who disengaged and talked about something else (concept coverage as above), or who did not connect to others in the discussion. This knowledge analysis may help a tutor to support learners in staying on task.

  • NLP and the Semantic Web are complementary technologies. The latter is very important for finding, sharing and exchanging resources, the former is more geared towards human interaction. NLP still has a long way to go, though, but the signs are promising.

    PS: more info at

    Does Big Data threaten personalisation?

    This is a reflection on week 2 of the Learning and Knowledge Analytics course (LAK11). I watched two interesting presentations from Ryan Baker and Ian Ayres at Google on number crunching and prediction algorithms. With more and more data available to us – the so-called Big Data revolution – more analytics can be done with better and farther reaching application.

    Ryan Baker elaborated on educational data mining (EDM), especially on how it can be used to detect students off-task or cheating the system (he calls it ‘gaming’). They developed a model that relatively accurately identifies such behaviour as it happens.

    While it is pretty impressive to hear how stats can outperform experts in predicting certain outcomes (e.g. elections), there is something of a bitter aftertaste in all of this that tells me it’s not all nice and dandy. Big Data analysis calculates mean behaviour – it does not calculate or predict outliers. This produces a measure of how distant a behaviour is from the “norm”. Results can be clustered, but they will still focus around a statistical mean. While this is fine to impose mainstream thinking, it could pose a danger to individuality, I believe.

    Personal treatment has already lost much ground in society due to computer technology. Whereas people are able to treat you individually and flexibly – taking note of your personal situation – forms and algorithms do not. Gone are the days when one could ask the waiter for separate bills or to redefine a bill item (e.g. turn beer into coke). Computer tills make this impossible. Similarly, late submissions to calls for proposals are unacceptable to a computer, the clock of which tells it relentlessly when time is up. It is a human virtue to be flexible beyond a set rule – and this helps us support each other in personal ways.

    With computers identifying mainstream thinking, they are only a step away to move from supporting humans to patronising us. We see this already happen in advertising, where ads are selected by what fits me or people similar to me. Compare this to the discovery of an advertisement in the papers that I would have never chosen to look at. And here I see the strange contradiction: while data analytics aim to make things more personal to us, they block personal development and personal treatment by ‘mainstreaming’ us and preventing discovery. After all “personal” does not necessarily mean “average”.

    Gravity rules the MOOC LAK11

    Lots of things happen on the MOOC open course on Learning Analytics 2011 (LAK11). Discussions spread in ever-which way. Participants migrate between discussions and platforms (or shall we say “bounce”?). The closest analogy I have for this is the ‘open space’ conference format. Similarly, there is a central theme (in this case “Learning Analytics”) and then people are asked to set up their own talk shops, deciding on their own take of the theme and inviting others to join in. A MOOC follows the same principles but is entirely virtual.

    Obviously, and we have been warned about this, it’s impossible to follow every discussion. So there are certain gravitational fields that people follow, and where they group together – some more popular, others less (but nevertheless interesting). Centres of gravity are: platforms (Facebook, Netvibes, Moodle, Twitter, Diigo, and many more), topics, and people (certain people attract a greater following simply by being there). What I have not seen so far is that connections matter much, unless one interprets the fluidity of people between nodes as an indication for this. The thing that holds it together and drives these centres of gravity are the fixed points of stimulating weekly keynote presentations. They are in my opinion vital in that they focus the mind and the discussions, provide a sort of basis for selection (do you continue the discussion from last week or switch to something new), and are generally important that the network does not wear out too soon.

    What I have not found yet, for myself, but maybe others have, is a strategy to manage my approach of accessing the knowledge spread across the course. There is, inevitably, the feeling that I might miss an important discussion, and not being at the right place. I think, I’m getting less worried as the MOOC goes on, since the internet technologies will preserve most of the ideas expressed for later access. So if I don’t get it now, I can catch up later – but for the opportunity of contributing. Alas, that’s the same in open space conferences and if you only have one tv at home.

    Learning and knowledge analytics

    Last week’s keynote of the Learning and Knowledge Analytics course, LAK11, by John Fritz addressed data gathered by institutional LMS. John rightly questioned the openness of such tracking data, as these are typically first and foremost accessible to system administrators. An experiment of openness in his university shows some interesting insights into how allowing students access to this data can enable reflection on their learning.

    Even more interesting is the resulting discussion with other participants. Gillian Palmer indicated some serious concerns, which I share. More generally, I see the danger of reducing students’ interactions to mouse clicks and number crunching. Since pulling together stats from log files is as easy as clicking a button, as opposed to assessing the quality of their work, educators may be tempted to come to the wrong conclusions and rate system data higher than they deserve to be.

    One of the challenges for learning and knowledge analytics is that with increased use of social media and personalisation, learner data becomes increasingly fragmented and it becomes harder and more time consuming to get a comprehensive picture, which in turn makes real-time support more difficult.

    This triggered some comments on the demand for qualitative learning analytics, looking at the artefacts directly rather than the interactions that produced them. This can be done using language technologies, but these are still in their infancy. However, there is already enough potential there to support certain elements of learner support.

    MOOCs – from micro to macro

    I just joined the Massive Open Online Course (MOOC) on Learning Analytics and listened to the introduction session by George Siemens. An impressive following of some 500 “learners” has been showing an interest according to the latest figures George mentioned. This in itself is already a pleasant surprise to me, and certainly worth a “well done, guys”!

    The first impressions I have of a venture like this are positive but not without hesitation. I won’t conceal it from you that it is less the topic of “learning analytics” that’s of interest to me (although I am ready to learn something about this too), but the course itself. This is also where my hesitation lies, but we shall talk more about this later in the course.

    The first thing I learned through the introduction was that it is probably best to throw out any terminology that we are used to. It carries too much semantic load from a different way of education. However, there is little resemblance to traditional courses or learners. To me it would be better to call it a “gathering”, “pow-wow” or using the Zulu word “imbizo” – “communities coming together to celebrate friendship, soul & spirit”. As George pointed out, the key component of the MOOC is that people decide on a set period of time to interact in more or less any way they wish, but there are no learning goals or outcomes – people decide themselves what they want to get out of it.

    A learner is also not a learner in this context, since this bears to much of the learner-teacher dichotomy. There is no teacher as such in this imbizo, just facilitators, but anyone can become their own facilitator if they wish. So the MOOC has more of a SIG of equals than of a classroom.

    In the very same way “content” does not exist. What I mean is there is no mandatory content to read or follow. Most of the content elements as Stephen Downes pointed out are created on the fly by participants. This does not exclude pre-existing texts, of course, but it is the reflection, communication and interactions that are the core elements of the imbizo.

    Now to my hesitations: Learning in a group (whatever you may call it) requires one thing: commonality. If it is not the curriculum, or the teacher, it has to be something else. In the introduction there was the greatest emphasis placed on shared technology, or should I say sharing technologies (technologies for sharing), like RSS, Moodle, FB, Diigo, twitter, slideshare, etc. This is what bothers me slightly, it was not the shared interest/topic that was laid out as a basis for coming together, but the how-to of technology. To me this sounds like focusing traditional courses around a blackboard or the OHP, which I personally find off target, since to me technology is a tool not a purpose.

    Stephen mentioned the four dimensions of connectivist learning: autonomy, diversity, openness, and interactivity. More closely relating to the course structure, George had three dimensions of activity zones: structured, exploratory, and free-range/chaotic. Hmm, both approaches may not be useful to distinguish the MOOC imbizo from what we are doing already in every-day practice. In my work, when I do the “learning-on-the-job”-thing, I am acting autonomously, exploratory and chaotic. I am also open about it and share this interactively via a variety of blogs and social tools on the net.

    A more general curiosity to me is that politics around the education system (incl. HE) is intensely focused on smaller class sizes, yet, here in the MOOC we go large – very large indeed. Is learning in small groups out?

    As I said above, I am curious to see how this develops. Since I am interested in the MOOC itself, I won’t be disappointed by missing the topic, but I am optimistic that topics will be covered and once participants find their strategy, they can and will benefit from it.