Category Archives: Technology & Innovation

Data assumptions can be deluding

Here’s a slightly critical add-on to my previous post about Learning Analytics and EDM. This year’s Learning and Knowledge Analytics course (LAK12), once again, brings up some highly valuable perspectives and opportunities for developing new insights, models, and harvesting possibilities for learning in general. However, this should not stop us from being aware of delusions and assumptions that are somehow orphaned in the ongoing discussions. In particular, I want to mention three potential pitfalls that are perhaps too much taken for granted:

  • data clenliness
  • weighting
  • meaningful traces

When developers talk about their products, everything looks shiny and wonderful. All the examples shown work smoothly and give meaningful results. This makes me pause, for while the ideation of new analytics technology is a wonderful thing and anticipated with much enthusiasm, the data isn’t always as clean as it is presented to be in the theoretical context. Most, if not all data processes have to undergo a clensing process to get rid of datasets that are “contaminated”. A good example is the typical “teacher test” content found in virtually every VLE database. It’s not always clearly indicated as “test”either, so in many cases extensive manual processes of eliminating nonsensical data have to be conducted. It should, therefore, be standard practice to report on how much data was actually “thrown away” and on what basis. Not that this would discredit in any way the usefulness of the remaining dataset, but indicate the amount of automated or manual selection that has gone into it.

This necessarily leads to questioning the weighting of data. By which mechanism are some datasets selected as being meaningful and at which priority level over others. Very often, the rational behind the selection of variables is not exposed neither is the priority relationship to other variables in the same dataset. Still it must be transparent whether e.g. the timing, the duration, or the location of an event is given more weight when predicting a required pedagogic intervention (or not). After all, a young person’s future may depend on it.

From the above two limitations results a third, that concerns the question of what is a meaningful trace a user leaves on the system. We know users leave data behind when conducting an electronic activity. These can be sequenced by time, but it is by far not clear where the useful cut-off points of a sequence or ‘trace’ are. Say you had a string of log data A-B-C-D-E-F-G-H. Does it make more sense to assume BCD constituting meaning or would CDEF perhaps be better – and why would it be better?

I realise that these questions could be interpreted as destructive criticism, but we have one other possibility, which is to just take the results conjured up in a black box at face value and see if they look plausible no matter how they were derived. This we could call the Google approach.

The twinning of EDM and Learning Analytics

After listening to Ryan Baker’s presentation on Educational Data Mining (EDM), I am more convinced than ever that EDM and Learning Analytics are actually the same side of the same coin. Despite attempts being made to explain them into different zones of influence or different missions, I fail to see such differences, and from reading other LAK12 participants’ reflections, I am not alone in this. Baker’s view that Learning Analytics are somewhat more “holistic” can be refuted with a simple “depends”. What is more, historically, EDM and LA don’t even originate from different scientific communities, such as is the case with metadata communities versus librarians, or with electric versus magnetic force physics – now of course known as electromagnetism.

Both approaches (if there are indeed two) are based on examining datasets to find ‘invisible’ patterns that can be translated into information useful to improve the success and efficiency of the learning processes. A good example Baker mentioned was the detection of students that digress, misunderstand, game the system, or disengage. It’s all in the data.

I would also like to believe that predicting the future leads to changing the future, at least it could give users the air of being in control of their destination. As a promotional message this has quite some power. But even in support of reflection the same can be postulated: knowing past performance can help your future performance! So, once again a strong overlap between predictive and reflective application of data analytics.

For me, all of this can only lead one way: instead of using efforts and energies to differentiate the two domains, which would only lead to reduced communities both ends, and friction in between, we need to think big and marry them into one large community and domain: Let’s twin EDM and LA!

A near complete history of EC funded research

More transparency where EU-funded research is going has always been a desirable. Also to know who is active in it and to what extent. ResearchRanking (beta) is an interesting attempt to rating European research institutions by participation in EU-funded projects. Total funding has been relatively steady over the past two decades:

EU funding statsThe site allows search by institution to see how successful they have been in getting funding, whether as participants or as coordinators. When looking at my own institution, the data is still incomplete, only covering Framework Programmes, but since it is beta, I expect more to come. Still, it’s a good start and judging from how our project activities from the past are identified, it looks representative for the work we are doing.

Interesting to inspect are the ranking tables, where usual suspects CNRF (France) and Fraunhofer (Germany) are leading the 2010 table.

In summary, the site provides opportunities for interesting browsing and it’s worth spending a few moments on it. Finally, it seems, the idea of Open Data has reached the European Commission and we can expect more insights into the workings of the ivory tower in Brussels.


Skype offers Wifi for pennies

This could be a game changer in the data roaming business! Data roaming in Europe has been a pain to say the least. I live in a border area between Belgium, the Netherlands and Germany and frequently cross over for all kind of activities. Every time I have to take special care that my phone does not dial into a foreign data connection, else it would become quickly very expensive. Ubiquitous mobile learning, I hear you ask? – Forget it! This is a problem that I believe the US doesn’t have, where you can roam from state to state and sea to sea with the same data tarif the mobile telecom provides.

Skype now offers 1,000,000 Wifi access points at low cost from 5c – 16c per hour with unlimited data. This is a much better deal than the data transfer rates other telcos offer. The EU has long argued that data roaming is not competitively priced and the same telco charges the user x-times as much for the same service when they cross the border between EU countries. Now this offer from Skype may be the competitive offer that we all have been waiting for, and it will turn up the heat for telcos to provide a decent pricing structure. This could be a real enabler for ubiquitous mobile learning!

**Oops, I stand corrected: the Skype pricing is per minute not per hour, which, sadly, does not make the Skype offer as competitive as I mentioned! Sorry folks.

Jumping ship to Google+

While MOOC-ing about, I realised how most people I know already are hooked into Google web apps. Practically everyone I encounter on eduMOOC has a Gmail account, I receive invitations to Google Docs almost daily, and now everyone is jumping at Google+. Have people forgotten the black hat Google is wearing?!

My view is that people are so keen on Google+ not because it’s so perfect and so private, but because of the arrogant way Facebook has dealt with their customers. But this is perhaps a naive way of dealing with Google, which, not so long ago, was the bad guy of the Internet.

To give credit where it belongs, G+ does have its merits and the company has learned a lot from past failure, like Wave. There is also great appeal in the integration of the highly usable and good quality productivity services Google provides. Still, I do get the feeling that the lock-in gets tighter and the circle that Google is drawing around us gets narrower and narrower. It is certainly more and more difficult to “escape”.

Could we see Google emerge as the first virtual state taking over the rule of what we are doing online. Will we see a Google virtual Prime Minister soon? With the identity infrastructure they mention in their plans it is certainly feasible. Where will this leave the rest of the Web – will there be anarchic outcasts, outposts of unregulated (un-googlified) web users?

I realise this sounds quite sci-fi for now, but wait and see…!

Identity infrastructure for the Web

In a post in early 2009, I anticipated the coming of a new Internet. Unlike people who thought Web 3.0 would give way to the Semantic Web, I long held it that by maturing as a virtual society, the Internet would inevitably require identities around which this society would be structured, or would structure itself. Hence, I firmly believed and still do that we are going to see a Personal Web emerge over time. By this I not so much mean that the web experience is being personalised, but that we have a singular accredited identity, just like we have in real life with our ID cards, social security numbers, etc.

Signs are that Google is going to lead the way there. In this interview, Google’s Eric Schmidt admits that they are working on an identity infrastructure for the web. So this may finally be the plot behind Google+. And certainly Chrome’s brand new browser identity management is a big step in this direction. Schmidt unmistakenly talks about unique identities perhaps with multiple personas and perhaps personalities. There are of course countless advantages in terms of convenience, personal safety, child protection, identity theft and fraud prevention. Numerous disadvantages too, in terms of policing, tracking, or spying.

So far Google’s long term plans are still kept quiet, and while anonymous browsing and chatting might still stay around for a while, in the end this development might mean that someone who claims to be under the age of thirteen, might indeed be under age.

Facebook competition from Google+

Finally, there looks like a real challenge to Facebook, coming – surprise surprise – from Google. Google have tried for years to get into the social networking business, but did not quite manage. Maybe the Google+ project can be more successful than Buzz or Orkut.

It actually looks quite promising. First and foremost, it doesn’t call everyone you meet a “friend”. Google recognises that we share different things with different sets of people we know (or don’t know). The name ‘circles’ is quite appropriate for this and reflects in my mind social reality much better than Facebook.

I agree with all three points of their assessment:

* We only want to connect with certain people at certain times, but online we hear from everyone all the time.
* Every online conversation (with over 100 “friends”) is a public performance.
* We all define “friend” and “family” differently—in our own way, on our own terms.

Google+ is still by invitation only, so we have to wait for what the sharing of content that they promise looks like, but at present at least it sounds good. I also expect the much bemoaned search and archive functionality in Facebook would be easily outperformed. Google+ will be available for Android, as an Apple App and as mobile web app, so plenty of mobility for connecting is guaranteed.

Since everyone on the eduMOOC learning network is already in Google with Google groups or other tools, I think a Google+ circle might just be the thing to do – provided Google+ opens for business anytime soon.

Mobile devices boost social activities

Social networks are, of course, available on PCs and laptops as well as handheld devices and tablets. Anything with a browser really. However, in a short survey we conducted with our students, 700 respondents confirmed to us, what we suspected all along. That is that mobile devices (and mobile apps in particular) boost social network activities. What is more, the more devices a person owns, the more active they are likely to be (see graphs below).

twitter use facebook use linkedin use

Frequent use (blue) in the graphs above represents daily or weekly use of the service; rare use (red) stands for less than once a week; while no use of a service (green) is shown on top. Click on thumbnail for full view. Although explainable and expected, the conclusion maybe somewhat skewed, because there are more people with only one mobile device than there are with 4 devices.

Some research questions for mobile learning

Mobile technologies change the ways we learn, work, and play. One day, they may fully replace stationary computer systems, at least in everyday activities. If we see mobile developments as a trajectory for the transition from stationary computing to fully flexible, nomadic, mobile computing, a number of challenges present themselves. These challenges lead to research questions we need to address:

Fragmentation is one of the challenges across all three perspectives. It encompasses the management and orchestration of fragmented infoscapes, learning networks, pedagogic strategies, and technical devices. The management of these environments is typically driven by user preferences, either individually, or by inter-personal consensus. The research question we derive from this, is how we can better bridge fragmented mobile environments to achieve more effective learning.

A consequence of fragmentation is distraction and interrupts. This is typically caused by having too many devices or activities on standby and alert. Monitoring a variety of information channels, receiving alerts, and the constant anxiety of missing the all-important
information, leads to information overload and distress. We need to ask ourselves what filter mechanisms can be developed and used to reduce this cognitive attention load.

Mobile devices encompass an increasing number of data sensors that allow for environmental perception never before experienced. There is great opportunity in this data, but also a number of issues (mainly relating to privacy and ownership). Exploiting the data produced by mobile devices and applications for learning analytics should become a priority for investigation. This would include automated context analysis and interaction monitoring. In my view this could lead to innovative approaches for personalisation and prediction.

Dummy computing and intelligent households

Two announcements at the Google developer conference made the rounds today: Android@home and Chromebooks are the Google vision of the future.

Chromebooks are basically dummy portable computers or tablets that do not have an operating system but run on the Chrome browser. This means you turn on the computer and it’s immediately on the net. According to Google this makes security issues, antivirus updates, data loss etc. a thing of the past. With thin-client technology like Citrix one could have a full desktop environment in a browser and everyone will be learning, working and playing in the Cloud.

So this sounds interesting, but! A natural born skeptic, I am waiting to see what this means in reality. For one thing, there is the anticipated cost of cloud computing: Doing everything on the web weighs in heavy on the charges for data transfer (add to these the extortionate roaming costs). There are payments for cloud services, and there is no guarantee that online office offerings like Zoho or Google Docs will stay free – especially, when users are trapped on the Web without alternative. Finally, you would have to handover your content to some company or other – in order to have access to it, e.g. your digicam photos and videos.

Google is probably right that it will bring down the cost of computing dramatically, and this would benefit learners from economically challenged backgrounds and the Third World. But does it also mean that when students use such computers in education, all their data is first passed to Google HQ, before it reaches the teacher? In the protective and fragile privacy environment that universities and schools operate in, this would be a stark violation of principles, which, at present, I cannot see happen.

A similar line of argument can be made for Android@home, which aims to equip ordinary household goods with internet-enabled intelligence and control. According to the news, this would lead to being able to operate the heating or lighting systems and other appliances remotely. The coolness factor aside, again privacy is the main concern: Imagine the sheer amount of data Google and others would be able to gather from every household and their inhabitants behaviour. And who do you call when the dish washer isn’t working – the plumber or the IT hotline?