If you haven’t noticed lately, there is a growing area of concern surrounding the field of learning analytics (also sometimes combined with artificial intelligence).
Of course, there has always been some backlash against analytics in general, but I definitely noticed at the recent Learning Analytics and Knowledge (LAK) conference that it was more than just a random concern raised here and there that you usually get at any conference. There were several voices loudly pointing out problems both online and in the back channel, as well as during in-person conversations at the conference.
Many of those questioning what they saw were people with deep backgrounds in learning theory, psychology, and the history of learning research. But its not just people pointing out how these aspects are missing from so much of the Learning Analytics field – it is also people like Dr. Maha Bali questioning the logic of how the whole idea is supposed to work in blog posts like Tell Me, Learning Analytics…
I have been known to level many of the current concerns at the Learning Analytics (LA) field myself, so I probably should spell out what exactly it is that I want from this field as far as improvement goes. There are many areas to touch on, so I will cover them in no particular order. This is just what comes to mind off the top of my head (probably formed by my own particular bias, of course):
Mandatory training for all LA researchers in the history of educational research, learning theory, educational psychology, learning science, and curriculum & instruction. Most of the concerns I heard voiced at any LAK I have attended was that these areas are sorely missing in several papers and posters. Some papers were even noticed as “discovering” basic educational ideas, like students that spend more time in a class perform better. We have known this from research for decades, so… why was this researched in the first place? And why was none of this earlier research cited? But you see this more than you should in papers and posters in the LA field – little to no theoretical backing, very little practical applications, no connection to psychology, and so on. This is a huge concern, because the LAK Conference Proceedings is in the Top 10 Educational Technology journals as ranked by Google. But so many of the articles published there would not even go beyond peer review in many of the other journals in the Top 10 because of their lack of connection to theory, history, and practice. This is not to say these papers are lacking rigor for what they include – it is just that most journals in Ed-Tech require deep connections to past research and existing theory to even be considered. Other fields do not require that, so it is important to note this. Also, as many have pointed out, this is probably because of the Computer Science connection in LA. But we can’t forego a core part of what makes human education, well… human… just because people came from a background where those aspects aren’t as important. They are important to what makes education work, so just like a computer engineer that wants to get into psychology would have to learn the core facets of psychology to publish in that area, we should require LA researchers to study the core educational topics that the rest of us had to study as well. This is, of course, something that could be required to change many areas in Education itself as well – just having an education background doesn’t mean one knows a whole lot about theory and/or educational research. But I have discussed that aspect of the Educational world in many places in the past, so now I am just focusing on the LA field.
Mandatory training for all LA researchers in structural inequalities and the role of tech and algorithms in creating and enforcing those inequalities. We have heard the stories about facial recognition software not recognizing black faces. We know that algorithms often contain the biases of their creators. We know that even the prefect algorithms have to ingest imperfect data that will contain the biases of those that generated it. But its time to stop treating equality problems as an after thought, to be fixed only when they get public attention. LA researchers need to be trained in recognizing bias by the people that have been working to fight the biases themselves. Having a white male instructor mention the possibility of bias here and there in LA courses is not enough.
Require all LA research projects to include instructional designers, learning theorists, educational psychologists, actual instructors, real students, people trained in dealing with structural inequalities, etc as part of the research team from the very beginning. Getting trained in all of the fields I mentioned above does not make one an expert. I have had several courses on educational psychology as part of my instructional design training, but that does not make me an expert in educational psychology. We need a working knowledge of other fields to inform our work, but we also need to collaborate with experts as well. People with experience in these fields should be a required part of all LA projects. These don’t all have to separate people, though. A person that teaches instructional design would possibly have experience in several areas (practical instruction, learning theory, structural inequality, etc). But you know who’s voice is incredibly rare in the LA research? Students. Their data traces DO NOT count as their voice. Don’t make me come to a conference with a marker and strike that off your poster for you.
Be honest about the limitations and bias of LA. I read all kinds of ideas for what data we need in analytics – from the idea that we need more data to capture complex ways learning manifests itself after a course ends, to the idea that analytics can make sense of the word around us. The only way to get more (or better) data is to increase surveillance in some way or form. The only way to make more sense is to get more data, which means… more surveillance. We should be careful not to turn our entire lives into one mass of endless data points. Because even if we did, we wouldn’t be capturing enough to really make sense of the world. For example, we know that click stream data is a very limited way to determine activity in a course. A click in an online course could mean hundreds of different things. We can’t say that this data tells us what learners are doing or watching or learning – only just what they are clicking on. Every data point is just that – a click or contact or location or activity with very little context and very little real meaning by itself. Each data point is limited, and each data point has some type of bias attached to it. Getting more data points will not overcome limitations or bias – it will collect and amplify them. So be realistic and honest with those limitations, and expose the bias that exists.
Commit to creating realistic practical applications for instructors and students. So many LA projects are really just ways to create better reports for upper level admin. Either that, or ways to try and decrease drop-outs (or increase persistence across courses as the new terminology goes). The admin needs their reports and charts, so you can keep doing that. But educators need more than drop-out/persistence stuff. Look, we already have a decent to good idea what causes those issues and what we can do to improve them. Those solutions take money, and throwing more data at them is not going to decrease the need for funding once a more data-driven problem (which usually look just like the old problems) is identified. Please: don’t make “data-driven” become a synonymy for “ignore past research and re-invent the wheel” in educators eyes. Look for practical ways to address practical issues (within the limitations of data and under the guiding principle of privacy). Talk to students, teachers, learning theorists, psychologists, etc while you are just starting to dig into the data. See what they say would be a good, practical way to do something with the data. Listen to their concerns. Stop pushing for more data when they say stop pushing.
Make protecting privacy your guiding principle. Period. So much could be said here. Explain clearly what you are doing with the data. Opt-in instead of opt-out. Stop looking for ways to squeeze every bit of data out of every thing humans do and say (its getting kind of gross). Remember that while the data is incomplete and biased, it is still a part of someone else’s self-identity. Treat it that way. If the data you want to collect was actual physical parts of a person in real life – would you walk around grabbing it off of them the way you are collecting data digitally now? Treat it that way, then. Or think of it this way: if data was the hair on our heads, are you trying to rip or cut it off of peoples’ heads without permission? Are you getting permission to collect the parts that fall to the floor during a haircut, or are you sneaking in to hair cutting places to try and steal the stuff on the floor when no one is looking? Or even worse – are you digging through the trash behind the hair salon to find your hair clippings? Also – even when you have permission – are you assuming that just because the person who got the hair cut is gone, that this means the identity of each hair clipping is protected… or do you realize that there are machines that can identify DNA from those hair clippings still?
Openness. All of what I have covered here will require openness – with the people you collect data from, with the people you report the analytical results to, with the general public about the goals and results, etc. If you can’t easily explain the way the algorithms are working because they are so complex, then don’t just leave it there, Spend the time to make the algorithms make sense, or change the algorithm.
There are probably more that I am missing, or ways that I failed to explain the ones I covered correctly. If you are reading this and can think of additions or corrections, please let me know in the comments.
Author: Matt Crosslin
Matt is currently an Instructional Designer II at Orbis Education and a Part-Time Instructor at the University of Texas Rio Grande Valley. Previously he worked as a Learning Innovation Researcher with the UT Arlington LINK Research Lab. His work focuses on learning theory, Heutagogy, and learner agency. Matt holds a Ph.D. in Learning Technologies from the University of North Texas, a Master of Education in Educational Technology from UT Brownsville, and a Bachelors of Science in Education from Baylor University. His research interests include instructional design, learning pathways, sociocultural theory, heutagogy, virtual reality, and open networked learning. He has a background in instructional design and teaching at both the secondary and university levels and has been an active blogger and conference presenter. He also enjoys networking and collaborative efforts involving faculty, students, administration, and anyone involved in the education process.