Index for AI research blogs / Geoff Davis London
I am doing research into computer text generation, specifically machine learning and text generation, in a field generally known as AI or artificial intelligence, at UAL CCI Camberwell London. My research supervisor is Professor Mick Grierson (see credits at bottom). I’m also doing other practice-based artworks for an exhibition in 2021.
I previously programmed ‘story generators’ in the days of home micros. The first computer poems were generated in 1953. So text and computing has a long history.
2020 research question: what happens when writers use a computer text generator to help write various types of articles? What are their experiences when doing this hybrid activity? This is open-ended (no hypothesis) and multi-modal research.
Apart from the practicalities of using a text generator, I also addressed plagiarism and ‘fake news’ (see below), and examined how the writer’s occupation (poet, copywriter) affected response.
Study 1 – August 2020
First I experimented with some AI text generators, new and old, large and small. These included, locally, the famous Karpathy 100 line system, and new Google Colab GPT-2 Transformer systems (see refs at bottom). At the end of my first academic year, I decided to do practical research into text generation tool usability in an experimental study. This would gather feedback from actual writers (stakeholders, in the jargon) who will be affected by these techniques.
Text generators aren’t new (data-driven sports and financial article generators since 2016, my own story generator from 1985), but this is one of the first studies of the effects on writers, which are ethical, political and economic.
The study had three text generation and editing experiments, and included many feedback questions. In these blogs I’m presenting some early findings in an informal way.
Nine out of ten of these writers (89%) had never used a text generator before. It is easy when working in the computer domain to assume everyone knows about advances, but they do not.
This has a plan of the study pages, and more detail.
Blogs on topics
Overall emotions for the three different text experiments
What were the emotional effects of using the text generator?
- The actual text – what was generated (soon)
Examples of edited generated text – the hybrid works – are illuminating, and these are examined. Some people put my name into the generator, or added it later, to give comical effects.
- Overall effects – relationships between experiments (soon)
There are various relationships in the data, summarised here.
- Engagement with hybrid CST writing tool – devise a single rating (soon)
Questions after the three text experiments
- Enjoyment of text generation Q1
- Emotions and feeling when using text generator Q2
- Ownership and Plagiarism of generated texts Q3 and Q5
(Others’ Work and Sell as own Work questions (note Sell was a reverse Likart and also out of sequence to check attention).
- Use in a word processor or creativity support tool Q4
- Final feedback comments Q6/1
- ‘Fake news’ – what do real writers think about it? Q6/2
Even defining fake news is not that simple, since if you look for examples in the USA, or Syria, both sides accuse the other of using it. See below for quick summary of ‘fake news’. Very topical, is new version of propaganda, age-old subject.
Blogs – related
For more information on ‘fake news’ research
These are the most recent papers on fake news:
What is ‘fake news’?
Generally, fake news is false or distorted information used as propaganda for immediate or long-term political gain. This definition separates it from advertising, which has similar approaches but is for brand promotion rather than life or death matters. False can be anything from an opinion to a website which appears to be proper news but is actually run to spread false information.
This has led to ‘reality checks’ where claims are checked against reality (which still exists) but the problem is that fake news spreads very quickly (because it appeals to the emotions, often fear or hate) while corrections (fact checks) take time to process and are very dull, so hardly anyone reads them, as they have no emotional content, and arrive days after the original hot news item.
People usually think they can tell fake news from real. What they read in their particular news reports, which are mostly channels either mainstream liberal, left or right wing, must be correct, and ‘fake news’ are only in the other’s (opposing views) news channel. This is how it plays out with each side of any debate accusing the other of believing or producing ‘fake news’.
Certainly the people that react to the fake news (if it is proven so) have moved on already, to the next item. But the people outraged by the fake news, only hear about it in the fact check rebuttal.
This study was devised and the study site programmed by Geoff Davis for post-graduate research at University of London UAL CCI 2019-2020. Research supervisor is Professor Mick Grierson.
A publicly available text generator was used in the study experiments, as this is the sort of system people might use outside of the study.
It was also not practical to recreate (program, train, fine-tune, host) a large scale text generation system for this usability pre-study.
Fabrice Bellard, coder of Text Synth:
Text Synth is build using the GPT-2 language model released by Google OpenAI. It is a neural network of 1.5 billion parameters based on the Transformer architecture.
GPT-2 was trained to predict the next word on a large database of 40 GB of internet texts. Thanks to myriad web writers for the training data and OpenAI for providing their GPT-2 model.
Permission was granted to use Text Synth in the study by Fabrice Bellard July 7 2020.
Visit OpenAI’s blog for more information on Google’s OpenAI text generation.
Image prompt man and dog photo
The image is from public sources under free license, it is an old image. See Man and Dog blog.
All material on this website is copyright Geoff Davis London 2020/21.