Index of AI research


The writing task involved generating text and then copying it into a text editor for rewriting, to create a hybrid work. The text generator allowed repeats but no actual fine editing. So by observing what happened in the editing window, it was possible to see if any editing had taken place.

Results summary

One third of respondents edited the generated text, two thirds did not. This shows that two thirds of people ignored the task instructions, to generate then edit before saving.

Perhaps they found the generated text sufficient without editing. This is the method used in previous studies of writers using generated text (Calderwood et al, 2019; Gero and Chilton, 2019).

Respondents that didn’t edit the text were less actively engaged in the experiments, shown in the three factors below: time on study; depth (completed number out of three experiments) and feedback ratings (Likart).

Creative occupations (poets etc.) were more likely to edit than non-creative (copywriters) by over a third.

Details – Method

I used an analytics service (FullStory 2020, see references) for live cursor tracking on the web page. These page sequences can then be recorded for future inspection. This is still anonymous as there was no actual link to the work, but connections could be made via reading the written text and searching in the data. (‘No data’ means the record did not not have full page recording, as some records were deleted by the free service before logging.)


Edited or not edited

Edited or not edited

One third of respondents edited the generated text, two thirds did not.

Time on Study

Edited-not edited time on study
Edited-not edited time on study

People that generated text but didn’t subsequently edit were less engaged in the experiments, shown in the graph below, with non-editors on study only 40% of the time.
Average time for editors 39m 38s, non-editors 15m 42s.

Depth – number of experiments completed

Edited how many experiments

This clearly show that respondents who did not edit the generated text (red), only completed the first experiment (nearly three-quarters), and most who edited the text (blue) completed all three.

Likart feedback scales

A simple average of all nine Likart scales (question on ‘Selling as Own’ was reversed). 1 is positive, 5 is negative, 3 is neutral.

Average Editing respondent 2.5
Average Non-Editing respondent 2.4

Both about same, slightly positive.



This is numbers identified as Creative vs Non-creative occupations (eg Poets or Copywriters). There is a preference (over a third, 38%, more) for editing amongst Creative writers.

Engagement – Note

Overall Study engagement, combining various measures of positive interaction and regard, is assessed in another blog.


FullStory Analytics

Alex Calderwood, Vivian Qiu, Katy Ilonka Gero, Lydia B. Chilton. . How
Novelists Use Generative Language Models: An Exploratory User Study. In
IUI ’20 Workshops, March 17, 2020, Cagliari, Italy. ACM, New York, NY, USA,
5 pages.

Katy Ilonka Gero and Lydia B. Chilton. 2019. Metaphoria: An Algorithmic Companion for Metaphor Creation. In CHI Conference on Human Factors in Computing Systems Proceedings (CHI 2019).