When and how can ChatGPT help us to save hand labeling efforts

We spend a lot of time and money on students to annotate text data. Some of them are used to train ML text classification models. Can ChatGPT be used to save some of the resources?

We have lots of annotated data from previous projects. This can be used to investigate if the task could have been fulfilled with ChatGPT and under which conditions.

Some playful ideas to investigate:

  • Is ChatGPT temporarily reliable? When prompted with some time (months) in between doe answers vary?
  • Can we use the parameter temperature to estimate the (hidden) certainty of an answer?
Max Callaghan
Max Callaghan
Researcher