Practical text generation and writing assistants

2019-01-07 — 2023-03-23

Wherein the rise of consumer-facing writing assistants is chronicled and the ubiquity of spammy tools, the need for smooth user interfaces for honest writers, and a Galactica controversy are noted.

faster pussycat

language

machine learning

NLP

real time

signal processing

stringology

Friendly user interfaces for text generation with those large language models for the end user.

This was mostly written before ChatGPT came online and is changing the world. As such, it is hopelessly obsolete

Worth making a listicle here because this domain is full of exponentially more spam because many writers’ assistant tools are written by spammers to assist spammers. Secondary markets are for grey area applications like university essay generators. As someone who has a stated position against university essays, these purposes do not seem morally distinct to me. They are all spam. Nonetheless, as with any taboo-adjacent market, it is hard to filter entrants in it for quality.

To be clear, I do not want to write spam or essays (although, what isn’t spam? Surely history is the judge of that) but I do want a writing assistant with a smooth and helpful UI.

1 Hacks

See LLM hacks for some hacks to get around the limitations of the current state of the art.

2 Text generation tools

Spammy recommendations from Reddit, all essentially obsolete now.

3 For science in particular

Colleagues recommend the generic tool QuillBot AI for easing writing papers.

There is, as far as I know, only one science-specific text-generating large language model, and it has been a public furore. See Galactica saga.

A lot of the coverage has been negative. Perhaps I am missing something, but I do not get why.

Here is some stridently negative coverage: Why Meta’s latest large language model only survived three days online.

Galactica was supposed to help scientists. Instead, it mindlessly spat out biased and incorrect nonsense.

I guess the presumption here is that large language models should do science, rather than help us write science. I think I missed the memo when a large enough neural network would deduce the laws of physics and sociology etc.

Unfortunately, real scientists spit out biased and incorrect nonsense all the time, and people impersonating real scientists do too, and we already spend a lot of time addressing that. Lowering the cost of producing mindless bigotry might be a problem, I suppose, if we are concerned about conference and journal reviewers being overwhelmed by low-quality pseudo-research… but I cannot really imagine that being a huge problem — if a given researcher regularly produces crap they can be easily blacklisted via their institutional email addresses etc. What am I missing?

Is it that members of the public might be confused by spammy science-impersonating PDFs? I suppose that is a concern; In which case it is another argument for reducing the importance of journal publisher paywalls. After all, academic publishers justify their existence largely in terms of the importance of their gatekeeping and verification functions.

Students Are Using AI to Write Their Papers, Because Of Course They Are

4 Grammar and style checking

LanguageTool - Online Grammar, Style & Spell Checker
grammarly
…

5 References

Stiennon, Ouyang, Wu, et al. 2020. “Learning to Summarize with Human Feedback.” In Advances in Neural Information Processing Systems.

Taylor, Kardas, Cucurull, et al. 2022. “Galactica: A Large Language Model for Science.”