Comparing local large language models for alt-text generation

February 28, 2025

The use of LLMs to generate alt text is somewhat contentious, with concerns about the accuracy of the descriptions generated. But automating a process that it has proved challenging to get humans to do, for whatever reason, may ensure far more alt text than we get at present, particularly on social media.

At Conffab we use LLMs to provide text descriptions for images in slides in presentations for our accessible slides feature. This is something we did originally by hand, but when a conference might have many thousands of slides, and a reasonable percentage contain images of some sort, this is a very costly and time consuming exercise.

An LLMs based approach very, very significantly sped this up. We do check the output to ensure these are good descriptions, but we have found these are often better descriptions than those done by humans–especially when they are of complex diagrams and charts. Here Dries Buytaert, the founder of Drupal tested 10 LLMs for doing this task, and concluded in a follow up post

Trusting AI to describe my photos wasn’t easy. But after 9,000 images, I had to admit: it often did the job better than me, and at a fraction of the cost.