LLM-generated labels for topic classification

Nick Hagar
6 min readMar 17, 2024

Read it first on my Substack

In a recent post, I trained a model to classify news articles by topic using just their URLs as input features. This approach, I argued, demonstrated the power of a lightweight dataset, provided the signal in the data was strong enough.

But even with this kind of model, there’s a limit to how lightweight the data can be. It still…

--

--

Nick Hagar

PhD student @ Northwestern University. I worked in digital media, now I study it.