Show HN: I built a tiny LLM to demystify how language models work

(github.com)

173 points | by armanified 4 hours ago

14 comments

ordinarily 1 hour ago
It's genuinely a great introduction to LLMs. I built my own awhile ago based off Milton's Paradise Lost: https://www.wvrk.org/works/milton
NyxVox 44 minutes ago
Hm, I can actually try the training on my GPU. One of the things I want to try next. Maybe a bit more complex than a fish :)
martmulx 58 minutes ago
How much training data did you end up needing for the fish personality to feel coherent? Curious what the minimum viable dataset looks like for something like this.
gnarlouse 58 minutes ago
I... wow, you made an LLM that can actually tell jokes?
cbdevidal 1 hour ago
> you're my favorite big shape. my mouth are happy when you're here.
Laughed loudly :-D
[-]
- vunderba 24 minutes ago
  This is a direct output from the synthetic training data though - wonder if there is a bit of overfitting going on or it’s just a natural limitation of a much smaller model.
nullbyte808 2 hours ago
Adorable! Maybe a personality that speaks in emojis?
SilentM68 2 hours ago
Would have been funny if it were called "DORY" due to memory recall issues of the fish vs LLMs similar recall issues :)
AndrewKemendo 2 hours ago
I love these kinds of educational implementations.
I want to really praise the (unintentional?) nod to Nagel, by limiting capabilities to representation of a fish, the user is immediately able to understand the constraints. It can only talk like a fish cause it’s very simple
Especially compared to public models, thats a really simple correspondence to grok intuitively (small LLM > only as verbose as a fish, larger LLM > more verbose) so kudos to the author for making that simple and fun.
[-]
- dvt 2 hours ago
  > the user is immediately able to understand the constraints
  Nagel's point was quite literally the opposite[1] of this, though. We can't understand what it must "be like to be a bat" because their mental model is so fundamentally different than ours. So using all the human language tokens in the world can't get us to truly understand what it's like to be a bat, or a guppy, or whatever. In fact, Nagel's point is arguably even stronger: there's no possible mental mapping between the experience of a bat and the experience of a human.
  [1] https://www.sas.upenn.edu/~cavitch/pdf-library/Nagel_Bat.pdf
  [-]
  - AndrewKemendo 1 hour ago
    Different argument
    I’m not going to argue other than to say that you need to view the point from a third party perspective evaluating “fish” vs “more verbose thing,” such that the composition is the determinant of the complexity of interaction (which has a unique qualia per nagel)
    Hence why it’s a “unintentional nod” not an instantiation
agenexus 23 minutes ago
[dead]
ethanmacavoy 55 minutes ago
[dead]
Morpheus_Matrix 2 hours ago
[dead]
weiyong1024 1 hour ago
[dead]
LeonTing1010 35 minutes ago
[flagged]
[-]
- secabeen 30 minutes ago
  Training data is here:
  https://huggingface.co/datasets/arman-bd/guppylm-60k-generic