The long tail of LLM-assisted decompilation

(blog.chrislewis.au)

27 points | by knackers 4 hours ago

6 comments

nemo1618 1 minute ago

IMO this is one of the best use cases for AI today. Each function is like a separate mini problem with an explicit, easy-to-verify solution, and the goal is (essentially) to output text that resembles what humans write -- specifically, C code, which the models have obviously seen a lot of. And no one is harmed by this use of AI; no one's job is being taken. It's just automating an enormous amount of grunt work that was previously impossible to automate.
I'm part of the effort to decompile Super Smash Bros. Melee, and a fellow contributor recently wrote about how we're doing agent-based decompilation: https://stephenjayakar.com/posts/magic-decomp/
bri3d 11 minutes ago

Claude is doing the decompilation here, right? Has this been compared against using a traditional decompiler with Claude in the loop to improve decompilation and ensure matched results? I would think that Claude’s training data would include a lot more pseudo-C <-> C knowledge than MIPS assembler from GCC 2.7 and C pairs, and even if the traditional decompiler was kind of bad at N64 it would be more efficient to fix bad decompiler C than assembler.
decidu0us9034 54 minutes ago

"Claude struggles with large functions and more or less gives up immediately on those exceeding 1,000 instructions." Well, yeah, that's the thing, an n64 game, that's C targetting an architecture where compiler optimizations are typically lacking, the idomatic style is lots of small tightly-scoped functions and the system architecture itself is a lot simpler than say a modern amd64 pc... These days I often just feel like, why is this person telling me how easy my job is now when they seemingly don't know much about it. I just find it arrogant and insulting... Perpetually demo season.
OptionOfT 56 minutes ago

I'm really excited about this, especially for games for which the source code was lost like Red Alert 2.
amelius 33 minutes ago

Does this technique limit the LLM to correctness-preserving transforms?
roelljr 13 minutes ago

[dead]