5 comments

  • nycerrrrrrrrrr 5 hours ago
    This might be orthogonal to the TLB miss overhead you found, but have you looked at using P2PDMA to transfer directly from the NVMe SSDs to the NIC? Not sure how the CRC calculation would play into that.
    • jeffbee 7 hours ago
      It seems that you could have reached this conclusion faster by elaborating on your use of the profiler. Don't assume that cycles are spent on instructions. Look at your IPC and drill down into what CPU-bound means for your workload. In your case I think a standard top down analysis would have made the virtual memory management cost jump right out.
      • MrCroxx 4 days ago
        Author here. This post is a write-up of a performance-debugging rabbit hole I hit while trying to saturate NICs with NVMe reads using io_uring and RDMA.

        The short version: READ_FIXED fixed the obvious per-I/O GUP overhead in a small demo, but the larger deployment still got stuck at roughly half of line rate. After ruling out io-wq backlog, request splitting, fd lookup, and CRC arithmetic, the actual wall turned out to be dTLB misses from scanning 1,028 KiB buffers backed by 4 KiB pages. Moving the read arena to hugepages brought the system close to NIC saturation.

        The funny part is that an AI agent suggested hugepages early and got the optimization right, but its explanation was wrong. This post is mostly about reconstructing the evidence for why it worked.

        I’d be very interested in feedback from people who have used AI to debug performance issues in a complex system.

        • ozgrakkurt 6 hours ago
          I disagree with the AI part. Because hugepages is one of the things that can be guessed to improve performance when doing something with substantial amount of data.

          So anyone familiar with the space could have suggested something like that without knowing the details of the problem. Hence it is not useful advice IMO.

          That aside, the blog post was really cool to read and a instant favorite, wish there were more english posts on the blog.

          Especially like the hardware limit based expectations, detailed measurements and the writing style.

        • rnio 8 hours ago
          [flagged]
          • serious_angel 7 hours ago
            [flagged]
            • dang 1 hour ago
              You crossed into personal attack here, and that's not allowed on HN. It's not what this site is for, and destroys what it is for.

              If you'd please review the site guidelines and stick to them when posting, we'd appreciate it.

              https://news.ycombinator.com/newsguidelines.html

              • modslieulie 55 minutes ago
                Lies lies lies Mods routinely break their own policies. You delete accounts that do not break any guidelines, you delete accounts because they say things you don't agree with, lies lies lies.

                You allow government agents to do whatever they want on this site. Lies lies lies. You're so full of yourself too it's gross. Liar.

              • ozgrakkurt 6 hours ago
                It is obvious that the blog is good quality if you have moderate knowledge on the subject and read the blog post.
                • MasterScrat 6 hours ago
                  This sounds like a strong statement with little backing. The author does infra at DeepSeek if his LinkedIn is to be trusted, and is the author of Foyer.
                  • flipped 4 hours ago
                    [flagged]
                    • dang 1 hour ago
                      We've banned this account for repeatedly breaking the site guidelines and ignoring our request to stop.

                      If you don't want to be banned, you're welcome to email hn@ycombinator.com and give us reason to believe that you'll follow the rules in the future. They're here: https://news.ycombinator.com/newsguidelines.html.