11 comments

  • cpard 14 hours ago
    Servethehome[1] does a bit of a better job describing what maverick-2 is and why it makes sense.

    [1]https://www.servethehome.com/nextsilicon-maverick-2-brings-d...

    • phkahler 3 hours ago
      Thats a fairly specialized chip and requires a bunch of custom software. The only way it can run apps unmodified is if the math libraries have been customized for this chip. If the performance is there, people will buy it.

      For a minute I thought maybe it was Risc-V with a big vector unit, but its way different from that.

      • ac29 1 hour ago
        The article says they are also developing a RISCV CPU
        • stogot 2 hours ago
          The quote at the end of the posted Reuters article (not the one you’re responding to) says that it doesn’t require extensive code modifications. So is the “custom software” is standard for the target customers of nextsilicon?
          • jll29 1 hour ago
            Companies often downplay the amount of software modifications necessary to benefit from their hardware platform's strengths because quite often, platforms that cannot run software out of the box lose out compared to those that can.

            By the time special chips were completed and mature, the developers of "mainstream" CPUs had typically caught up speedwise in the past, which is why we do not see any "transputers" (e.g. Inmos T800), LISP machines (Symbolics XL1200, TI Explorer II), or other odd architectures like the Connection Machine CM-2 around anymore.

            For example, when Richard Feynman was hired to work on the Connection Machine, he had to write a parallel version of BASIC first before he could write any programs for the computer they were selling: https://longnow.org/ideas/richard-feynman-and-the-connection...

            This may also explain failures like Bristol-based CPU startup Graphcore, which was acquired by Softbank, but for less money than the investors had put in: https://sifted.eu/articles/graphcore-cofounder-exits-company...

        • klooney 1 hour ago
          They've got a "Mill Core" in there- is the design related to the Mill Computing design?
          • damageboy 25 minutes ago
            Yeah, it's an unfortunate overlap. The Mill-Core in NextSilicon terminology is the software defined "configuration" of the chip so to speak that represents swaths of the application that are deemed worthy of acceleration as expressed on the custom HW.

            So really the Mill-Core is in a way the expression of the customer's code. really.

            • jecel 33 minutes ago
              They are completely different designs, but the name is inspired by the same source: the Mill component in Charles Babbage's Analytical Engine.
          • dlcarrier 13 hours ago
            https://archive.is/6j2p4

            I can't access the page directly, because my browser doesn't leak enough identifying information to convince Reuters I'm not a bot, but an actual bot is perfectly capable of accessing the page.

          • wood_spirit 12 hours ago
            The other company I can think of focusing on F64 is Fujitsu with its A64FX processor. This is an ARM64 with really meaty SIMD to get 3TFLOP of FP64.

            I guess it it hard to compare chip for chip but the question is, if you are building a supercomputer (and we ignore pressure to buy sovereign) then which is better bang for the buck on representative workloads?

            • wmf 2 hours ago
              If Fujitsu only releases one processor every 8 years they're going to be behind for most of the time.
            • dajonker 5 hours ago
              Even if the hardware is really good, the software should be even better if they want to succeed.

              Support for operating systems, compilers, programming languages, etc.

              This is why a Raspberry Pi is still so popular even though there are a lot of cheaper alternatives with theoretically better performance. The software support is often just not as good.

              • wood_spirit 5 hours ago
                Their customers are building supercomputers?
                • daveguy 4 hours ago
                  The implication wasn't to use the raspberry pi toolchain. Just that toolchains are required and are a critical part of developing for new hardware. The Intel/AMD toolchain they will be competing with is even more mature than rpi. And toolchain availability and ease of use makes a huge difference whether you are developing for supercomputers or embedded systems. From the article:

                  "It uses technology called RISC-V, an open computing standard that competes with Arm Ltd and is increasingly being used by chip giants such as Nvidia and Broadcom."

                  So the fact that rpi tooling is better than the imitators and it has maintained a significant market share lead is relevant. Market share isn't just about performance and price. It's also about ease of use and network effects that come with popularity.

              • shrubble 13 hours ago
                Curious if the architecture is similar to what is called “systolic” as in the Anton series of supercomputers: https://en.wikipedia.org/wiki/Anton_(computer)
                • fentonc 46 minutes ago
                  I was an architect on the Anton 2 and 3 machines - the systolic arrays that computed pairwise interactions were a significant component of the chips, but there were also an enormous number of fairly normal looking general-purpose (32-bit / 4-way SIMD) processor cores that we just programmed with C++.
                  • damageboy 22 minutes ago
                    Not really. I work for NextSilicon. It's a data-flow oriented design. We will eventually have more details available that gradually explain this.
                    • le-mark 5 hours ago
                      I spent a lot of time on systolic arrays to compute crypto currency POW (Blake 2 specifically). It’s an interesting problem and I learned a lot but made no progress. I’ve often wondered if anyone has done the same?
                    • gdiamos 11 hours ago
                      I find it helpful to read a saxpy and GEMM kernel for a new accelerator like this - do they have an example?
                      • shusaku 14 hours ago
                        If there really is enough market demand for this kind of processor, it seems like someone like NEC who still makes vector processors would be better poised than a startup rolling RISC-V
                        • damageboy 14 hours ago
                          I work in NS. The riscv was the "one more thing" aspect of the "reveal".

                          The main product/architecture discussed has nothing to do with vector processors or riscv.

                          It's a new, fundamentally different data-flow processor.

                          Hopefully we will improve in explaining what we do and why people may want to care.

                          • joha4270 10 hours ago
                            So, a Systolic Array[1] spiced up with a pinch of control flow and a side of compiler cleverness? At least that's the impression I get from the servethehome article linked upthead. I wasn't able to find non-marketing better-than-sliced-bread technical details from 3 minutes of poking at your website.

                            [1]: https://en.wikipedia.org/wiki/Systolic_array

                            • damageboy 7 minutes ago
                              I can see why systolic arrays come to mind, but this is different. While there are indeed many ALUs connected to each other in a systolic array and in a data-flow chip, data-flow is usually more flexible (at a cost of complexity) and the ALUs can be thought of as residing on some shared fabric.

                              Systolic arrays often (always?) have a predefined communication pattern and are often used in problems where data that passes through them is also retained in some shape or form.

                              For NextSilicon, the ALUs are reconfigured and rewired to express the application (or parts of) on the parallel data-flow acclerator.

                              • CheeseFromLidl 8 hours ago
                                Are the GreenArray chips also systolic arrays?
                                • ripe 7 hours ago
                                  My understanding is no, if I understand what people mean by systolic arrays.

                                  GreenArray processors are complete computers with their own memory and running their own software. The GA144 chip has 144 independently programmable computers with 64 words of memory each. You program each of them, including external I/O and routing between them, and then you run the chip as a cluster of computers.

                                  [1] https://greenarraychips.com

                                  • rkagerer 2 hours ago
                                    Reminds me a bit of the Parallax Propeller chip.
                              • slwvx 14 hours ago
                                Text on the front page of the NS website* leads me to think you have a fancy compiler: "Intelligent software-defined hardware acceleration". Sounds like Cerebras to my non-expert ears.

                                * https://www.nextsilicon.com

                                • damageboy 16 minutes ago
                                  No real overlap with Cerebras. Have tons of respect for what they do and achieve, but unrelated arch / approach / target-customers.
                              • pezezin 8 hours ago
                                NEC doesn't really make vector processors anymore. My company installed a new supercomputer built by NEC, and the hardware itself is actually Gigabyte servers running AMD Instinct MI300A, with NEC providing the installation, support, and other services.

                                https://www.nec.com/en/press/202411/global_20241113_02.html

                              • yyyk 13 hours ago
                                Sounds like an idea that would really benefit from a JIT-like approach to basically every software.
                                • damageboy 18 minutes ago
                                  You can indeed and should assume there is a heavy JIT component to it. At the same time, it is important to note that this is geared for already highly parallel code.

                                  In other words, while the JIT can be applied to all code in principle, the nature of accelerated HW is that it makes sense where embarrassingly parallel workloads are around.

                                  Having said that, NextSilicon != GPU, so different approach to acceleration of said parallel code.

                                • znpy 5 hours ago
                                  I definitely expect this to be a big hit.

                                  In a way, this is not new, it’s pretty much what annapurna did: they took ARM and got serious with it, creating the first high performance arm cpus. Then they got acqui-hired by amazon and the rest is history ;)

                                  • zawaideh 5 hours ago
                                    I don’t want my electronics to contribute to genocide and apartheid and possibly the next pager exploding terror attack. No thanks.
                                    • AtlasBarfed 19 minutes ago
                                      I'd be fascinated to know who your "good guys" list is.
                                      • danielxt 3 hours ago
                                        It's not yours, don't have to buy it
                                        • flyinglizard 2 hours ago
                                          Stop using Apple, or Google, or Amazon, or Intel, or Broadcom, or Nvidia then. All have vast hardware development activities in that one country you don't like.