by Analytics Insight
April 14, 2022
Compilers are programming language translators: Put simply, just as humans translate and understand natural languages, compilers translate programming languages into instructions that hardware can understand and execute. Polymage Labs offers both products and services for creating compilers for programming languages and models used in artificial intelligence and machine learning computations.
Please tell us about the company, its specialization and the services your company offers.
PolyMage Labs is a deep-tech software startup specializing in high-performance compiler and code generation systems for artificial intelligence computing. Compilers are programming language translators: Put simply, just as humans translate and understand natural languages, compilers translate programming languages into instructions that hardware can understand and execute. Polymage Labs offers both products and services for creating compilers for programming languages and models used in artificial intelligence and machine learning computations. His target customer case is either companies building new hardware to speed up AI, companies building algorithms from the field of AI and relying on them to run faster, and companies offering services (cloud computing providers) , which in turn provide a platform to run AI calculations quickly for their users.
With what mission and goals was the company founded? Tell us briefly about your journey since the company was founded?
Polymage Labs was founded with several goals: (1) to translate technology from advanced research into compilers and automatic code generators to help industry solve complex programming challenges in AI software development, (2) to build world-class expertise in automatic code generation and powerful AI systems in India.
Most of the technology building and founding of Polymage Labs towards its current goals began in May 2019. We were soon able to secure our first customer. (Please watch the explainer video linked above for a testimonial.) The team was just me for the first 9-12 months, then about three members over the next six months and has now grown to a strong eight. A note on our journey is towards the end of the first video, which highlights the challenges involved in founding a deep tech startup in the computer systems space.
Please tell us about the products/services/solutions you offer your customers and how they benefit from them.
PolyBlocks are compiler building blocks developed by PolyMage: they enable the rapid creation of new compilers and code generators for several areas served by dense tensor computations, including deep learning, image processing pipelines, and template computations used in science and engineering. These building blocks come in the form of MLIR operations (explained below) and their transformation utilities. Highly optimized code for these operations is generated using a number of advanced research techniques. The same building blocks are intended to be reusable across a variety of programming models and target hardware.
What is your biggest USP that differentiates the company from the competition?
Our unique selling proposition is the polyhedron compiler technology that powers PolyBlocks. Further details can be found in the middle part of the explanatory video. This technology draws on expertise that has long typically only been available in academic research communities, but has not been fully translated into a form suitable for widespread production use. This was changed when MLIR was built and made open source in April 2019. The founder of PolyMage Labs was also a founding member of the MLIR project while he was a visiting researcher at Google in 2018.
MLIR stands for Multi-Level Intermediate Representation, and the MLIR project is an open source compiler infrastructure project. MLIR was announced and released as open source by Google in April 2019, but is now a community-driven compiler infrastructure project that is part of the LLVM project. The MLIR project was initiated to provide the next generation optimization compiler infrastructure with a focus on meeting the computational needs of AI and machine learning programming models. At Google itself, one of the goals of the project was to address the compiler challenges associated with the TensorFlow ecosystem. MLIR is a new intermediate representation designed to provide a unified, modular, and extensible infrastructure to progressively reduce dataflow computational graphs, possibly through loop nesting, to target-specific, high-performance code. MLIR shares similarities with traditional three-address static single assignment (SSA) representations based on control flow graphs (including LLVM IR or Swift Intermediate Language), but also introduces notions from the polyhedral compiler framework as first class concepts to feed powerful analysis and transformations allow the existence of loop nests and multidimensional arrays. MLIR supports multiple front and back ends and uses LLVM IR as one of its primary code generation targets. It is therefore a very useful infrastructure for developing new compilers, particularly for solving the compilation challenges associated with aligning new programming languages/models for AI and machine learning with the plethora of specialized accelerator chips.
All PolyMage Labs technology is based on (ie built upon) the MLIR infrastructure. This also allows us to benefit from and contribute to the open source community. We believe that certain pieces of infrastructure can only thrive outdoors if they are readily available for reuse by all stakeholders.
What are the key trends driving growth in Big Data Analytics/AI/Machine Learning?
I list the most important ones from my point of view. While the first three are often listed, the fourth is also crucial.
1) innovations in computer hardware towards massive parallelism as well as custom-built specialized accelerator units on chips for computations used in the above areas;
2) The generation and availability of data, which in turn is due to the widespread use of information technology, smartphones and data centers,
3) Innovation and development of programming models, libraries, packages, compilers, code generation tools, visualizers and the surrounding software ecosystem that support software development in these areas; (this resulted from (1) and (2) and was partly an effect as opposed to a cause),
4) The underlying computational patterns emerging in several recently successful areas of AI (especially deep learning) are simple, regular, easy to tune, and already widely studied/tweaked: they lend themselves to effortless acceleration on parallel hardware. They are typically dominated by matrix-to-matrix multiplication (matmul) or matmul-like patterns, or other fully “data-parallel” computations (meaning they can be performed in parallel on different data items). Designing new hardware and software to run even faster also becomes easier. This has led to a self-reinforcing feedback cycle.
What challenges is the industry facing today?
In general, the ML/AI/Data Analysis Systems industry can be broadly classified into software and hardware. The challenge for the hardware industry is to make their chips easy for programmers to use. This is one of the areas where Polymage invents, innovates, and solves tough problems. On the other hand, the ML/AI systems software industry struggles with building and maintaining high-performance software and constantly adapting it to evolving hardware. A lot of human expert effort is required here to write, rewrite, reoptimize and retune for newer generations of hardware. Providing the right tools, programming models, libraries, and automation is critical here, and this is the second audience for Polymage Labs technology.
Can you highlight your company’s recent innovations in the AI/ML/Analytics space?
PolyBlocks. Please watch the videos. Our automatic code generation systems are competitive with the best hand-tuned expert code in several cases. In some cases, they provide 5x to 25x improvements over widely used modern programming systems.
Do you also think that finding the right talent is a challenge in the industry?
Building an AI software “systems” company requires a very different talent pool than that required for an AI applications company. The former requires strong computer science skills and we often find that the quality of CS engineering education in India is not up to par here. Hence, having the right kind of technology talent is one of the biggest challenges facing the deep tech software systems industry in India. At Polymage Labs, we spend a lot of time onboarding and training our engineers, and they can develop these skills over time.
What is your technology and business roadmap for the rest of the year?
We have grown rapidly in the last 1.5 years and the following year will be crucial. Technology development in this area is a multi-year effort. We plan to build more technology for PolyBlocks over the next year and move toward an increasingly scalable business. In our environment, the area of AI software and compiler systems must always be a combination of products and services, often with a significant service component. As the technology becomes more stable and mature, we expect to further expand the product component in the coming year. We are also planning to significantly expand the scope of our projects with our existing customers in the coming year.