The pace and scale of software supply chain attacks ramped up significantly over the past couple of years, imposing enormous time and financial costs, and in some cases jeopardizing the security of important data. "Chainguard is here to make securing the software supply chain part of the software development workflow, making the secure way to do things the easiest way to do things," said the company’s head of developer education, Lisa Tagliaferri.
Deeply rooted in the open source ecosystem, the remote-first startup aims to drive adoption of open source software like SigStore, SLSA, and Tekton to assess and validate software integrity throughout the development and implementation lifecycle. Founded and led by core maintainers of some of these projects, the company is finding early commercial traction with large enterprises in regulated industries such as financial services and healthcare while continuing to promote and contribute to the open source software ecosystem. The company is investor-backed, with institutional capital coming from Amplify Partners and an array of individual angel investors.
Not so long ago, federated machine learning was the purview of academic AI researchers and software skunkworks in big tech companies. A technique which facilitates ML model training and tuning across distributed data sources, federated machine learning is a privacy-preserving methodology capable of delivering accurate, explainable ML models without the need for large centralized datasets, computing resources, and cumbersome pre-processing to maintain compliance with privacy regulations and internal best practices.
Founded in 2020 by CEO Kartik Chopra, a former undercover CIA technical intelligence officer, Devron aims to deliver the benefits of federated machine learning techniques to enterprises and government agencies that seek to harness data that may be scattered throughout various parts of their organizations. Prolific investment firm Tiger Global led Devron's Series A round in February 2022, which saw participation from prior investors FinTech Collective, Afore Capital, and Essence Venture Capital.
When it comes to ML model performance, context can sometimes make all the difference. Most production ML models are trained on a dataset that's unique to the task at hand, but enriching that proprietary data with contextual external data can often result in more robust, reliable ML models. But as anyone who works with data—from first-timers to PhDs—can tell you, connecting external data to internal data is rarely as simple as importing a couple of CSVs and typing out a pd.merge() function.
OpenBlender is a San Diego, CA-based startup building a toolkit that enables data science teams to enrich their internal datasets with publicly-avaiblable geospatial and time series data from any source without the hassle of building data extraction, processing, reconciliation, and integration workflows just to access a couple of features. Named a 2021 "Cool Vendor" in data for AI and machine learning by Gartner, OpenBlender indexes and updates an ever-expanding library of external datasets which can be pulled, updated, and blended by time or location via Pandas and R dataframes.
One of the more tedious aspects of managing complex software systems is configuration–the process of flipping virtual switches and turning virtual knobs to first get a system up and running, and then to maximize performance and reliability of that system. A key element of most software architectures, having a properly configured database is important, but manual tuning can be costly and time-consuming.
OtterTune, a startup spun out of the Carnegie Mellon Database Group and led by CMU Professor Andy Pavlo, builds upon academic research into a machine learning-powered framework for automatically configuring database management systems (DBMSs) for each type of workload they're tasked with. The scope and complexity of modern systems is beyond the comprehension of any individual person, the company says, so machine learning techniques are helpful in discovering and setting the hundreds of virtual knobs where they need to be to achieve optimal system performance. OtterTune currently supports cloud-based MySQL and PostgreSQL databases provided by Amazon Aurora and Amazon RDS, as well as certain on-premise database implementations. As an academic project, OtterTune was partially funded by the National Science Foundation; as a startup, the company raised seed funding led by Accel.
Facilitating rapid exploration, manipulation, and processing of datasets, notebook-style environments have changed the way data scientists do their work. However, the ease of pasting a block of data-crunching code from notebook to another can result in some serious maintenance headaches for those tasked with re-plumbing a collection of notebooks. Creating and maintaining one data processing pipeline that can be used across multiple notebooks and run in the cloud can streamline the workflows of data science teams.
Ploomber is building open source cloud infrastructure to help data scientists bring best practices from the software engineering world into their workflows. Ploomber helps them build and deploy repeatable, maintainable data processing pipelines that can be shared across local notebooks and execute in a cloud environment. The New York City-based company participated in the most recent Y Combinator accelerator batch. Its main GitHub repository has over 2,300 stars and Ploomber's open source package is currently downloaded over 15,000 times per month.
The performance of a machine learning model is largely dependent on the quality of the data it was trained on, and the process of sourcing raw data and transforming it into something that's usable for ML is both time-consuming and tiresome.
Rasgo is a New York-based company which offers a suite of data prep power tools, both through a free Python package and a low-code web interface. The company also recently unveiled RasgoQL, an open source Python package that enables data teams to write Python locally while executing SQL on a cloud-hosted data warehouse. Aimed at automating aspects of the data transformation and feature engineering, the company's goal is to help data science and machine learning teams accelerate time to value: from raw data to robust features for training ML models. The Rasgo team is emphatic in its support for the open source community and at the time of writing, its RasgoQL package has been downloaded over 8,000 times in less than 30 days since release. The company has raised over $25 million in venture capital, with backing from Unusual Ventures and Insight Partners, among others.
Containers changed the way a lot of software is built and used, especially in cloud environments. Boston-area startup Slim.AI aims to make the process of building and deploying cloud-native software just a little more seamless. Slim.AI users can discover and gain visibility into software containers, all while improving efficiency by pruning unused parts of those containers.
The company is led by Kyle Quest—the creator of DockerSlim, an open source package with over 13,100 stars on GitHub—and John Amaral, who previously led product for Cisco Cloud Security. Offering more than the container minification features found in the open source DockerSlim package, Slim.ai is building a more wholistic SaaS solution for facilitating container-driven development workflows. Founded in January 2021, Slim.AI is backed by investors including Decibel, Insight Partners, and boldstart Ventures, among others.
Orchestrating machine learning workflows is a challenge for large data science and AI teams operating on high-scale projects. Flyte started at Lyft as an internal effort to build a machine learning orchestration platform. The intention of Flyte is to integrate and automate diverse workflows across software-, data-, and ML engineering disciplines through one platform. Flyte has been used at Lyft, Spotify, and Freenome, among others, and was incubated by the Linux Foundation after going open source in 2020. Union.ai is positioned as an infrastructure partner, helping enterprise customers access the benefits of Flyte.