PurePredictive, Inc. v. H2O.AI, Inc.: Northern District of California Invalidates Machine Learning Claims Under Section 101

| Vlad Teplitskiy

Machine learning is one of the fastest growing categories of granted patents[1].  However, there do not appear to be many examples of patent infringement lawsuits where machine learning claims have been analyzed by the courts under the patent-eligibility framework set forth in Alice[2]

PurePredictive, Inc. v. H2O.AI, Inc.[3] is a new case that helps fill this void.  This case was filed in the Northern District of California and is currently pending appeal at the Federal Circuit.  PurePredictive[4], the patent owner, is a company that offers machine learning as a service and advertises their intellectual property on its website.[5], the defendant, also provides machine learning as a service.’s allegedly infringing product—H2O with AutoML[6]—is an open-source tool for model selection and hyperparameter optimization.  Notably, PurePredictive’s complaint references the GitHub version control repository by which the source code for the H2O platform was made publicly available.  PurePredictive appears to have reviewed publically-available open source software code to identify potential infringers, showing how providing open source software code can expose developers to risk.

The patent at issue, U.S. 8,880,446[7] (“the ’446 patent”), concerns automatically generating an “ensemble” of machine learning models.  An “ensemble” is a collection of different models that cooperate together to generate the end result (e.g., a prediction or a classification).  Machine learning models “learn” how to generate accurate predictions by repeated exposure to training data where both the input and expected output are known, such that the parameters of the model can be adjusted to accurately generate the expected output when provided with the corresponding input.  Different types of machine learning models are conventionally considered more suitable for analyzing certain types of data than others.  For example, convolutional neural networks are typically used for analyzing image data, while recurrent neural networks are typically used for analyzing sequence data. 

The claimed technology in the ’446 patent seeks to help users generate an ensemble “with little or no input from a user or expert” by evaluating a number of different “learned functions” that are generated “without prior knowledge regarding suitability of the generated learned functions for the training data.”  Consider representative claim 1:


1. An apparatus for a predictive analytics factory, the apparatus comprising:

a receiver module configured to receive training data for forming a predictive ensemble customized for the training data;

a function generator module configured to pseudo-randomly generate a plurality of learned functions based on the training data without prior knowledge regarding suitability of the generated learned functions for the training data;

a function evaluator module configured to perform an evaluation of the plurality of learned functions using test data and to maintain evaluation metadata for the plurality of learned functions, the evaluation metadata comprising one or more of an indicator of a training data set used to generate a learned function and an indicator of one or more decisions made by a learned function during the evaluation; and

a predictive compiler module configured to form the predictive ensemble, the predictive ensemble comprising a subset of multiple learned functions from the plurality of learned functions, the multiple learned functions selected and combined based on the evaluation metadata for the plurality of learned functions, the predictive ensemble comprising a rule set synthesized from the evaluation metadata to direct data through the multiple learned functions such that different learned functions of the ensemble process different subsets of the data based on the evaluation metadata. filed a motion to dismiss PurePredictive’s infringement claims for invalidity as ineligible under 35 U.S.C. 101.  Specifically, argued that the claims “are directed to an abstract mathematical process for testing and refining algorithms,” characterizing the ’446 patent as “an attempt to monopolize the use of basic mathematical manipulations without reference to any specific implementation, application, purpose, or use.”  PurePredictive countered that the claimed technology can “generate a predictive ensemble in an automated manner” with “little or no input from a user or expert,” likening its claims to those previously held patent-eligible by the Federal Circuit.

Ultimately, the Court agreed with regarding the invalidity of the claims.  In reaching this conclusion, Judge Orrick turned to the patent specification’s description of the invention as a “brute force, trial-and-error approach” that could “generate a predictive ensemble regardless of the particular field or application.”  This led Judge Orrick to conclude “that this process is merely the running of data through a machine,” and that the claims “go to the general abstract concept of predictive analytics rather than any specific application.”  Therefore Judge Orrick held that the claims are directed to an abstract idea under step one of the Alice analysis, without “significantly more” under step two.  As this analysis illustrates, the way that software inventions are framed in their specifications can be an important component of the section 101 inquiry.  On the other hand, it is not clear even in hindsight that an alternative description could have changed the result, since the decision cites no authority that an invention must avoid “brute force” or be applicable only to a particular field or limited application for patent eligibility.

PurePredictive has appealed the district court’s decision to the Federal Circuit.  Stay tuned with the Knobbe Section 101 blog to see how the appeal plays out at the Federal Circuit.



[1] Forbes, Roundup of Machine Learning Forecasts and Market Estimates, 2018, available at, accessed September 12, 2018.

[2] Alice Corp. v. CLS Bank International, 134 S. Ct. 2347 (U.S. 2014).  The Alice framework is a two-part test, with step one requiring a determination of whether a claim is directed to an abstract idea.  If not, the claim is patent-eligible.  If the claim is directed to an abstract idea, step two requires determining whether the elements of the claim, considered both individually and as an ordered combination, transform the claim into a patent-eligible application of the abstract idea.

[3] PurePredictive, Inc. v. H2O.AI, Inc., Case No. 17-cv-03049-WHO (N.D. Cal. Aug. 29, 2017).





Editor: Philip Nelson