Four categories, four levels of validation effort The GAMP 5 framework classifies pharmaceutical software into categories that determine the appropriate level of validation effort. The classification is based on two factors: the software’s configurability (how much the user can customise its behaviour) and its complexity (how much custom development is involved). Higher categories require more documentation and more thorough testing. Category 2 (firmware) was removed in the current GAMP 5 edition — firmware is now classified under Category 1 (infrastructure) or Category 3 (non-configured products) depending on its role. Category 1 — Infrastructure software Infrastructure software provides the computing environment on which GxP applications run. This includes operating systems, database management systems, middleware, virtualisation platforms, and network firmware. Category 1 systems are not GxP applications themselves, but they support GxP applications. Validation approach: Installation qualification (IQ) — verify correct installation and configuration. No functional testing of the infrastructure software itself (the vendor is responsible for that). Document the version, patch level, and configuration settings. Examples: Windows Server, Red Hat Enterprise Linux, Oracle Database, VMware ESXi, Docker runtime, NVIDIA CUDA drivers. Category 3 — Non-configured products Software used exactly as delivered by the vendor, without user configuration of business logic or workflow. The user installs it and uses it within its intended purpose. Validation approach: Verify that the software is used within its intended scope. Review vendor documentation. Test critical functions relevant to the intended use. Do not test generic vendor functionality that is not relevant to your application. Examples: Scientific calculators, standard analytical instruments with embedded firmware, simple data loggers, reference databases. Category 4 — Configured products Commercial software configured by the user to meet specific business requirements. The software functionality exists in the product; the user enables, configures, or customises features to match their process. Validation approach: Document the configuration. Test configured features against requirements. Verify that configuration changes produce the intended behaviour. Leverage vendor testing for core functionality; user testing focuses on the configured aspects. Examples: SAP (GxP modules), LabWare LIMS, Emerson DeltaV, Siemens SIMATIC, Veeva Vault Quality. Category 5 — Custom applications Software developed specifically for the intended use. This includes both fully custom-built applications and significant customisations to commercial platforms that involve writing new code. Validation approach: Full lifecycle validation — requirements specification, design documentation, code review, unit testing, integration testing, system testing, user acceptance testing. Complete traceability from requirements through test evidence. Examples: Custom manufacturing control systems, bespoke analytical data processing tools, ML models trained on facility-specific data, custom integration middleware. The AI classification challenge ML models trained on company data are Category 5 custom applications — even if they use a commercial framework (TensorFlow, PyTorch). The framework itself is Category 1 infrastructure. The pre-trained model architecture may be Category 3 or 4. The training pipeline, training data, and resulting model weights are Category 5. Understanding how these categories apply to AI/ML systems in detail is covered in the GAMP 5 classification guide for AI/ML software, which addresses the practical challenges of multi-category classification and continuous validation for non-deterministic systems. The practical rule Classify accurately. Validate proportionately. A Category 4 system validated as Category 5 wastes engineering resources. A Category 5 system validated as Category 4 creates regulatory exposure. The purpose of classification is to determine the right level of effort — not the maximum level of effort. How do GAMP categories apply to AI and machine learning systems? AI and ML systems present a classification challenge for the GAMP framework because they do not fit neatly into the existing categories. Traditional GAMP categorisation assumes that software behaviour is deterministic and fully specified — Category 5 (custom) software is validated by testing every specified requirement. ML models produce outputs that depend on training data and may vary with model updates, complicating the “test every requirement” approach. The pragmatic classification approach we recommend: treat the ML model as a Category 5 component within a larger system. The training pipeline, the inference pipeline, and the decision logic surrounding the model are custom software that can be validated using standard GAMP Category 5 methods. The model itself is validated through performance qualification — demonstrating that it meets predefined acceptance criteria (accuracy, sensitivity, specificity) on a representative test dataset. Change control for ML systems requires additional considerations. Model retraining produces a new model version that must be re-validated against the acceptance criteria. Automated retraining pipelines must include validation gates: the new model is deployed only if it passes the predefined performance criteria on the validation dataset. This approach has been accepted by FDA inspectors in several audits we have supported. The key is documentation: clearly define what the model does, how it was trained, what data it was trained on, what performance criteria it must meet, and how ongoing performance is monitored. Inspectors are not opposed to AI in pharma — they require evidence that the AI system is understood, controlled, and monitored with the same rigour as any other GxP-critical system.