Clinical documentation of patient-reported medical cannabis use in primary care: Toward scalable extraction using natural language processing methods.
Most states have legalized medical cannabis, yet little is known about how medical cannabis use is documented in patients' electronic health records (EHRs). We used natural language processing (NLP) to calculate the prevalence of clinician-documented medical cannabis use among adults in an integrated health system in Washington State where medical and recreational use are legal. We analyzed EHRs of patients ≥18 years old screened for past-year cannabis use (November 1, 2017-October 31, 2018), to identify clinician-documented medical cannabis use. We defined medical use as any documentation of cannabis that was recommended by a clinician or described by the clinician or patient as intended to manage health conditions or symptoms. We developed and applied an NLP system that included NLP-assisted manual review to identify such documentation in encounter notes. Medical cannabis use was documented for 16,684 (5.6%) of 299,597 outpatient encounters with routine screening for cannabis use among 203,489 patients seeing 1,274 clinicians. The validated NLP system identified 54% of documentation and NLP-assisted manual review the remainder. Language documenting reasons for cannabis use included 125 terms indicating medical use, 28 terms indicating non-medical use and 41 ambiguous terms. Implicit documentation of medical use (e.g., "edible THC nightly ") was more common than explicit (e.g., "continues cannabis use"). Clinicians use diverse and often ambiguous language to document patients' reasons for cannabis use. Automating extraction of documentation about patients' cannabis use could facilitate clinical decision support and epidemiological investigation but will require large amounts of gold standard training data.