Chemistry
Reinforcing the supply chain of umifenovir and other antiviral drugs with retrosynthetic software
Y. Lin, Z. Zhang, et al.
The COVID-19 pandemic exposed vulnerabilities in global supply chains, including those for pharmaceutical starting materials. As numerous antiviral and anti-inflammatory drugs were investigated for repurposing, the potential scale of demand risked overwhelming existing inventories and raw material supplies. The authors hypothesized that modern computer-assisted retrosynthetic planning could rapidly identify alternate, cost-competitive starting materials and routes to mitigate supply-chain stress if a therapeutic required rapid scale-up. Focusing on 12 investigational COVID-19 drugs, with experimental emphasis on umifenovir, the study aimed to design routes that avoid known starting materials while maintaining competitive step counts, feasibility, and scalability, leveraging the SYNTHIA retrosynthesis platform. A key goal was to validate that excluding published starting materials during searches can reveal distinct, affordable raw material supply chains and to experimentally realize select proposed routes.
The study situates itself within the context of prior work on automated retrosynthesis and drug repurposing. Earlier efforts used automated planning to design contingency syntheses for hydroxychloroquine and remdesivir, generally initiating from known starting materials or with increased route length/cost. Foundational and recent advances in computer-assisted synthesis planning (Corey and Wipke; deep neural networks with symbolic AI; robotic flow synthesis; expert–ML synergy; patented-route navigation; transformer models; and planning of complex natural products) are cited as enabling technologies. The authors also note umifenovir’s prior use against SARS-CoV-1 and in vitro activity against SARS-CoV-2, while clinical outcomes for COVID-19 have been mixed. The paper builds on these by shifting to a starting material-centric strategy, crowd-sourcing literature/patent routes, encoding them in SMILES to exclude known starting materials during searches, and benchmarking predictions against published methods.
- Data collection and exclusion strategy: Lab members crowd-sourced all published and patented routes for each of 12 target drugs. Routes were encoded as SMILES, and the concatenated list of starting-material SMILES was used as an exclusion filter in SYNTHIA searches to enforce novel starting materials. An interactive route visualizer (covidroutes.cernaklab.com) was built to review and compare routes.
- Retrosynthetic search: SYNTHIA was configured to generate 50 route proposals per target. Heuristics prioritized minimal starting-material cost, with adjustments (e.g., relaxing cost, increasing beam width) when necessary to control step count. Manual review prioritized route brevity, synthetic feasibility, scalable conditions (minimizing cryogenics, pyrophorics), chemo-/regio-/stereoselectivity, and overall cost.
- Targets: Twelve drugs were examined: remdesivir, umifenovir, bromhexine, galidesivir, ritonavir, cobicistat, ribavirin, camostat, darunavir, nelfinavir, favipiravir, and baricitinib.
- Experimental validation: Four predicted routes to umifenovir and one route to bromhexine were executed with minor modifications when needed. High-throughput experimentation was used to optimize a copper-catalyzed coupling (ligand 54) in one umifenovir route. General experimental practices included inert atmosphere reactions (glovebox/Schlenk), purified solvents, silica gel chromatography, and TLC monitoring.
- Heuristic refinements: For umifenovir, the team deliberately excluded Nenitzescu reaction proposals by keyword when default criteria surfaced known starting-material routes. For bromhexine, a manually designed one-step C–H functionalization route using commodity reagents was implemented and added to the SYNTHIA database for future prediction.
- Scalability considerations: Manual curation emphasized catalytic operations, avoidance of expensive catalysts, and safer conditions for potential multikilogram production; software recommendations were adapted to substrate-specific needs.
- General: Across 12 drugs, the software identified alternate starting materials that are distinct from literature routes, often with equal or fewer steps. Exclusion of known starting materials effectively navigated to novel, cost-competitive inputs.
- Galidesivir (4): Predicted sequence included trans-hydroiodination, Evans auxiliary alkylation to form 18, Ullmann coupling to 19, enantioselective Heck coupling to 22, followed by in situ Boc deprotection and dihydroxylation to reach 4. The algorithm successfully avoided five established pyrrolopyrimidine starting materials to propose 21, cost-competitive with published analogs (e.g., 4-chloro-5H-pyrrolo[3,2-d]pyrimidine at $9.90/g vs 7-bromo-5H-pyrrolo[3,2-d]pyrimidin-4-ol used in reported syntheses at $280/g). Route favors robust, often catalytic chemistry.
- Umifenovir (2): Four routes were experimentally validated. • Route A (oxidative indole formation + Baeyer–Villiger): From 1-(4-aminophenyl)ethan-1-one (26, $1.15/g) and ethyl acetoacetate (27, $0.03/g) via indium-catalyzed formation of 31 and oxidative cyclization to indole 32 (optimized to 47% yield with MgSO4). N-methylation to 33 proceeded in 99% yield. Direct Baeyer–Villiger oxidation of 33 gave mixtures due to competing Prilezhaev oxidation. Introducing a directing chloro group (chlorination to 35) enabled selective Baeyer–Villiger to 36; bromination to 37, thioetherification with 38 and in situ saponification afforded 39; alkylation with 40 provided 2. • Route B (benzylic C–H oxidation): Starting from 41 gave 42 in 79% yield; N-methylation to 43 in 92% yield. Software-recommended Oxone/KBr conditions failed; Baran–Roček oxidation enabled selective C14 oxidation in 62% yield, intercepting the prior route. • Route C (Friedel–Crafts acylation with chloroacetyl chloride): From low-cost 45, oxidative indole formation to 46, methylation to 47, then AlCl3-catalyzed Friedel–Crafts acylation with chloroacetyl chloride (48, $0.13/g) directly installed the chloromethyl ketone (35), intercepting earlier routes. The acylation exhibited 2:1 regioselectivity requiring optimization. Overall, six C–H bonds were transformed into new functionalities over seven steps from 27 and 45 to 2. • Route D (Bamberger rearrangement): Starting from 2,5-dibromo-1-nitrobenzene (49, $1.73/g), reduction to the hydroxylamine and Bamberger rearrangement gave 50 (38% over two steps), methylation to 51, copper-catalyzed coupling with 53 (from 52 and 38) delivered 39 in 66% yield (with ligand 54), followed by alkylation with 40 to 2. This most convergent route limited the longest linear sequence to five steps.
- Bromhexine (3): A predicted route used benzylic C–H oxidation to enable reductive amination and N-methylation from novel starting material 55 (avoiding 56–59). The team designed and validated a one-step alternative: direct C–H functionalization coupling of 2,4,6-tribromoaniline (64, $0.51/g) with N,N-dimethylcyclohexylamine (65, $0.10/g) using tert-butyl peroxide to give 3 in 41% yield. The transformation was added to the SYNTHIA database and surfaced as a top hit in subsequent searches.
- Operational timeline: The work to identify and validate routes was completed over nine weeks in Spring 2020, demonstrating rapid response capability.
The findings validate that a starting material-centric retrosynthetic strategy, enforced by excluding known starting materials, can yield alternative, affordable supply chains for therapeutics without sacrificing route length or feasibility. Experimental realization of four distinct umifenovir routes and a concise bromhexine synthesis demonstrates that software-guided proposals translate to the lab with limited modifications. Incorporating C–H functionalization logic often reduced starting-material costs by leveraging simpler feedstocks. Where software predictions encountered selectivity challenges (e.g., Baeyer–Villiger vs Prilezhaev oxidation), targeted adjustments (chlorine directing group) restored selectivity, highlighting the complementary roles of human expertise and automated planning. The approach is broadly applicable across diverse targets (12 drugs analyzed) and supports rapid contingency planning during supply disruptions. Practical considerations for large-scale production (avoiding cryogenics, pyrophorics, chromium waste, and hazardous peroxides) remain, but the strategy provides immediate, testable alternatives that can be further optimized for industrial deployment.
This study merges crowd-sourced literature curation with automated retrosynthetic planning to rapidly propose and validate alternative starting-material supply chains for 12 COVID-19-related drugs. Four experimentally validated routes to umifenovir and a one-step synthesis of bromhexine showcase how excluding known starting materials directs software to novel, cost-competitive inputs and concise routes, frequently employing C–H functionalization. The workflow, tooling (SYNTHIA and a public route visualizer), and validation demonstrate a scalable framework for supply-chain resilience in pharmaceuticals. Future work includes full process development for industrial scalability (safer oxidants, improved regioselectivity, catalyst and condition optimization), expanding databases with new transformations, and enhancing software selectivity prediction for challenging functional groups (e.g., chiral phosphorus).
- Software-predicted conditions at times required substantial adjustment (e.g., Oxone/KBr failed for benzylic oxidation; Baran–Roček oxidation used, generating chromium waste unsuitable for scale).
- Selectivity limitations: Baeyer–Villiger vs undesired Prilezhaev oxidation necessitated installing a directing chloro group; Friedel–Crafts acylation showed only 2:1 regioselectivity, requiring optimization for manufacturing.
- Safety and scalability: Use of tert-butyl peroxide and chromium reagents raises hazards and waste concerns for large-scale production; alternatives would be needed.
- Scope limitations: Not all predicted routes were experimentally validated; some functionalities (e.g., chiral phosphorus) challenged the software.
- Cost estimates relied on list prices and may differ at production scale; competitiveness would depend on vendor bidding and procurement.
- Timeline constraints (nine weeks) limited full process optimization and comprehensive route benchmarking.
Related Publications
Explore these studies to deepen your understanding of the subject.

