(Generated with DALL-E 3 ∙ 30 October 2023 at 1:48 pm) We have some very exciting news to report: the new SureChEMBL is now available! Hooray! What is SureChEMBL, you may ask. Good question! In our portfolio of chemical biology services, alongside our established database of bioactivity data for drug-like molecules ChEMBL , our dictionary of annotated small molecule entities ChEBI , and our compound cross-referencing system UniChem , we also deliver a database of annotated patents! Almost 10 years ago , EMBL-EBI acquired the SureChem system of chemically annotated patents and made this freely accessible in the public domain as SureChEMBL. Since then, our team has continued to maintain and deliver SureChEMBL. However, this has become increasingly challenging due to the complexities of the underlying codebase. We were awarded a Wellcome Trust grant in 2021 to completely overhaul SureChEMBL, with a new UI, backend infrastructure, and new f
Comments
As to the source of the patent structures. There are a number of initiatives underway at the moment to text-mine chemical structures from patents. We're currently not free to say what some of these sources are, but one source could be the feed from the EPO team.
These structures would be loaded into UniChem (qv) and all the lookups done there.
A big problem with other ways of chemical patent data are shown by your other comments - indirect access through semi-open resources, with significant onus on the user to ensure they don't violate any explicit or ambiguous usage constraints/licenses.
One of the ideas of patent filings is explicitly to make things easy to find so researchers don't waste time recreating other peoples IP, and also can build on top of this. Current systems do not really allow this.....