Text Box: Chemical Structure/Substructure Searching of Patents
A Mash-up of ChemSpider and the SureChem Patent Database
Antony Williams

Introduction

ChemSpider offers the integration to a multitude of databases and recently has been integrated to the SureChem patent database. SureChem offers users the ability to perform online searches of the US, European and WO full text patents collections via text and chemical structure. ChemSpider has indexed the chemical structures associated with all patents on the SureChem patent database and provides a direct link between the services. This technical note will detail how to perform searches on ChemSpider and access related patent data. Users are recommended to read the Manual for deeper explanations of each of the points described here.

 

 

Accessing Searches

To access ChemSpider enter www.chemspider.com into the navigation entry box of an Internet Browser. The Home Page offers the route by which to navigate across the majority of the website. The Navigation interface is illustrated below with descriptive annotations.

ChemSpider Home
 

 

 

 

 

 

 

 

 

 

 


Click on the Search button to access searching capabilities of the ChemSpider database

 

Performing Text-Based Searches

The most basic search screen is as shown below:

 

Structures can be searched by inputting a chemical name, systematic name, SMILES string, InChI string or registry number. The identifiers are described below.

 

Identifier Type

Systematic Name – this includes two primary types of systematic nomenclature – IUPAC names and Index (CAS-type) names.

Trivial Name – this allows searching by one or more trivial or trade names associated with a chemical structure.

Registry Numbers – search on various registry numbers associated with chemical structures

SMILES strings – the SMILES strings listed in the database are those generated using OpenBabel software

InChI strings – InChI strings are canonicalized by default. Since all InChI strings utilize the InChI DLLs provided by IUPAC searching should invoke a simple copy and paste into the search box.

 

 

Some example identifiers for the same molecule are shown below:

Systematic names: 8-chloro-1-methyl-6-phenyl-4H-[1,2,4]triazolo[4,3-a][1,4]benzodiazepine

Trivial names: Xanax

Registry numbers: 249-349-2 (EINECS Number), 28981-97-7

SMILES codes: Clc2cc3/C(=N\Cc1nnc(C)n1c3cc2)c4ccccc4

InChI strings: InChI=1/C17H13ClN4/c1-11-20-21-16-10-19-17(12-5-3-2-4-6-12)14-9-13(18)7-8-15(14)22(11)16/h2-9H,10H2,1H3

 

Typing in any of these identifiers will provide a results screen. An example results screen is shown below.

 

 

Click on any of the structures of interest to expand to the screen of interest to provide a single record as shown below.

 

 

If there are patents available these will be shown in the Data Sources by the name SureChem. The number in the parentheses indicates the number of patents containing that  structure in the SureChem database. In the example above there is only one patent. Click on the SureChem name (or any of the other links in the Data Sources) and the links to data sources will be shown as indicated below.

 

 

In order to view the patent simply click on the External Link ID, in this case the number 6287620. The patent is shown below. It may be necessary to create a trial account for free access.

 

To highlight the name searched in the patent article simply click on the Query field as shown below.

 

 

and the text will be highlighted as shown:

 

 

 

To see other chemical compounds in the patent that have been identified by SureChem, select one of the two ‘Markup’ options at the top of the document. “All Compounds” includes every chemical name identified by SureChem while Compounds with Structures” only shows those chemical names that SureChem has identified and been able to convert to a chemical structures.

 

Search filters and data export options are available on the right side of the screen, allowing you to narrow search results and then export the related chemical structures to your desktop. For more information about SureChem, visit www.surechem.org.

 

Performing Structure-Based Searches

 

The details of structure and substructure searching are detailed in Section 2.3.3 of the ChemSpider Manual available at http://www.chemspider.com/ChemSpiderManual.aspx. Please review the manual for details about how to perform these searches and the linking out to patents that may be examined from the results list.

 

Conclusion

 

ChemSpider and SureChem integration provides online access to text and structure/substructure-based searching of both American and International Patent Databases. This is an ideal mash-up to further enable ChemSpider users to derive information regarding molecules of interest.