Free Chemistry Databases on the Web: Creating a Comprehensive Guide 20

Posted by Rich Apodaca Mon, 07 May 2007 13:32:00 GMT

One of Depth-First's more popular articles is a summary of free databases titled Thirty-Two Free Chemistry Databases. Clearly there is a need to link the producers of free chemical databases (developers) with the potential users of these services (chemists). Chemistry is slowly emerging from a decades-long period of over-reliance on a single supplier of information. As new players enter, they'll need some way to have their message heard.

The Problem

As evidence of this need, I'm getting more requests to list additional services on the Thirty-Two Databases article - or to provide an updated review of a service already there. This is wonderful!

One approach would be for me to simply research and write an updated article reviewing the new additions myself. The problem is that thirty-two is already a very large number to deal with. My guess is that there must now be well over sixty or seventy free chemistry databases. That's far too many for one person to research properly on their own.

On the other hand, the Web is all about collaboration, so why no try to use it that way?

An Idea

Here's the idea: if you run a free database or other online chemistry service and would like to promote it, post a comment to this article containing a link and brief description of what makes your service different/useful. If you've used a free chemistry database, feel free to provide your thoughts on it. If there's a free database you wish existed but doesn't yet, feel free to write about that. Unlike the other articles on this site for which comments are closed after two weeks, this article's comments will remain open indefinitely.

After some period of time, I'll use these comments to write a new article highlighting the new material.

Notice the use of the word "free". A free database can be used by any member of the general public without fees or a lengthy registration process. This includes both free speech and free beer services. There are more restrictive definitions that could be applied, but let's not worry about those just yet. Free beer is better than no beer at all.

Links can either be in HTML or Markdown. Here's one example of each:

<a href="http://megamolecules.com">MegaMolecules</a> (HTML)

[MegaMolecules](http://megamolecules.com) (Markdown)

The Outcome

I have no idea what kind of response this experiment will generate. But if past experience is any guide, large numbers of chemists are keenly interested in free chemistry databases. All they need is a link.

Image Credit: Kate and Dave Hugh

Comments

Leave a response

  1. Bill Tue, 08 May 2007 13:40:03 GMT

    You may already know about this, but Peter Suber just linked a list of chem databases on the web, most of which are OA.

  2. Rich Apodaca Wed, 09 May 2007 12:57:46 GMT

    Bill, thanks for that link. Yes, I've seen that one and several other fine lists. It might even be worthwhile building a meta-list of Web chemistry databases.

    The collection I have in mind is a little different in that only free databases will be included. The one in your link lists both free and non-free databases.

  3. Antony Williams Tue, 15 May 2007 00:16:20 GMT

    ChemSpider is a chemistry prediction engines and provider of structure-based search engine. It has been built with the intention of aggregating and indexing chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. The ChemSpider database offers many properties for each of the chemical structures within the database – structure identifiers such as SMILES, InChI, IUPAC and Index Names as well as many physicochemical properties. We intend ChemSpider to offer the fastest chemical structure searches available online and delivered with the flexibility and usability necessary to encourage repeat usage.

    What problems will ChemSpider solve? There are tens if not hundreds of chemical structure databases and no single way to search across them. There are databases of curated literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data and on and on. The only way to know whether a specific piece of information is available for a chemical structure is to have simultaneous access to all of these databases. Since many of these databases are for profit there is no way to easily determine the availability of information within these commercial or even in the open access databases. With ChemSpider the intention is to aggregate into a single database all chemical structures available within open access and commercial databases and to provide the necessary pointers from the ChemSpider search engine to the information of interest. This service will allow users to either access the data immediately via open access links or have the information necessary to continue their searches into commercially available systems. The question “is there specific information about my chemical” will be answered. Accessing the information may require a commercial transaction with the appropriate provider.

    Structure searching in ChemSpider is presently accessible via a ChemSketch add-in or via a Structure Drawing Applet . Identifier searching of chemical names, systematic names, InChI or SMILES strings is accessible via the ChemSpider interface or via Firefox and Internet Explorer Add-ins . The ChemSpider search add-in can also be embedded directly in a web page .

  4. Geoff Wed, 16 May 2007 16:10:22 GMT

    I'm sorry Anthony, but this seems like you're spamming Rich's blog. You're not really posting a comment to the "comprehensive guide" to free databases. Surely you must have a list for ChemSpider -- why don't you share them with Rich (and all of us) by posting links to this post?

    Otherwise, this just looks like you're trying to get free publicity.

  5. Rich Apodaca Wed, 16 May 2007 17:43:46 GMT

    Strange as it may sound, Geoff, I *am* looking for this kind of "spam." Let's not call it spam, though.

    With free services like ChemSpider proliferating, one of their biggest problems is visibility. Chemists won't use them if they can't conveniently find them. I looked around and found very little addressing this issue.

    So I thank both you for raising the issue and Antony for diving in.

    BTW, there was a problem with markdown comments, which now seems to be fixed.

    Any other takers?

  6. Dana Roth Mon, 21 May 2007 23:43:24 GMT

    Are you sure there are ACS articles on ChemRefer ?

    I found RSC articles, but this is expected as they have made some of their 1997+ material OA.

    I think there is a 3+ year rolling window that requires subscription.

  7. Richard Apodaca Tue, 22 May 2007 01:02:41 GMT

    Dana,

    ChemRefer did at one time link to ACS articles. But none of them were hosted on the ChemRefer server. Rather, ChemRefer spidered the Web looking for pdfs that had been self-archived by authors on their personal websites.

    It now looks like links to those self-archived ACS articles are no longer there. But I can assure you that a few months ago, they were.

  8. Antony Williams Tue, 22 May 2007 16:03:40 GMT

    Geoff/Rich - just returned to Depth-First for the first time in a few days to read Geoff's post. I thought that the posting was exactly the type of information Rich was requesting so I am glad to see it confirmed. Phew. There certainly was no intention to spam, just to post back to his request "if you run a free database or other online chemistry service and would like to promote it, post a comment to this article containing a link and brief description of what makes your service different/useful." What surprises me is that no one else has posted and I will do what I can to expose Rich's request on the ChemSpider Blog.

  9. Joerg Kurt Wegner Wed, 23 May 2007 02:08:37 GMT

    Should those things not be also added to Wikipedia http://en.wikipedia.org/wiki/Category:Chemical_databases

    And the PubChem article provides already some information about the content of the database http://en.wikipedia.org/wiki/Pubchem

    So, might it not be useful to create a 'ChemistryDatabaseBox' template analog to a ChemBox http://en.wikipedia.org/wiki/Template:Chembox_new

    Why? DBPedia uses Infobox templates http://dbpedia.org/docs/

  10. Joerg Kurt Wegner Wed, 23 May 2007 03:56:10 GMT

    Rich,

    please check also the excellent compilation of Beetstra http://chemistry.poolspares.com/wiki/Wikipedia:Chemical_sources http://chemistry.poolspares.com/wiki/Main_Page being used in this search http://chemistry.poolspares.com/wiki/Special:Chemicalsources EXAMPLE http://chemistry.poolspares.com/index.php?title=Special:Chemicalsources&CAS=58-08-2 COMMENT resolves deep-linking problems and if DBs are searchable with the provided information which is part of this discussion http://en.wikipedia.org/wiki/Templatetalk:Drugbox#Substructuresearchin_eMoleculesandPubChemadded and affects the Wikipedia templates "DrugBox", "ChemBox", and "Chembox new"

    Joerg

  11. David Bradley Wed, 23 May 2007 09:30:32 GMT

    Chemspy.com may be worth a look. It needs a bit of a spring clean, but has various MSDS search tools built in and access to chemistry tutorials across the web. The search box at bottom right provides a quick and easy access to various chem/science-related search engines.

    db

  12. Kevin Theisen Fri, 01 Jun 2007 13:18:06 GMT

    One database I haven't seen mentioned is SDBS, which contains over 100,000 spectra free to search (http://www.aist.go.jp/RIODB/SDBS/cgi-bin/cre_index.cgi). I use this one all the time.

    As for free chemical applications, I have been creating and distributing educational tools through my site www.ichemlabs.com. At the moment iChemLabs hosts 1H and 13C NMR virtual simulators and MolGrabber, a utility for producing molecule files and graphics (using PubChem). These applications are being used at universities across the country and I am looking for online means to reach students and interested scientists.

    I hope this post helps.

  13. Will Mon, 04 Jun 2007 12:42:28 GMT

    In answer to the above query.

    ChemRefer has largely scaled back its spidering of research group / departmental websites due to the fact that it annoys the publishers, some of whom say that most such papers are self-archived illegally by their authors.

    I was contacted by a publisher who said I might be pursued for contributory infringement if I linked to articles that are (allegedly) illegally self-archived by their authors on university websites. So, I decided to stop since I cannot verify publisher-to-author copyright agreements.

    Any author of an article that was de-indexed from ChemRefer that believes this was unfair can contact me.

    Nevertheless, ChemRefer passed the 50,000 articles indexed post a month ago. And, ChemRefer on ChemSpider should soon pass the 100,000 article mark. Not bad all things considered.

  14. Alex Trofimov Thu, 07 Jun 2007 04:33:46 GMT

    R&D Chemicals is a common and freely accessible chemicals catalog and directory of suppliers of products and services for research and development over the internet. Its search engine allows you to find a chemical by its molecular formula, IUPAC and trade name, part of name, common name, CAS number, catalog number, structure or substructure. We are including in our database only really unique and rare chemicals and aren't aiming at a large database.

  15. Richard Apodaca Sat, 09 Jun 2007 15:21:44 GMT

    Wiley's Encyclopedia of Reagents for Organic Synthesis (e-EROS) gives short single-compound summaries for a variety of organic molecules. Each summary has its own DOI, for example this one for quinoline. Search by structure (plugin required) or alphabetically by name.

    Bravo to Wiley for making this a free resource.

  16. Richard Apodaca Sat, 09 Jun 2007 15:36:50 GMT

    Clarification to the above on e-EROS. The abstracts are free, but viewing detailed content and references requires a paid subscription. Just creating an account apparently isn't enough.

    Still potentially useful, though...

  17. Richard Apodaca Sat, 09 Jun 2007 15:46:23 GMT

    Distributed Structure-Searchable Toxicity (DSST) is "a project of EPA's Computational Toxicology Program, helping to build a public data foundation for improved structure-activity and predictive toxicology capabilities. The DSSTox website provides a public forum for publishing downloadable, standardized chemical structure files associated with toxicity data."

    Various DSST datasets can be [downloaded] by FTP.

  18. Antony Williams Sun, 17 Jun 2007 23:31:07 GMT

    I'm just checking in to see how the construction of an updated list of Free CHemistry Databases is going? Based on the posts to this blog it appears that only ChemSpider has been publicly posted...maybe you have received a lot of information offline? Is there an estimated date for release of an update. I'm interested since if there are some free resources I'd like to know so I can approach them to see if they would be interested in submitting their data to ChemSpider.

  19. Antony Williams Sun, 17 Jun 2007 23:54:06 GMT

    Rich...as this collection is brought together what I am interested in is the measure of quality of these free databases. I have blogged about this elsewhere in my question about "Zen and the Art of Chemical Structure Databases. What is the definition of “Quality”?" (http://www.chemspider.com/blog/?p=27) . I'm interested in your thoughts and those of your readers. As we are assembling data we are of course finding different levels...

  20. Richard Apodaca Mon, 18 Jun 2007 13:31:20 GMT

    Antony - as you can see, the response has been rather quiet. I've continued to collect my own links, but I think the next summary is at least three months away.

    In the meantime, check out the CMLD-BU reaction database.

Comments