InChI Spam

February 28, 2007

Do you remember when getting email - any email - was exciting? For me, that time was 1995 and I had just found the Internet. Of course, I remember looking forward to messages from people I knew. But I also remember being blown away by the idea that I could write to anyone with an email account, anywhere in the world for essentially free - and that they could do the same. Back then, it was fun to get email, no matter what the source.

Today, spam is something that I, like millions of others, deal with on a daily basis. And it's not limited to email. Anyone who runs a blog knows about comment spam and how difficult it can be to eradicate it. Even trackback is being used as a medium for blog spam. Of course, keyword Spam on the Web has been a constant problem for search engines - eliminating it has in part led to more than a few fortunes earned at companies like Google.

Recently, I introduced a small Web application called InChIMatic. It lets you conveniently do exact-structure molecular queries thorough popular search engines like Google. Draw your structure, click "Search" and find your matches.

There aren't a lot of InChIs visible to search engines now, as an InChIMatic query for even the most trivial molecule will reveal. Regardless of you views on InChI as a technology for bringing chemistry to the Web, it seems very likely that the number of InChIs visible to search engines will increase significantly over the next few years. And with this increase may come sites dedicated to nothing other than publishing a lot of irrelevant InChIs in the hope of attracting accidental advertising click-throughs.

Right now, searching the Web by InChIs offers a very high signal-to-noise ratio experience - not unlike email in 1995. The shysters haven't yet discovered it and nobody is counting on the technology for mission-critical work. But if and when the idea of indexing chemical content on the Web through InChIs begins to catch on, filtering tools will become essential. If this scenario seems implausible, think back to your first experience with email and how concerned you were about spam then.

Photo Credit: cobalt123