<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/rss2full.xsl" type="text/xsl" media="screen"?><?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/itemcontent.css" type="text/css" media="screen"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" version="2.0">
  <channel>
    <title>Depth-First comments</title>
    <link>http://depth-first.com</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/depth-first-comments" type="application/rss+xml" /><feedburner:emailServiceId xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">1649950</feedburner:emailServiceId><feedburner:feedburnerHostname xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://www.feedburner.com</feedburner:feedburnerHostname><item>
      <title>"SciFinder Web, Greasemonkey, and REST: Embracing Divergence in Chemical Information Systems" by Rajarshi Guha</title>
      <description>&lt;p&gt;Nice discussion. I certainly agree with REST interfaces - makes life &lt;em&gt;much&lt;/em&gt; easier for client code&lt;/p&gt;</description>
      <pubDate>Thu, 20 Nov 2008 04:21:46 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:1efa1290-0221-4ac2-a3b1-455c67e84eda</guid>
      <link>http://depth-first.com/articles/2008/11/19/scifinder-web-greasemonkey-and-rest-embracing-divergence-in-chemical-information-systems#comment-899</link>
    </item>
    <item>
      <title>"Building ChemWriter: What to Do When Requesting Applet Keyboard Focus Leads to Disappearing Popup Windows" by Rich Apodaca</title>
      <description>&lt;p&gt;It now appears that this approach does not work on Firefox 3 (both XP and Vista) with any version of Java. Different versions of the Java plugin lead to different reasons for failure.&lt;/p&gt;

&lt;p&gt;The fix does work with IE 7 under any version of the Java plugin.&lt;/p&gt;

&lt;p&gt;Google Chrome 0.3.154.9 using JRE 6 update 10 does not exhibit the unwanted popup behavior at all and so needs no fix.&lt;/p&gt;</description>
      <pubDate>Wed, 19 Nov 2008 18:46:32 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:aa1e84f0-ca4d-476f-bd32-4156dd0d8f8f</guid>
      <link>http://depth-first.com/articles/2008/11/06/building-chemwriter-what-to-do-when-requesting-applet-keyboard-focus-leads-to-disappearing-popup-windows#comment-898</link>
    </item>
    <item>
      <title>"One of These Things is Not Like The Others" by Andrew Dalke</title>
      <description>&lt;p&gt;There's too many non-HTML markup formats in the world. Markdown, and wikitext, and restructured text, and BBCode and Textile. I know HTML. I like Blogger because I can just enter a subset of HTML and be happy.&lt;/p&gt;

&lt;p&gt;I'm now working on the theory that Ullmann is a monomorphism algorithm. That is, you are correct about the names but that distinction wasn't so important back in the 1970s. I found &lt;a href="http://www.lsi.upc.es/~valiente/grammatica/Grammatica.m" rel="nofollow"&gt;this text&lt;/a&gt; which claims: "Current version of function 'AllMonoMorphisms' implements the graph monomorphism algorithm by J. R. Ullman, "An algorithm for subgraph isomorphism" (Journal of the ACM, 23(1):31-42, 1976)."&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://www.algorithmic-solutions.com/pdf/graph_iso.pdf" rel="nofollow"&gt;LEDA documentation&lt;/a&gt; says that "Graph monomorphism is a weaker kind of subgraph isomorphism." That's what you've been saying. But the terminology that "X is a weaker form of Y" when "X is not a Y" is strange.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://liris.cnrs.fr/Documents/Liris-2940.pdf" rel="nofollow"&gt;This paper&lt;/a&gt; says "A subgraph isomorphism problem between a pattern graph ..  is also called subgraph monomorphism problem or subgraph matching 
in the literature." &lt;a href="http://msdl.cs.mcgill.ca/people/hv/teaching/MSBDesign/COMP762B2004/presentations/040218.MarcProvost.pdf" rel="nofollow"&gt;This one&lt;/a&gt; says "(Partial) Subgraph Isomorphism. An injective (one-to-one) homomorphism. Also called monomorphism in the literature." Wonderful.&lt;/p&gt;

&lt;p&gt;Ooo! Look &lt;a href="http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-06262008-145353/unrestricted/zampelli-thesis-printed-version-june2008.pdf" rel="nofollow"&gt;here&lt;/a&gt;! "By constraining the function f , new types of morphism are defined: 1. isomorphism, if f is a bijection, 2. epimorphism, if f is a surjection,  3. monomorphism, if f is an injection."&lt;/p&gt;

&lt;p&gt;Okay, I've definitely been thinking that Ullmann finds  monomorphisms. It might be labeled "isomorphism" but not mean "isomorphism" like these paper refer to.  I just need to wait some weeks before I get to some place where I can look up Ullmann and probably more before I actually try to implement it myself.&lt;/p&gt;

&lt;p&gt;Still, the people I know of who implemented SMARTS matchers refer to Ullmann as the source algorithm.&lt;/p&gt;</description>
      <pubDate>Sun, 16 Nov 2008 20:31:36 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:43e2447f-c842-4ca0-98b0-7f7763a84297</guid>
      <link>http://depth-first.com/articles/2008/11/13/one-of-these-things-is-not-like-the-other#comment-897</link>
    </item>
    <item>
      <title>"One of These Things is Not Like The Others" by Rich Apodaca</title>
      <description>&lt;p&gt;Andrew, I found &lt;a href="http://www.info.ucl.ac.be/~pdupont/pdupont/pdf/symcon06.pdf" rel="nofollow"&gt;this article&lt;/a&gt; discussing subgraph monomorphism." Page 2 defines "subgraph monomorphism" as:&lt;/p&gt;

&lt;blockquote&gt;
    &lt;p&gt;A subgraph monomorphism (or subgraph matching) be-
    tween Gp and Gt is a total injective function f : Np → Nt
    respecting the monomorphism constraint : (u, v) ∈ Ep ⇒
    (f (u), f (v)) ∈ Et . Figure 1 shows an example of subgraph
    monomorphism.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Figure 1 is a direct analog of CCC mapping onto C1CC1.&lt;/p&gt;

&lt;p&gt;Unfortunately, this document doesn't define "subgraph isomorphism" for comparison.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://spdp.dti.unimi.it/papers/tissec05.pdf" rel="nofollow"&gt;This article&lt;/a&gt; has this to say about "subgraph monomorphism" (page 20):&lt;/p&gt;

&lt;blockquote&gt;
    &lt;p&gt;The problem of finding a mapping between each vertex of L(GA ) and each vertex
    of L(GI ) is thus a graph-subgraph monomorphism, that is, the problem of find-
    ing whether L(GA ) is a (partial, or not-induced) subgraph of L(GI ).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Both seem consistent with the view in my previous comment. In other words, I don't think that because we're talking about "subgraph" versus "graph" iso/monomorphism that the edge-induction constraints are any different.&lt;/p&gt;

&lt;p&gt;Then again, this could be an example of mathematicians having the same problems with terminology that &lt;a href="http://depth-first.com/articles/2007/11/28/smiles-and-aromaticity-broken" rel="nofollow"&gt;chemists have with aromaticity&lt;/a&gt; ;-).&lt;/p&gt;

&lt;p&gt;Still, the only way to answer your previous question about Ullmann giving edge-induced mappings is to &lt;a href="http://doi.acm.org/10.1145/321921.321925" rel="nofollow"&gt;read the paper&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;BTW, if you'd like to include hyperlinks in your comments, you can do it with this notation (&lt;a href="http://daringfireball.net/projects/markdown/" rel="nofollow"&gt;Markdown&lt;/a&gt;):&lt;/p&gt;

&lt;p&gt;[anchor text](href)&lt;/p&gt;</description>
      <pubDate>Sun, 16 Nov 2008 16:27:04 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:e45366f7-8ed9-43bf-92ac-979d02845ecb</guid>
      <link>http://depth-first.com/articles/2008/11/13/one-of-these-things-is-not-like-the-other#comment-896</link>
    </item>
    <item>
      <title>"One of These Things is Not Like The Others" by Andrew Dalke</title>
      <description>&lt;p&gt;I read further down in Dörr and the bit about ismorphism is about graph isomorphism and not subgraph isomorphism. The Wikipedia link for "Graph isomorphism" (&lt;a href="http://en.wikipedia.org/wiki/Graph_isomorphism" rel="nofollow"&gt;http://en.wikipedia.org/wiki/Graph_isomorphism&lt;/a&gt; ) says "A generalization of the problem, the subgraph isomorphism problem, is known to be NP-complete."&lt;/p&gt;

&lt;p&gt;The page for "Subgraph isomorphism" (&lt;a href="http://en.wikipedia.org/wiki/Subgraph_isomorphism_problem" rel="nofollow"&gt;http://en.wikipedia.org/wiki/Subgraph_isomorphism_problem&lt;/a&gt; ) references the Ullmann paper and also links to  the "Induced subgraph isomorphism problem".&lt;/p&gt;

&lt;p&gt;The page for "Induced subgraph isomorphism problem" (&lt;a href="http://en.wikipedia.org/wiki/Induced_subgraph_isomorphism_problem" rel="nofollow"&gt;http://en.wikipedia.org/wiki/Induced_subgraph_isomorphism_problem&lt;/a&gt; ) says "This is different from the subgraph isomorphism problem in that the absence of an edge in G1 implies that the corresponding edge in G2 must also be absent. In subgraph isomorphism, these "extra" edges in G2 may be present."&lt;/p&gt;

&lt;p&gt;This again suggests to me that Ullmann is for the subgraph isomorphism problem, and the one you point out comes from a different but related problem.&lt;/p&gt;

&lt;p&gt;I'm also feeling out of my field. Once upon a time I got a bachelor's degree in math but that was, umm, 18 years ago and the last time I worked deeply on graph algorithms was 9 years ago when I did some MCS work.&lt;/p&gt;</description>
      <pubDate>Sun, 16 Nov 2008 10:56:19 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:3b02f5d2-58ed-45bc-9ce6-6d4ffaa3fee4</guid>
      <link>http://depth-first.com/articles/2008/11/13/one-of-these-things-is-not-like-the-other#comment-895</link>
    </item>
    <item>
      <title>"One of These Things is Not Like The Others" by Rich Apodaca</title>
      <description>&lt;p&gt;Andrew, if I follow, you're interpretation is:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;monomorphism&lt;/em&gt;: CCC can not be mapped onto C1CC1&lt;/p&gt;

&lt;p&gt;&lt;em&gt;isomorphism&lt;/em&gt;: CCC can be mapped onto C1CC1&lt;/p&gt;

&lt;p&gt;My understanding is the opposite.&lt;/p&gt;

&lt;p&gt;Reading &lt;a href="http://books.google.com/books?id=DSTdyqOJcoUC&amp;amp;pg=PA12&amp;amp;vq=isomorphism&amp;amp;dq=Efficient+Graph+Rewriting+and+Its+Implementation&amp;amp;source=gbs_search_s&amp;amp;cad=0" rel="nofollow"&gt;further down&lt;/a&gt; into the passage you quote, Dorr goes onto make the distinction between monomorphism and isomorphsm. If the induced edge map is &lt;a href="http://en.wikipedia.org/wiki/Bijection" rel="nofollow"&gt;bijective&lt;/a&gt;, then we have an isomorphism. If not, we have a monomorphism.&lt;/p&gt;

&lt;p&gt;In other words, isomorphism is a special case of monomorphism.&lt;/p&gt;

&lt;p&gt;Rajarshi was kind enough to forward a communication with Pasquale Foggia, a VFLib author, who holds the same view, if I'm interpreting it correctly.&lt;/p&gt;

&lt;p&gt;This also appears to be the way VFLib is set up, with its "sub" and "mono" states behaving as I outlined above. See, for example, lines 313 and 333 for the VFLib vf2_mono_state.cc file, in which it appears that the bijective edge constraint has been commented out relative to vf2_sub_state.cc. These are the only differences between the two files.&lt;/p&gt;

&lt;p&gt;Like I say, I'm way outside my field on this and I could be missing something fundamental.&lt;/p&gt;

&lt;p&gt;The big picture, though, is that there are Ullmann implementations that behave as if they're edge-induced. Verbatim translation in such a case will not give a system that can work as a chemical substructure search utility. Figuring out how to apply a fix might be non-trivial.&lt;/p&gt;</description>
      <pubDate>Sun, 16 Nov 2008 02:41:13 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:a61403f9-c43d-4317-aecc-926bb2171a63</guid>
      <link>http://depth-first.com/articles/2008/11/13/one-of-these-things-is-not-like-the-other#comment-894</link>
    </item>
    <item>
      <title>"One of These Things is Not Like The Others" by Andrew Dalke</title>
      <description>&lt;p&gt;The web sites for OpenEye and Daylight contain comments which suggest that they use Ullman. For examples, &lt;a href="http://www.eyesopen.com/docs/html/cplusprog/Bibliography.html" rel="nofollow"&gt;http://www.eyesopen.com/docs/html/cplusprog/Bibliography.html&lt;/a&gt; and &lt;a href="http://www.daylight.com/meetings/emug97/Sayle/" rel="nofollow"&gt;http://www.daylight.com/meetings/emug97/Sayle/&lt;/a&gt; . These are not conclusive, though I know from talking to Roger Sayle that they do use Ullman. &lt;/p&gt;

&lt;p&gt;OpenBabel's implementation is in parsmart.cpp but I've never read the original Ullman paper to be able to compare them, meaning that I wouldn't know if something was the original Ullman algorithm or  modified version.&lt;/p&gt;

&lt;p&gt;I assume they are unmodified, since no one has ever mentioned this problem.&lt;/p&gt;

&lt;p&gt;But this isn't a question of implementations. You said the Ullman algorithm was edge-based and gave an example of how Ullman wouldn't work for matching CCC against C1CC1. What I do know from algorithm, through other references, suggests otherwise. My memory is that Ullman is a vertex-based approach, and it doesn't have the behavior you state. &lt;/p&gt;

&lt;p&gt;So I'm not asking for source code verification that those tools use Ullman, I'm asking for more evidence that the Ullman algorithm has the problems you say it does.&lt;/p&gt;

&lt;p&gt;You pointed out two examples, one written by you and another a comment by Rajarshi Guha. The first isn't from another source. Rajarshi's comment is that the Ullman algorithm implemented by VFlib "differentiates between subgraph isomorphism and monomorphism."&lt;/p&gt;

&lt;p&gt;I did a Google book search and found "Efficient Graph Rewriting and Its Implementation" by Heiko Dörr. He writes:&lt;/p&gt;

&lt;p&gt;"""The graph monomorphism is induced by an injective vertex map which must respect the connectivity and labels of the vertices. Additionally, adjacent vertices must be mapped to adjacent ones, and the label of the connecting edge in the image must be equal to that of the original edge."""&lt;/p&gt;

&lt;p&gt;That is, a monomorphism is a subgraph isomorphism with the additional constraint that if two atoms are connected in one graph then they are connected in the other. Hence CCC does not match C1CC1 because the query says that the first and last atoms are not neighbors, and the solution is not allowed to have that bond. (At least, I think that's what monomorphism means.)&lt;/p&gt;

&lt;p&gt;I therefore infer that the Ullman algorithm in the VFlib implementation is a modified version which also checks for this connectivity constraint, and not the original Ullman algorithm.&lt;/p&gt;

&lt;p&gt;When I get back to a good research library I should dig up the Ullman paper. BTW, there are other Ullman implementations, according to a Google code search for "Ullman isomorphism"; &lt;a href="http://www.google.com/codesearch?hl=en&amp;amp;lr=&amp;amp;q=ullman+isomorphism&amp;amp;sbtn=Search" rel="nofollow"&gt;http://www.google.com/codesearch?hl=en&amp;amp;lr=&amp;amp;q=ullman+isomorphism&amp;amp;sbtn=Search&lt;/a&gt; .
For example, there's one in mmdb_graph.cpp. I still think the algorithm is vertex oriented, given a cursory look at that implementation. (Eg, it builds an array based on the vertex size.)&lt;/p&gt;</description>
      <pubDate>Sat, 15 Nov 2008 20:21:14 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:5e46b7d9-62f8-4d01-b9ee-0fdf88d75f44</guid>
      <link>http://depth-first.com/articles/2008/11/13/one-of-these-things-is-not-like-the-other#comment-893</link>
    </item>
    <item>
      <title>"One of These Things is Not Like The Others" by Rajarshi Guha</title>
      <description>&lt;p&gt;One source would be to look at MQL (from Gisbert Schneider et al), where they use an Ullman implementation. That code is LGPL (though I can't find the source code!)&lt;/p&gt;</description>
      <pubDate>Thu, 13 Nov 2008 20:10:20 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:0366a2ac-1e0b-4c07-ac59-b48ec100539c</guid>
      <link>http://depth-first.com/articles/2008/11/13/one-of-these-things-is-not-like-the-other#comment-892</link>
    </item>
    <item>
      <title>"One of These Things is Not Like The Others" by Rich Apodaca</title>
      <description>&lt;p&gt;That should be "The effect is not restricted to cyclopropane..."&lt;/p&gt;</description>
      <pubDate>Thu, 13 Nov 2008 15:38:52 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:1982fd48-2280-41af-a0e1-68df99e14f15</guid>
      <link>http://depth-first.com/articles/2008/11/13/one-of-these-things-is-not-like-the-other#comment-891</link>
    </item>
    <item>
      <title>"One of These Things is Not Like The Others" by Rich Apodaca</title>
      <description>&lt;p&gt;Andrew, I'm not sure what algorithms Daylight and OpenEye use - they may be a modified Ullmann that relaxes the matching constraint so that propane matches cyclopropane. If we had the source code, &lt;a href="http://dalkescientific.blogspot.com/2008/11/open-source-peer-review.html" rel="nofollow"&gt;we could check&lt;/a&gt; ;-). CDK doesn't use Ullmann, but rather appears to use a specially-designed algorithm. JOELib doesn't appear to use Ullmann. I'm not sure about Open Babel.&lt;/p&gt;

&lt;p&gt;The effect is restricted to cyclopropane; any n-alkane query will fail to match its cycloalkane counterpart.&lt;/p&gt;

&lt;p&gt;Apparently, the distinction comes down to the difference between "isomorphism" and "monomorphism". The former is a bidirectional relationship (edge-induced) whereas the latter is unidirectional between query and target graphs (node-induced) - but I'm no expert on this.&lt;/p&gt;

&lt;p&gt;VFLib takes this into account by offering separate classes for monomorphism matching: "foo_mono_state.cc". Unfortunately, there's no "ull_mono_state.cc".&lt;/p&gt;

&lt;p&gt;I haven't seen much discussion of this issue, and so I feel like I'm going out on a limb mentioning it. If anyone knows better, feel free to chime in.&lt;/p&gt;

&lt;p&gt;But in two cases now with separate Ullmann implementations I've run into the propane-&gt;cyclopropane issue: &lt;a href="http://octet.cvs.sourceforge.net/octet/octet/src/net/sf/octet/traversal/UllmanIsomorphismTraverser.java?revision=1.2&amp;amp;view=markup" rel="nofollow"&gt;Octet&lt;/a&gt; and &lt;a href="http://rguha.wordpress.com/2008/09/19/faster-substructure-search-in-the-cdk/" rel="nofollow"&gt;Rajarshi's example&lt;/a&gt; (see comment #6).&lt;/p&gt;

&lt;p&gt;Unfortunately, all of the discussions on the Ullmann algorithm I've seen have been in terms of matrices and matrix algebra, which I haven't looked at in a long time. This is another example of the kind of implementation that's functional, but not particularly readable.&lt;/p&gt;</description>
      <pubDate>Thu, 13 Nov 2008 15:36:37 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:6fe8b1e8-5178-4995-b412-aaa9472e4355</guid>
      <link>http://depth-first.com/articles/2008/11/13/one-of-these-things-is-not-like-the-other#comment-890</link>
    </item>
  </channel>
</rss>
